The Coffee House is a friendly and informal community dedicated to having fun. We're a diverse bunch, and so we have plenty to offer, including:
  • Discussions on a wide range of subjects, from science and current events to sport and gaming (and most things in between!);
  • Community-centered forums where members can get to know each other better, and share things they've made;
  • Regularly-scheduled contests, where members can compete for awards and forum currency (Coffee Credits);
  • Shops, where members may spend the Coffee Credits they've earned;
  • A Discord server, where anyone can chat to our members in real time.
Topic Started: Aug 24 2016, 12:55 AM
Member Avatar
Christian. Exterminator of Spammers.

I think this might be a first for this site. Yes, we have a lot of intelligent talk over here about a variety of different subjects, but I don't think anyone has ever tried to legitimately teach a course here, or at least offer the kind of material that would be used in such a thing.

But you've interpreted the title correctly if you guessed that I plan to give a crash course in basic linguistics here. :thumbsup:

What Is Linguistics?

Linguistics is not just the learning of many languages, although many linguists do end up polyglots. Linguistics is the scientific study of language and all the processes behind it, and it can be a very broad subject. Most of what I'm going to put in this thread consists of fundimentals that one starts with when first venturing into linguistics. We won't be talking about things like Optimality Theory or Transformationalist Minimalism or anything hardcore-theoretical like that; more like how to analyse and describe any language. The five areas I'll focus on are:

- Phonetics (properties of speech sounds). More specifically, I'll be focussing on articulatory phonetics, which pertains to the production of speech sounds within the vocal tract.
- Phonology (how speech sounds pattern).
- Morphology (word formation).
- Syntax (sentence formation).
- Typology (cross-linguistic language tendencies).

Of course, if you were to go deeper, not only would you see subdivisions in these (such as auditory phonetics, acoustic phonetics, generative phonology, generative grammar vs. functionalist grammar, etc.) but there are also other areas of linguistics.

- Sociolinguistics (the interaction between language and culture)
- Psycholinguistics (the interaction between language and other cognitive processes)
- Semantics (I think you can figure this one out :P)
- Pragmatics (language in usage)
- Discourse (the structuring of language beyond the level of the sentence, this and pragmatics often go together)
- Applied Linguistics (how to put the theory into practice; this would include things like principles of translation, principles of literacy, application of phonetics and phonology for speech-pathological purposes, application of auditory phonetics for audiology and hearing, etc. For Christians this also includes things like hermeneutics - the science of interpretation - which ties in very closely to things like semantics, pragmatics, and even sociolinguistics)
- Historical and Comparative Linguistics (I think you can figure this one out as well; a lot of the basic analytical methods are actually rooted in phonology, since it is regular sound correspondences that tip one off to common parentage of two words from separate languages. I think this is my favourite part of linguistics overall!)

I'll start with phonetics, but I need to scrounge up some sound files before I do. You can't really talk about individual sounds without having sound files to link them to!
Member Avatar
Guv'nor's Other Woman

Linguistics is excellent and learning other languages makes you more aware of your own. I feel English is a very ambiguous language.
Member Avatar
Certified Mutant

I think this might be a first for this site. Yes, we have a lot of intelligent talk over here about a variety of different subjects, but I don't think anyone has ever tried to legitimately teach a course here, or at least offer the kind of material that would be used in such a thing.

I think it might be a first too :P . I made an abortive attempt with mathematics a while back - but, because everyone we all have varying amounts of mathematical education from our school days, it's difficult to know what level to start at. Besides, a lot of people actively dislike maths, so it was always going to be difficult to get something like that off the ground :P . I don't expect either of those difficulties here, though, given that linguistics isn't as unpopular, and most of us have no prior knowledge of it, so it makes sense to just start at the very beginning.

Anyway, I look forward to seeing more of this :D !
Member Avatar
Christian. Exterminator of Spammers.

Aug 24 2016, 02:12 PM
Linguistics is excellent and learning other languages makes you more aware of your own. I feel English is a very ambiguous language.
English is weird in a lot of ways. We'll probably cover that quite a bit when we finally get to syntax. :P

Phonetics is, as I said, the study of the properties of speech sounds. Basically, you study the final product, looking at the different speech sounds individually to discern how they are produced, and when not done just for curiosity, it is done with the purpose of eventually being able to mimic the sounds. That's what articulatory phonetics in particular is good for, and that's what will be included here.

First, let's lay down some groundwork.

The International Phonetic Alphabet

This is what linguists use to write words down, even when there's no written form of the language. It's been in development for well over a century now, having first been proposed back in 1887 and having had several changes made to it since. I'm going to have to use the nocode tag a lot in the phonetics section as well, because in IPA, phonetic transcriptions are written using square brackets. All 26 letters of the Roman alphabet are in it, although they don't all sound like they would in English. The ones that sound exactly the same in every scenario are [b], [d], [f], [g], [h], [l], [m], [n], [s], [v], [w], and [z]. Ones that sound like English in certain contexts are [k], [p], and [t], ones that sound close to English are [a], [o], and [u], and ones that sound nothing at all like their English spelling counterparts are [c], [e], [i], [j], [q], [r], [x], and [y].

Just to get your feet wet, here's a standard IPA chart. If you don't understand everything, don't worry; this will all be revealed in time.

Posted Image
Member Avatar
Certified Mutant

Yeah, there is quite a bit that I don't understand :lol: . Even though I've seen these symbols quite a lot (e.g. at the beginning of Wikipedia articles), there are still a lot of them that I don't recognize.

I'm sure it'll get clearer over time, though :) .
Member Avatar
Christian. Exterminator of Spammers.

Consonants: places of articulation, and the structure of the vocal tract

This is one of a few things I'm going to add that isn't straight from memory (the history of the IPA is another such thing :P ) - it's a vertical cross-section of the vocal tract. I can't draw worth crap, so here it is. ;)

Posted Image

These days, when first explaining places of articulation, phonetics profs have a strong tendency to start with the lips and go down the tract. This is evident on standard IPA charts as well. While classifying places of articulation can sometimes get incredibly precise to the point of being nitpicky, there's a more basic set of classifications that phoneticians will use in most cases:

Bilabial - using both lips to produce the sound.
Labiodental - upper teeth against lower lip to produce sound.
Linguolabial - tongue-tip against upper lip to produce sound.
Dental - tongue tip between the teeth
Alveolar - you know that big bump behind your teeth, centred in your mouth? That's the alveolar ridge.
Postalveolar - behind the alveolar ridge. There are three subtypes of these:
- Palato-Alveolar; the body of the tongue is right behind the alveolar ridge
- Alveolo-Palatal; the body of the tongue approaches the alveolar ridge and the back of the tongue is up against or close to the hard palate; the tongue tip is lowered somewhat
- Retroflex - same place as palato-alveolar, but with the tongue curled back further
Palatal - back of the tongue against or related to the hard palate
Velar - back of the tongue against or related to the soft palate (velum)
Uvular - back of the tongue against or related to the uvula (the punching bag thingy in the back of your throat)
Pharyngeal - in the throat between the uvula and the vocal folds; these are the only sounds that use the root of the tongue.
- Includes epiglottals, which are made with a flap just above the vocal folds.
Glottal - no usage of the tongue at all, completely relating to the vocal folds.

So you ask, "what about nasal sounds?" Those are included in manners of articulation.

- Plosives (oral stops) mean a complete stop is made to the airflow, and the sound is made when that stop is released. Unreleased plosives occur utterance-finally fairly commonly.
- Fricatives are made when the airflow is restricted to the point of friction, causing noise.
- Affricates are a combination of a plosive and a fricative, where a quick plosive followed by a fricative release results in a single sound. Note: these aren't to be confused with aspirated plosives. More on those later.
- Nasals (nasal stops) mean that the airflow is stopped in the mouth but the velum is raised, allowing air to freely pass through the nose. Pharyngeal and glottal nasals are impossible.
- Approximants form when the airstream is restricted, but not to the point of creating friction. The approximants near the back of the mouth more or less always have an equivalent vowel.
- Flaps require rapid contact from and release from a particular protruding surface and as such cannot be produced in certain places of articulation; palatal, velar, purely pharyngeal (that is, with no epiglottis), and glottal flaps are impossible.
- Trills are basically extended sequences of flaps. Rather than just quick contact, the tongue is held close enough to the point of contact that the airstream causes the tongue to make rapid on-off contact with the surface. Again, in certain contexts they are impossible, and trying to do it will merely result in a fricative or an approximant.

There are extensions of these basic manners as well.

- When a sound is lateral, it means that the airstream goes around the sides of the tongue. Fricatives, affricates, approximants, and flaps can be lateral.
- Sibilants (or stridents, or grooved fricatives) are formed when the air is pushed towards the sharp edge of the teeth via a groove in the tongue, resulting in an extra noticeable layer of noise. Only alveolar and post-alveolar sounds can be strident, and stridence is only noticeable in fricatives and affricates. It should be noted, too, that the sibilant/non-sibilant difference never means sound contrast. I'll talk more about that when I get to phonology.

Airstream mechanics - voicing and "non-pulmonic" consonants

The last basic is voicing. Ever hear a [s] and a [z] and think "Hey, these are really similar! But how are the different?" The answer is voicing. Now the most basic types of voicing are voiced and voiceless - either the vocal folds are shut when the sound is made (voiceless) or they are vibrating (voiced). But the bigger picture is actually more complicated than that, since the vocal folds don't have bipolar function, with voicing ranging from breathy voice, which has almost no vibration of the vocal folds, to creaky voice, in which a stiffening of the cartilage in the larynx causes an almost complete blockage. I won't go into too great a detail here, as that would be beyond the scope of a basic linguistics class.

There is also aspiration, which is considered part of voicing. This happens when the glottis opens briefly after (or sometimes before) a plosive, causing an extra bit of air to be released. English has aspiration. I'll talk more about that when I get to phonology.

What has been brought up so far is the range of pulmonic consonants, where the source of the final airstream is the lungs. Sure, the air almost always comes from the lungs, but sometimes the airstream is held up en route and therefore the final airstream source is either partially or fully from elsewhere. These are the "non-pulmonic sounds."

The most common kind of non-pulmonic sound is called an ejective. These are formed when the glottis shuts before a plosive then opens during its release, causing the air to rush out and producing a "heavy-hitting" variation of a plosive, affricate, or in rarer cases even a fricative. These are always voiceless, and limited to plosives, affricates, and fricatives. Rather than having a pulmonic airstream, these are said to have a glottalic airstream because the glottis is the final source of the air. Ejectives are widespread in terms of languages in which they are spoken, but few major languages have them and they are noticeably absent from European languages, if you don't count the Caucasus.

Implosives are always stops, and they're what I was referring to when I said "partially from elsewhere." They have an ingressive glottalic airstream - meaning that air rushes inward at the opening of the glottis rather than outward - caused by the larynx being lowered during production as the glottis is closed, but there is also an egressive pulmonic airstream. The odd combination results for something of a gulping sound, and they are regularly voiced. While they do occur in languages elsewhere (a particularly noteworthy case of this is Sindhi, a major language of southern Pakistan), they are most commonly found in sub-Saharan Africa.

And then there are clicks, which have a lingual airstream. This is what I mean when I say the air almost always initially comes from the lungs. They don't with clicks. Instead, the tongue blocks off the flow at the soft palate and the sound comes from the release of a vacuum created after the articulator is released along with the tongue on the velum. These are generally somewhat loud, although dental clicks aren't; languages that use these in everyday meaningful speech are exclusive to Africa and almost exclusive to the southern quarter of the continent; there was one attested language in Australia with clicks, but it is a) a created register and b) extinct. Clicks can be heard outside of the realm of actual words in our culture, as dental and alveolar lateral clicks are used to call animals, and dental clicks are used to show pity or disapproval (this was eventually written down as tsk-tsk).

Secondary articulations and double articulations

Occasionally you'll get times when there is a secondary element to an articulation. The sound [w] (spelt with the same letter in English and a number of other languages) is probably the most common example of this, as it has a primary velar articulation with a secondary labial articulation - the lips are rounded. French has a similar sound phonetically, with a primary palatal articulation and a secondary labial articulation, denoted by the symbol [ɥ]. These are both examples of "labialisation." There's actually a more common secondary articulation than this (although labialisation is quite common) and that is palatalisation; this is also the name of a very widespread phonological assimilation process. (More on that later.) East Slavic languages (Russian/Ukrainian/Belarusian/Rusyn) are ready examples of this.

But then you have situations where two places of articulation are hit by exactly the same manner at more or less exactly the same time. This is double articulation, where the articulations are equally audible and equally important. Most of these are either plosives or nasals, and the process is most common in West Africa. I cringe at people pronouncing NHL hockey player Kyle Okposo's last name. :p In English our usual tendency is to treat it as two separate consonants, but given how the co-articulated version sounds, I've heard more commentators treat the velar half of that labial-velar stop like it didn't even exist! Examples of languages that have this in their very names are the Liberian language Kpelle and the Nigerian language Igbo.

There is a hard-to-classify double-articulated fricative in Swedish, called the "sje-sound" in Swedish grammar literature and [ɧ] in the IPA.

Naming conventions for consonants

When writing the technical names for consonants (which you will have to do sometimes if you specialise in phonetics or phonology), the convention is generally voicing-place-manner. Sub-manners like "lateral" or "ejective" come between main place and main manner. The same is true of secondary articulations. Like this:

[k] is a voiceless velar plosive. [kʰ] is a voiceless aspirated velar plosive. [l] is a voiced alveolar lateral fricative. [ɫ] is a voiced velarised alveolar lateral fricative. [k͡p] is a voiceless labial-velar plosive, while [kʷ] is a voiceless labialised velar plosive.


Vowels are generally voiced sounds produced with next to no tension, in the back half of the vocal tract. All languages have 'em.

There are three features that divide base segmental vowels up in a phonetic sense - tongue height, frontness/backness, and roundedness. The convention for these is obviously different, too. For example, [ɑ] is a low back unrounded vowel, while [y] is the high front rounded vowel, which is found in languages like French and Finnish.

There are two naming conventions for height, though. One can either use high, near-high, high-mid, mid, low-mid, near-low, and low, or close, near-close, close-mid, mid, open-mid, near-open, and open. The IPA prefers the latter in academia, but the former is good for starters or if you want to keep things simpler.
Member Avatar
Certified Mutant

Nice - that is helping me to make sense of the IPA chart from the previous post :D .

(BTW, about those double articulations: I did find this video which claims to provide the correct pronunciation of 'Kpelle'....but a native speaker has left a comment to say it's wrong :P !)
Member Avatar
Christian. Exterminator of Spammers.

And now comes the fun part - the sounds themselves! (I couldn't find a consistent sound chart for all of them - you'll have to search the individual sounds on Wikipedia, because most if not all have a sound file with them.)


All languages have plosives. Clean voiceless plosives are the most common, although many of these can also be aspirated.

[p] - voiceless bilabial plosive. Occurs in a large percentage of languages. The English citation form is actually the aspirated [pʰ] but [p] does exist in certain environments (more about this in phonology); [pʰ] and [p] are considered different sounds in many Indo-Aryan languages, the Chinese languages, and Scottish Gaelic, among others. However, Standard Arabic lacks the sound, as do many regional Arabic dialects.

[t̼] - voiceless linguolabial plosive. Incredibly rare; only attested in disordered speech until it was discovered being used contrastively in a group of languages of Vanuatu. (Yes, I looked this one up.)

[t] - voiceless alveolar plosive. Citation form is aspirated in English. Almost every language has it, with Hawaiian being among the very few exceptions. As with [p] and also [k], the aspirated and non-aspirated variants are contrasted in Indo-Aryan languages, the Chinese languages, and Scottish Gaelic, among others. Can vary somewhat in placement, from dental to post-alveolar.

[t͡p] - voiceless labial-alveolar plosive. VERY rare; only decisively attested in one specific language in Papua New Guinea.

[ʈ] - voiceless retroflex plosive. Found primarily contrastively in Asia and the Pacific, especially in South Asia and Australia. Also occurs in Swedish and Norwegian.

[c] - voiceless palatal plosive. Exists in English as a variant of [k] happening before [i] and [e]. Considered a different sound in Hungarian and Albanian among others.

[k] - voiceless velar plosive. Citation form is aspirated in English. VERY common, almost as much so as [t] and more than [p]. Tahitian is a rare counter-example.

[k͡p] - voiceless labial-velar plosive. Fairly rare, occurring mainly in West and Central West Africa. NHL player Kyle Okposo has this sound in his last name, or at least in the original pronunciation thereof - his father is from Nigeria. Also occurs in the language name Kpelle.

[q] - voiceless uvular plosive. Occurs in a number of non-Indo-European languages, which are many but dispersed. Several Turkic languages, Inuktitut, a large number of Salishan languages (if not all of them), all Wakashan languages, many Northeast and Northwest Caucasian languages, some dialects of Arabic and Hebrew (indeed, it's posited that Biblical Hebrew had this sound), and several other indigenous languages of North America have this sound. Iranic languages have it as well, under influence from Arabic.

[q͡ʡ] - voiceless uvular-epiglottal plosive. Supposedly occurs in Somali; could actually just be a clean uvular plosive [q].

[ʡ] - voiceless epiglottal plosive. Quite rare. Occurs in Haida and Archi, and some linguists believe it occurs in Nuu-Chah-Nulth.

[ʔ] - glottal stop. Better understood as an absence of sound. Almost every word in human language that is perceived as beginning in a vowel rather actually begins with a glottal stop phonetically when at the beginning of an utterance. Many languages use this word-medially or finally as a contrastive sound. Believe it or not, English is among these. "Uh-oh" is transcribed [ʔʌʔɔw] phonetically in Western American/Canadian English.

Voiced plosives are less common than their voiceless counterparts, but still common.

[b] - voiced bilabial plosive. Occurs in numerous languages, including English.

[d̼] - voiced linguolabial plosive. Incredibly rare. Had to look this up, too. Attested in Vanuatu, and the Kakojo dialect of Bijago. And disordered speech. :P

[d] - voiced alveolar plosive. Occurs in numerous languages, including English. Can vary somewhat in placement, from dental to post-alveolar. Is the only voiced stop to occur in Finnish, as a variant of [t] in certain environments.

[ɖ] - voiced retroflex plosive. Occurs in languages of India, as well as in Swedish and Norwegian.

[ɟ] - voiced palatal plosive. Occurs primarily in Eastern Europe as a contrastive sound, most notably in Hungarian, Albanian, Czech, Slovak, and Latvian.

[g] - voiced velar plosive. Common, but the least frequent of the "common six plosives" ([p, t, k, b, d, g]).

[ɡ͡b] - voiced labial-velar plosive. Rare, primarily found in West and Central West Africa, as in the language names Igbo and Gbe.

[ɢ] - voiced uvular plosive. Quite rare. Attested in Mongolian, some dialects of Arabic (non-contrastive), and Canadian indigenous language Kwak'wala, among some others.


Very few languages lack fricatives phonetically, although some language families lack them contrastively. More on that in phonology.

[ɸ] - voiceless bilabial fricative. Not particularly common in contrast, although African language Ewe has it. But as a variant of other sounds it is surprisingly common, and occurs in Spanish and Japanese in this manner, among others.

[f] - voiceless labiodental fricative. Fairly common. Occurs in most Indo-European languages, including English, French, Italian, German, etc.

[̼θ] - voiceless linguolabial fricative. VERY rare. Only attested in Vanuatu.

[θ] - voiceless interdental fricative. Fairly rare. Occurs contrastively in English, Icelandic, Castillian Spanish, Albanian, Greek, and Bashkort, among others.

[s] - voiceless alveolar (grooved) fricative. Most common fricative. Most languages said to not have this sound are in the Pacific, and include Hawaiian and Maori.

[ɬ] - voiceless alveolar lateral fricative. Found primarily in North America, the Caucasus, and southern Africa, also attested contrastively in Welsh and some languages of East and Southeast Asia.

[ʃ] - voiceless palato-alveolar (grooved) fricative. Quite common. Most Indo-European and Turkic languages, and many indigenous languages of the Americas, have this sound.

[ʂ] - voiceless retroflex (grooved) fricative. Not super-common, but not exactly rare, either. Occurs in languages of India, North Germanic languages, Chinese languages, Polish, and the East Slavic languages (Russian/Ukrainian/Belarusian/Rusyn).

[ꞎ] - voiced retroflex lateral fricative. Attested only in Toda, a Dravidian language of southern India.

[ɕ] - voiceless alveolo-palatal (grooved) fricative. A bit less common. Occurs contrastively in Polish, Russian, Chinese languages, and some languages of the Caucasus; also attested in Japanese as a variant of [s].

[ç] - voiceless palatal fricative. Rare in contrast. Does, however, occur semi-frequently as a variant of another sound, even in certain English dialects (as a variant of [h]). German, Greek, Dutch, and Finnish among other such languages.

[ʎ̥˔] - voiceless palatal lateral fricative. Attested only in a couple of Afro-Asiatic languages of Central Africa.

[x] - voiceless velar fricative. Quite common. A number of Indo-European languages have this in dialectal inventory; Spanish, Russian, and Mandarin are three particularly major languages that have this sound. Old English had this sound.

[ɧ] - voiceless postalveolo-velar fricative. This one is rare and also somewhat controversial. It is only clearly attested in Swedish as the "sje-sound" which is written "sj." Officially it is considered a co-articulation of [ʃ] and [x], but this is still a point of argument amongst those who study Swedish phonetics and phonology. Supposedly also occurs in the Kölsch dialect in western Germany.

[ʟ̝̊] - voiceless velar lateral fricative. Attested in some Chimbu-Wahgi branch of the Trans New Guinea family, and also in Northeast Caucasian language Archi.

[χ] - voiceless uvular fricative. A rather harsh sound, not quite as common. Fairly common in indigenous languages of western North America, the Caucasus, the Middle East, and several dialects of German.

[ħ] - voiceless pharyngeal fricative. Fairly rare. Occurs mainly in Semitic languages and languages of the Caucasus.

[h] - voiceless glottal fricative. Occurs in a large variety of languages, English included.

And of course, your voiced fricatives as well:

[β] - voiced bilabial fricative. Is attested as a contrastive sound (most notably in Ewe) but occurs much more frequently as a variant of another sound, as in Spanish, Japanese, and Korean, among others.

[v] - voiced labiodental fricative. Occurs contrastively primarily in Europe, the Middle East, and Siberia, but is attested elsewhere. (I call this the "hard v.")

[ð̼] - voiced linguolabial fricative. VERY rare, attested only in Vanuatu.

[ð] - voiced interdental fricative. Fairly rare. Attested contrastively in English, Icelandic, Arabic, Bashkort, Welsh, and a number of indigenous North American languages. Also a variant in Greek, Spanish, and some dialects of Hebrew.

[z] - voiced alveolar (grooved) fricative. MUCH less common than its voiceless counterpart. Mainly found contrastively in Europe, Africa, and the Middle East.

[ɮ] - voiced alveolar lateral fricative. Quite rare; attested contrastively in Northwest Caucasian languages, Zulu, Xhosa, and a small number of other scattered languages from diverse families.

[ʒ] - voiced palato-alveolar (grooved) fricative). Not particularly common. Found contrastively in many Slavic languages, French, Portuguese, many Na-Dene languages, and various Turkic and Uralic languages. English has it as a variant of [s] or [z].

[ʐ] - voiced retroflex (grooved) fricative. Kinda rare. Occurs in Russian, Polish, Vietnamese, and Mandarin, among others.

[ʑ] - voiced alveolo-palatal (grooved) fricative. Very rare. Occurs in Chinese languages, Polish, Sorbian, and some Northwest Caucasian languages contrastively. Also occurs in some languages as a variant, such as Russian, Portuguese, and Catalan.

[ʝ] - voiced palatal fricative. Incredibly rare in contrast with other sounds - only attested contrastively in Scottish Gaelic and some Berber languages of Saharan Africa. A number of others have this as a variant.

[ɣ] - voiced velar fricative. Surprisingly common. Occurs contrastively in a number of Caucasian languages and languages of the Americas (especially North), and also in Scottish and Irish Gaelic, Arabic, and certain dialects of Hebrew, among others.

[ʟ̝] - voiced velar lateral fricative. Attested in some Chimbu-Wahgi branch of the Trans New Guinea family, and also in Northeast Caucasian language Archi.

[ʁ] - voiced uvular fricative. Occurs mainly in Europe, the Middle East, and North America; in Europe it is often referred to as the "guttural r" which is used in French, Portuguese, German, many dialects of Dutch, and Danish - in this regard it is also used in Hebrew and Inuktitut. It occurs in Turkic and various Caucasian languages contrastively as well.

[ʕ] - voiced pharyngeal fricative/approximant. Hard to tell in most cases whether there is friction or not since it is so close to the glottis and has no trilling feature. Fairly rare. Occurs contrastively in Arabic, Interior Salish and Wakashan languages (North America), and Caucasian languages. Is the consonantal equivalent of [ɑ].

[ɦ] - voiced (most often breathy-voiced) glottal fricative. Exists in numerous languages, but seldom contrasting with the voiceless [h].


These are usually written with digraphs (that is, two symbols) connected by a ligature.

[p͡ɸ] - voiceless bilabial affricate. VERY rare. Unattested in contrast, and most of the languages in which it is attested - mainly on a dialectal level - are West Germanic (English, German, Dutch)

[̪p͡f] - voiceless labiodental affricate. VERY rare. Most of the languages it is attested in at all are West Germanic (German, Lëtzebuergesch, Bavarian) and it apparently also exists contrastively in Tsonga, one of the official languages of South Africa.

[t͡θ] - voiceless interdental affricate. Quite rare. Exists in a small number of indigenous North American languages - some Coast Salish languages have this sound, as do a few Na-Dene languages of the Northwest Territories in Canada.

[t͡s] - voiceless alveolar affricate. Very common, although non-contrastive in English. Best-known in contrast from German, Italian, Slavic languages, and Chinese languages; also exists in a number of indigenous languages of the Americas, the Caucasus, and southern Africa.

[t͡ɬ] - voiceless alveolar lateral affricate. Somewhat rare. Attested primarily in North American indigenous languages (especially in the west) and southern Africa, but also occurs in Icelandic, and Mexican Spanish (through borrowings from Nahuatl).

[t͡ʃ] - voiceless palato-alveolar affricate. Quite common - this is the English "ch" sound. Fairly widespread. Might actually be more widespread an affricate than [t͡s], which is typologically unusual.

[ʈ͡ʂ] - voiceless retroflex affricate. Rarer. Attested mainly in pockets of languages - Slavic languages (primary West and East Slavic), Northwest Caucasian languages, and Chinese languages.

[t͡ɕ] - voiceless alveolo-palatal affricate. Very rare in contrast (Polish and Serbo-Croatian, for example) but more common as a variant (Chinese languages, Russian, Chinese, Korean, even certain dialects of English).

[c͡ç] - voiceless palatal affricate. Very rare. Languages that use it often have the plosive [c] as a free variant of it. Some Samic languages of Northern Europe have this attested as contrastive, as does Hungarian.

[c͡ʎ̥˔] - voiceless palatal lateral affricate. Only a couple African languages have this sound contrastively.

[k͡x] - voiceless velar affricate. Quite rare. Most commonly seen in contrast in Southern Africa.

[k͡ʟ̝̊] - voiceless velar lateral affricate. VERY rare. Attested contrastively in Archi and the Laghuu language of Vietnam.

[q͡χ] - voiceless uvular affricate. Very rare. Mainly found contrastively in the Caucasus and in the Pacific Northwest. I call this the "death rattle" sound. :p

[ʡħ] - voiceless pharyngeal-epiglottal affricate. Apparently attested in Haida, a language of British Columbia and Alaska. Otherwise unattested.

[ʔ͡h] - glottal affricate. Never occurs contrastively. Attested as a variant in Queen's English (aka Received Pronunciation) and certain dialects of Chinese languages.

And the voiced ones.

[b͡β] - voiced bilabial affricate. Unattested in contrast and very rare otherwise. Some British dialects and this one specific African language have it attested as a variant.

[b̪͡v] - voiced labiodental affricate. Very few languages have this attested in contrast, and these are primarily in southern Africa. Occurs in some West Germanic languages as a variant.

[d͡ð] - voiced interdental affricate. VERY rare and only attested as a variant in languages where dental variations of the alveolar plosive [d] or certain dental sound combinations occur.

[d͡z] - voiced alveolar affricate. Not the most common of sounds, but more common than most voiced affricates. Occurs regularly in languages of Eastern Europe - Slavic languages either have it contrastively (most South Slavic languages) or as a variant (East and West Slavic languages); also attested in Albanian, Caucasian languages, Hungarian, Italian, Armenian, and a few Chinese languages and dialects.

[d͡ɮ] - voiced alveolar lateral affricate. VERY rare, and not known to occur contrastively. Attested in Xhosa and Séliš.

[d͡ʒ] - voiced palato-alveolar affricate. The most common voiced affricate, occurring in a wide range of locations.

[ɖ͡ʐ] - voiced retroflex affricate. Rare. Seen mainly in Slavic languages and languages of China.

[d͡ʑ] - voiced alveolo-palatal affricate. Rare, and usually seen as a variant. Contrastive in some Chinese languages, Polish, and Serbo-Croatian.

[ɟ͡ʝ] - voiced palatal affricate. VERY rare. Hungarian and Samic languages have it in free variation with the plosive [ɟ]. Some dialects of Albanian have it as well.

[ɡ͡ɣ] - voiced velar affricate. VERY rare and unattested contrastively.

[ɡ͡ʟ̝] - voiced velar lateral affricate. Attested contrastively only in Laghuu in Vietnam, and Hiw in Vanuatu.

[ɢ͡ʁ] - voiced uvular affricate. Unattested but considered possible.

[ʡ͡ʕ] - voiced pharyngeal-epiglottal affricate. Unattested but considered possible. How, I really don't know. :lol: I blame Eric.


Nasals are typically voiced.

[m] - voiced bilabial nasal. Incredibly common, with very few languages (among them the ironically named Mohawk - in its own language it is Kanien’kéha; Rotokas also lacks it) not having it.

[ɱ] - voiced labiodental nasal. Almost nonexistent contrastively, but any language with both [m], and [f] and/or [v], WILL have this as a variant. Very hard to tell apart from [m] to my ears!

[n̼] - voiced linguolabial nasal. You guessed it. VANUATU! :lol:

[n] - voiced alveolar nasal. Incredibly common, with very few languages (among them Samoan and Rotokas) lacking it.

[n͡m] - voiced labial-alveolar nasal. VERY rare; only attested in one language of Papua New Guinea.

[ɳ] - voiced retroflex nasal. For the most part, limited to India, Australia, and Scandinavia. Vietnamese also has it as a variant.

[ɲ] - voiced palatal nasal. Surprisingly common, especially in Europe. A number of languages have this contrastively (most Romance languages, Hungarian, Czech, Slovak, most South Slavic languages, Irish Gaelic, Albanian) with others have this as a variant.

[ŋ] - voiced velar nasal. Not hugely common but not super-rare, either. Occurs frequently as a variant of [n] before velar plosives/fricatives/affricates. In languages where it occurs contrastively it is sometimes consigned to the part of a syllable after the vowel (as in languages like English; this is more common amongst Indo-European languages with the sound) but sometimes not (certain African languages, Samoyedic languages, some Caucasian languages, Chinese languages, several languages of the Philippines).

[ŋ͡m] - voiced labial-velar nasal. Rare, and primarily occurs in West and Central West Africa.

[ɴ] - voiced uvular nasal. Rarer, and usually occurs as a variant of [n] or [ŋ] preceding another uvular. Attested as contrastive in some Eskaleut languages (Inuktitut, Kalaallisut) and one language of Papua New Guinea.

(remember, anything beyond uvular is impossible for a nasal)

Voiceless nasals are usually variants, but not always. On rare occasion, they are contrastive. The Hmong-Mien language family is known for its voiceless nasals - heck, there's one even in the name of the family!

[m̥] - voiceless bilabial nasal. Occurs contrastively in the Hmong-Mien family, Burmese, Yupik, Shixing, Kildin Sami, and Mazatecan languages, and a few others.

[n̥] - voiceless alveolar nasal. Occurs contrastively in the Hmong-Mien family, Burmese, Yupik, Shixing, Kildin Sami, and Mazatecan languages, and a few others.

[ɳ̊] - voiceless retroflex nasal. Only occurs contrastively in one language of New Caledonia.

[ɲ̊] - voiceless palatal nasal. Occurs contrastively in Mazatecan languages, the Hmong language, Burmese, and Shixing.

[ŋ̊] - voiceless velar nasal. Occurs contrastively in Mazatecan languages, Burmese, Yupik, and Shixing.


There are five approximants that are actually known to contrast with or at least differ from their fricative counterparts:

[ʋ] - voiced labiodental approximant. The "soft V" found in Dutch, North Germanic languages, Finnish, Czech, and some South Slavic languages. Not particularly common outside of Europe, but it does occur.

[ɹ] - voiced alveolar/postalveolar approximant. Your stereotypical "English R," fairly rare and tends to be dialectal and a variant in the languages it does occur in. Primarily occurs in Indo-European languages, but is also attested in Burmese, Vietnamese, and Igbo. Is considered a vowel in some dialects of Mandarin.

[ɻ] - voiced retroflex approximant. Quite rare, mainly occurs in India and Australia, but is also attested in southern South America, and dialectally in certain Germanic languages, including English.

[j] - voiced palatal approximant. One of the two most common approximants, occuring in a wide number of languages; is the consonantal equivalent of [i]. (Written with "y" in English) Sometimes takes on noise and becomes [ʝ], but this is a variant.

[ɰ] - voiced velar approximant. Very rare contrastively, primarily occurring as a variant of [k], [g], or [ɣ]. The consonantal equivalent of [ɯ].

And then there're these gems:

[l] - voiced alveolar lateral approximant. VERY common. Often has a velarised variant; occasionally this [lˠ] is contrastive with [l] (Albanian, for example)

[ɭ] - voiced retroflex lateral approximant. Rare. Attested primarlily in Dravidian and North Germanic languages, also in Australia, and in Korean and Khanty.

[ʎ] - voiced palatal lateral approximant. Not really common, not really rare. A large number of Indo-European languages (of Europe, anyway) have this sound, as do Basque and Hungarian. Outside of Europe, the best examples are in Quechua, Aymara, and more neutral American and Canadian dialects of English.

[ɥ] - voiced labialised palatal approximant. VERY rare contrastively, occurring in Abkhaz and Iaai. (Not sure if there are others.) However, it is more common as a variant, with languages with the vowel [y] and [w] usually having it as a variant of [w] (such as some Chinese languages, Korean, and Shixing); occurs in French as a variant of [y].

[ʟ] - voiced velar lateral approximant. Acoustically almost indistinguishable from the velarised alveolar; the one that is simply velar is much rarer and attested only in the Pacific, and in Scots.

[w] - voiced labio-velar approximant. One of the two most common approximants, occurring in a wide variety of languages. Is the consonantal equivalent of [u].

[ʟ̠] - voiced uvular lateral approximant. Unattested contrastively but apparently possible, and tentatively attested as a variant in certain American Englishes. How, I don't know.

Contrastive voiceless approximants are generally very rare, as they tend to mainly occur as variants of their voiced counterparts following voiced plosives or affricates. The following-listed are only ones that are attested as contrastive.

[ʋ̥] - voiceless labiodental approximant. Attested contrastively only amongst English-speaking South Africans of Indian extraction.

[l̥] - voiceless alveolar lateral approximant. Usually a variant of [l], only attested contrastively in Shixing, Tibetan, and Moksha.

[ɭ̊] - voiceless retroflex lateral approximant. Attested contrastively only in the Iaai language of New Caledonia and the Dravidian language Toda.

[j̊] - voiceless palatal approximant. Attested contrastively in some Samic languages, Moksha, and Jalapa Mazatec (possibly other Mazatecan languages as well).

[ʎ̥] - voiceless palatal lateral approximant. Attested contrastively only in Shixing.

[ʍ] - voiceless labio-velar approximant. Attested contrastively in a number of English dialects (including Southern American English; these dialects preserve an audible contrast between the words "whine" and "wine," of which this is the one in "whine" ) and Hupa; Old English had this sound; many languages have this take on noise and become [xʷ] instead.


As with approximants and nasals, these are usually voiced.

[ⱱ̟] - voiced bilabial flap. VERY rare - attested primarily in scattered African languages - and not yet proven to be contrastive in any language, rather being a variant of...

[ⱱ] - voiced labiodental flap. Quite rare - attested in scattered African languages, mainly in Central Africa.

[ɾ] - voiced coronal flap. Don't worry about the term "coronal" for now - I'll expand on that when I do phonology. The typical place of articulation for this alveolar, but it can also be post-alveolar or dental, and these never contrast. Anyway, these are somewhat common as an "r-sound" in language, as in Spanish, Turkish, and Arabic, and also as a variant in Russian. It does occasionally occur as a variant of [d] as well, as in North American Englishes, Estuary English, and Danish. There's even a nasalised variant of this - [ɾ̃] - that occurs in several North American Englishes as a merger of [n] and [t]. (It isn't as common in Canada.)

[ɺ] - voiced coronal lateral flap. VERY rare, the only major language with this sound is Japanese; occurs as a postalveolar in one particular dialect of Norwegian.

[ɽ] - voiced retroflex flap. As with most retroflex sounds, this is most common to North Germanic, and languages of India and Australia, but in the cases of the North Germanic languages it occurs as a variant of laterals.

[ɭ̆] - voiced retroflex lateral flap. Very rare; occurs primarily in South and Southeast Asia, the most notable contrastive example being Pashto.

[ʟ̆] - voiced velar lateral tap. Only attested in two languages of Papua New Guinea, and not contrastively; why it is able to be a tap on the soft palate is because air can still pass around the edges of the tongue, but the time of contact is still very short.

[ɢ̆] - voiced uvular flap. Very rare and never occurs in contrast. German, Dutch, and Limburgish are attested to have it.

[ʡ̮] - voiced epiglottal flap. Only attested in one language - Dahalo - and even there it isn't contrastive.

Voiceless flaps do occur, but only as variants. They've never been attested in contrast.


[ʙ] - voiced bilabial trill. Like mimicking a horse, except with voicing! :p Occurs in diverse languages on most continents, but is quite rare.

[r] - voiced coronal trill. Usually alveolar, and quite common as an "r-sound."

[ɽ͡r] - voiced retroflex trill. VERY rare - attested contrastively only in three languages and tenuously in some dialects of Dutch.

[ʀ] - voiced uvular trill. Never contrasts with the fricative [ʁ], and is found as the "guttural R" of languages like French, Portuguese, German, some dialects of Dutch, and Hebrew, among others; Most of the languages in question are Indo-European.

[ʢ] - voiced epiglottal trill. Basically a growl; only attested as a speech sound in the Aghul language of Dagestan, Russia, and in some dialects of Arabic.

Voiceless trills are relatively rare in contrast. Only those attested in contrast are shown.

[r̥] - voiceless coronal trill. This does occur in contrast in some languages, such as Icelandic, Moksha, and Welsh.

[ʜ] - voiceless epiglottal trill. Patterns like a fricative; still quite rare, being attested in languages like Chechen, Dahalo, Haida, and some dialects of Arabic.


Ejectives come in three subtypes. There are ejective plosives, ejective affricates, and ejective fricatives. These are always voiceless. All are fairly rare, but ejective plosives and affricates are more common than ejective fricatives.

Ejective plosives

If a language has ejectives, it WILL have ejective plosives.

[p'] - bilabial ejective plosive. Found in scattered languages, but particularly common in the Americas, southern Africa, the Cape Horn area, and the Caucasus (including even Armenian, which is an Indo-European language, quite possibly the only one to have contrastive ejectives).

[t'] - alveolar/dental ejective plosive. While these are written with the same symbol in most cases, the Dahalo language of Kenya has both. Otherwise, in much the same way as the bilabial, these are particularly common in the Americas, southern Africa, the Cape Horn area, and the Caucasus.

[ʈʼ] - retroflex ejective plosive. VERY rare, supposedly occurs in Gwich'in, a Na-Dene language primarily spoken in northwestern Canada and western Alaska.

[cʼ] - palatal ejective plosive. Very rare, occurring mainly in North American indigenous languages, like Haida for example; also attested in isolated languages in southern Africa and the Caucasus, and in Hausa, a major trade language of central west Africa.

[k'] - velar ejective plosive. Found in scattered languages, but particularly common in the Americas, southern Africa, the Cape Horn area, and the Caucasus. Labialised versions of this are common in the Caucasus and western North America.

[q'] - uvular ejective plosive. Limited mainly to the Caucasus and western North America; labialised versions of this occur in Northwest Caucasian, Salishan, Wakashan, and some Na-Dene languages. Palatalised versions occur in Abkhaz and occurred in now-extinct Ubykh, which also had pharyngealised and labialised-pharyngealised versions as well!

[ʡʼ] - epiglottal ejective plosive. VERY rare, only attested in Dargwa, a Northeast Caucasian language.

Ejective affricates

[t͡sʼ] - alveolar ejective affricate. Primarily attested in languages of the Caucasus and the Pacific Northwest.

[t͡ɬ'] - alveolar lateral ejective affricate. I LOVE this sound. As with the above, it is primarily attested in languages of the Caucasus and the Pacific Northwest.

[t͡ʃʼ] - palato-alveolar ejective affricate. Primarily attested in languages of the Caucasus and the Pacific Northwest.

[ʈ͡ʂʼ] - retroflex ejective affricate. Very rare. Attested in Adyghe, a Northwest Caucasian language. Proposed for Avar and Yokutsan languages.

[c͡ʎ̝̥ʼ] - palatal lateral ejective affricate. VERY rare. Attested in Dahalo, and an African language isolate called Hadza.

[k͡xʼ] - velar ejective affricate. Fairly rare. Occurs mainly in southern Africa, esp. Zulu and Xhosa, but also attested in Haida and Hadza.

[k͡ʟ̝̊ʼ] - velar lateral ejective affricate. VERY rare, only occurring contrastively in Archi, a Northeast Caucasian language, and in that language having plain and labialised forms thereof. Occurs in some southern African languages as a variant of [k͡xʼ].

[q͡χʼ] - uvular lateral ejective affricate. The ultimate throat-killer. :p Occurs as a variant of [q'] in Northeast Caucasian and also Salishan languages. It's actually easy to go from one to the other because of the force of an ejective causing vibration of the uvula and therefore some noise.

Ejective fricatives - these are all VERY rare.

[fʼ] - labiodental ejective fricative. Attested in Kabardian, a Northwest Caucasian language. Proposed for Yapese.

[θʼ] - dental ejective fricative. Attested in South Arabian languages (esp. Mehri) and also in Yapese.

[sʼ] - alveolar ejective fricative. Mostly found in scattered North American indigenous languages; attested dialectally in Adyghe and Hausa.

[ɬ’] - alveolar lateral ejective fricative. Found in most Northwest Caucasian languages (not Abkhaz, though, which actually doesn't have ejective fricatives); also attested in Tlingit.

[ʃʼ] - palato-alveolar ejective fricative. Attested in Adyghe and the Keresan languages.

[ʂʼ] - retroflex ejective affricate. Attested in the Keresan languages.

[x’] - velar ejective affricate. Attested in Tlingit.

[χ’] - uvular ejective affricate. Attested in Tlingit, and supposedly Georgian as a variant.


Implosives only have one form - they pattern in the same way as plosives. Implosives are more commonly voiced. They occur most frequently in Sub-Saharan Africa and Southeast Asia, but also happen in the Sindhi language and the Saraiki dialect of Punjabi, in Pakistan.

[ɓ] - voiced bilabial implosive. Occurs in a number of African and Southeast Asian languages, and also in Sindhi and Saraiki. Apparently attested in Southern American English at the beginning of words. (Not sure I buy that. ;) ) Notable languages with it include Vietnamese, Khmer, and Hausa. Also attested in some Mayan languages of Guatemala.

[ɗ] - voiced alveolar implosive. Occurs in a number of African and Southeast Asian languages, and also in Sindhi and Saraiki. Notable languages with it include Vietnamese, Khmer, and Hausa.

[ᶑ] - voiced retroflex implosive. Much rarer; attested in Ngadha, a language of Indonesia, and Oromo, a major vernacular language of Ethiopia.

[ʄ] - voiced palatal implosive. Not as common as some implosives, and not attested in Southeast Asia, being mainly used in Africa. Does appear in Sindhi and Saraiki, however.

[ɠ] - voiced velar implosive. Occurs primarily in Africa, but also occurs in Sindhi and Saraiki.

[ʛ] - voiced uvular implosive. Occurs in actual language in a very odd place; the Mam language of Guatemala. Apparently no other language has it, which is odd in the sense that it is outside the regular occurrence zones of implosives. Used by many other languages, though, as a way to mimic gulping sounds!

There are voiceless implosives attested; those few languages confirmed using them in contrast are mostly in Africa.

[ɓ̥] - voiceless bilabial implosive. Attested contrastively in Serer-Sine, a Senegalese language, and in the Owere dialect of Igbo in Nigeria.

[ɗ̥] - voiceless alveolar implosive. Attested contrastively in Serer-Sine and in the Owere dialect of Igbo.

[ʄ̊] - voiceless palatal implosive. Attested contrastively only in Serer-Sine.

[ʛ̥] - voicless uvular implosive. Attested in Kaqchikel and Q’anjob’al, both Mayan languages spoken in Guatemala. These are also the only languages on the books that are claimed to have a voiceless implosive without having the voiced counterpart.


Only one language outside of southern Africa has ever been attested having clicks, and that language (in Australia) is now extinct; clicks have numerous contrastive forms, but only the baseline forms are shown here.

[ʘ] - bilabial click. Not a kissing sound, rather a lip-smacking sound done with non-puckered lips. Attested in the Tuu and Kxa languages of southern Africa.

[|] = dental click. Used as an actual speech sound in a number of southern African languages, including Zulu, Xhosa, and the languages formerly designated as "Khoisan." (This term fell out of use when it was determined that those languages didn't necessarily form a provably cohesive language family) Used paralinguistically in English to denote pity or disapproval, written "tsk-tsk."

[ǃ] = alveolar click. Occurs in a number of southern African languages, including Sesotho, which has no clicks in any other place of articulation, Xhosa, Zulu, and the languages formerly designated as "Khoisan."

[ǂ] = palatal click. Only occurs in languages formerly designated as "Khoisan."

[ǁ] = lateral click, formed by making the click on the side of the mouth rather than in the middle or across the whole mouth. Used in many southern African languages, including Xhosa (the name of which has the aspirated variant of this), Zulu, and the languages formerly designated as "Khoisan."

[ǃ˞] = retroflex click. Extremely rare, actually only being attested in the Central !Kung language of Namibia.


Unlike consonants, which occur in more or less a specific spot in the mouth, vowels have a larger range and are often relative to the language. Still, there are specific formant frequency ranges that are set out for each vowel. You would never hear an [i] with the formants of an [ɑ], for example. Anyway.

Keep in mind as well, there are two different conventions for vowel height. Vowels can also be contrastively long and short, which they were in Latin and Old English, and remain in languages like Finnish, Japanese, Navajo, etc. It can be contrastively nasalised in languages like the Na-Dene languages of North America (Navajo, Tlingit, Apache, Dakelh, Tsilhqut'in, etc.), French, most dialects of Portuguese, Polish, etc. Furthermore, phonation can vary; some languages have contrastive creaky voice. (Creekynoise? :p [/insidejoke])

What are shown here are your basic segmental vowels.

Front Vowels

[i] - close/high front unrounded vowel. Among the most common vowel sounds - most languages have this sound in contrast. Can't actually think of a language off the top of my head that doesn't. It isn't always a "clean" [i], though; some dialects of English have a slight diphthong instead, [ɪj].

[y] - close/high front rounded vowel. Considerably less common than its unrounded counterpart. Occurs primarily in Europe and North/Central Asia. English is probably the only major Germanic language that doesn't have it cross-dialectally, and even some dialects of English (most stereotypically Scottish) have it. Major languages with this sound include French, German, Mandarin, Cantonese, Wu, Dutch, Turkish, Hungarian, and Finnish

[ɪ] - near-close/near-high front unrounded vowel. Not as common as its close/high counterpart, but still occurs fairly widely, particularly in Europe and Africa. In English, this replaced the short [i] after vowel-shift, and prescriptive grammars still call it "short-I," but there is more than just length that is a factor here; other West Germanic languages follow suit in this regard.

[ʏ] - near-close/near-high front rounded vowel. Fairly rare. Almost all the languages that use it are Indo-European, as a short variant of [y], although it is also attested in Turkish and Hungarian. On top of this, the bulk of the languages from Indo-European that use it are either themselves Germanic (Dutch, Icelandic, Swedish, Faeroese, German, Limburgish, Norwegian) or have a historically heavy Germanic influence (French; one could possibly make this argument for non-Indo-European Hungarian as well).

[e] - Close-mid/high-mid front unrounded vowel. As with [i], this is sometimes not a "clean vowel" but a diphthong. Fairly common; occurring in some "5-vowel systems" and more or less all "7-vowel" and "9-vowel" systems. (I'll talk about that in phonology). Noteworthy examples of this being found clean are Cantonese, German, French, Hindi, Scottish English, and Arabic.

[ø] - Close-mid/high-mid front rounded vowel. A little less common, occurring primarily in Europe and Northern/Central Asia. Occurs in most Germanic languages and in French (why the crap isn't it in more standard English, then? :p) and also occurs in Turkish, Wu, and Hungarian, among others.

[e̞] or [ɛ̝] - Mid front unrounded vowel. This sound occurs in languages that don't have a contrast between two mid-range front unrounded vowels, such as Spanish, Finnish, Japanese, Romanian, a number of Slavic languages, Hebrew, and Tagalog, among others.

[ø̞] or [œ̝] - Mid front rounded vowel. Fairly rare. Uralic languages with a front rounded vowel in the mid-range, like Finnish, Estonian, Võro, and Hungarian, have this sound, as does Turkish. In a number of Germanic languages, English included, it occurs dialectally, although in some cases this is argued.

[ɛ] - open-mid/low-mid front unrounded vowel. Fairly common. Occurs in some "5-vowel, ""7-vowel" and "9-vowel" systems, although in some of these, it is supplanted by [ə].

[œ] - open-mid/low-mid front rounded vowel. Rarer, and typically a variant of [ø]. The only language I can think of where these two actually contrast is French, which is a nightmare for me, because I have trouble telling the two apart even now after years of training! :lol:

[æ] - near-open/near-low front unrounded vowel. Fairly rare. In a number of languages that have it it only occurs in certain dialects. Some languages that consistently have it are English, Finnic languages (Finnish, Estonian, etc.), Northern Azeri, Farsi, and Tsilhqut'in (I can verify this on personal experience as I studied the language with two speakers in my undergrad). In Southern American English, it contrasts with [a].

[a] - open/low front unrounded vowel. Fairly common. Since most languages have only a single low-range vowel, this is either the default form of it, as in Spanish, Mandarin, Arabic etc., or a variant thereof. A number of "5-vowel," "7-vowel," and "9-vowel" systems have this, with this being the only one that is unpaired in those cases.

[ɶ] - open/low front rounded vowel. Extremely rare and only attested in a few Germanic languages as a variant. Probably because it would make you look like a contortionist trying to pronounce the thing while keeping your lips rounded!

Central vowels

With the exception of [ə], the mid-range of the central vowels is hard to find in contrast. This is because in usage, they seldom if ever contrast with their back counterparts.

[ɨ] - close/high central unrounded vowel. [ə] aside, this might be the most common central vowel attested in language, appearing contrastively in a large range of languages in various places around the world. Russian, Uzbek, Mongolian, Irish Gaelic, and Mandarin have the prototypical examples of this sound.

[ʉ] - close/high central rounded vowel. Typically a variant of [u]; dialectal in English, contrastive in Swedish and Norwegian, and in fact Swedish contrasts three high rounded vowels, a rarity in language.

[ɘ] - close-mid/high-mid central unrounded vowel. Rare in contrast, occurs dialectally to a degree in English, but only consistently in languages like Kazakh and Skolt Sami. Often freely variant with back vowel [ɤ].

[ɵ] - close-mid/high-mid central rounded vowel. Rare in contrast, occurring definitively in languages such as Cantonese, Mongolian, and Tajik but being conflated with other vowels in a number of other supposed attestations.

[ə] - mid central vowel. If a language reduces vowels, it will have this as a variant, and a LOT of languages reduce vowels! It is much less common contrastively, but still occurs fairly frequently; Northwestern and Northeastern Caucasian languages, indigenous languages of Western North America, Indo-Iranian languages (Hindi, Punjabi, Marathi, Kurdish), Armenian, Albanian, Palauan, and even French have this contrastively. Seriously, I remember my French alphabet. [a], [be], [se], [de], [ə], [ɛf], [ʒe], [aʃ], [i], [ʒi], [ka], [ɛl], [ɛm], [ɛn], [ɔ], [pe]... :P

[ɚ] - rhotacised mid central vowel. This can also be written as a syllabified [ɹ]. Very rare in terms of number of languages that use it, but for number of speakers? Most North American Englishes (New England or certain New York or Southern sub-dialects notwithstanding) have this, as do some widely-spoken dialects of Mandarin.

[ɜ] - open-mid/low-mid central unrounded vowel. Rare in contrast, but surprisingly, Queen's English (RP) is one of the languages that does use it, replacing the sequence [ə] + [ɹ]. Minority languages like Paicî (New Caledonia) and Ladin (northern Italy) have it in contrast as well.

[ɞ] - open-mid/low-mid rounded vowel. Mainly occurs as a variant, occasionally occurring dialectally, and only occurring consistently in one language, the Kashubian language of Poland.

[ɐ] - near-open/near-low central unrounded vowel. Surprisingly common as a variant; rarer in contrast, but is considered the "citation form" corresponding to the letter "a" in languages like Catalan, Cantonese, and the Baltic languages. A rounded equivalent does occur in one language, the Sabiny language of Uganda.

Back vowels

[ɯ] - close/high back unrounded vowel. While it never contrasts with [ɨ], the two are quite distinct nonetheless. Somewhat common, occurring in many Turkic and Mongolic languages, some Chinese languages, some languages of southeast Asia, Korean, and Scottish Gaelic.

[u] - close/high back rounded vowel. Very common, easily the most common rounded vowel in any world language. English doesn't have a clean [u] per se, with it instead being partly diphthongised. Some languages, like Japanese, Wu, Swedish, and Norwegian, have a form where the lips are rounded but don't stick out (called "compressed"), giving it a slightly different sound. But typically the rounding of this vowel results in protruded lips.

[ʊ] - near-close/near-high back rounded vowel. Fairly common either in five/seven/nine-vowel systems, a variant of [u] as in Russian or Québecois French, or as development from a historic short [u] as in English; does contrast in English (compare the words soot [sʊt] and suit [sut]). There is an unrounded vowel of this height and backness, but it occurs strictly as a variant or a dialectic sound.

[o] - close-mid/high-mid back rounded vowel. Quite common. Many English dialects don't have a clean [o]; languages that do include French, German, some dialects of English (Scottish English, Singlish, Indian English), and a number of others, many of whom have 7-vowel or 9-vowel systems. Wu Chinese has a "compressed" form of this vowel.

[ɤ] - close-mid/high-mid back unrounded vowel. Fairly rare. Languages that do have it include some languages of East and Southeast Asia, most notably Mandarin, Taiwanese, and Thai.

[o̞] - mid back rounded vowel. As with the front vowels, back vowels "go mid" when they don't distinguish multiple vowels in the mid-range. Finnic languages, Spanish, several Slavic languages, Japanese, Turkish, and Hebrew are among those that have this vowel.

[ɤ̞] - mid back unrounded vowel. Quite rare. Estonian, Võro, Danish, and Bulgarian have this sound attested, as do certain dialects of English (Norfolk and Cardiff supposedly) and Vietnamese.

[ɔ] - open-mid/low-mid rounded vowel. Common, and from what I've seen it's probably more common than [o]. This "open O" occurs in all 7-vowel and 9-vowel systems and in a fairly large number of 5-vowel systems. Its place in English is dialectally varied, but in R-retaining dialects it is generally the variant of [o] before [ɹ]. In R-drop dialects, "or" generally becomes a long [o], usually written [o:]

[ʌ] - open-mid/low-mid unrounded vowel. Not really that common. Occurs in a large number of English dialects (especially in North America), though, as the vowel in words like "butt" "shut" "fronting," etc. Also appears in Standard Korean, Tamil, and as a variant of [ə] in Salishan languages.

[ɑ] - open/low unrounded vowel. The only unrounded back vowel that is more common than its rounded counterpart, and the continuum between it and centralised [ä] gives us quite possibly the most common vowel sound area in language. Arguably all languages have a vowel somewhere in this formant range; the only other vowel that is in the same range of commonality is [i]. Languages contrasting three low vowels is almost unheard of, although apparently Skolt Sami does, contrasting this, [æ], and [ɐ].

[ɒ] - open/low rounded vowel. In contrast to its unrounded counterpart, this is very rare; it's usually dialectal or a variant of something else; an exception to this is its frequency in a large number of dialects of English (including my own, "Western Canadian"), and it is also attested in Farsi, Uzbek, Hungarian, the dialects of the Western Desert in Australia, and Assamese. In most cases, the vowel-rounding is less pronounced than in other rounded vowels.

Welp, that's the end of the basics of phonetics. (Your poor brain is probably reeling by now. :lol: )
Edited by Jarkko, Aug 25 2016, 03:48 AM.
Member Avatar
Certified Mutant

Nice :D . I bet that took a while to write up!

I did have a go at reproducing some of those rare/unattested sounds....without any success whatsoever :lol: . I blame Eric.
Member Avatar
Christian. Exterminator of Spammers.

Don't worry too much about that. Not even the best phoneticians can nail everything. I would hate to have to learn to pronounce Ubykh because there are so many fine distinctions that actually contrast meaning in words. (This is more phonology than phonetics, of course, but phonetics is an absolute necessity for phonology anyway.)
Offline Profile Quote Post Goto Top
Member Avatar
Certified Mutant

So, when do you expect to have the next lesson up :P ?
Online Profile Quote Post Goto Top
Member Avatar
Christian. Exterminator of Spammers.

I need to spend more time working on this, but for now, I'll give you this.


So having seen all these sounds, you think, "Okay, seems simple enough! I think I can learn all these languages now!" Not so fast. A language may have a number of phonetic sounds in the language, but you'll find out quickly enough that pairs or groups of these can be considered the same sound in a language, or even in various contexts within a language. When someone asks how many sounds are in a language, they usually want to know how many distinct sounds are in a language, and it comes down to how the sounds pattern and make sense within each individual language.

This is in the realm of what is called "phonology."

You know how I used the word "contrastively" a lot when describing the individual sounds? Well, that was actually cheating a little. That's something you would hear more in a phonology class than in a phonetics class. And consistent sound contrast gives the listener/linguist an idea of what sounds are actually perceived in a language.

The first word you need to know is phoneme. You run into this word everywhere in phonology. It is the underlying abstract sound in the brain that can surface as one or multiple sounds on the surface. The surface sounds that are tied to your underlying phoneme are called allophones. Every time I used a word "variant" in the section on phonetics and individual sounds, you could replace every last one of those with "allophone." The great thing about phonemes is that you write them down differently than you do allophones, or sounds in phonetic transcription as you would in a phonetics class or lab session. And for the rest of this series on linguistics, I intend to write words "phonemically" to simplify things a bit (and not have to abuse the "nocode" tag like I had to with phonetics ;) ). Phonetics uses square brackets for everything, but phonology only uses it for allophones. For phonemes /ðe ɑɹ ɹɪtn̩ bətwi:n slæʃəz/!

Sound Classes

I'll lay some groundwork before talking about phonemes a little more. Sounds pattern in various ways, and there are actually classes of sounds that lump together a lot of the places or manners of sounds.

There are four major place feature classes, although sometimes a sound will have more than one of these (co-articulated sounds generally do). First there are labial sounds, which comprise any sound using the lips as one of the articulators. Coronal sounds use the tip, blade, or front of the tongue, while dorsal sounds use the back. Laryngeal sounds are either made using the root of the tongue or just the glottis. As far as the different phonetic places go, here's how they line up:

Labial: bilabial, labiodental, linguolabial; rounded vowels are said to have a secondary labial feature.
Coronal: linguolabial, dental, alveolar, palato-alveolar, retroflex, alveolo-palatal
Dorsal: alveolo-palatal, palatal, velar, uvular; furthermore, all vowels are considered dorsal - perhaps controversially so in the case of low-back vowels - because the formants get their resonance from the area between the hard palate and the upper pharynx.
Laryngeal: uvular (rarely), pharyngeal, epiglottal, glottal

There is a two-way sonority contrast as well. Sonorants are sounds that are produced without turbulent airflow in the vocal tract, while obstruents have turbulence, resulting in noise or complete airflow stoppage. This is a grouping of manners.

Obstruents: plosives, affricates, fricatives
Sonorants: all other consonants, and also vowels.

And then there's continuancy. Continuants have continuous airflow through the mouth, while non-continuants do not. Notice I said, "through the mouth." Nasals are NOT continuants because the airflow goes through the nose. Other than this, plosives and affricates are non-continuants as well, while all other consonants, and vowels, are considered continuants.

More About Phonemes

Here's an example from Canadian English (the following is also true of most of Western American English, North Central American English, Northern American English, and "General American") of a phoneme and its allophones.

Let's take /t/ for example. When asked to pronounce this, most people will say [tʰ]. That's because English has what's called "aspiration." But we don't say /bɪt/ as [bɪtʰ]. Instead, we say either [bɪt] or [bɪt̚] (the diacritic on the second one is for an unreleased stop). On top of this, we never say "stick" [stʰɪk] or "schtick" [ʃtʰɪk]. There's no aspiration in those contexts. Some people don't even aspirate the first /t/ in "antidote." (I do, but the aspiration is weaker.) We do, however, aspirate the "t" in "tick" [tʰɪk] and "catastrophe" [kətʰæstɹ̥əfi].

So what happens is that /t/ becomes [tʰ] at the very beginning of a syllable (in some dialects it's even more specific than that, taking on the aspirated form at the beginning of a stressed syllable - I'll talk about stress later, because there are actually multiple types thereof) and [t] elsewhere, with a variant of an unreleased [t̚] being a free variant when it is the only sound after a vowel, at the end of an utterance. (I'll talk about free variation later.) (Aspiration also exists in other Englishes, but the distribution of it is different.)

If only it were that simple. :p

In North American Englishes, if a /t/ is between a vowel and a syllabic approximant, it will instead become [ɾ]. So an underlying /bʌtɹ̩/ "butter" will actually surface as [bʌɾɹ̩], with the sound represented by the "tt" being relatively short in length. This is a "flapping process", which mainly occurs with /r/ in languages but also occur with /t/, /q/, and sometimes even the sequence of /nt/, which happens in some American Englishes and with certain speakers in Canada. Flapping actually falls under a larger group of processes called lenition, in which a sound becomes "weakened" for articulatory reasons.

As for aspiration, it is the opposite, which is called fortition. The sound becomes stronger. You'll need to remember this for when I talk about syllable structure, because both fortition and lenition play into it somewhat.

Back on track - the phoneme /t/ actually has more than two allophones - [t], [tʰ], [t̚], and [ɾ]. (There are actually more than this, but I'll get to this later)
Member Avatar
Christian. Exterminator of Spammers.

Phonological processes

Last time I introduced the concept of the phonological process, but now I'll explain it a bit. It's how a phoneme becomes an allophone. Sometimes nothing happens at all! This would result in an elsewhere form (not sure if this is an actual linguistic term, but it's a helpful and accurate one nonetheless) - sound X becomes Y in environment A, Z in environment B, and X elsewhere.

I mentioned lenition and fortition last time. There are actually several processes that fall under these two umbrellas.


There is flapping, which I mentioned last time, but this isn't the most common one. It does happen with /t/ and /d/ in English, and often happens with /r/ in languages that have it. This has a tendency to happen between vowels.

There is what is called spirantisation, where a plosive or affricate becomes a fricative, typically between vowels. We have a bit of this in English, but there's actually more going on than just simple spirantization, and I'll bring examples of this up later. An example of pure spirantisation would look more like this:

/bat͡sa/ -> [basa]

/baga/ -> [baɣa]

Intervocalic voicing happens when a voiceless sound takes on voicing between two vowels. It's common, and actually, Old English had it - the remnants of it can be seen in very old words where the second vowel has since become silent, as in "knife" vs. "knives."

Example would look like this:

/kasa/ -> [kaza]
/tupo/ -> [tubo]

There's also post-nasal voicing. This doesn't happen in English (or any Germanic language for that matter) but it does happen in some languages - an underlying voiceless obstruent will become voiced after a nasal.

/sompa/ -> [somba]


Fortition processes are less common in general, and many of them happen to sounds over time rather than in active usage. However, there is one particularly common fortition process called final devoicing. It's pretty self-explanatory - you get an underlyingly voiced segment that becomes voiceless at the end of a word. An example from German:

/hand/ -> [hant] "Hand" (gee, I wonder what this means? :P)

It also happens in Slavic languages. An example from Russian:

/ɔlʲɛg/ -> /ɐlʲɛk/ "Oleg" (bloke's name - this name will come up as an example later as well)

Some languages have post-nasal fortition; while the lenition process usually deals with voice, this process deals with continuancy, and usually gives it the ol' heave-ho. :P Examples I've seen generally follow a pattern like this:

/kumva/ -> [kumba]
/kinzi/ -> [kindi]
/nrasa/ -> [ndasa]

Place Assimilation

Now these two process types could both be regarded as assimilation of features of one sound to another, in terms of continuancy (or lack thereof) or voicing. But there's also place assimilation. Although consonants will interact with one another from time to time, the most common triggers are actually vowels. A very common one is palatalisation, where the tongue is moved towards the hard palate. In some languages this is actually contrastive, such as East Slavic languages (Russian, Ukrainian, etc.) and a number of Uralic languages (not Finnish, but still including Estonian to a degree), but in others it is actually an allophonic process. Japanese is one of the easiest examples to explain, where underlying /si/ will surface as [ɕi]. This also happens in Polish (exactly the same) and Korean (slightly different, as the surface form is [ʃi] - untrained anglo ears wouldn't really be able to tell the difference ;)). This is why L1 speakers of Japanese and Korean struggle to differentiate "see" from "she" when learning English at first.

Velarisation occurs when the tongue is pulled back to the soft palate (velum). Aside from languages like Irish Gaelic and Russian, though, where velarisation is actually contrastive this usually only happens with /l/, but is it ever common! Usually what happens, is that /l/ will surface as [lˠ] when at the end of a syllable. Now in some dialects of English (including many North American dialects), [lˠ] is actually the elsewhere form, if not the form that occurs everywhere. In Queen's English, though, [lˠ] will only occur after back vowels at the end of a syllable.

Labialisation will typically occur in the environment of a rounded vowel, and usually before it. The labial feature of a vowel will be passed back to the preceding consonant, resulting in a secondary articulation. This is actually not all that common, as usually when labialised consonants are talked about, they are contrastive.

Other contiguous assimilation processes

Not all voicing assimilation processes happen because of lenition. In Russian, there is an intuitive process where the second consonant in a cluster determines the voicing of both.

/futbɔl/ -> [fudbɑlˠ]

English has this going in the opposite direction, but it's only at the boundary of word parts. (I'll talk more about this in morphology)

/tɑkd/ -> [tʰɑkt] "talked." (Canadian/Western American/North Central American - the process is the same in other Englishes, but the vowels are different!)

These two actually demonstrate another point. Assimilation can move in either direction. When it moves towards the end of the word (as in the English examples), it is referred to as progressive assimilation. If it moves towards the beginning of the word (as in the examples from Russian or Japanese), it is called regressive assimilation. I find regressive assimilation is more common, but progressive isn't exactly rare.


Dissimilation usually happens in morphophonology, where a sound becomes less like its preceding sound for ease of pronunciation. More on that later.


When a sound is inserted, usually a vowel into a tough consonant cluster, for ease of pronunciation. Doesn't happen quite as much in English, although the "e" (an epenthetic schwa [ə]) in kisses to separate two alveolar fricatives is one example.


When any two sounds are reversed for ease of pronunciation. This is more common in English than one might think, especially dialectally. For example, the oft-lampooned North American (especially Southern American) pronunciation of nuclear, that is, [nukj̊əlɚ], is actually metathesis - the more "accepted" pronunciation is [nukl̥i(ə)ɹ] (so you have not only metathesis of the , which is shortened to [j], with the [l], but also epenthesis of a schwa after the fact). Now in Hebrew, metathesis is an active part of the verb system, with the reflexive hitpael construction metathesising the first root consonant with the last consonant of the prefix hit- if the root consonant is coronal. In the SENĆOŦEN language of the Greater Victoria area in BC, metathesis in and of itself has a grammatical meaning, marking the "actual" aspect (similar to the English present progressive).

Next lesson will feature vowel harmony and a few other long-distance processes, plus a discussion of what's called "phonotactics."
Member Avatar
Certified Mutant

Yay, I'm glad to see this thread is back :D .

Interesting stuff - and I'll look forward to the next lesson!
Member Avatar
Christian. Exterminator of Spammers.

Vowel Harmony

I could talk about this one for a while, but I'll keep it simple. Vowel harmony is a process where all vowels in a word or word root will share one or more specific feature. This is typically binary, and the most common types of it are backness (front vs. back vowels), height (high vs. non-high vowels), roundedness (rounded vs. unrounded vowels), and tongue-root position (advanced vs. retracted; commonly called ATR harmony).

Backness harmony is common in languages of northern Eurasia, especially Uralic and Turkic languages. Standard Estonian is the only Uralic language known to have completely lost its vowel harmony, but the most comprehensive system is found in Finnish, where [a, o, u] and [æ, ø, y] cannot occur in the same word root. [i, e] are considered "neutral vowels" as they do not have a back counterpart, however, when a suffix attaches to a word of all neutral vowels, it will take the front-vowel surface form if it exists. Say for example, the inessive case suffix -ssa/-ssä will be -ssä on a word with all neutral vowels, but -ssa if there's even a single back vowel present in the root. Now, in the case of a compound word, the very last root will determine how the suffix harmonises. If the last root is front or all neutral vowels, the suffix will come out with the front suffix. If it has back vowels, or neutrals with one back vowel, it will come out with the back suffix. I yanked a list from Wikipedia for examples:

(B) kaura → kauralla
(B) kuori → kuorella (ignore the i → e change; I'll come back to that later)
(N) sieni → sienellä (again, because the root has all neutral vowels)
(F) käyrä → käyrällä
(B) tuote → tuotteessa (something else going on here; will come to it later)
(F) kerä → kerällä
(B) kera → keralla

Roundedness harmony is scattered, but is perhaps best known from the Turkic languages, where it interacts with backness harmony. It is only the high vowels that are directly affected by the roundedness harmony, meaning that [i, y] and [ɨ, u] cannot occur in the same root. As with Finnish and most other vowel harmony languages, compounds will buck the trend and any affixes will harmonise to an adjacent root.

ATR harmony is mainly found in Africa, but occurs elsewhere as well. According to Dr. Rod Casali (my advanced phonology professor who is a relative expert on this system), ATR languages have a strong tendency to have 7-vowel or 9-vowel systems, with [a] being the ATR-neutral vowel. With this said, the typical ATR contrasts would have [ɪ, ʊ, ɔ, ɛ] vs. [i, u, o, e] in a 9-vowel inventory (and from what I remember of Dholuo from when I studied it, this is their vowel inventory). 7-vowel systems will typically drop the mid, [+ATR] pair, leaving two neutral vowels so no [e] or [o]. But some languages do other things. It could be argued that the "pharyngeal harmony" in Mongolian is ATR harmony.

Height harmony is a bit rarer and can be trickier to explain. In feature-based phonology, there are binary features [± high] and [±low], meaning that it is both theoretically possible, and actually the case, that a vowel can be [-high] and [-low] at the same time. (These are the mid vowels, categorised as open-mid/close-mid and sometimes just mid in phonetics.) However, as Dr. Casali pointed out to us in AP class, harmony where the harmonising feature is the [±low] feature, while theoretically possible, does not actually exist. So it would actually be high vowels and non-high vowels (lows and mids) being separated. I've heard Nez Percé (a Penutian language of the Pacific Northwest), Coeur D'Alene (a nearly extinct Salishan language of northern Idaho) and a few African languages, among them Rwanda-Rundi, cited as examples of height harmony.

There are also nasal and rhotic harmony systems - the latter is incredibly rare, only being attested in the recently-extinct Yurok language of northern California.

Syllable structure and phonotactics

You'd be amazed how much diversity there is in syllable structure. Cross-linguistically, there is a whole host of different combinations of permissible syllable structure, ranging from languages that only allow "CV" syllables (that is, one consonant and one vowel) to languages like Nuxalk and certain Tamazight languages that don't even have syllabic sonorants in some words - in Nuxalk, [ɬχʷtʰɬt͡sʰxʷ] "you spat on me" has all voiceless obstruents!

Now any syllable will have at least an onset and a nucleus. The nucleus is generally what helps one perceive a syllable, and in the vast majority of cases, it will be a vowel. (Even those which aren't, which are sonorants in most cases, are understood to "have a vowel," sometimes transcribed with a schwa [ə]. Many languages do allow coda consonants, but many do not. Even those that do sometimes limit how large a coda can be and/or what consonants can go in it. Onsets and codas can be simple (one segment, and in the case of an underlying starting vowel, that will almost always be a glottal stop [ʔ]) or complex (more than one segment - in the case of codas this is surprisingly rare although English is notorious for it).

Most languages allow only simple codas; some of them (Finnish for example) allow two in certain scenarios, and a scant few allow more. English is peculiar in that it allows fairly large codas, sometimes up to four consonants. Consider the word strengths [stɹ̥ɛŋkθs] for example. Already odd for having a large onset, the English phonological rule of putting an epenthetic (inserted) voiceless plosive between a nasal and a voiceless obstruent of a different place of articulation leads to a four-sound coda, very rare even in English. In fast speech for a number of English speakers, though, this is dodged by either assimilating the nasal to the place of the obstruent (give us [stɹ̥ɛnθs]) or deleting the obstruent after insertion (giving us [stɹ̥ɛŋks]).

An example of a very restrictive language for codas is Japanese. Only nasals (underlyingly /n/ and assimilating to the following onset) or geminate consonants (basically double-length consonants) can end a syllable, and only vowels or nasals can end a word. (Japanese folks get around this to some degree with using voiceless vowels when trying to pronounce foreign words that end in a consonant.) Finnish, on the other hand, does allow syllables to end in just about anything, provided it's not at the end of a word - only vowels or alveolar consonants (obstruents or sonorants) can end a word.

Is there rhyme or reason to this? Absolutely. With the exception of some languages of the Pacific Northwest and the Caucasus Mountains, languages tend to base what they can do with a syllable - their phonotactics - on what's called the "Sonority Sequencing Hierarchy." (Will talk about this in more detail later)( The most sonorous thing in a word will be the nucleus, and can only be followed by something with equal or less sonority. The usual ranking, from most to least sonorous, has traditionally been vowel - semivowel approximant ([w], [j], etc.) - other approximant ([l], [ɹ], etc. - trill, nasal - fricative - affricate/plosive. The behaviour of /s/ in many languages, though, has led me to believe that sibilant fricatives are actually less sonorous than plosives or affricates. This isn't the only thing that contributes to phonotactics, because there are some perfectly SSH-compatible sequences of sounds that English speakers have trouble with when learning other languages, that are not in native English words. One such sequence, /ʃt/, has actually become part of English phonotactics because of the long-standing influence of German and Yiddish. But take the Bulgarian word for "mayor," kmet. Most English speakers would pronounce this kuh-MET [kʰəmɛt]. It is pronounced as a single syllable in the original language. ([km̥ɛt]) Inversely, English's large codas are problematic for speakers of most other languages and will insert vowels to compensate. English also doesn't like having two plosives in a row in one syllable, and will either insert a vowel between the two (as in trying to pronounce the Russian word for "what," kto, properly) or delete one (usually the first) plosive entirely, as in "ptarmigan." Nasals also contribute to this, as in the word "tmesis" or the Norwegian name "Knut," or in codas, the word "hymn."

For the same reason English codas are hard for speakers of many languages, English speakers have real trouble with the vowelless words of Salishan languages, or even the seven-consonant onsets of Georgian (there's actually been a study or two done on how Georgians cope with this).
Member Avatar
Certified Mutant

Nice work - that's certainly something I'd never thought about at all. It's especially interesting to read about aspects of English which are very uncommon in other languages around the world!
Member Avatar
Christian. Exterminator of Spammers.

Of which there are many. :P Oh man just wait until I get to syntax. :lol:

Syllable weight and stress assignment

Oftentimes, how a syllable is composed will also determine how stress is assigned in a word. Now sometimes, primary stress patterns much more regularly - take Finnish or Hungarian, for example, where primary stress is word-initial, Polish, where stress is on the penultimate (second-to-last) syllable, or Permian languages (Komi, Udmurt), where primary stress is word-final. But you'll get languages where primary stress patterns according to syllable weight; one such language is Latin.

So what determines syllable weight? In Latin, for example, it's all in the vowels. Latin distinguishes vowel length; in a word with all short vowels, the antipenultimate (third-to-last) syllable is stressed, but in a word with at least one long vowel before the end, it's the penultimate syllable. When measuring syllable weight, the unit used is the mora, marked in linguistic notation by the Greek letter mu. (μ)

While it doesn't do this for primary stress, Finnish secondary stress is weight-based. Finnish also distinguishes short and long vowels (and consonants as well, but this is inconsequential to stress), and while the primary stress is always initial, it is every second mora that gets secondary stress. (It is worth noting that Finnish secondary stress can be hard to perceive in fast speech!)

Of course, then there's English, which does have some stress rules, but they're too complex for the scope of this lesson and might as well be a form of "lexical stress," where you just have to memorise it. A language that has legit lexical stress is Russian.


Now we get into those pesky pitch-related things. Let's start with tone. Tone can be defined as "variant pitch that results in change of the core meaning of a word or word root." You'd be surprised how many languages actually have tone. Conservative estimates suggest that half of the world's seven thousand plus languages have tone, and I've heard estimates as high as 70%. When most people think of tone, they immediately think of the Chinese languages, and perhaps rightly so - Mandarin is the world's most spoken mother tongue, and it is a tonal language. But you'd be surprised to know that a large number of tone languages are actually spoken in Africa; also, a number of Central American/Mexican indigenous languages have tone, and a smaller number of indigenous languages in Canada and the USA, most notably in the Na-Dene language family (Navajo, Apachean languages, Gwich'in, many Dene languages of British Columbia).

How does tone work, exactly, though? Like stress and intonation (which I'll bring up later), it is suprasegmental, meaning that it affects more than a single sound. This may seem odd to people who are only familiar with a Chinese language, but hear me out. Tone doesn't always attach to sounds, per se. It moves parallel to word parts, or morphemes (remember this term, because I'll use it a LOT when I get to morphology). Tones tend to work a little differently in so-called "Asian-type" tone systems than they do in so-called "African-type" systems, too. I'll expand on this as I go along.

Tone starts with registers, and that's all some languages actually have. The simplest register systems have two tones, high and low. Languages with three registers are common as well, with a mid tone or neutral tone involved; as far as registers go, Mandarin falls into this category. There have been languages analysed as having five registers, though - extra-high, high, mid, low, and extra low!

There are also contours. In African tone languages, these are often analysed merely as sequences of register tones, but this is a little trickier to do with Asian tone languages because word parts are generally monosyllabic, whereas in African tone languages they very often aren't.

Regardless, my tone professor gets irked when people ask the question, "how many tones are in your language?" The question probably stems from a knowledge of Chinese prescriptive grammar, where this is is the terminology used. Said professor prefers the term "tone melody" in this context, even for an Asian language. There are a number of factors that play into African and Western Hemisphere tone languages that are the reason for this.

If you were to analyse tone as parallel to Mandarin, for example, you'd think the Mende language of West Africa had an absurd number of tones, because there are high, low, rising, falling, and rise-fall tones, which can change absolute pitch as one progresses in a word. However, there's a method to how these tones are distributed. There is an underlying melody that goes with a morpheme, and it is assigned to tone-bearing units (which are typically vowels, but as I found when I studied Chumburung, they can be coda consonants as well) in a fixed pattern - in Mende, it is one tone per TBU from beginning to end until the last TBU, then all remaining tones in the melody are assigned to the last TBU. If there is only one tone in the melody, it is assigned to all syllables; in the case of a two-tone melody in a trisyllabic morpheme, the first tone is only assigned to the first syllable and the second to the other two. I forget what each of the words means, but I'll use the actual attested sequences /mba/ and /ɲaha/ as well as a hypothetical /kiguɾu/ to demonstrate the five melodies that exist in the language:

L - [mbà], [ɲàhà], [kìgùɾù]
H - [mbá], [ɲáhá], [kígúɾú]
LH - [mbǎ], [ɲàhá], [kìgúɾú]
HL - [mbâ], [ɲáhà], [kígùɾù]
LHL - [mbã]*, [ɲàhâ], [kìgúɾù]

* - I used a tilde here because there's no diacritic for the combination of a hacek with a grave accent on my keyboard. :P

But then there's this pesky little thing called downstep. Tones, within a melody or otherwise, influence one another. In most cases, it's the lows that have the influence, pushing everything slightly lower every time they surface. It was a couple of tonologists, one of which was my tone professor, Dr. Keith Snider, that had to come up with a new theory of tone analysis just to explain it. If you're interested, you can look up "register tier theory." But within an utterance, low tones will pull the overall pitch of any following lows and highs down a notch. (Obviously this has its limits from a purely mechanical point of view!) This happens in African and Western Hemisphere tone languages, and on rare occasion even Asian tone languages.

There are two types of downstep as well. Automatic downstep is when the tones initiating the downstep are there to be heard. Non-automatic downstep, on the other hand, means there's something more going on. Occasionally there will be a morpheme whose segments will delete for whatever reason, leaving just the tone behind. But since it can't attach to the TBUs of another morpheme, it just "floats" there. (Yes, the technical term used for such a tone is floating tone. ) But its effects are felt. Floating low tones cause downstep, even if the tone itself is not actually pronounced. So if one has a high followed by a "mid" in these languages, the best practice is to start looking for anything that could've left a floating tone! Now here's a real kicker for you - sometimes, a floating tone can be a morpheme in and of itself. The majority of the time, this tone is low. Non-automatic downstep happens primarily in West Africa.

Much rarer than this is upstep. For whatever reason, low tones trigger upstep. I'll have to talk with Keith to get my facts straight on this, but the only language I recall this happening in from memory is Krache, which is a close relative to Keith's language of study, Chumburung. (He was a Bible translator in Ghana.)

Moving back to Asian tone languages for a moment, very often, low tone is accompanied by creaky voice.

Languages with tone include: the Chinese languages, the entire Tai-Kadai family (Thai, Lao, and those related), Vietnamese and some of its relatives (not Khmer, though), some Tibeto-Burman languages (including Tibetan, Burmese, and Dzongkha), Punjabi and a couple of minority languages fairly closely related (the only Indo-European languages with full-on tone, btw), the entire Hmong-Mien family, the entire Oto-Manguean family (now spoken exclusively in Mexico), Nilotic languages like Luo and Dinka, a large number of Niger-Congo languages (including major ones like Bambara, Igbo, Yoruba, Lingala, Zulu, Sesotho, Setswana, but not Swahili, Fula, or Wolof), about half of the Na-Dene languages (see above), Iroquoian languages (primarily spoken in Ontario and Quebec, but also in New York), Chadic and Omotic languages (most notably Hausa, a major language of Niger and northern Nigeria), and many languages of Papua New Guinea. This is by no means an exhaustive list.

Next lesson will include intonation, which unlike tone, doesn't change the core meaning per se, just adds nuance. Not only that, but intonation spreads out over an entire utterance.
Member Avatar
Christian. Exterminator of Spammers.


This is always a tricky one to explain because there is so much variation. But intonation is the use of pitch at the level of the utterance to add nuance to the overall meaning of the idea. In some languages, intonation works in more or less fixed patterns, while other languages can use it with more versatility. In Finnish, which has freer word order than English due to its extensive case system (I'll touch on this in morphology and syntax), shifting word order or adding discourse particles is often used where we in English would use changes in intonation.

Here's an example of English intonation changing the overall meaning of an utterance:

"You went to the ball and didn't even think to invite me?" is going to be our sample sentence. ;) Italics will indicate rising pitch and/or volume to denote a focus on that word via intonation.

You went to the ball and didn't even think to invite me? -> Speaker implying that subject did something that others didn't.
You went to the ball and didn't even think to invite me? -> Speaker implying that subject wasn't going to go, but did anyway.
You went to the ball and didn't even think to invite me? -> Speaker implying that subject was thinking of possibly going someplace else, or that the ball was very important to him/her (speaker, that is).
You went to the ball and didn't even think to invite me? -> Speaker really offended that subject didn't think to invite him/her. Possibly implying that somebody else invited him/her.
You went to the ball and didn't even think to invite me? -> Not much different than above, but perhaps a bit angrier, and no implication that someone else did.
You went to the ball and didn't even think to invite me? -> Speaker implying that subject might have been thinking about something else, such as excluding speaker, or something completely different.
You went to the ball and didn't even think to invite me? -> Speaker implying that subject invited others to his/her exclusion.

Now yes-no questions in English have a distinct intonation (even though this actually varies between North American Englishes and other Englishes). Questions with an interrogative word (this is considered a pronoun in the case of "what" or "who(m)" and an adverb in the case of "where," "when," "why," and "how" ) in their basest form use a statement intonation. Some non-standard questions like "who did what where to who(m)?" may have a question intonation.

There is a link between intonation and certain grammatical functions. Many Indo-European languages have a "question intonation." (Not sure if they all do - I haven't researched this that much :P ) But some other languages instead mark questions with a particle or an affix. Finnish is an example of this, where the first word of an interrogative utterance takes the -ko or -kö marker depending on vowel harmony.

Some languages, like French, have distinguishable list intonations as well. When you are listing things, there is a rising intonation for each item except the last one, which has a falling intonation.

Back to English. Intonation is a very versatile thing in this language. :lol: Sometimes stretching a word out in length (which wouldn't work in languages that have actual contrastive vowel length, such as Finnish - wouldn't've worked in Old English for this exact reason) adds emphasis. Also using pitch height or vowel length to indicate surprise is common, especially with the word "what," where a flat high pitch indicates sheer shock, but a high-falling pitch is usually substituted for the full sentences "what's your problem?" or "what the crap are you looking at?" :P Sometimes even epiglottalisation (aka growling) of a word, as well as a rise in pitch (it can be very slight) can indicate anger or disgust with a particular subject. That's part of intonation.
Member Avatar

I find when learning Chinese that some of the sounds are pretty hard to pronounce
Offline Profile Quote Post Goto Top
Member Avatar
Christian. Exterminator of Spammers.

Apr 18 2017, 05:16 AM
I find when learning Chinese that some of the sounds are pretty hard to pronounce
Three possibilities I can think of as to why. (This also depends on which Chinese you're talking about. ;) )

1. Segmental similarity issues. Sometimes it's the sounds that sound closer to our own that we find harder to pronounce. Mandarin, for example, has [ɕ] ("x" in pinyin) and [ʂ] ("sh" in pinyin), both of which sound, to an untrained native English-speaker's ear, like [ʃ], our "sh"-sound. Even those who can hear the difference can't always consistently replicate it at first. I'm having this issue with Russian. :P

2. Segmental difference issues. The Chinese languages have a couple sounds that English does not.

3. That pesky tone. :P
Member Avatar
Certified Mutant

Yeah, intonation is certainly versatile. It's just a pity the nuance it provides often seems to get lost in text-only communication!

(I suppose it is possible to use italics to substitute for it, but it seems a lot of people don't think to do this :( )
Online Profile Quote Post Goto Top
Member Avatar

It does sound like a x or sh sound when heard pronounced, it ain't that bad to pronounce. it can just perform a tongue twister mid sentence... practice makes perfect though right?
Member Avatar
Christian. Exterminator of Spammers.

It's just a case of tripping over one's tongue because one is saying similar sounds in too rapid a succession. Either you end up assimilating one to the other (which can prove embarrassing), or you derp up completely.

My next (and last) segment on phonology will be about something I missed, which is the sonority scale; after that, it's on to morphosyntax.
Member Avatar
Christian. Exterminator of Spammers.

Okay, I took way too long to get this back going again.


Now this is a bit complicated, and some parts are actually controversial. It does have to do somewhat with syllable structure, though. Languages whose syllables are limited to CV or CVC at most don't have this issue, but every other language does. The Sonority Sequencing Principle was devised to answer the question, "Is there a logical order in which different types of segments are ordered in a single syllable?"

Now scroll back to where I talked about syllable structure if you forgot what onset, coda, and nucleus were. ;) Although the onset and the coda aren't treated the same with regards to syllable weight (in most languages, anyway - I had a theory at one point where Estonian perhaps did due to an advanced phonology project I was doing), what they are the same as, generally speaking, is in the pattern where, the farther a segment is from the nucleus (which is typically a vowel), the less sonorous it is.

The initial sequence I was taught was plosive/affricate < fricative < nasal < lateral/trill < approximant < vowel. Upon further investigation I think this is actually wrong, or at least insufficient. While plosive-fricative onset beginnings are found and fricative-plosive coda endings are quite common, I find that in onset position, fricative-plosive sequences are more common. This is especially true of sibilants. In English, we seem to have issues with things that start with /ts/ (affricate or not) like the place name "Tsawwassen" or the original pronunciation of "tsunami," and so we often delete the opening /t/ (I don't, but I have enough linguistics training that my native speaker intuition has been somewhat compromised :P ) , but /ʃt/ (as in "stein" or "schtick"), even when it is only found in more recent borrowings, comes as easy as 1-2-3, and is quite a common sequence in those languages that do allow complex onsets (more than one consonant).

In short, I'd actually argue that fricatives, from a standpoint more consistent with what I see cross-linguistically, are actually less sonorous than plosives, even though from a purely phonetic point of view they are more so because they involve continuous airflow either through the mouth or (in the case of nasal consonants) through the nose. Just ftr, ejectives, implosives, and clicks are all treated as plosives in this, because they pattern the same way.

And here's where things get controversial. Certain languages have words that seem to either have very unsonorous things as nuclei, like fricatives in Berber languages, or even seem to have no nuclei at all, such as in some Salishan languages. How then do we determine what is actually a syllable? The only plausible answer I've heard is to rely on L1 speaker intuition. On another note, the people that came up with this probably studied a huge corpus of words. One never really sees a segment from every single sonority class in the same word, because of phonotactic limits on the number of consonants in an onset or coda. Maybe in Georgian onsets, but those buck the whole thing to a degree. :P When it comes to "linguistic universals" (and I'll talk about this when I FINALLY get to typology), most so-called universals are more tendencies than absolute rules. You'll always find a language that "sticks it to the man!" :lol:

Okay, now I need to go back and edit the syllable section, because I realised I missed something. :P


Morphosyntax is actually a blend of two things - morphology, the study of wordforms, and syntax, the study of sentence formation. Why they are more and more commonly grouped together is that the relation between the two has become clearer and clearer as more languages become documented and studied.

Here's the thing - languages that make greater use of morphology have far fewer syntactic restrictions, while languages that make very little use of morphology have very rigid syntactic rules. There's actually a scale to determine this, but first, one needs to learn about the building blocks of words - morphemes. The scale entirely depends on morphemes, which are the smallest units that can carry meaning of any sort. And morphemes aren't restricted to just things that can stand alone as words. They include such things as affixes, clitics, "tonemes", "chronemes", or even morphophonological processes such as reduplication, stem vowel changes, or weak suppletion. On occasion, you'll even get strong suppletion, which sees two words with the same core meaning but different inflections look completely different.

But back to the scale for a second. There are rough boundaries as to where each of the categories lie, but one could see two of these four as extremes of the other two. Languages such as modern English that have a lower morpheme-per-word ratio on the whole and rely more on syntax are classified as analytic languages. The extreme of this more base category is the isolating language, which has no inflectional morphemes used to denote grammatical relations. The example I often see given for a purely isolating language is Vietnamese. Mandarin is still isolating in the sense that it doesn't use inflectional morphemes, but it is starting to trend away from being isolating because of how much it (and indeed other Chinese languages) are using compounding.

But then you have synthetic languages, which many Indo-European languages still are to one degree or another, which rely on inflectional morphology to some extent to denote grammatical relations. Some have more rigid syntax (like French for example) while others have freer word order (Finnish, Baltic languages, to a lesser extent East and West Slavic languages). But there's actually another split within the ranks here - while morphemes-per-word is the main component of the scale, synthetic languages (and even analytic languages that still use inflectional morphemes) can be described as being either fusional or agglutinative, depending on the meanings encoded in a single morpheme. English is very fusional, for example, and Indo-European languages tend to have a high degree of fusionality. It's a spectrum, though, and not a hard bipolar system. Finnish, for example, is "somewhere in the middle," having some agglutinative morphemes, but other fusional morphemes (the best example of the latter in Finnish is the number-person and tense-aspect-mood affixes - more about these things later). Some more extreme examples of agglutinative languages include indigenous languages such as Na-Dene, Algic, and Salishan languages, and also languages of the North Caucasus.

Next time around: Inflection vs. Derivation, Free vs. Bound, and different morphological processes.
Member Avatar
Christian. Exterminator of Spammers.

Some Important Terms

Before I get into the discussion about different kinds of affixes, it's important to make a distinction between root, stem, and base. I honestly had to look this one up again, because there is so much seeming overlap between the three (especially in English) that sometimes it's a fuzzy distinction to make.

A root is the absolute core of a word, and sometimes it has the same surface form as a base. But a base can also include derivational affixes, which I will talk about in a moment. A stem is simply the form of a word that can be inflected for grammatical purposes. Now a stem has to have a lexical meaning, ie, it has to have meaning standing alone. Not all roots, or even all bases, are stems for this exact reason. With this in mind, let's move ahead.

Inflection vs. Derivation

When dealing with morphological processes, there are categories within categories. The first one I'm going to talk about is function-based. Inflections add grammatical/discursive information while adding very little if anything to the actual core meaning of the word. They also don't typically change the part of speech of a word. (The most controversial "inflectional" category is the gerund, which while not changing the core meaning, does actually change the function of a verb so that it will function nominally.)

In English, we still have a bit of inflectional morphology, as opposed to languages like the Chinese languages and Vietnamese, which don't have any at all, or Japanese, which uses separate words to mark inflection. English is odd in that it actually marks third-person singular on verbs -(e)s, which in most languages that use inflection is either completely unmarked, or the least marked form. Aside from our irregular forms, most of which are from older words, -(e)s is also the noun plural, but in spite of having the same shape, it is considered a different suffix from the third-person-singular of verbs.

Another such "homophonous set" of suffixes is the present participle -ing and the gerund -ing. While both verb forms, the present participle serves a markedly different function. Gerunds can be modified by the sorts of things you would expect to modify a noun, such as the plural -s (never taking the alternate form -es because in ends in /ŋ/ - more on that when I go into morphophonology), articles, and adjectives. Participles, while you can use them as adjectives, serve verbal purposes, and when used adjectivally are basically the equivalent of a super-reduced relative clause - "rolling stones" and "stones that/which roll" mean basically the same thing.

We also have our "regular past tense" -ed. Our irregular past tenses are from older verbs and include an alveolar ending and/or a stem change.

By linguistic definitions, -'s is not actually a suffix but a clitic. I'll talk more about those later.

There's also the less common but still somewhat productive -en past participial suffix. A lot of past-parts these days are -ed instead.

Finally, there are the two degree suffixes for adjectives: -er for comparative and -est for superlative.

Many languages have more complicated inflectional systems, with affixes for case (grammatical/adverbial function), gender/noun class, and animacy in nouns and adjectives, and tense (timing), aspect (state of action), mood (reality vs. intent/wish), negation, and number/person, in verbs.

Derivation, on the other hand, makes a change to the core meaning of the word, and typically - but not always - changes the part of speech. While English inflectional affixes are exclusively suffixes, derivational affixes can be prefixes, although obviously there are derivational suffixes as well, and even one derivational circumfix (think prefix + suffix together) as in the case of en- -en, as with "enlighten." The primary adverbial suffix -ly is also derivational.

In English, there are derivational suffixes that can change:

Verbs - into other verbs (ex-, in-, re-, un-, mis-), into nouns (controversially the gerund -ing, -(at)ion, -ance/-ence/-ancy/-ency, -or/-er, -ist, -ism, -ment, -age), and into adjectives (-ant/-ent do count even though they didn't in the original Latin, as they were present participle markers, -able/-ible).

Nouns - into other nouns (anti-, -phobe, -phobia, neo-, proto-, mis- (rare), dys-, -ite, -ist, -ian, -age), verbs (-ise, -ate, de-, un-, dis-, -ify), and adjectives (-ful, -less, -y, -like/-ly (not the same as the adverbial), -al, -ous, -ic/-ac/-iac, -ish).

Adjectives - into other adjectives (-ish), into verbs (en-/em-, -en, en- -en/em- -en, -ise, -ify), and nouns (-ness, -ity)

Next post has morphological processes - THE BIG TEN! 8D
Member Avatar
Christian. Exterminator of Spammers.

Morphological Processes

Now how do languages build words? Obviously they have roots and bases and stems and everything like that, but there are also processes that occur to bring about the final word forms. In English we have prefixes and suffixes, ONE circumfix :P, and some stem changes. These are only a few of the possibilities. The main morphological processes are referred to as the "Big Ten" by Payne (2006).

1-4. Affixation
1. Prefixation
2. Suffixation
3. Infixation
4. Circumfixation

5. Reduplication
6. Transfixation
7. Stem change
8. Autosegmental variation
9. Compounding
10. Deletion/Subtractive morphology.

Now affixation comes easy for us English speakers. We do it all the time even in our analytical language, as the examples in my previous post show. Although we do have our fair-share of prefixes and our one circumfix going "EN-YAY ME-EN!" in the background :P English is largely a suffix-heavy language. You can draw a comparison with a language like Finnish, whose affixes are almost completely suffixes, German, which has a much richer overall mix, or, to the other extreme, a Na-Dene language like Tsilhqut'in, which is very prefix-heavy. There are actually typological tendencies that each sort of language mixes up with. But I digress.

So we know what a prefix is. You take your root/base/stem do, you tack on un-, and you get the new base/stem undo. Not exactly rocket science. In languages like the aforementioned Tsilhqut'in and relatives such as Navajo and Dakelh, there will be whole sequences of prefixes attached to the root, and there's a logic to them that is unique to the language or family, aside from the typological universal (I think) of derivational suffixes/prefixes/circumfixes always being closer to the root, and inflectional ones being farthest away.

How about a language other than English for suffixes, though? Indo-European languages, even the most analytical of the lot, have a number of them. Take Russian, for example. Its case system (which I'll talk more about after I've touched on syntax) has a whole host of suffixes to go with it. Let's do something easy to start. The word for "book" is kniga, which is a feminine noun, complete with its set of case endings. Here are the singular forms:

nominative (subject): knig-a
accusative (direct object): knig-u
genitive (possessor): knig-i
dative (indirect object): knig-e
prepositional: knig-e
instrumental: knig-oj

It gets more complicated after this, so for the purposes of this thread, I'm done with this example.

Infixes are common in Austronesian languages (especially in the Malay Archipelago). They're a bit odd in that they go inside the stem. In Ilocano, for example, the infix -in- marks the perfective aspect - the completeness of an action - which is typically used to express the past tense. So take the root patay for example. It actually means "death" when used alone, but one can add verbal affixes to it, and pinatayko means "I killed (something)." There are a few others in Ilocano.

Here's where I have to make a note here. You know the phenomenon where you can insert one word (usually a vulgarity or invective) into another? Such as fan-flipping-tastic? This "expletive infixation." isn't true infixation, because it's inserting an entire standalone word, and where it goes is determined by stress, whereas in typical infixation stress really has no bearing. It's closer to what's called tmesis, where parts of a semantic word are morphophonologically separated by another semantic word. English periphrasis and German separable prefix verbs are other examples of tmesis. Okay, away from the rabbit trail. :P

Circumfixes are harder to pin down, because sometimes a combination of a separate prefix and suffix can be misconstrued as a circumfix, and sometimes elements of the "circumfix" can change for non-phonological reasons, leading some to analyse it as a separate prefix and suffix. An example of the latter is in German, where ge- -t and ge- -en are the two most common forms of the past participial affix - since ge- never changes, some linguists do analyse it as being separate affixes. The same can be true of the superlative degree "circumfix" of certain Slavic languages and Hungarian, since the suffix part of it, used alone, forms the comparative, and a prefix on top of that forms the superlative.

Less controversial circumfixes can be found in Berber languages, where they are used to create feminine forms, or in languages like some Arabic dialects, Guarani, and Chukchi, as negation.

Reduplication is a process where all or part of a word is duplicated to render grammatical meaning or some other nuance. In terms of stricter grammatical reduplication, this is most common in Austronesian languages and in indigenous languages of North America, and it also happens in Greek and Somali. Partial reduplication is more common, since you'd expect most languages that use morphology to this extent to have polysyllabic words! In Lushootseed for example, pastəd "white person" can be pluralised as paspastəd. In Ilocano, reduplication frequently shows up in the verbal system - taking a verb base and using the same kind of partial reduplication forms the imperfective aspect. (Most roots in Ilocano are better classified as nouns when used alone.)

Total reduplication has a variety of uses as well. In Malay languages within Austronesian, it is used for pluralisation. Orang "person, man" is pluralised as orang-orang "people" in Malay and Indonesian, for example. In Halkomelem, there is a "dispositional" aspect, which functions more like an English adjective but is technically a verb (a number of languages actually form their descriptive words this way rather than having a separate class of adjectives), which is formed by total reduplication of a root and means "prone/inclined to do X." Wikipedia provided a nice example of this: [qʷél] "to speak" becomes [qʷélqʷel] "talkative."

Besides having grammatical functions, reduplication can add lexical nuance. We do this all the time in English. First, there's just straight total reduplication. For example, there are contexts where "home-home" is used to distinguish one's family homestead from one's current place of residence. It can be used to distinguish a word's primary sense from a secondary sense "funny-funny vs. 'I-dunno-about-this-bloke'-funny," literal from figurative, and so forth. We also have what's called "ablaut reduplication," which is a mix of total reduplication with a stem vowel change, and generally indicates repetitiveness of an action (often called "iterative" in linguistic terminology). Think things like "jibber-jabber," "wibbly-wobbly," "pitter-patter," "chit-chat," "clink-clank," and so forth. There are other forms of partial reduplication used lexically even in English.

Transfixation is probably the hardest thing to get used to when studying Hebrew or Arabic, or indeed any Semitic language. The basis of verbs and nouns is found in the "triconsonantal root," which is represented by the actual letters in the Hebrew, Arabic, and Ethiopian writing systems. Vowels, which are either written as points, or left out entirely in Hebrew, are grammatical in function and can render either verbs or nouns depending on the root in question. Take the ubiquitous k-t-v root in Hebrew (note: the "v" can change to "b" depending on position in the final word). katav means "(he/she) wrote," and from the same root you get "ketuvim," which is the name of the Jewish Wisdom writings such as Psalms, Proverbs, and Job (in the Christian Old Testament and Jewish Tanakh).

Stem changes are interesting, but also very frustrating for initial language learners. West Germanic languages such as English, German, and Dutch, are infamous for it, because the past tenses of these languages often rely on such things. Oftentimes these are historical remnants of old verbs that at one point had a productive system as its explanation, but now these things are considered "irregular verbs" because it is not morphophonologically predictable given the current sound-scheme of the language. Think, for example, goose vs. geese, or more thoroughly, catch vs. caught. (There's a legitimate explanation for how this worked in Old English, but it's a bit lengthy. I'll spare you for now. :whistle: Also, there are so many historical morphophonological processes that cause the stem change that I'll spare you that as well.)

Obviously this isn't unique to Indo-European languages, or even to vowels. Finnish "consonant gradation" falls under this category as well, with the root lahte- surfacing in its "dictionary form" as lahti "bay", but if the case suffix closes or geminates the second syllable (geminates are analysed as being one segment spread across two syllables oftentimes), the /t/ becomes a [d], in some of their fifteen cases. While the surface form of the partitive surfaces as lahtea (because vowel gradation doesn't occur), you get case forms such as lahden (genitive), lahdessa (inessive), lahdeksi (translative), and lahdella (adessive). This is true of all Finnish voiced stops, long or short. But because of Finnish's general lack of voiced stops (even as allophones only [d] actually exists) and the historical Uralic existence of voiced fricatives, which have all but disappeared from Finnic languages, the patterning seems irregular.

/p/ -> [v~ʋ] (historically it was actually [w])
/t/ -> [d]
/k/ -> Ø (it disappears completely :P)

Autosegmental variation involves a change outside of segments to give a change in meaning, dealing with stress, tone, nasalization, and length, among other things. For stress, English has an entire class of bisyllabic noun-verb pairs where the verb (which in a lot of cases was the original form) is stressed finally, while the noun is stressed initially. Adding confusion to the mix for learners, stress is not represented in the spelling system like it is (at times) in Spanish. Take the written word record. /ɹɪ'kʰoɹd/ is the verb, to put data into a physical or digital medium, while /'ɹɛkʰɚd/ is the noun, which is the resulting combination of data and the medium in question.

In the case of tone, many African languages (Niger-Congo, and also some of those historically classified as Nilo-Saharan and Khoisan) actually mark a change in tense with a "toneme," that is, a tone melody that is otherwise unattached to a tone-bearing unit that has an inflectional usage. I remember observing this when I read up on the Dholuo language (which is Nilotic) in my Advanced Field Methods class, because that is the language that we were working on. The example Payne (2006) gives of nasalisation is from a language of Gujarat in India, where nasality of the first vowel determines whether a pronoun is singular or plural. Length can be used in the same sense as tone in some places in Finnish, and is sometimes referred to in this usage as a "chroneme." The third-person singular form of most verbs is formed simply by lengthening the last vowel of the stem. The illative case also has a chroneme in the singular, where you double the last vowel of a noun stem and add /n/.

Compounding is something a large number of languages do. You take two roots and smoosh 'em together to make a new word stem. :P Given the nature of compounding, it's strictly derivational in nature. More synthetic languages such as Finnish (which has the world's largest palindrome when it's in the nominative case, three roots long) or German can go pretty crazy with this, but you can have polysynthetic languages that put even those two to shame! It's worth noting, as well, that not all compounds are written as such, either as a contiguous word or with hyphens. English actually has to at times, to state in writing what intonation does in speech. Take black bird vs. blackbird. The former has the phrasal head emphasised, so bird. The latter has black stressed. This differentiates between a bird that is black and a particular species of bird.

Even some isolating languages compound. It's about the only morphological process that Mandarin Chinese actually uses (one could make an argument for derivation reduplication in some very specific contexts), doing everything else through syntax and context. But what I was told in Contrastive Linguistics in my undergrad - which focussed on comparing and contrasting Mandarin and English - is that Mandarin loves its two-root compounds. ;)

Subtractive morphology, which is sometimes just called "deletion," is by far the rarest of these, attested in some Nilo-Saharan languages and the Muskogean languages of the USA (originally from The South). This is where one deletes a segment or two that is actually in the root to create meaning. Payne (2006) uses the example of Murle, a Surmic language of South Sudan, where there is no affix for either singular or plural, and the plural is formed by deleting the last consonant, regardless of what it is. In Alabama, a Muskogean language, one can drop the last two segments of the penultimate syllable to indicate that the verb has a plurality of undergoers (transitive object or intransitive subject) in what's called the pluractional aspect. (Not a term I'm going to use very often :lol: )

Besides the Big Ten, there's also the ever-annoying (to prescriptive grammarians) zero-conversion, which means you don't do a perkeleen thing to the word, you just use it as a different part of speech than its prototypical usage! :lol: An infamous example of this is the word derp :herpderp: . Initially an interjection, I have since seen it used (and indeed used it) as a verb, a noun, and adjective (although derpy is probably more common), and even an adverb (although I've never used it in this last fashion)!

Next post: Free vs. bound, and I explain what clitics are.
Member Avatar
Christian. Exterminator of Spammers.

Free vs. bound morphemes

So we've looked at a couple dimensions of morphemes - where they fit in a word (morphological processes) and what their function is (derivational vs. inflectional). There's also the issue of whether or not a morpheme can stand on its own or not. Remember that scale from earlier, as to how one distinguishes isolating, analytic, synthetic, and polysynthetic languages? It's a continuum, based on the average of morphemes per word. Directly correlating with this (and quite possibly caused by it) is the ratio of free morphemes to bound morphemes. The closer one gets to the extremes of isolating languages (Vietnamese, for example), the more the ratio skews towards free morphemes. The opposite is also true. Polysynthetic languages have a very high proportion of bound morphemes, which cannot stand on their own as words. For example, in a typical Na-Dene language such as Tsilhqut'in or Navajo, verb roots are never free. They will always have tense-aspect-mood markers, and typically will also have pronominal markers (person-number) for at least the subject and sometimes even the object.

English, on the other hand, is a good middle ground. It has a good number of bound morphemes, and also a good number of free morphemes. Obviously, suffixes, prefixes, and our ONE productive circumfix :P are bound morphemes. But even if you discount the number of Latinate roots and affixes that became lexicalised together (ie, the parts have no meaning by themselves in English - ones that are more productive are sometimes called "cranberry morphemes" if Wikipedia is to be believed :P), there are still a few roots in English that could be considered bound. Probably the ones we use most are -cracy and -archy, which Wikipedia very annoyingly lists as suffixes. No, they're not suffixes. They're bound noun roots, coming from the Greek words for "power" and "rule" respectively. "Democracy" could be interpreted as two bound roots, with -dem(o)- ("people-related") also turning up in demographic and epidemic, while -cracy "power/rule of X" can turn up in a number of places. A suffix usually adds to a core meaning. Bound roots have their own. So semantically, democracy is actually a compound - "rule of the people." Now can affixes attach to bound roots? You bet they can. The words acracy and anarchy are both examples of this. (They can mean the same thing, but acracy as a word is usually only used in political philosophy.) -archy is much more frequently prefixed, though. Number prefixes give us a whole host of words, ranging from monarchy (originally meaning rule of one, with mon(o)- being the prefix) to something like decarchy, meaning a nation with ten rulers. Hyperarchy ("excessive government," literally "overgovernment") is another such prefixed use of a bound root.

Now the thing about bound morphemes is that they invariably attach to specific parts of speech. Sure, we have forms that phonologically overlap and even attach to the same part of speech, like the gerundive and present participial forms in English. But what do you do with an "affix" that can supposedly attach to anything and still mean the same thing? From a syntactic and semantic point of view, it isn't bound. "But aren't all affixes bound?" Yes they are. What you're dealing with is a clitic; these are "syntactically free but phonologically bound." They typically play a functional role at the level of the phrase, sentence, or larger unit of discourse, and as such they can attach (phonologically speaking) to any word regardless of part of speech. Probably the most common clitics in English are the contracted forms of the verbs "to be" or "to have," especially when used as auxilliary verbs (more on those when I discuss parts of speech), and can modify/be modified by entire phrases. The English possessive -'s is also a clitic that can modify entire noun phrases, which can include relative clauses post-modifying the head noun.

Next post: Morphophonology. After that - possibly in the post following, I'll do parts of speech, as a segue from morphology into syntax.
Member Avatar
Member Avatar

Very interesting stuff :D !

Is there any particular reason why English has just the one circumfix?
Member Avatar
Christian. Exterminator of Spammers.

Usually circumfixes develop from prefix-suffix combos, and if not for the fact that the en- -en circumfix has a slightly different meaning than en- or -en by itself (and they basically mean the same thing) it would be analysed that way. It's actually rather idiosyncratic. Then again, English is very idiosyncratic in general! :lol:

There are always historical elements to idiosyncrasies, though. In English's case it often involves borrowing plural forms of foreign nouns as well as the singular forms. This is particularly true of Latin and Greek words, but there's also the Hebrew example of seraph/seraphim that comes to mind. There are also the very old words that go back to Old English and Old West Saxon and the like. Those are idiosyncratic in Modern English because the processes that rendered those forms were lost long ago as the language changed.
Edited by Jarkko, Sep 21 2017, 12:37 AM.
Member Avatar
Christian. Exterminator of Spammers.


Not all phonological processes happen "just because." There is a certain amount of interaction between phonology and morphology in languages that utilise morphology to greater extents, which is referred to as morphophonology or morphophonemics.

English does have a bit of this. But the morphophonology in English can actually be split into two groups - one where the native speaker can actually perceive the difference, and one where (barring linguistic training) they can't. In certain phonological theories that I won't dive into because it is way beyond the scope of a beginning linguistics class, there are actually specific named categories for both kinds of morphophonology.

Let me explain. Consider the /k/ -> [s] phenomenon found in English words of Greek/Latin origin, or the /t/ -> [ʃ] palatalisation process in words of Latinate origin. (Transcriptions represent Western American and Canadian Englishes; this does happen in other Englishes but the surface forms are slightly different)

plastic /plæstɪk/ + -ity /ɪti/ -> plasticity [pl̥ˠæstɪsɪɾi:]
reprobate /ɹɛpɹobe:t/ + -ion /jɑn/ -> reprobation [ɹɛpɹ̥əbɛjʃən]

These are examples of words where a speaker can actually discern a change in the sound when the morphophonological process is applied. Now compare that to the forms of the English simple-past, where the untrained English speaker actually can not tell the difference in surface form.

talk /tɑ:k/ + -ed /d/ -> talked /tʰɑ:kt/
slog /slɑ:g/ + -ed /d/ -> slogged /slɑ:gd/

There's a small caveat in examples with alveolars. If the final consonant of the base is an alveolar plosive (/t/ or /d/), which the suffix is as well, this process doesn't occur, instead being resolved by epenthesis - remember, that means addition - of a schwa between the base and the suffix, and in most North American Englishes, this triggers the flapping process, resulting in waded and waited sounding exactly the same. This doesn't happen in a number of Englishes, including Received Pronunciation (aka Standard British English), meaning that one can tell waded and waited apart in those dialects.

The exact same process occurs with the English plural -/z/ and third-person singular -/z/, where it becomes voiceless when affixed to a base ending in a voiceless consonant, and epenthesis occurs after sibilants (/s/, /z/, /ʃ/, and /ʒ/) for distinction purposes.

Is this unique to English? Not by a flipping long shot. Finnish is loaded with this stuff. Not only does one have the "weak suppletion" found in most Finnish noun and verb roots, referred to in Finnish grammars and Finnicist linguistics works as "gradation" (and I touched on this in my last post), but the final form of any suffix or clitic ALWAYS harmonises to the last root of a base, and some Finnicists will use shorthand to denote that the front-back feature is left unspecified in the underlying form of the suffix. Others will argue that, since stems with all neutral vowels always take a front-vowel variation of a harmonising suffix, that the suffixes are underlyingly front and harmonise when the root has one or more back vowels. Consider:

lahti -> lahdessa (inessive), lahdesta (elative), lahtea (partitive)
tyhmä ("stupid") -> tyhmässä, tyhmästä, tyhmää
risti ("cross, sharp-sign (in music)") -> ristissä, rististä, ristiä (/i/, represented by the letter I in Finnish, is a neutral vowel.)

You want a case of wacky morphophonology? Check out the verb tehdä on Wiktionary. But here's the clincher - it is considered a regular verb, because the processes involved in its different forms are very regular in Finnish - the root in its underlying form is actually /tek/- but it is subject to consonant gradation and the /k/ -> [h] / _[t,d] rule.

Of course, numerous other languages (most that use morphology) also have morphophonological processes; processes that are triggered by the formation of words. And morphophonology need not be restricted to segments, either - in Chumburung (and many other languages of Africa), for example, tonal changes in a word can occur because of downstep or even upstep, typically that of the automatic variety, because an affix with an underlying tone changing the surface tone melody of the entire word.

So I'll hold off on parts of speech for now, and make a new post for that.
