A Raag is a Language

The concept of a raag in Hindustani classical music seems a lot like a language to me. I’m going to try to explain what a raag is from this perspective and see how much sense the analogy makes.

(Fun fact: this idea was first mentioned to me by my dad, a few years ago, when I was first reading Douglas Hofstadter’s Gödel, Escher, Bach and excitedly explaining to him the mathematical patterns in a Bach piece. He said, “Can’t you describe similar rules and patterns in the notes of a raag?”, of which I was instantly skeptical because what else is the expected attitude of an adolescent boy towards an insightful comment by his father? Anyway, I think it is very true now, and was reminded of it when giving Rujul an impromptu lesson about the seasonal suitability of various raags.)

(Quick notation primer: Hindustani Classical Music (HCM) is based on relative pitch, with a movable tonic/’do’. The basic solfege syllables are sa re ga ma pa dha ni essentially corresponding to do re mi fa sol la ti, and I will also refer to them as degrees of the scale. Another important note is that almost all melodic HCM instruments are continuous-pitch (in the hands of a good performer), especially the ones in a solo performance, even ones like the flute that don’t seem like it.)

Let’s start with phonetics. The entire space of possible raags covers only 12 basic phones, the 12 notes of the chromatic scale; the phonemes of any particular raag are a subset of these. To make this a little more accurate to reality, consider that different raags can use different microtones of the same note, giving more than 12 phones, but there is never a contrast or minimal pair between microtones of the same note in a single raag so we can consider that note (regardless of microtonal variation) to be a phoneme. This situation is not that different from natural language. There are considerably many more phonemes in a language and many more phones in general, but a language uses a subset of all possible phones, and those phonemes can have ‘microtones’ (e.g. aspiration allophony, velarized articulation, etc.) that are not contrastive in a single language.

Phonology (the interactions between phonemes strung together) and morphology (the interactions between word roots and affixes that form an entire word) overlap in a kind of messy way when it comes to raags, because there are no ‘words’ really. Nevertheless I think there are some interesting observations. When a performer moves from one note to the next, it is of course not an instantaneous change, because of the physical properties of their voice or instrument. But that brief, almost unnoticeable period between two notes is not outside the control of the performer; in fact, the type of transition between two notes often defines the nature of a raag, and a raag can have different preferred transitions for different pairs of notes. The transition can be a smooth ease-in-ease-out, or have a lilt at the beginning or end, or hit a very brief grace note en route, or a number of other possibilities. (For those familiar with HCM, a simple example: consider the difference between the dha to sa transitions in Yaman and Bhoop!) This does not have an obvious counterpart in natural language, but I think that it is reminiscent of the sort of phonological/morphological changes that occur at phoneme/morpheme boundaries. In the transition from the root ‘box’ to the suffix ‘s’ (plural), say, there is a ‘grace note’ that makes you say ‘boxes’ with a schwa in the middle and not ‘boxs’.

There is a more straightforward correspondence at the syntax (grammar) level. Every novice HCM student starts learning a raag by learning its aaroha and avaroha, which are a generally ascending and generally descending sequence of some of the notes of the raag, respectively, that start and end around the tonic (sa) in the appropriate octave. If one is currently singing/playing a note, then the next note can be any note above it in the aaroha or any note below it in the avaroha, such that any valid sequence of notes can be broken into ascending sections agreeing with the aaroha and descending sections agreeing with the avaroha. This is quite literally a grammar that specifies which ‘words’ can be used in what order to form a sentence! A realization or instance of a raag is comprised of valid sentences drawn from an infinite space, exactly how the ‘formal’ description of a language’s syntax is realized into an actual sentence that obeys it. And just like real grammars, the aaroha and avaroha can have context-dependent rules that specify patterns of notes that must be followed, such as ‘if a phrase ends in the tonic, it almost always must do so with the degree sequence 3-4-2-1 (ga ma re sa)’. There is some intersection here with morphology, because the rules also often specify what sort of transitions are to be used in certain phrases, but then again there is considerable interaction between morphology and syntax in natural languages too.

Finally, we come to semantics and pragmatics. Whether or not a musical phrase produced in the framework of a raag has any meaning is contentious, but there are certainly meta-rules for realizing a raag in the broader context of a performance. For instance, every raag has two special notes called the vaadi and the samvaadi, that are intended to be the primary and secondary focuses (respectively) of phrases in the raag. This doesn’t mean that every phrase must use those notes more often than other notes, and nor does this mean that a section of a performance cannot focus on a different note, but the performance as a whole must show a clear emphasis on the vaadi and a less prominent emphasis on the samvaadi over other notes. Another example is the set of idioms of a raag, phrases or smaller note sequences that appear much more frequently than average, and are often building blocks of larger sentences. Again, it is not the case that every phrase must use these idioms, but if they aren’t used frequently enough then the raag does not ‘sound correct’ to most listeners even if the grammar is technically being followed. (Interestingly, these idiomatic phrases vary between gharanas or guilds of musicians, just like dialects!) My last example here is the case of mishra or hybrid raags. If raag A and raag B are in a hybrid, then the performer can switch between a mode in which they realize phrases of raag A and a mode for raag B, but the switching can only happen in certain contexts (e.g. usually not in the middle of an idiomatic phrase); it must be seamless, in the sense that the switch should flow naturally and appear effortless; and both raags’ rules are slightly modified from the versions that would be used for performing purely raag A or B (e.g. maybe using raag A below the dominant (pa) and raag B above). The linguistic counterpart that immediately comes to mind here is code-switching which has many similar phenomena.

(Warning: another cognitive science reference incoming!) I am now reminded of the units about music and language processing in my Human Brain class last semester. Briefly speaking, the music and language processing pathways are the same for many initial perceptive processes like determining how fast and in which direction pitch is changing, detecting harmonics and the relationships between them, etc., but then the two pathways diverge for higher-level functions specific to language or music that are selectively activated for those stimuli. If the subject listens to songs with lyrics then both are activated. (If I remember correctly, there are also regions that respond selectively to human voices, and it is unclear how these tie into language and music.) However, this research has all been conducted using Western musical frameworks; would the results change with HCM?

In particular, I’m thinking of how my approach to learning a raag changed once I was no longer a novice HCM student. For me—and I believe for most people with more than just basic training in HCM—figuring out which notes a raag uses, and the transitions between them, requires perhaps 15 to 30 seconds of listening to a performance. Within a few minutes, the aaroha, avaroha, and many of the finer morphological rules are reasonably clear. I’d say 30 minutes is almost always sufficient to have a pretty good grasp of the pragmatics of various idioms and such, but to be able to perform something that ‘sounds correct’, I’d have to attentively listen to at least a couple complete performances. Importantly, most of this learning process is unconscious, because things like phonological and grammatical rules are picked up naturally just by listening to the performer, and even in an instructional setting the most effective way to convey them is having students learn by example. I don’t believe anyone ever actually talks about the various types of transitions between notes; it’s just something you pick up as you get better. This sounds suspiciously similar to learning a language! No amount of theoretical grammar can make a language learner’s sentences flow naturally.

All I need now is access to an fMRI machine (technician included), a few dozen experimental subjects (half of them trained in HCM), an undergraduate research assistant to clean up the data and run the statistics I specify, and some departmental funding. Should be an easy road to publication after that.


Leave a Comment

Username (required)
Comment (Markdown allowed)
Comments will appear after moderation.