Tuning is hard

I am committed to teaching my students something about the history of tuning in Western European music. I don’t expect them to retain any details or do any math, I just want them to know that the history exists. In preparation, I continue to refine my explanation of this history to myself.

Before the year 1400 or so, Western Europeans mainly tuned their instruments in three-limit just intonation, which they called Pythagorean tuning. (Don’t be fooled by the name; this system was in use in Mesopotamia centuries before the Greeks described it.) Three-limit just intonation is based on the first three harmonics of a vibrating string. Western Europeans really like the pitch ratios produced by these harmonics, as do people from many other cultures (though not all of them). In this post, I will explain why Europeans liked three-limit just intonation, why they nevertheless eventually abandoned it, and what came after.

Let’s start by imagining a guitar string tuned to middle C, and say that it vibrates at a frequency of one Hertz, one vibration cycle per second. (It doesn’t. Middle C’s frequency is 261.626 Hz, 261.626 vibration cycles per second. Multiply all the frequencies in this post by 261.626 if you want real-world numbers.) You can separate a string’s complicated vibration patterns into simple sine wave shaped oscillations called harmonics. Put another way, a string’s movement is the combination of its harmonics.

  • The string’s first harmonic (also known as its fundamental frequency) is its vibration along its entire length. When we say that a string is tuned to C at 1 Hz, we really mean that its first harmonic is a 1 Hz vibration.
  • The string’s second harmonic is its vibration in halves. Each half of the string vibrates twice as fast as the whole string. The second harmonic of C is a 2 Hz vibration, and it produces the pitch C an octave higher than the C at 1 Hz.

The image below shows spectrograms of a violin section playing C4 and C3. Each green peak is a harmonic. Western Europeans, like many (but not all) of the world’s people, consider notes an octave apart to be “the same” note. This is because all of the harmonics of the higher-octave note align with the even-numbered harmonics of the lower-octave note.

For tuning purposes, you can move notes up and down by octaves as much as you want, and it won’t make any difference. (It does make a difference! But not for the purposes of tuning.) You can move a note up an octave by doubling its fundamental frequency, and you can move it down an octave by halving its fundamental frequency.

Let’s move on.

  • The string’s third harmonic is its vibration in thirds. Each third of the string vibrates three times as fast as the whole string. The third harmonic of C is a 3 Hz vibration, and it produces the pitch G that’s a perfect fifth plus an octave higher than the C at 1 Hz.

Given some note, you can find the note that’s a perfect fifth higher like so: multiply the starting note’s fundamental frequency by three, and then divide by two so you keep everything in the same octave. Most people in the world really like the sound of perfect fifths. This is probably because the second, fourth, sixth and eighth harmonics of the higher note align with the third, sixth, ninth and twelfth harmonics of the lower note. That’s not as much alignment as you get in an octave, but it’s still enough to tell you that the two notes must be closely related.

So, to recap: the first three harmonics of a string tuned to C at 1 Hz produce C at 1 Hz, another C at 2 Hz, and G at 3 Hz. To expand our tuning system, let’s take G at 3 Hz as our starting note, tune a new string to that frequency and look at its first three harmonics. The G string’s harmonics will produce G at 3 Hz, another G at 6 Hz, and D at 9 Hz. The notes G and D have the same perfect fifth relationship as the notes C and G.

We can expand our three-limit tuning system further by taking D at 9 Hz as our starting note, tuning a new string to that D, and looking at its harmonics. The third harmonic of D produces A at 27 Hz. If we tune a new string to A, then its third harmonic produces E at 81 Hz. We now have C, D, E, G and A, which make up the C major pentatonic scale. You can play a lot of great-sounding music with these notes. Cultures around the world like to use the pentatonic scale, and it is incredibly ancient. In Werner Herzog’s documentary Cave of Forgotten Dreams, a paleontologist named Wulf Hein (no relation to me) plays a replica of a 40,000 year old bone flute that’s tuned to a clear major pentatonic scale.

To extend three-limit just intonation further, you can work backwards, and ask, what note has C among its first three harmonics? The answer is F at 1/3 Hz, whose first three harmonics produce F at 1/3 Hz, another F at 2/3 Hz, and C at 3/3 Hz (so, 1 Hz). You can then work backwards from F to get B-flat at 1/9 Hz. From B-flat, you can get E-flat at 1/27 Hz, and from E-flat you can get A-flat at 1/81 Hz. You can generate the entire chromatic scale this way. Here’s how it looks.

So, great! Three-limit just intonation makes all the notes you could ever want and then some, and it’s all based on lovely pure fifths. Why did people stop using it? To find out, let’s listen to a MIDI version of the C major prelude from Bach’s Well-Tempered Clavier Book I in three-limit:

This sounds pretty off-kilter! The major thirds you get from three-limit are way too wide, and the minor thirds are too narrow. To get our thirds sounding good, we will need to tune them to pitch ratios from higher harmonics than just the first three.

  • The string’s fourth harmonic is its vibration in quarters. Each quarter of the string vibrates four times as fast as the whole string. The fourth harmonic of C is a 4 Hz vibration, and it produces the pitch C that’s two octaves higher than the C at 1 Hz. That is not very interesting.
  • The string’s fifth harmonic is its vibration in fifths. Each fifth of the string vibrates five times as fast as the whole string. The fifth harmonic of C is a 5 Hz vibration, and it produces the pitch E that’s a major third plus two octaves higher than the C at 1 Hz. Now we’re getting somewhere.

This E that you get from C’s fifth harmonic is the one that your ear likes to hear. If you move five-limit E into the same octave as C and G, you get a lovely pure C major triad. The fourth, eighth, twelfth and sixteenth harmonics of this E align with the fifth, tenth, fifteenth and twentieth harmonics of C.

Also, the sixth, twelfth, eighteenth and twenty-fourth harmonics of this E align with the fifth, tenth, fifteenth and twentieth harmonics of G.

This is not as much audible alignment as you get between C and G, or between C and a higher-octave C, but it still sets off your unconscious pattern-recognition instinct.

Meanwhile, three-limit E is noticeably sharper than five-limit E. When you move them into the same octave as C, three-limit E is tuned to 81/64 Hz, and five-limit E is tuned to 80/64 Hz (5/4 Hz). This ratio of 81 to 80 is going to be coming up a lot in this post. The harmonics of three-limit E don’t line up nicely with the harmonics of C and G. They almost line up, but there is audible friction.

In grad school, I learned that medieval Europeans considered thirds and sixths (inverted thirds) to be dissonant intervals. That seemed nutty to me, but it makes sense: if you are tuning in three-limit, thirds and sixths are dissonant! This also explains why medieval European music uses so many parallel fifths. In three-limit, those sound really great. For example, here’s the Hilliard Ensemble singing Perotin’s “Viderunt Omnes” from 1198.

In the Renaissance, European composers were getting tired of the medieval parallel-fifths sound, and they wanted to have in-tune thirds. They devised some new tuning systems called meantone temperaments that replace the bad three-limit thirds with the good five-limit thirds. There are many different versions of meantone, and the details are very tedious. The important thing is that the thirds in meantone sound great, but as a result, some of the fifths are wildly out of tune. On a practical level, meantone temperaments tend to sound great in the key of C major and nearby keys, but they sound pretty grim when you get to more distant keys. Kyle Gann has audio examples here.

But why use meantone at all? Why not just build a better tuning system entirely out of five-limit just intonation? Unfortunately, five-limit introduces new problems of its own. I made a “family tree” showing the notes you get from the third and fifth harmonics of C, the notes you get from all of those notes’ third and fifth harmonics, the notes that include C in their third and fifth harmonics, the notes that include those notes in their third and fifth harmonics, and so on. In theory, this diagram extends infinitely outward in all directions. See if you can figure out what’s wrong with this tuning system.

You may notice that all of the notes closest to C appear more than once in the diagram. This is because you are seeing both three-limit and five-limit versions of those notes. They conflict with each other, the same way that three-limit E is not the same as five-limit E.

  • There’s a three-limit D at 9 Hz and a five-limit D at 5/9 Hz. If you put them in the same octave, you get 9/8 Hz for the three-limit D and 10/9 Hz for the five-limit D. They differ by a ratio of 81/80.
  • There’s a three-limit A at 27 Hz and a five-limit A at 5/3 Hz. They also differ by a ratio of 81/80.
  • The same goes for three-limit F at 1/3 Hz and five-limit F at 27/5 Hz.
  • The most dramatic problem is the fact that the diagram shows three different tunings for C. In the center, we have C at 1 Hz. In the upper right corner, we have a five-limit C at 5/81 Hz. In the lower left corner, we have a different five-limit C at 81/5 Hz.

That 81/80 ratio separating all of these conflicting note tunings is important enough in music history to have its own name; it’s called the syntonic comma.

The multiple tunings for individual notes is not the only problem with five-limit tuning. It also shares one of the problems of three-limit: the endless proliferation of enharmonics. In both systems, F-sharp and G-flat are two different notes. So are C-sharp and D-flat, G-sharp and A-flat, D-sharp and E-flat, and A-sharp and B-flat. Not only that, but E-sharp and F are different notes, as are F-flat and E, and B-sharp and C, and C-flat and B. And don’t even get me started on the double flats and double sharps.

On continuous-pitch instruments like violin, enharmonics are no problem. You can just finger F-sharp and G-flat a little differently. However, on keyboards and fretted string instruments, enharmonics are a major headache. Let’s look at D-sharp and E-flat as an example.

  • In three-limit, D-sharp is 19,683 Hz, while E-flat is 1/27 Hz. If you move them into the same octave, that’s 19,683/16,384 for D-sharp compared to 32/27 for E-flat. The difference between them is 531,441/524,288, a ratio called the Pythagorean comma. It’s pretty close to being 1, but not close enough.
  • In five-limit, D-sharp is tuned to 75 Hz, while E-flat is tuned to 3/5 Hz. Moved into the same octave, that’s 75/64 for D-sharp compared to 6/5 for E-flat. The difference between them is 125/128, a ratio called a diesis. As with the Pythagorean comma, the diesis is close to 1, but not quite.

The practical consequence is that in just intonation and their related meantone temperaments, you need to have separate keys or frets for D-sharp and E-flat, separate keys or frets for G-sharp and A-flat, for C-sharp and D-flat, and so on. People did actually build some of these instruments! Here’s a 16th-century organ tuned in meantone with separate black keys for flats and sharps.

It seems needlessly cumbersome to have separate keys for D-sharp and E-flat when they are so close together. Maybe we could just average them out, so we would only need one black key for both of them? This idea motivated the development of another family of tuning systems called the well temperaments. In the well temperaments, D-sharp and E-flat are averaged out to the same note, as are G-sharp and A-flat, C-sharp and D-flat, and so on. This simplifies things enormously. Furthermore, there is no need for a B-sharp key, since it’s averaged out to the same note as C. There is no need for an F-flat key, since it’s averaged out to the same note as E.

Twelve-tone equal temperament pushes the logic of well temperament to its extreme. Again, the details are mind-numbing, but the basic point is that the infinitely ramifying universe of similar but conflicting pitch classes gets reduced to twelve pitch classes total.

Every keyboard, fretted instrument, DAW and digital tuner in the modern world uses 12-TET by default. It’s so nice and simple! Unfortunately, it doesn’t sound very good. Octaves are perfectly in tune, but all the other intervals are out of tune. 12-TET fifths are pretty close to three-limit fifths, so that’s nice. But 12-TET major thirds are noticeably sharper than five-limit thirds, and 12-TET minor thirds are noticeably flatter than five-limit minor thirds.

You can demonstrate the difference between good five-limit thirds and bad 12-TET thirds for yourself using a guitar. If you play the fifth harmonic of the low E string (between the third and fourth frets), it will play a lovely pure G-sharp that’s two octaves higher than the open E. In theory, the fourth fret on the high E string should play this same G-sharp. However, if you play that note while letting the low E string’s harmonic ring, they won’t line up at all. It is, in fact, impossible to have the guitar be in tune with itself. If you tune to three-limit intervals (third harmonics), the thirds will be out of tune, but if you tune to five-limit intervals (fifth harmonics), the fifths will be out of tune. If you use a digital tuner (so, tuning in 12-TET), none of the strings will be in tune at all, but they’ll be out of tune by an acceptably small amount… or so we as a civilization have decided to believe.

I think it’s important that undergrads hear about all of this early in their music making lives, complicated though it is, for several reasons.

  1. Guitarists can stop feeling crazy for not being able to get their instruments sounding the way they want.
  2. When you look at r/musictheory and other places where theory novices gather, someone always talks about A-sharp in the key of F, someone always corrects them, and someone else asks what difference it makes. Or, like, why do scores show C-flats and B-sharps when those are not real things? Or why do classical theorists sometimes call major sixths “diminished sevenths”, or call flat sevenths “augmented sixths”? That stuff made me crazy in school. Now that I know that C-flat used to be an actual different note from B, and that a major sixth used to sound different from a diminished seventh, I’m much more at peace.
  3. The kids should know that there is a world beyond five-limit. For example, there’s seven-limit just intonation, which is not nearly as exotic as you might think. When barbershop quartets sing those lovely resonant seventh chords, they are singing them in seven-limit.
  4. There is also reason to believe that the blues emerges out of seven-limit just intonation, specifically, the first seven harmonics of I and IV (e.g. the E and A strings on a guitar.) This hasn’t been conclusively proven, but it makes too much sense to dismiss. For one thing, it’s quite easy to hear the first seven harmonics of the guitar’s E and A strings, and the resulting intervals make highly idiomatic-sounding blue notes. For another thing, there is no remotely plausible alternative explanation for blues harmony.

I know why people don’t usually teach the history of tuning systems to beginner music theory students (or to anyone.) It’s so arcane, and you can’t just retune your piano when you need to demonstrate something. But it’s fairly easy (and getting easier) to retune your MIDI keyboard using software like MTS-ESP, and it’s very easy to play harmonics on the guitar.

I wanted to compare three-limit and five-limit tuning systems for myself, so I used Ableton Live and MTS-ESP to alternate various intervals and chords between their three-limit and five-limit versions. I put it all over the beat from “When The Levee Breaks” by Led Zeppelin.

On first listen, I found that the three-limit thirds actually sounded better than the more in-tune five-limit thirds. The in-tune thirds sound weirdly flat. This is because I, like you, have grown up in a world dominated by 12-TET, whose thirds are close to three-limit. If you listen to five-limit for a while, its smoother sound grows on you, and then you can hear how anxiety-producing those wide three-limit thirds are. But it’s hard to shake familiar things, even when they are worse. There’s a not very subtle metaphor here.

5 replies on “Tuning is hard”

  1. Thanks for this post. It clears things up to know that C flat is not an “imaginary note” as I once saw it described, but was once a real thing. Is it still a real thing on fretless instruments, such as a violin or fretlass bass?

  2. Thanks for this post! Very useful and insightful to see how the intersection of theory, perception, culture, and practical constraints has resulted in a system can seem arbitrary at times when you don’t understand the history and context.

  3. Nice summary article. There are two other things that you might find interesting to consider.

    * The overtone graphs you show are theoretical. For most actual instruments, the 2nd harmonic is not exactly 2X the first, and the 3rd and higher are even less close to the theoretical integer multiples. Generally speaking, the actual overtones are a little sharper than the integer multiple, and get sharper as you go up the series.

    * When tuning two different notes, our ears are not at all comparing the relationship of the fundamental frequencies, but we are tuning the overtones that do coincide.

    The result of this is that, for example when tuning a piano, the lower notes are tuned lower than as would be expected by the integer relationships, so that the overtones line up. This is called stretched tuning. https://en.wikipedia.org/wiki/Stretched_tuning

    1. Those overtone graphs are not theoretical, they are screencaps of a spectrogram from actual string instrument samples. Stretched tuning is a factor in piano tuning because of the physics of the low strings, but not every instrument has these same constraints. You only need to do very slight stretch tuning for the low strings of a guitar, and for violin it’s not a thing.

Comments are closed.