This pie chart displays the smallest number of consecutive notes required to uniquely identify a tune in the Skiptune database.

A few observations are in order. First, it is somewhat surprising that as many as 1.4 percent of the tunes can be uniquely identified by only two consecutive notes. About 1,000 tunes out of 82,000 can be differentiated from every other tune in the database by just a 2-note pattern (one note followed by another note, though either may involve tied notes). We would have thought that by the time the database had nearly a hundred thousand melodies, there would be precious few, if any, tunes that a composer, any composer, would have only used once. Of course, there are many more tunes to add to the database, and we do we expect this number to drop slowly. You can see examples of the types of patterns that result in some of these unique 2-note patterns in our page on surprisingly rare patterns.
Second, it is notable that a fifth of the tunes (19 percent) can be identified by just three consecutive notes. Given that 44 percent of the tunes can be uniquely determined by four notes, we find that nearly two out of three melodies can be distinguished from all other tunes by four or fewer consecutive notes.
Third, we expected far more tunes in the long tail of the distribution. We find only about 8,500 tunes needing six or more notes to be identified. Here’s a table breaking down the 10 percent of tunes that require six or more notes to be identified uniquely.
Number of Tunes Requiring at Least 6 Notes to Be Uniquely Identified
| Number of Notes Needed to Identify Tune Uniquely | Number of Tunes |
|---|---|
| 6 | 5932 |
| 7 | 1382 |
| 8 | 473 |
| 9 | 203 |
| 10 | 115 |
| 11 | 70 |
| 12 or more | 344 |
You might be curious about the tunes that are really hard to identify without a lot of notes. Here is a selection of a few of them by name with the minimum number of notes needed to identify each:
We start with a tune that needs eight notes (7 sets of 2-pattern notes) to identify them uniquely from other tunes:

The chant is a 1552 hymn written by Richard Farrant. As you can see, most of the notes are half notes following each other. The eight notes circled are the only consecutive eight notes in the tune that uniquely identifies it. Observe that the song itself isn’t much longer than eight notes. In case you’re wondering what tune also contains the first eight notes (starting with the F whole note), the tune is another hymn, “Christian Hearts in Love United,” written by Johannes Thommen in 1735. Here’s the first few bars of that tune:

You can see the similarity between this hymn and Farrant’s chant. The circled notes form the same pattern as the first eight notes in the above chant, though in a different key. Thommen wrote this hymn nearly two centuries after Farrant wrote his chant, and the odds are that Thommen borrowed a bit. He borrowed well as indicated by the number of names this tune goes by. The ones listed above the notes are only about half the hymns using this melody.
Now let’s look at the Irish folk song, Robber, aka Charley Reilly and Croppy Boy, a tune collected by Bunting in the late 1700s and published in 1840. The shortest unique pattern of notes is the final eight, circled in red.

This old tune evolved into “Charles O’Reilly,” or in Gaelic, “Catal Ua Ragallaig,” by 1909 and was published as this by Captain Francis O’Neill:

Note the similarity in titles, so there was no attempt to hide the fact that this tune was “borrowed”. Every 8-note pattern in “Robber” can be found in “Charles O’Reilly” except for the section encircled in red in “Robber”. You can see this for yourself because this first section of “Charles O’Reilly” is an exact restatement of “Robber” except for a change in keys. The note patterns themselves are identical.
We also use these two tunes to illustrate a compromise we had to make in undertaking the Skiptune project. In the original The Robber, the tune is played with a repeat sign at the end, indicating that the player is to replay the pick-up notes and repeat the entire tune, ending with an implied rest. When O’Neill added a second part to the Bunting version of The Robber, renaming it Charles O’Reilly, he eliminated the final rest and added two different pick-up notes to transition to the second tune (the B and G eighth notes in the ninth measure).
If we had recorded the original Bunting version with its repeat, the first half of Charles O’Reilly would have not been included in the databases because it would have been an exact replication of The Robber, and only the second half would have been included. But because of the way we enter tunes into the database, The Robber ends with a rest rather than the repeat, whereas Charles O’Reilly ends with the beginning pick-up notes. Consequently, the first half of Charles O’Reilly gets classified as a new tune when in fact it’s identical to The Robber. This is a necessary compromise because if we included all the repeats and all their endings, the database would be too unwieldy. We must use other means, notably our metrics and AI, to identify such equivalencies.
More Examples of Tunes Identified by Long Pattern Strings
Many songs that require a large number of notes to be uniquely identified have been around for a long time and have undergone many revisions and variations. Because such tunes bear a strong resemblance to each other, they need more notes to differentiate them. A good example is Hey, Ho, Nobody Home, a round written by Thomas Ravenscroft in 1609 that has evolved over the centuries. Here’s a relatively recent 1964 version of the tune:

The encircled notes are the ones needed to identify this tune from other versions of this children’s song, including the original. The notes not circled are contained in other versions of the tune, such as Paul Stookey’s A-Soalin’, written in 1963.
Brahms Example
Sometimes composers reuse their themes with some variation, resulting in long strings of notes needed to uniquely identify them. The following Brahms example is one tune that requires 10 notes to identify it uniquely among all the tunes in the database. Brahms wrote this folk melody in 1856, a tune he had heard in his home country:

The tune as Brahms wrote it is has the name Marias Wallfahrt, No. 22 from Deutsche Volkslieder, Op. 28. Nearly four decades later in 1893 Brahms wrote Maria ging aus wandern, No. 14, WoO 33, and that piece is shown below.

One can easily see the small changes Brahms made to the original folk tune to expand the melody and evolve it. For instance, the last note of the encircled notes in the No. 22 is a dotted eighth note, but a quarter note in No. 14. If you examine the rest of the earlier piece, No. 22 and compare it to the notes in No. 14, you will see that Brahms included the un-circled notes unchanged into the piece he wrote much later. Because Brahms added complexity to the simple folk tune, the latter 1893 version does not need as many consecutive notes to identify it uniquely. For instance, the last two notes of the encircled passage followed by the next two notes (the “A” eighth note and “E” sixteenth note) only occurs in this piece in all our database, a total of only four notes to identify this tune as unique.
Repeating Notes
Sometimes a musical work requires a long set of notes to set it apart as unique because of lots of repetitive notes. That is true of our next example, variation No. 44 of a ground by a Mr. Henry Eccles in 1684:

One’s first reaction might be that this example is not much of a melody, but that would be because it has been lifted out of context in a set of many variations. This particular variation breaks the simple melody into many sixteenth notes and is highly repetitive. Any shorter set of notes in this piece can be found in other tunes. For instance, the opening notes of Vivaldi’s Winter from the Four Seasons,the allegro non moto section, contain the exact same pattern of the last nine notes encircled in red above (see the first set of encircled notes below). The key is different from Eccles’s variation and Vivaldi uses eighth notes, but the pattern is exactly the same: eight eighth notes followed by another eighth note a major second higher in interval. A few notes later in the same piece, Vivaldi uses the same pattern as the first nine notes encircled above (the second set of encircled notes):

The two encircled passages demonstrate identical patterns found in the Eccles variation. Both these patterns can be found in a couple dozen other themes throughout the centuries, even in modern times. “The Lion Sleeps Tonight,” written in 1939 but popularized a few decades later, has this 9-note pattern in the section with the words “o-weem-o-way”. The point is that songs with many repeating notes are strong candidates for requiring a long set of notes to identify a melody uniquely.