The Skiptune database consists of 82,000 melodies as of January 2026. Unless otherwise noted, we use the words “tune,” “melody,” “musical phrase,” and “song” interchangeably, and those terms may refer to just a portion of the entire melodic line that may appear in a song. Basically we are trying to capture a “musical thought”. Only the melody is entered (no harmonies or bass unless they are carrying the melody), and no percussion. The database consists of the following items:
1. ID. A unique ID number to distinguish each tune from every other tune.
2. Name(s). All the names by which the tune has been known. Some tunes have dozens of names. This happens for a variety of reasons explained here.
3. Key Signature. Indicates the number of sharps or flats the tune has as entered in the database. This may not be the key of that the tune was originally written in, but simply reflects whatever source we used. Tunes written in modes, such as Dorian or Mixolydian, are not identified as such, but rather simply by the number of flats or sharps. For older tunes, then, our recorded key is inaccurate; we don’t use the key signature in any of the calculations or metrics, but merely recorded it in case at some future point we find it useful for more modern tunes that are rarely modal.
4. Year Written. The year in which the tune was composed or, failing that, first published. For many folk tunes of international origin, the first-published year can be wildly off from the composition year. For instance, Chinese folk tunes go back centuries, but are mostly recorded in the Skiptune database as having been first published in the 20th century. Such tunes may not have been written down in Western-style notation for most of their lives, so we can’t be sure that the tune has been unchanged for all that time. To be conservative, we use the later, first-published year. While date errors introduce some inaccuracy into our findings, it’s better to include the tunes with known errors in the year than to exclude them entirely because such tunes help us answer questions about music evolution and commonality between cultures. In some cases we know the composer’s name but not the composition date. We use the historical record to estimate the date the tune was written or, if nothing else is known about the composition date, the halfway point of the composer’s adult life. Doing so minimizes the error between our estimated date of composition and the actual date of composition.
5. Tags. “Tags” are descriptors used to link the tune to genres or styles or to otherwise link them to other tunes. Tags may reflect any of the following classifications:
a. Genres of music, such as romantic, country western, jazz, or rock
b. Country or area of origin, such as Irish, American, or Latvian
c. That the tune is a variation on a main theme or derivative from some earlier tune
d. Associated with certain human activities, such as dance, military, or sports.
e. Reflect the popularity of a tune, such as pop or standard.
6. Composer. The name of the composer if known. If unknown, we used “anon” (short for “anonymous” ) to indicate that the author is an unknown person.
7. Tempo. The beats per minute that the tune is typically played in. For older tunes, the tempo has to be estimated based on the historical record. For some tunes, the tempo can range a great deal and is therefore of limited use in analyzing tunes.
8. Time Signature. The time that the tune is written in, such as 4/4, 6/8, or 2/4.
9. Source. The source of the tune as entered in the Skiptune database. Recording the source allows us to return to where we obtained the tune in case there’s some question about its authenticity or accuracy. Sources are generally represented by using four capital letters, with some exceptions for shorter or longer acronyms.
10. Tune. Last, the tune itself, recorded in alternating MIDI pitches and durations.