Blog

Music ID Apps – How Do They Work?

How Music ID Apps Work

Our phones can do just about anything these days. Occasionally producing some small miracle that will seem like nothing short of magic. In the case of music ID apps, it can often seem like what they do for us is explainable.

These music recognition apps are capable of listening to a short, 4-5 second clip of a song. In some cases they can even listen to the user’s spoken or singing voice, and can correctly identify the song in a matter of seconds. For years people have struggled with hearing things they enjoyed but couldn’t identify. The invention of music ID apps has helped .

The Beginning

With the invention of the smartphone, potential solutions to common problems increased dramatically. No longer cut off from the world, the human race realizes a level of connection to one another far greater than ever in history. Information of all sorts is right at our fingertips, and that includes an extensive library of essentially every song recorded in modern history. With all that data out there, it can be daunting to decide how best to use that library to find the information you need.

Music ID apps paved the way for sound recognition software. Instead of forcing a user to search manually through data, the app generates a unique fingerprint of the recorded clip, and compares it to a database of similarly generated fingerprints. The service was first launched in the UK in 2002, but was popularized later when it was brought to the iPhone in the US. Soon after others came along to try to get in on the budding market of app music recognition.

How It Works

While it may feel like magic is fetching your data, the reality of it is far less grandiose. Some assume that an evolved version of voice recognition software must be utilized. However that would be impractical due to barriers beyond recognizing the voice. Such as identifying the song itself, and the version of the song, as many songs have several versions by the same singer.

The apps work instead by using proprietary formulas for translating song data into unique numerical codes. A library is generated for known songs. When the app is given a sample recording it creates a fingerprint for that sample. It then compares it to the library. Essentially, the music recognition software does for sound what Google does for words and images.

While that sounds simple enough, the complexity of the software is the most common source for problems. For a long time it was considered impractical to attempt to boil down a song into a set of digits. As there is simply too much information within a song to abridge into an easy-to-use fingerprint.

Instead, the software creates a three dimensional plot of the song in order to compare three different data points simultaneously. By doing so, this allows the song recognition software to ignore the insignificant portions of the data and focus only on the high-energy, intense moments.

These distinct data points are generated for each song at roughly the rate of three per second. Well within the range of acceptable code complexity for use as a fingerprint.