When we like the songs we hear in a cafe, at events we attend, or simply in cool video edits, we turn on our phone and enter Shazam, and Shazam can give us that song within seconds.
The ability to do this, especially in environments with intense background sounds, such as cafes and events, means stopping; It makes you think “how does he find the song in seconds?” Of course, after finding the song, we don’t pay much attention to what’s going on in the background because we don’t look at the app until we need to Shazam the next song. But that part is more impressive.
First, let’s briefly talk about Shazam.
Our application, which reached more than 225 million users per month in 2022, has a very simple logic. When you open the app, all you have to do is tap the big logo once. After that, Shazam listens to the music using your microphone and finds what it is in seconds, like black magic.
Of course, let’s still move away from the traditions of the Middle Ages and choose not to call Shazam black magic. Here’s the logic behind it:
When you play music to Shazam, the sound waves of that music are converted into data that the computer can process. In this way, a fingerprint is created as a unique wave is created for each song. This process proceeds on the basis of simplification. Instead of completely including areas where the sound gets louder, thinner or thicker, the peaks of these events in the sound waves are taken into account. This simplicity is what makes the process take seconds.
After understanding the basic logic, let’s move on to the practical example. Let’s say you liked the song playing while you were sitting in the cafe and you Shazamed it. The moment you press the logo, Shazam records the sound and creates a spectrogram. In this spectrogram, where all sounds coming into the microphone are recorded within a certain time interval (max. 20 seconds for Shazam), the peaks we just mentioned are created and the complexity you see above is simplified.
In the final stage, these peaks, which have now turned into data, are matched with each other and compared with other data pairs in Shazam’s huge library. If enough pairs are matched, Shazam says it found the song and gives the song name and artist. Of course, this process that we have explained at length can be completed in seconds since it is developed by the computer.
When we look at it from this perspective, the Shazam algorithm is as simple and detailed as our breathing.