Bass Detection for sound reactive patterns

jeff · September 6, 2021, 4:06am

Whoa - I cannot believe you just posted this today!

I just finished a really helpful pattern for you, Music Sequencer. It tackles a lot of the problems mentioned in this thread.

SB limits

I have just enough background to be dangerous. I took a few audio engineering courses in undergrad. I also wrote the Sensor Board reference page. Here’s our limiting factors with the current sensor board:

Updates come in to PB at 40Hz - higher FPS patterns will reuse the last values received.
The bin spacing is processing limited in the low frequencies. This means a sine wave is only distinguishable from the nearest semitone at C5 and above (the C above middle C, about 520 Hz). A lot of instrument detection relies on detecting harmonics of a fundamental, which is not always possible with these bins.
As Scruffynerf noticed, this means maxFrequency is sometimes a harmonic, and like most music, bass lines usually have much more energy density than even a synth lead.

Beat and tempo detection steps

After trying a bunch of approaches (and I can’t wait to see what you’ve got going, @zranger1), I settled on:

Sum 3 of the low frequency bins
Keep track of the highest sum seen, and reduce that maximum if bass is present but it’s been a long time since seeing a maximum
Maintain a fast and slow exponential moving average of the bass sum
Store the first derivatives of the fast moving average in a circular buffer. If the average of this derivatives buffer is above a threshold, bass has been net rising for several samples and the code decides that this is a beat event.
Debounce the events to be no more frequent than 16th notes
Store the intervals between the last 8 beat events. If the standard deviation of these sampled intervals is low, compute the estimated BPM and mark the confidence.

The other “instrument” detectors (claps and highhat) work similarly, but use different frequency bins, and simply compare their current sample to their exponential moving average. If it’s high enough above recent signal, debounce that as the detection event.

Known opportunities for improvement

Beat detection: The technique is somewhat successful at identifying bass drums among the competing basslines that exist in the same frequency bins; the most common situation where a beat is missed is when the bass line is much louder than the beat itself. In this scenario, detection would be better if we could also detect a declining fundamental over time.

Bass drums (both real and synthetic) almost always have a “gulping” quality to them which represents the fundamental frequency tracking a downward glissando. Bass lines, by contrast, commonly stay on semitones. Even with just 40Hz updates, it should be possible to better discriminate a beat from bass lines by tracking a declining center of mass on the sonogram.

Tempo inference: I think the key here, according to some papers I read, is to convolve the time domain record of instrument detection events with comb filters of different candidate tempos. This kind of makes sense: It’s computing, “Which ‘1e&a 2e&a’ tempo best coincides with my percussion?” The highest convolution sum indicates the most likely tempo. This is a lot more processing and should probably only be attempted outside of PBScript, in C - maybe on a next iteration of the sensor board. Or, one of you geniuses will figure out a more clever hack One idea bouncing around in my head is storing the most common non-4:4 beat intervals as a set of detection signatures, an approach not unlike how old matrix metering apparently worked on film cameras.