Bass Detection for sound reactive patterns

Has anyone done bass detection on sound patterns? I’m looking for something that would detect whether most of the sound was high treble or low bass, maybe a ratio. I understand the frequency data, and could probably sum those into buckets for low/med/high and get a ratio?

Here are some examples
Bass’d Ralph

Bass’d Mario

1 Like

Sadly, the frequency buckets aren’t really good at specifically detecting bass sound, as the higher buckets tend to be sensitive, and the lower buckets not. You can isolate high noises to a bucket, but the low sounds tend to have some resonance in higher frequency buckets.

I’ve been playing with this, using tools like

https://www.szynalski.com/tone-generator/

and https://drumbit.app

I think you’re on the right track, looking at lots of buckets, and looking for some energy signature, like all buckets have some, which tends to be lower sounds like the bass beat.

Quick and dirty code I was using for testing:

Rough attempt to just display brightness relative to frequency use on 1d line of lights
export var frequencyData
export var energyAverage
export var maxFrequencyMagnitude
export var maxFrequency

export var gain = 1
export function sliderGain(v){
  gain = v * 4
}

export var buckets = array(32)

function bucketUpdate(v,i,a){
  l = frequencyData[i] - ((0.001+energyAverage)*gain)
  if (l < 0) { l = 0}
  return l * 1000
}
export function beforeRender(delta) {
  buckets.mutate(bucketUpdate)
}

export function render(index) {
  h = index/pixelCount
  s = 1
  v = buckets[index%32]
  hsv(h, s, v)
}

I copied the frequency items to a new bucket array, so I could play with adjusting values (like reducing all per the energyAverage, and using s gain slider (which is what’s the example is), etc, and do math once per render loop.

With the tone generator, you usually can’t get low buckets, while your voice can, for example.
Some beats are around bucket 7-9, so likely that’s the key buckets, but then again, bass tends to add values to many buckets so it’s likely your approach of looking at a mass of buckets at once might be better.

Obligatory:

ooh, I found an mp3
https://www.televisiontunes.com/uploads/audio/Close%20Encounters%20of%20the%20Third%20Kind%20-%20Wild%20Signals.mp3 (thanks to archive.org)

With my example, you can see how much of the low bucket activity is clump-y, while the high noises are distinct. Of the ‘5 tones’, when they drop an octave, I get no bucket at all.

Ideally, I want this particular set of sounds to end up making far more of a range of lights display than the way the low buckets light up all at once. I don’t think it’s possible with the sound board as is though.

Far as I can tell, the sound board seems to behave pretty well with both low and high frequencies.

As part of my in-progress beat detection project, I’ve written an averaging spectrum analyzer. It shows each bucket’s relative contribution to the total energy, averaged over an adjustable window of a minimum of 20 samples.

Here’s a shot from mid-Guns and Roses “Welcome to the Jungle”. The energy distribution curve is about what you’d expect – most of it is in the low to mid-bass, falling off as you go up, with occasional spikes for high guitar and cymbal hits and such.

When testing, try using a line input rather than the built-in mic. That wee tiny mic is going to be reasonably good at picking up things in the human vocal range (about 100-1500hz), but I’d bet its response curve falls off quickly around the edges. It’s easier to calibrate things with the line input too, because you can give the board a consistent signal.

1 Like

You might be correct about the mic being a factor… But I still feel like the lower range (and using the tone generator, I seem to see it), tends to have lots of higher bucket energy thrown in, while higher tones don’t have lower buckets. This is consistent with what’s expected by both the method used in the FFT and sound waves in general. It’s not that you won’t see the bass, but it’s got lots of harmonics so if you reduce the gain to avoid those, you also lose much of the lower signal.

Happy to be wrong here, of course.

I just learned about the ColorChord .net software (as opposed to the esp8266 or STM code, and will be trying that out… in part because I’m still trying to understand the algo.) And continue to wish changing the FFT code in the sensor board was easier.

I think you’re right - to detect instruments in specific ranges, you’ll have to look at more than one band, and average over time too. Instantaneous sampling in the pattern – very low frequency, at irregular intervals – is going to miss stuff, or land on weird outliers. And there’s really no getting away from low frequencies having more energy. It’s just physics. The clumping thing too. Going up an octave - doubling the frequency - at 4000hz moves you a lot farther than it does at 40hz.

Rather than fixed scaling, you could maybe keep a moving average for the bands you’re interested in and test vs. that.

Here’s my spectrum display tool – nothing fancy, just handy. It makes a pretty interesting display w/the Close Encounters thing.

Simple Averaging Spectrum Display
// super simple energy averaging spectrum analyzer
// default window size of 20 is about 312ms on PB3
// max size (64) is about a second.
// 09/05/21 ZRanger1

export var frequencyData
export var energyAverage
export var maxFrequencyMagnitude
export var maxFrequency

var displayWidth = 16;  //  Set this for your display
var MAXWINDOWSIZE =  64
export var windowSize = 20;

// timers so we can figure out how long our windows are
var timer1 = 0;
export var windowTime = 0;

// per frequency bin storage
var bandHistory = array(32);
var bandAverage = array(displayWidth);
var avgIndex = 0;
export var volScale = 4;

export function sliderSensitivity(v) {
  volScale = 1 + (10 * v * v);
}

export function sliderWindowSize(v) {
  windowSize = 3+floor(v * (MAXWINDOWSIZE-3));
}

// matrix for energy history
for (var i = 0; i < 32; i++) {
  bandHistory[i] = array(MAXWINDOWSIZE);
}

function calcAvgBandEnergy() {
  arrayMutate(bandAverage,(v,i,r)=>{ return 0; })
  
  for (i = 0; i < 32; i++) {
    bandHistory[i][avgIndex] = frequencyData[i];
    var sum = arrayReduce(bandHistory[i], (acc, v)=> acc + v, 0)
    bandAverage[(i/32) * displayWidth] += sum / windowSize;
  }
  avgIndex = (avgIndex + 1) % windowSize;
  arrayMutate(bandAverage,(v,i,r)=>{ return v / 2; })
}

export function beforeRender(delta) {
  timer1 += delta;
  
  // how much time does our window cover?
  // enable watch vars and check windowTime to find out.
  if (avgIndex == 0) {
    windowTime = timer1;
    timer1 = 0;
  }
  calcAvgBandEnergy();
}

export function render2D(index,x,y) {
  e = bandAverage[(1-x) * 15]
  if (((e * volScale) >= (1-y))) {
    hsv(x,1,1);
  }
  else {
    rgb(0,0,0);
  }
}

@jeff - new music sequencing pattern is killer AWESOME! So many tools for building a lights+audio display, and all in one pattern. Just… nice job, man!

1 Like

@zranger, hmm… not working on v2, reduced window size to 20 to fit ram, but no reactivity, bug in code? (my code still worked)

And then I tried Jon’s new Music pattern (in the library), and the v2 has stopped responding… sigh. Time to debug.

Can’t seem to get v2 out of a weird state… it’ll go blue, then go orange, and back and forth… this is a v2 updated to most recent firmware.

@jeff / @wizard you might want to be sure that Music pattern won’t run on v2 right now. I think it bricked this. No leds, on usb, no SB, reset button isn’t helping. Sigh, now I have two units to figure out.

It eventually went to setup (after crashing a bunch?), and I set wifi and restarted it and it’s crashing again. I’ll have to delete that pattern I think, after it settles down again. (Reset the entire thing? it would be nice to not have to do that but…) Finally calmed down again and I got back to wifi setup, hit reset, and it’s rebooting madly again. Reseting wifi might have been a mistake, now it’s crashing and never seems to get far enough along to start up wifi again. Sigh.

@Sunandmooncouture

Whoa - I cannot believe you just posted this today!

I just finished a really helpful pattern for you, Music Sequencer. It tackles a lot of the problems mentioned in this thread.

SB limits

I have just enough background to be dangerous. I took a few audio engineering courses in undergrad. I also wrote the Sensor Board reference page. Here’s our limiting factors with the current sensor board:

  1. Updates come in to PB at 40Hz - higher FPS patterns will reuse the last values received.
  2. The bin spacing is processing limited in the low frequencies. This means a sine wave is only distinguishable from the nearest semitone at C5 and above (the C above middle C, about 520 Hz). A lot of instrument detection relies on detecting harmonics of a fundamental, which is not always possible with these bins.
  3. As Scruffynerf noticed, this means maxFrequency is sometimes a harmonic, and like most music, bass lines usually have much more energy density than even a synth lead.

Beat and tempo detection steps

After trying a bunch of approaches (and I can’t wait to see what you’ve got going, @zranger1), I settled on:

  1. Sum 3 of the low frequency bins
  2. Keep track of the highest sum seen, and reduce that maximum if bass is present but it’s been a long time since seeing a maximum
  3. Maintain a fast and slow exponential moving average of the bass sum
  4. Store the first derivatives of the fast moving average in a circular buffer. If the average of this derivatives buffer is above a threshold, bass has been net rising for several samples and the code decides that this is a beat event.
  5. Debounce the events to be no more frequent than 16th notes
  6. Store the intervals between the last 8 beat events. If the standard deviation of these sampled intervals is low, compute the estimated BPM and mark the confidence.

The other “instrument” detectors (claps and highhat) work similarly, but use different frequency bins, and simply compare their current sample to their exponential moving average. If it’s high enough above recent signal, debounce that as the detection event.

Known opportunities for improvement

Beat detection: The technique is somewhat successful at identifying bass drums among the competing basslines that exist in the same frequency bins; the most common situation where a beat is missed is when the bass line is much louder than the beat itself. In this scenario, detection would be better if we could also detect a declining fundamental over time.

Bass drums (both real and synthetic) almost always have a “gulping” quality to them which represents the fundamental frequency tracking a downward glissando. Bass lines, by contrast, commonly stay on semitones. Even with just 40Hz updates, it should be possible to better discriminate a beat from bass lines by tracking a declining center of mass on the sonogram.

Tempo inference: I think the key here, according to some papers I read, is to convolve the time domain record of instrument detection events with comb filters of different candidate tempos. This kind of makes sense: It’s computing, “Which ‘1e&a 2e&a’ tempo best coincides with my percussion?” The highest convolution sum indicates the most likely tempo. This is a lot more processing and should probably only be attempted outside of PBScript, in C - maybe on a next iteration of the sensor board. Or, one of you geniuses will figure out a more clever hack :wink: One idea bouncing around in my head is storing the most common non-4:4 beat intervals as a set of detection signatures, an approach not unlike how old matrix metering apparently worked on film cameras.

2 Likes

@scruffynerf, on the spectrum analyzer pattern, it should be ok on v2. I’m running it on one of mine now. Just make sure that windowSize isn’t greater than MAXWINDOWSIZE, as happens if you (as I did) reduce MAXWINDOWSIZE to get it down to an acceptable amount of memory for PB2, and then leave windowSize’s initializer (or the control) set to a higher value.

@jeff, getting beat detection working in a pattern is fantastic! Tempo detection across a wide range of genres is an annoyingly subtle problem. Especially given how easy it seems to be for humans.

Mine is going into special sensor board firmware. I’ll stuff the data where it hopefully won’t interfere with most other applications – the last couple of analogInputs[] entries for the moment.

It works pretty much as you describe: tracking the timing of peaks in several frequency bands, then comb filtering by convolution to find the most reasonable tempo candidate.

Early attempts got really confused by complicated rock drumming. The current version is better at rock, but isn’t doing well at less loud and thumpy genres yet. Finding “1” in a measure would also be nice, but is proving to be not so easy. Always something…

1 Like

What about those of us who like our beats on 2 and 4?

Ha! That’ll be a real challenge! (And it’s amazing how Steve Martin still looks mostly the same as he did back then.)

This is an EXTREMELY helpful thread for the project I’m doing now – Does Firestorm requires individual sensor (sound) boards for lamp project? - #4 by ZacharyRD – Anyone in here happen to have a “good cheap” line-in microphone they’ve settled on for tasks like this? There’s roughly a million of them on Amazon, but finding one that’s going to play nice with the Sensor Board seems like it’s going to be a “buy a bunch of stuff trial and error” and I feel like there has to be a SKU that works for <$10.

Well, I just want to point out that any external mic connected to a preamp to raise it to line-in levels will still have most of the same challenges as the built-in mic: 1) It’ll pick up reverb and reflections 2) It’ll be very sensitive to proximity to the sound source in terms of measuring volume 3) Different frequencies are even more subject to different proximity effects (this is why any time you get the same vocal chords really close to a mic, the vocal gets more bassy, or why mic popping and sibilance depend on proximity).

Any chance you can get a real audio signal to the line-in jack?

The main tricks that external microphones like the Snowball use to increase perceived quality are mass, vibration isolation, careful design against resonances, larger ECM capsules, choice of a cardioid capsule, and internal compression. Many of these will still be missing in the line-level mics you might be shopping for.

If you still want to try, I would look into Behringer stuff. Performance for the price is incredible.