Best way to avoid repeatedly computing the pixels' spherical coordinates?

GeekMomProjects · September 12, 2022, 7:19am

Apologies if this has been discussed before. I searched for “polar” and “spherical” but didn’t see a definitive answer to my question.

I’d like to create a playlist of patterns for pixels that sit on the surface of a sphere (project described in “Show and Tell” here). Some patterns use Cartesian coordinates via render2D(index, x, y) and others (particularly ones which use the sensor board’s accelerometer) will use each pixel’s spherical coordinates.

The Cartesian pixel map works fine. For the patterns which use spherical coordinates, I’m trying to avoid recomputing them in every single call to render2D because I assume the inverse trig calls are computationally expensive.

Is there a best practice for storing the spherical coordinates somewhere in the pattern code or elsewhere?

I’ve seen the trick of simply storing the spherical coordinates in the mapper instead of the Cartesian coordinates, but I’ll need to use both Cartesian AND spherical coordinates depending on the pattern.

My plan is to create global arrays for normalized spherical coordinates in the pattern code and initialize them with negative values. Then, in each call to render2D(index, x, y), check to see if the values have already been computed for the specific pixel index. If they don’t exist, compute and save them in the global array before returning from render2D.

This seems workable, but I was wondering if there might be a better way within the Pixelblaze system?

Thanks,
Debra

sorceror · September 12, 2022, 4:05pm

That would work, but it would probably be more efficient to initialize the whole array at startup. Also it would avoid choppy framerates at startup.

In a 2D project, I needed x, y, and θ, so I just precalculated them in the pixelmap (that is, θ was stored in z). If you need x,y,z and r,θ,Φ then … yeah, put x,y,z in the pixelmap and calculate the other 3 at startup.

I think it would be great if we could store an arbitrary-length vector in the pixelmap

jeff · September 12, 2022, 5:15pm

Hi Debra!

What you describe is very close to how I would do it. In addition, I might swap out the cartesian-to-polar renderer with a polar-lookup renderer after all polar coordinates are computed and cached. This approach made it about 50% faster for me vs computing hypot() and atan2() on each render, and 7% faster than always checking for an “uncached” flag value (like the negative number)*.

Here’s an example. Note I didn’t normalize r and phi for this demo, but you can.

// Polar cacher - compute and cache polar coords from a cartesian map

var rs = array(pixelCount)    // Polar radius from center, positive values
var phis = array(pixelCount)  // Polar angle phi from center in [-PI..PI]
var cachedCoordCount = 0      // How many pixels have had their polar coordinates calculated and cached

export function beforeRender(delta) {
  resetTransform()
  translate(-.5, -.5)         // Center at (0,0)
  t1 = time(.1) 
  
  // After all polar coords are cached, skip computing them
  if (cachedCoordCount >= pixelCount) render2D = render2DFromPolarCache
}

export function cacheAndRender2D(index, x, y) {
  var r = rs[index] = hypot(x, y)
  var phi = phis[index] = atan2(y, x) // Reminder that output range of atan2 is in [-PI..PI]
  cachedCoordCount++
  renderPolar(index, r, phi)
}

// Use cacheAndRender2D() as render2D() until the cache is populated (should take one frame)
export function render2D(index, x, y) {}
render2D = cacheAndRender2D 


export function render2DFromPolarCache(index, unusedX, unusedY) {
  renderPolar(index, rs[index], phis[index])
}


export function renderPolar(index, r, phi) {
  h = r + t1 + phi / PI2
  hsv(h, 1, 1)
}

Being able to redefine the function pointer for `beforeRender()`, `render()`, `render2D()`, etc is such a useful trick.

@sorceror I agree, except I’ve had a few cases where pixelCount was large and PixelBlaze ran out of time / resources (“Execution steps exhausted”) to precompute the entire cache before getting to the first run of beforeRender(). That’s why I sometimes do progressive caching like this.

* Benchmark conditions: simple 2D renderer, 600 pixels, preview data disabled

wizard · September 12, 2022, 7:20pm

See also this thread

Generally, the trig math is extremely fast, its all optimized for speed, and the rest of your pattern code will down out the cost of the trig calls.

There’s also overhead in doing the cache, unless you do a trick with swapping out render functions like @jeff mentions, you can easily erase any gains with extra if() checks, thats how fast they are.

You might want to use the “Performance test framework” pattern if you want to experiment with micro-optimizations. (Currently on page 5):
Performance test framework.epe (3.8 KB)

Using this I measure that calling atan2 and hypot 384 times will add about 0.57ms per animation frame more than looking up a value from a cache array (assuming perfectly optimized with no checks). Given all the added complexity of a cache, I would personally opt for the much shorter simpler code.

GeekMomProjects · September 12, 2022, 8:03pm

Thanks for the feedback. I agree it makes more sense to initialize the array at startup. I just don’t know how to access the pixel map from inside the pattern mapper aside from the coordinates that get passed into render2D. What’s the mechanism for accessing the pixel map from inside the pattern editor?

GeekMomProjects · September 12, 2022, 8:11pm

Got it. Thank you so much, Jeff, that is extremely helpful and redefining the function pointer is a very clever way to eliminate the initialization check for each pixel!

I also hadn’t thought to put translate(-.5, -.5) inside beforeRender, but it makes so much sense. This is great - I really appreciate it!

GeekMomProjects · September 12, 2022, 8:21pm

Got it. Thank you so much! The explanation of what changes are actually useful for optimizing speed and which changes aren’t is very helpful. I had wondered how costly the array lookups are, but it seems like general coding canon that inverse trig functions are computationally expensive and to be avoided whenever possible. It’s good to have a bit of a peek behind the curtain sometimes!

I haven’t (yet!) written any patterns fancy enough that I’ll need to worry about optimizing at that level, but I do want to be prepared for the possibility. I’m sure I’m going to want to make an even bigger sphere (or something like it) in the future, in which case the optimizations may well be necessary.

I will definitely be sure to check out the test framework. I’d like to understand how my code changes affect speed on a more granular level, and that seems like the best to go. Thanks for that reference.

Thank you so much! Gonna go play with some patterns now. (Woo hoo!!!)

Best,
Debra

GeekMomProjects · September 12, 2022, 8:27pm

One quick question about your code. Inside the cacheAndRender2D function, you’re just counting the number of times the function is called. I assume, then, that it is guaranteed that each of the first pixelCount calls has a different value for index? It would make sense, but I wasn’t sure that I could assume that.

Thanks!
Debra

sorceror · September 12, 2022, 8:31pm

The only mechanism is that the values are passed to render functions like Render3D(index,x,y,z).

GeekMomProjects · September 12, 2022, 10:20pm

Oh, so you’re suggesting that I copy the pre-computed coordinates into an array in the pattern editor then. I thought you were saying there was a way to compute them from the stored Pixel Map before any of the calls to the render functions. Got it. Thanks!

jeff · September 12, 2022, 10:35pm

Yeah, it happens to work out OK in the current Pixelblaze architecture, though I think you’re right that it’s not guaranteed in most shader frameworks; they reserve the right to call render with some pixel indices more frequently than others, and not necessarily in any order.

So it seems also reasonable to pre-populate the uncached values with a marker value, then only cachedCoordCount++ when a marker is replaced.

GeekMomProjects · September 13, 2022, 1:30am

Hi All,

Just to follow up, I followed @jeff 's example and made a test using the code below, which can either compute a pixel’s spherical values in every call to render3D -or- use cached spherical values it computes once per pixel. Switching between the two modes requires commenting or uncommenting the line render3D = cacheAndRender3D

The pattern itself is just a simple moving “bullseye” with hues determined by the distance from a point which oscillates in two dimensions.

For 320 pixels (I’ve only got 5 of the 6 8x8 matrices in my build assembled ATM) the uncached version yields about 55 FPS and the cached version gives about 72 FPS. Which is significant, but I had expected a bigger difference based on my impressions of the computational expense of inverse trig functions. Either I was quite wrong about that or you’ve done some very impressive optimization, @wizard!

Anyhow, thanks again all for the excellent advice, and I learned some very useful concepts from the responses here.

-Debra

// Try caching spherical coordinates. All points on a sphere so can ignore rho
function initVals(v,i,a) {a[i] = -1;}
var phiVals = array(pixelCount)     // Normalized phi values
phiVals.forEach(initVals)           // Initialize phi to -1
var thetaVals = array(pixelCount)   // Normalized theta values
export var cachedCount = 0          // Number of cached pixel values

export function beforeRender(delta) {
  t1 = time(.06)
  t2 = time(.023)
  wx = wave(t1)
  wy = wave(t2)
  
  // Have cached all values. Can pull them from arrays instead of doing computation
  if (cachedCount >= pixelCount) render3D = render3DfromCache
}

export function cacheAndRender3D(index, x, y, z) {
  var phi = phiVals[index]
  var theta = thetaVals[index]
  if (phi < 0) {
    x -= .5
    y -= .5
    z -= .5
    theta = thetaVals[index] = 1 + atan2(y,x)/PI2
    phi = phiVals[index] = acos(z/hypot3(x,y,z))/PI
    cachedCount++
  }
  render2D(index, theta, phi)
}

export function render3DfromCache(index, x, y, z) {
  render2D(index, thetaVals[index], phiVals[index])
}

// This function computes spherical values with each call
export function render3D(index, x, y, z) {
  x -= 0.5
  y -= 0.5
  z -= 0.5
  render2D(index, 1 + atan2(y,x)/PI2, acos(z/hypot3(x,y,z))/PI)
}

// Comment out the next line to make the code compute pixel spherical values in 
// each call to render3D. Uncomment it to make the code use cached values instead
render3D = cacheAndRender3D

// Select hue by distance from a point oscillating in x and y
export function render2D(index, x, y) {
  hsv(2*hypot(x-wx,y-wy)+t1,1,1)
}