Lessons learned updating code that uses Web Audio

Thu Jun 20 2024 16:20:00 GMT+0100 (British Summer Time)

As I mentioned in my previous post, I have been working on getting my audio and Web Audio based experiments working again, after years (in some cases, more than a decade) of not touching the code. Note the distinction: some of my experiments do not use the Web Audio API at all, yet they play with audio, via the HTML audio element.

The fact that I mostly only had to touch a few lines of code on each case was quite remarkable, but still there were some bits that I wish I had known before I started. So, here they are:

1. You can't just start an audio context or an audio element (which plays audio) without user interaction any more

The action will fail if you do something such as

window.addEventListener("load", function () {
    let ac = new AudioContext();
});

For example, Firefox outputs this message:

An AudioContext was prevented from starting automatically. It must be created or resumed after a user gesture on the page.

So what I had to do was to change the onload handlers to listen to a click or tap, and then start whatever it is that would cause audio to be generated. Something like:

window.addEventListener("load", function () {
    let startButton = document.getElementById("startButton");
    startButton.addEventListener("click", onStart);
});

function onStart() {
    let ac = new AudioContext();
}

As a user, this makes a lot of sense: you don't want to suddenly be surprised with audio playing when you open a link.

2.You can probably let go of AudioContext monkey patching libraries now

When I started writing code that used Web Audio, the API had been in flux for a bit only, and the support was quite patchy: Chrome exposed the AudioContext constructor with a -webkit prefix, then I think it didn't, but Safari still used the prefix for quite a long time; at the same time Firefox did not implement the API at all for a couple of years, until it did, and I think it prefixed it with -moz.

Thus, I often recommended pulling in a brilliant library called AudioContextMonkeyPatch that Chris Wilson started. This helped make the code you wrote a lot more predictable and consistent. It not only gave you a way to just refer to the AudioContext consistently, it also aliased some Node names, as they also were "constantly" changed at that time as the spec was changing.

But as of today, all the modern browsers have finally converged on using unprefixed constructors and the same node names. So if you are writing new Web Audio based code and do not expect to run it in old (unsupported) browsers, you shouldn't need this anymore. Yay for one less dependency!

In any case, it is always a good practice to check if an API is present before using it. So maybe something like this would be advisable:

if (window.AudioContext) {
    // ... do audio magic
} else {
    // ... let the user know in a nice way
}

3. Some methods have been removed and replaced with attributes

Many of my experiments used positioning in one way or another, as I built a virtual 3D world to explore or played with panning and colours to hypnotise you.

In the past you could set the position of the listener with the setPosition method, which you used like:

// ac is the audio context
ac.listener.setPosition(x, y, z);

But the method has been removed and replaced with the corresponding attributes positionX, positionY and positionZ in the listener, which also have become AudioParams. That means you don't set the attribute value directly, but through the value attribute of each attribute (yes, that also confused me initially!)

It's easier to see with an example. The code above becomes:

listener.positionX.value = x;
listener.positionY.value = y;
listener.positionZ.value = z;

That looks like not a big win in itself, but it becomes more interesting when you take advantage of the fact that they are AudioParams, and so you could use methods such as these to schedule a change of value at a given when time:

// when is a time based value, relative to the audio context creation time
listener.positionX.setValueAtTime(x, when);
listener.positionY.setValueAtTime(y, when);
listener.positionZ.setValueAtTime(z, when);

You can schedule all the values you'd like to see each attribute to have at which time, and then just let the audio engine work out what needs to happen when. You can even program interpolations using the exponentialRampToValueAtTime or linearRampToValueAtTime, etc. It's like telling your studio engineer which faders need moving when at the beginning of the day, and then just forgetting about it because the engineer takes care of all for you.

In addition of setPosition, setOrientation has also been removed and replaced with six (!) attributes, representing the x, y, z components of the up and forward vectors. You need to use a similar method as above, using AudioParam to get values to change.

But at the time of writing Firefox still has not fully implemented this change of spec. Which means that we need to detect what the browser supports before accessing things, or else either our code will break or we will get no audible effect whatsoever. A solution is to use something like this (taken from the code of 3400 miles below):

// The listener is in the ship, so that's the position we place them in
// The listener is looking at the shipTarget point
// The up vector is calculated based on the target and other values
if (listener.positionX) {
    listener.positionX.setValueAtTime(ship.position.x, now);
    listener.positionY.setValueAtTime(ship.position.y, now);
    listener.positionZ.setValueAtTime(ship.position.z, now);
} else {
    listener.setPosition(ship.position.x, ship.position.y, ship.position.z);
}

if (listener.forwardX) {
    listener.forwardX.setValueAtTime(shipTarget.x, now);
    listener.forwardY.setValueAtTime(shipTarget.y, now);
    listener.forwardZ.setValueAtTime(shipTarget.z, now);
    listener.upX.setValueAtTime(up.x, now);
    listener.upY.setValueAtTime(up.y, now);
    listener.upZ.setValueAtTime(up.z, now);
} else {
    listener.setOrientation(
        shipTarget.x,
        shipTarget.y,
        shipTarget.z,
        up.x,
        up.y,
        up.z
    );
}

This covers both bases and keeps you able to spatialise your sounds.

4. Other methods are entirely removed: `setVelocity`

Speaking of spatialising, there was a nice feature in early Web Audio implementations in which you could specify the velocity of listeners and panner nodes, and a sort of doppler effect would be present. So you could hear a bit of that wooossh as your listener (or the panner nodes) moved in space, i.e. not only the volumes would diminish as things got further away, but the frequencies also changed, altering the sound with the movement, much like what you get when a vehicle with loud music passes you by fast enough.

This has been removed, and I think this is the GitHub thread tracking the discussion, but the summary is:

don't use setVelocity in either AudioListener or PannerNodes
don't use listener.dopplerFactor or listener.speedOfSound (mostly because they do nothing now)

I'm a bit sad we don't get this built-in anymore, but I suppose it could be brought in back by using some sort of custom spatialiser code with AudioWorklets.

5. `ScriptProcessorNode`s should be replaced with `AudioWorklet`

And speaking of AudioWorklets, they are the recommended way of generating custom audio, rather than the old ScriptProcessorNode.

It is not recommended to use those in new code, but the feature has not been entirely removed yet. If you can, it would be a good idea to start thinking of how to migrate your code to use the new system.

In my case, I still have an experiment using ScriptProcessorNode, as MACCHINA I generates the audio using a custom synthetiser that feeds into an instance of ScriptProcessorNode. I need to move it to AudioWorklets at some point, but I haven't looked into that yet.

Since AudioWorklets run on a separate context (much like Web Workers) I need to figure out how to pass messages to the node, and the synth has a lot of messages to pass (to configure each parameter, which note to play, etc), so message passing isn't necessarily trivial!

6. Firefox supports MP3 now, but Safari still does not support OGG

At the time I built my experiments, Firefox did not support MP3, and Safari did not support OGG files.

I did not use Safari for anything at the time, and since I was high on ideological purity at the time, I only worried about providing OGG files. This meant that when I got to test some of these experiments all these years later, none of them worked on my phone or tablet, which are Apple devices.

This annoyed me because I knew they had a fully capable Web Audio engine and all that was missing was an MP3 file for the reverb response! So I placed my ideological purity on hold, generated an MP3 version of the OGG files, and either I simply loaded MP3 files (since Firefox now supports that format) or I feature detect and load a specific set of files:

if (document.createElement("audio").canPlayType("audio/ogg")) {
    soundURLs = soundURLsOGG;
} else {
    soundURLs = soundURLsMP3;
}

This still enervates me, because of reasons, but it's probably a matter for another post.

7. Making things work in iOS

If we thought Safari on desktop is weird, wait until you hear about Safari on iOS.

Apart from the lack of OGG support I already mentioned, I realised that even after I changed my code to start playing audio only after a user gesture, it sometimes would not emit any sound at all.

I was at a total loss, as I accessed the JS console and could see no error. What was going on? After some digging I figured out that if you have the "silent" switch ON in your phone, no Web Audio based code will emit any sound at all, but if your audio comes from an AudioElement it will emit sound IF the phone volume is non-zero, that is. The volume being zero is not the same as the phone being in "silent mode"... ok 🤯

Guess what? I always have my phone on silent; I don't need it vibrating because I already have a system that ~~distracts me~~ delivers notifications to my wrist. I wasn't even sure what the silent switch was, as I so rarely use it!

On the other hand: I am positively impressed by both the experience of remote debugging Safari, and by the performance of my experiments on my hammered, 4 years old phone. Both Web Audio and WebGL were as smooth as it gets.

8. Replacing pseudo-CDNs with https, CORS

Some parts of the web are considerably safer than they were 10+ years ago. Thanks to initiatives like Let's Encrypt, anyone can run a website delivered with encryption through https, which is great!

Once it was easy to serve websites that way, browser makers felt empowered to enforce restrictions about allowing some features if run in a secure context only, for greater user protection (roughly, served via https or from a 'safe' environment like localhost).

This is great and cumbersome at the same time!

While you are way more protected as a user, it can be a bit trickier to develop things, specially if you are trying to migrate code you wrote in 2010 and which was assuming much looser restrictions:

My earlier experiments were quite happily mixing up content of different levels of security as some websites had migrated to https but I had not updated the links which sometimes were absolute and not relative, so even if the experiment was served from an https context, other code came from an insecure (http) URL. This can break things easily, as for example the browser might refuse to give you access to the webcam if the source code was not served securely (to avoid anyone sneaking on your images).

The main reason for this mixage is that back in 2010 we thought it was a really good idea to use "pseudo-CDNs" to host our JS. So for example, I had one experiment that loaded the main script from my server, then the UI library was served from some Google CDN server, then some samples were hosted in another (insecure) server. I'm glad that node/npm helped us think in terms of modules, and that ES modules and tooling such as esbuild exist nowadays, because that was a hot mess!

So, my advice if you find yourself like me trying to make old code work again(see note below) would be:

start by making everything local and don't rely on someone else's CDN (you don't know when/if they'll be down and or which content they're actually serving you)
- this might require painfully digging out old versions of CDNs by looking at old library websites, but eventually you'll get there
use the network panel in your Developer Tools and see which errors you get (I find the errors more readable and actionable in Firefox's DevTools, specially when it comes to CORS errors)
use server based redirections (e.g. with Apache's .htaccess or nginx config) when you move content to https servers to make sure that
- everything is served under https,
- and also, that if someone reaches an http:// url they are still redirected to the right place
if you're worried about size usage, use a minifier at the end - modern build tools can do tree shaking and so you don't need to "optimise" by delegating your code delivery to a third party

NOTE: this assumes that your code is so ancient it was not even using Bower or any earlier package manager which has some sort of dependency tracking

9. A couple of final reflections

Things are much more mobile first these days

I realised that I was not wanting to show the experiments to people on desktop. I instinctively wanted to show them with my phone, or send them a link (and they often would try to see the thing on their phone because that's where they read my message).

This poses a few interesting challenges as many of these experiments were not designed to be run on a phone. I had to add some code for dealing with touch, rather than mouse events, for example. It wasn't so much that the phones were underpowered, which was the assumption that we used frequently back in the days—it was more about how do you design an interface that either works in both cases, or adapts sufficiently enough to be functional.

I don't have all the answers to that... yet (?).

Modern JavaScript is great!!!

As I looked at the constructs we had to use to gain some semblance of safety, like wrapping our main.js in an anonymous function which was executed immediately on load, or the prototype-style classes, or the files that contained everything, because this was written before even require.js became a thing... as I looked at all those things, I experienced such pain...!

I really had to restrain myself to keep the changes to the bare minimum so that the experiments would work again.

On the other hand: it is so good that we can build stuff in a more repeatable and pleasant way now. It's a bit like one of those cases where if suddenly they take away what you're used to, then you really realise how much you valued what you had.

Even much misunderstood and maligned features such as CORS can enable developers to achieve goals we couldn't reach before: for example, allowing you to load content from some contexts without restrictions (if configured correctly!).

When I look back and forth in this way, I feel very optimistic!

10. Try my experiments!

jranular - granular audio experiment (does NOT use WebAudio)
MACCHINA I - generative music box
tween.audio - sonification of interpolation curves
theremin.js - a take on the classic instrument
3400 miles below - 3D exploration in a strange underwater world
obey - autohypnosis with web audio and canvas
MACCHINA II - physics and web audio
radioworks - fireworks which are radio buttons
to_the_beat - a demoscene demo! which can be interactive (press d)