Songbird: Spatial Audio Encoding on the Web
Hear Songbird in action (currently the examples only work on laptops/desktops):
The implementation of Songbird is based on the
Google spatial media specification.
It expects mono input to its
Source instances and outputs
ambisonic (multichannel) ACN channel layout with SN3D normalization. Detailed
documentation may be found
Table of Contents
- How it works
- Differences to PannerNode
- Related Resources
How it works
the Web using Higher-Order Ambisonics (HOA). This is accomplished by attached
audio input to a
Source which has associated spatial object parameters.
Source objects are attached to a
Songbird instance, which models the
listener as well as the room environment the listener and sources are in.
Binaurally-rendered ambisonic output is generated using
Omnitone, and raw ambisonic output
is exposed as well.
Songbird is designed to be used for web front-end projects. So NPM is recommended if you want to install the library to your web project. You can also clone this repository and use the library file as usual.
npm install songbird-audio
The first step is to include the library file in an HTML document.
<!-- Use Songbird from installed node_modules/ --><!-- if you prefer to use CDN --><!-- or -->
Spatial encoding is done by creating a
Songbird scene using an associated
AudioContext and then creating any number of associated
Songbird.createSource(). Any number of AudioNodes can be connected
Source objects. The
Songbird scene models a physical
listener while adding room reflections and reverberation. The
instances model acoustic sound sources. The library is designed to be easily
integrated into an existing WebAudio audio graph.
"Hello World" Example
Let's see how we can create a scene and generate some audio. Let's begin by
Songbird scene and connecting it to the
audio output. You can view a live demo of this example
var audioContext = ;// Create a (1st-order Ambisonic) Songbird scene.var songbird = audioContext;// Send songbird's binaural output to stereo out.songbirdoutput;
Next, let's add a room. By default, the room size is 0m x 0m x 0m (i.e. there is no room and we are in free space). To define a room, we simply need to provide the dimensions in meters (the room's center is the origin). We can also define the materials of each of the 6 surfaces (4 walls + ceiling + floor). A range of materials are predefined in Songbird, each with different reflection properties. To create hidden/missing walls, select the 'transparent' material.
// Set room acoustics properties.var dimensions =width : 31height : 25depth : 34;var materials =left : 'brick-bare'right : 'curtain-heavy'front : 'marble'back : 'glass-thin'down : 'grass'up : 'transparent';songbird;
Next, we create an audio element, load some audio and feed the audio element
into the audio graph as an AudioNode. We then create a
Source and connect the
AudioNode to it. The default position for a
Source is the origin.
// Create an audio element. Feed into audio graph.var audioElement = document;audioElementsrc = 'resources/SpeechSample.wav';// Create an AudioNode from the audio element.var audioElementSource = audioContext;// Create a Source, connect desired audio input to it.var source = songbird;audioElementSource;
Finally, we can position the source relative to the listener and then playback
the audio with the familiar
.play() method. This will binaurally render the
scene we have just created.
// The source position is relative to the origin (center of the room).source;// Playback the audio.audioElement;
Positioning Sources and the Listener
Source objects can be placed with cartesian coordinates relative to the origin
(center of the room). Songbird uses a right-handed coordinate system, similar
to OpenGL and Three.js.
// Or Source's and Listener's positions.source;songbird;
Source and Listener orientations can be set using forward and up vectors:
// Set Source and Listener orientation.source;songbird;
Source's and Listener position and orientation can be set
using Three.js Matrix4 objects:
Room properties can be set to control the characteristics of spatial reflections and reverberation. We currently support the following named materials:
When constructing a
Songbird scene, optional scene-related arguments may be
provided to override default values.
var songbirdOptions =ambisonicOrder: 1listenerPosition: 1 0 0listenerForward: 1 0 0listenerUp: 0 1 0dimensions: width: 3 height: 4 depth: 5materials: left: 'uniform' right: 'uniform' down: 'uniform'up: 'uniform' front: 'uniform' back: 'uniform'speedOfSound: 340;var songbird = audioContext songbirdOptions
Likewise, when creating a new
Source, source-related optional arguments may
var sourceOptions =position: 0 10 10forward: 0 0 -1up: 0 1 0minDistance: 01maxDistance: 1000rolloff: 'logarithmic'gain: 01alpha: 0sharpness: 1sourceWidth: 0var source = songbird;
See the documentation for more details on all optional arguments.
Differences to PannerNode
There are several advantages to using Songbird over WebAudio's PannerNode.
- Room Acoustics
PannerNode requires two convolutions per encoded source. But because we employ ambisonics, there is a fixed cost associated with rendering from Songbird, with nominal costs per source. Developers can adjust the desired ambisonic order (from 1 to 3) to control the majority of computational costs.
In addition to controlling computational costs, adjusting the ambisonic order controls the quality of spatialization (Higher order typically yields better direct source localization). Additionally, we offer direct ambisonic output (which bypasses the rendering), allowing developers total control over how to render their content.
Songbird comes with room modelling effects, which includes both an early and late reflection model based on the room properties. These effects are likewise ambisonically-encoded and fully spatialized.
Porting PannerNode projects to Songbird
For projects already employing PannerNode, it is fairly simple to switch to
Songbird. Below is a basic
// Create a "PannerNode."var panner = audioContext;// Initialize propertiespannerpanningModel = 'HRTF';pannerdistanceModel = 'inverse';pannerrefDistance = 01;pannermaxDistance = 1000;// Connect input to "PannerNode".audioElementSource;// Connect "PannerNode" to audio output.panner;// Set "PannerNode" and Listener positions.panner;audioContextlistener;
And below is the same example converted to Songbird:
// Create a Songbird "Source" with properties.var source = songbird;// Connect input to "Source."audioElementSource;// Connect Songbird’s output to audio output.songbirdoutput;// Set "Source" and Listener positions.source;songbird;
Songbird uses WebPack to build the minified library and to manage dependencies.
npm install # install dependencies.npm run build # build a non-minified library.npm run watch # recompile whenever any source file changes.npm run build-all # build a minified library.npm run build-doc # generate documentation.npm run eslint # lint code for ES6 compatibility.
Note that unit tests require the promisified version of
so they might not run on non-spec-compliant browsers. Songbird's Travis CI is
using the latest stable version of Chrome.
Testing Songbird Locally
For the local testing with Karma test runner, Chrome/Chromium-based browser is required. Windows and Linux platforms are supported.
Special thanks to Alper Gungormusler, Hongchan Choi, Marcin Gorzel, and Julius Kammerl for their help on this project.
If you have found an error in this library, please file an issue at: https://github.com/Google/songbird/issues.
Patches are encouraged, and may be submitted by forking this project and submitting a pull request through GitHub. See CONTRIBUTING for more detail.
Copyright © 2017 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.