Playing sound from a location in a 3D scene

Position sound from a specific direction and automatically raise or lower volume based on the environment.

Overview

Audio that adjusts its volume based on distance to a specific reference point is called spatial audio. Spatial audio transmits sound from a particular location, or source, in a specific direction that you define by describing the sound’s 3D position and orientation. Your app can leverage spatial audio to define the point of reference, or define a listener as a player in a game. For example, to indicate that a horn resides on the left-hand side of a scene, PHASE outputs audio more through the left speaker.

PHASE spatial audio accommodates a scene layout in several ways: by observing objects in the scene, the shape of the sound’s source, obstructions, and the point of reference of a listener. To complement obstacles and indoor locations in your scene, you can use spatial audio to layer environmental effects, for example, a reflection or reverberation, on top of spatial sounds.

After describing your app’s scene, you queue sound to play by creating an event node asset that instructs PHASE on what to play and when. At runtime, PHASE checks your app’s state through the event node asset and plays sounds at the right moment that react in accordance to your app’s visual appearance.

Set an object’s scene position and orientation

Before PHASE plays sound in 3D, you create several objects to model your app’s visual scene. Initialize and add the following classes to rootObject, the engine’s root object:

PHASESource for sound sources
PHASEListener for a listener
PHASEOccluder for optional occluders

Each class derives from PHASEObject, which defines a transform to provide it with a unique 3D position and orientation in the scene.

An app can define the scene as a level in a game that’s composed of a detailed set of visual properties, for example, textured pixel data, lighting, and animation. However, to play sound in accordance with the visual scene, PHASE only needs to know the scene’s geometric shape. PHASE interprets the shape through a tranform (of type simd_float4x4), and in particular, its array of four-column vectors. The following transform defines a zeroed orientation and position for a board-game piece:

var chessPiecePose: simd_float4x4 = matrix_identity_float4x4

Define the location from which sound plays

The first type of object you define in the PHASE scene describes a location from which sound originates, such as a chess piece that makes a shuffling noise as it slides to a location on the board. A source emanates sound from a single point in space, or a voluminous area. To produce sound at a point in the scene, create a PHASESource with the engine parameter, for example:

let chessPiecePointSource = PHASESource(engine: engine)

Then, add the source to the scene by handing it to the engine. You can add an object as a child to any other object by calling addChild(_:) on the parent. The result models a scene as a connected graph of objects, or an object hierarchy. The following code adds the chess piece as child of the engine’s root:

do { try engine.rootObject.addChild(chessPiecePointSource) } 
catch { print ("Failed to add a child object to the scene.") }

PHASE interprets object positions in the coordinate space of the parent, which for the root object is the coordinate space of the scene, or the world space. Set an object’s position by assigning the transform’s first three elements of the last column. The following code sets a chess piece’s position to (0,0,-6), which is 6 meters in front of the world origin (0,0,0):

chessPiecePose.columns.3.z -= 6.0
chessPiecePointSource.transform = chessPiecePose

Originate sound from a geometric area

For spatial sounds, the framework requires a 3D position from which the sound originates. To emanate sound from an area larger than a point, for example, the full length of an electric fence in a game, describe the region by configuring a PHASEShape object. The following code models a fence by using a Model I/O plane:

let fenceMesh = MDLMesh(
   planeWithExtent: vector3(0.1, 0.1, 0.1), 
   segments: vector2(1, 1), 
   geometryType: .triangles, 
   allocator: nil)

let fenceShape = PHASEShape(engine: engine, mesh: fenceMesh)
let volumetricFenceSource = PHASESource(engine: engine, shapes: [fenceShape])

do { try engine.rootObject.addChild(volumetricFenceSource) } 
catch { print ("Failed to add a child object to the scene.") }

Create an object that hears sound

A PHASEListener is an object within an app that hears sound; it’s the central position and orientation at which the user experiences spatial audio. The framework adjusts the volume of a sound source based on its unique position and orientation with respect to the listener. For example, a sound that plays far off in the distance from the listener plays quietly, and a closer sound plays more loudly. The following code creates a listener and defines its position and orientation in the scene:

let origin: simd_float4x4 = matrix_identity_float4x4
let listener = PHASEListener(engine: engine)
listener.transform = origin

Add the listener to the scene by inserting it into the object hierarchy. The following code adds the listener as a child of the engine’s root object:

do { try engine.rootObject.addChild(listener) } 
catch { print ("Failed to add child object to the scene.") }

Set up occluder options for obstructed sound

The framework lowers the volume of a sound source when an obstacle blocks its path to the listener. The result creates a realistic effect in cases where, for example, the player takes cover behind a gallery pillar, which reduces the volume of commotion from the other side of the room.

To define an obstacle, create a PHASEOccluder with a shape and position that matches the pillar’s visual counterpart. The following code defines an occluder’s geometric shape using Model I/O:

defaultOccluderSize: Float = 1.0

let pillarOccluderMesh = MDLMesh.newCylinder(withHeight: 10.0, 
    radii: vector_float2(defaultOccluderSize, defaultOccluderSize), 
    radialSegments: 9, 
    verticalSegments: 1, 
    geometryType: MDLGeometryType.triangles, 
    inwardNormals: false, 
    allocator: nil)

You can tweak the reflectivity of a particular occluder by choosing a physical material that makes up the occluder. The material you choose implements a blend between how much an occluder reflects sound and how much sound passes through it. To select a material, pass its corresponding preset, PHASEMaterialPreset, into a new PHASEMaterial object:

let occluderMaterial = PHASEMaterial(engine: engine, preset: .concrete)
let occluderShape = PHASEShape(engine: engine,
   mesh: pillarOccluderMesh,
   materials: [occluderMaterial])

let occluder = PHASEOccluder(engine: engine,
   shapes: [occluderShape])

Position an occluder in the scene by setting its transform. The following code places an occluder midway between the source and listener by dividing the custom defaultSourceDistance variable by two:

let defaultSourceDistance: Float = 10.0

var occluderTransform: simd_float4x4 = origin
occluderTransform.columns.3.z -= defaultSourceDistance / 2.0
occluder.transform = occluderTransform

Activate the occluder by adding it into the scene. The following code adds the occluder as a child of the root object:

do { try engine.rootObject.addChild(occluder) } 
catch { print ("Failed to add a child object to the scene.") }

Describe the output pipeline

As one of the final stages in audio playback configuration, the app specifies the particular object, or mixer, that combines in-flight audio signals for transmission to the output device. For spatial audio, the app creates a spatial mixer, PHASESpatialMixerDefinition. The following code defines a spatial mixer for two sources:

let chessPieceSpatialMixer = PHASESpatialMixerDefinition(
    spatialPipeline: PHASESpatialPipeline(
        flags: .directPathTransmission)!)

let fenceSpatialMixer = PHASESpatialMixerDefinition(
    spatialPipeline: PHASESpatialPipeline(
        flags: .directPathTransmission)!)

The spatial mixer can add environmental layers to the output, such as reflections or reverb, by including the earlyReflections or lateReverb cases, in addition to directPathTransmission, to the flags argument.

Adjust volume based on distance

PHASE attenuates sound over the distance between a source and a listener by observing the distance model you define on the spatial mixer. As a source emits sound, the spatial mixer adjusts its volume based on the distance from the listener. The farther away the source is from the listener, the more the volume attenuates and the quieter the sound gets with respect to the listener.

For more information about distance modeling and its various types, geometric and envelope, see Spatial Mixing.

PHASEGeometricSpreadingDistanceModelParameters is a model that simulates sound loss over distance realistically. The distance model rolloffFactor emphasizes or deemphasizes this model. At 1.0, sound that emanates between the source and listener loses 6 dB every time distance doubles. At 2.0, the loss doubles. At 0.5, the loss halves, and so on. The following code defines a geometric spatial model for the scene’s spatial mixers:

let simpleModel = PHASEGeometricSpreadingDistanceModelParameters()
simpleModel.rolloffFactor = 1.0

chessPieceSpatialMixer.distanceModelParameters = simpleModel
fenceSpatialMixer.distanceModelParameters = simpleModel

Generate a sound event

When PHASE understands the layout and configuration of the scene, sound reacts in accordance when your app plays a sound event by using PHASESoundEvent. You generate a sound event from a sound-event node that describes your particular playback needs.

First, identify the sound assets for your sound events and register them with the engine’s asset registry. The following code loads two files bundled with the project titled “shuffling.wav” and “buzzing.wav”:

let shufflingSoundUrl = Bundle.main.url(forResource: "shuffling", withExtension: "wav")!
var shufflingSoundAsset:PHASESoundAsset!
do {
    shufflingSoundAsset = try engine.assetRegistry.registerSoundAsset(
    url: shufflingSoundUrl,
    identifier: "shufflingSound",
    assetType: PHASEAsset.AssetType.resident,
    channelLayout: nil,
    normalizationMode: .dynamic)
} catch {
    print("Failed to register the sound asset.")
}

let buzzingSoundUrl = Bundle.main.url(forResource: "buzzing", withExtension: "wav")!
var electricBuzzingSoundAsset:PHASESoundAsset!
do {
    electricBuzzingSoundAsset = try engine!.assetRegistry.registerSoundAsset(
    url: buzzingSoundUrl,
    identifier: "buzzingSound",
    assetType: .resident,
    channelLayout: nil,
    normalizationMode: .dynamic)
} catch {
    print("Failed to register the sound asset.")
}

The following code creates sampler nodes for both the shuffling chess piece and electrified fence:

// Define the shuffling chess piece one-shot sound.
let shufflingSamplerNode = PHASESamplerNodeDefinition(
    soundAssetIdentifier: shufflingSoundAsset.identifier,
    mixerDefinition: chessPieceSpatialMixer,
    identifier: shufflingSoundAsset.identifier + "_SamplerNode")

shufflingSamplerNode.playbackMode = .oneShot

// Define the buzzing electric fence looping sound.
let electricBuzzingSamplerNode = PHASESamplerNodeDefinition(
    soundAssetIdentifier: electricBuzzingSoundAsset.identifier,
    mixerDefinition: fenceSpatialMixer,
    identifier: electricBuzzingSoundAsset.identifier + "_SamplerNode")

electricBuzzingSamplerNode.playbackMode = .looping

Give PHASE information about the sound-event nodes by registering their assets with the engine:

var shufflingSoundEventAsset: PHASESoundEventNodeAsset!
do { shufflingSoundEventAsset = try 
    engine.assetRegistry.registerSoundEventAsset(  
    rootNode: shufflingSamplerNode,
    identifier: shufflingSoundAsset.identifier + "_SoundEventAsset")
} catch { print("Failed to register a sound event asset.") } 

var buzzingSoundEventAsset: PHASESoundEventNodeAsset!
do { buzzingSoundEventAsset = try 
    engine.assetRegistry.registerSoundEventAsset(  
    rootNode: electricBuzzingSamplerNode,
    identifier: electricBuzzingSoundAsset.identifier + "_SoundEventAsset")
} catch { print("Failed to register a sound event asset.") }

Define which sound source plays the audio and the listener that hears it by configuring a PHASEMixerParameters object for each spatial mixer:

let chessPieceSpatialMixerParams = PHASEMixerParameters()
chessPieceSpatialMixerParams.addSpatialMixerParameters(
    identifier: chessPieceSpatialMixer.identifier,
    source: chessPiecePointSource,
    listener: listener)

let fenceSpatialMixerParams = PHASEMixerParameters()
fenceSpatialMixerParams.addSpatialMixerParameters(
    identifier: fenceSpatialMixer.identifier,
    source: volumetricFenceSource,
    listener: listener)

To play the sounds, generate a PHASESoundEvent instance for each node and call start(completion:) on the sound event. Associate the source to the sound event by passing its spatialMixerParams through the PHASESoundEvent initializer mixerParameters argument:

let buzzingSoundEvent = try! PHASESoundEvent(engine: engine,
    assetIdentifier: buzzingSoundEventAsset.identifier,
    mixerParameters: fenceSpatialMixerParams)

buzzingSoundEvent.start(completion: nil)

let shufflingSoundEvent = try! PHASESoundEvent(engine: engine,
    assetIdentifier: shufflingSoundEventAsset.identifier,
    mixerParameters: chessPieceSpatialMixerParams)

shufflingSoundEvent.start(completion: nil)

Because electricBuzzingSamplerNode and shufflingSamplerNode have no children, both buzzingSoundEventAsset and shufflingSoundEventAsset contain a node hierarchy of only one node. When the app invokes a sound event from these assets, the same audio plays consistently.