Audio

Chapter 5

Audio

Get your games rocking!

Subsections of Audio

Introduction

We often focus on the visual aspects of games, but the audio aspects can really make a game shine. Consider that many game tracks are now presented as orchestral performances:

And how important sound effects can be for conveying what is happening in a game?

In this chapter, we will explore both sound effects and music, and how to implement them within MonoGame.

Sound Effects

From the “bing” of a coin box in Super Mario Bros to the reveal chimes of the Zelda series, sound effects provide a powerful mechanism for informing the player of what is happening in your game world.

SoundEffect Class

MonoGame represents sound effects with the SoundEffect class. Like other asset types, we don’t normally construct this directly, we rather load it through the content pipeline. Usually, a sound effect will start as a .wav file, though a handful of other file formats are acceptable.

Once loaded, the SoundEffect can be played with the SoundEffect.Play() method. This is essentially a fire-and-forget method - you invoke, it and the framework takes care of loading and playing the sound.

You can also use the SoundEffect.Play(float volume, float pitch, float pan) to customize the playback:

  • volume ranges from $ 0.0 $ (silence) to $ 1.0 $ (full volume)
  • pitch adjusts the pitch from $ -1.0 $ (down an octave) to $ 1.0 $ (up an octave), with $ 0.0 $ indicating no change
  • pan pans the sound in stereo, with $ -1.0 $ entirely on the left speaker, and $ 1.0 $ on the right, and $ 0.0 $ centered.

Note that the per-sound-effect volume is multiplied by the static SoundEffect.MasterVolume property. This allows for the adjustment of all sound effects in the game, separate from music.

Warning

Note that if you invoke Play() on a sound effect multiple frames in a row, it will start playing another copy of the sound effect on each frame. The result will be an ugly mash of sound. So be sure that you only invoke Play() once per each time you want to use the sound!

SoundEffectInstance Class

If you need finer control of your sound effects, you can also create a SoundEffectInstance from one with: SoundEffect.CreateInstance(). This represents a single instance of a sound effect, so invoking its Play() method will restart the sound from the beginning (essentially, SoundEffect.Play() creates a SoundEffectInstance that plays and disposes of itself automatically).

The SoundEffectInstance exposes properties that can be used to modify its behavior:

  • IsLooped is a boolean that when set to true, causes the sound effect to loop indefinitely.
  • Pan pans the sound in stereo, with $ -1.0 $ entirely on the left speaker, and $ 1.0 $ on the right, and $ 0.0 $ centered.
  • Pitch adjusts the pitch from $ -1.0 $ (down an octave) to $ 1.0 $ (up an octave), with $ 0.0 $ indicating no change
  • Volume ranges from $ 0.0 $ (silence) to $ 1.0 $ (full volume)
  • State returns a SoundState enumeration value, one of (SoundState.Paused, SoundState.Playing, or SoundState.Stopped)

The SoundEffectInstance also provides a number of methods:

  • Play() plays or resumes the sound effect
  • Pause() pauses the sound effect
  • Resume() resumes a paused sound effect
  • Stop() immediately stops the sound effect (so when started it starts from the beginning)
  • Stop(bool immediate) also stops the sound effect, immediately if true, or its authored release phase, i.e. a fade, if false

Perhaps the strongest reason for creating a SoundEffectInstance is to be able to crate positional sound. We’ll discuss this next.

Positional Sounds

Positional sounds provide the illusion of depth and movement by using panning, doppler shift, and other techniques to emulate the affect movement and distance have on sounds. Positional sounds can convey important information in games, especially when combined with surround-sound speakers and headphones.

To create positional sound effects, we need to place the sound in a 3D (or pseudo 2D) soundscape, which incorporates both a listener (i.e. the player) and an emitter (the source of the sound). Consider the example soundscape below:

An example soundscape An example soundscape

We have two sound effects, one played by emitter A and one by emitter B, and the player is represented by the listener. If we imagine the listener is facing downwards, we would expect that the sound from emitter A will play more on the right speaker, and emitter B on the left (given stereo speakers). For a surround sound system, these would be further distinguished by playing on the front speakers.

In addition to determining which speaker(s) a sound is played with, positional sounds also usually incorporate attenuation and doppler effect.

Attenuation in this context means that sound waves get softer the farther they travel (as some of the energy in the wave is absorbed by the air as heat). Thus, as emitter B is farther from the listener than emitter A, we would expect that if the same sound were played by both emitters, emitter B would be softer.

Doppler effect refers to the change in pitch of a sound when either the emitter or listener is moving. When the distance between the emitter and listener is getting smaller, the sound waves emitted by the emitter are closer together (higher frequency), resulting in a higher pitch. And when they are moving apart, the waves are farther apart, resulting in a lower frequency and pitch.

Info

Position, attenuation, and doppler effect represent some of the easiest-to-implement aspects of the physics of sound, which is why they are commonly implemented in video game audio libraries. More complex is the interaction of sound with the environment, i.e. absorption and reflection by surfaces in the game world. This parallels the early days of 3D rendering, when the Phong illumination model (which we’ll talk about soon) provided a simplistic but adequate technique for handling lights in a 3D scene.

The MonoGame framework provides two classes for establishing positional sound, the AudioEmitter and AudioListener .

AudioListener Class

The AudioListener class represents the player (or microphone) in the game world, and all position, attenuation, and doppler effects are calculated relative to its position, orientation, and velocity. It exposes four properties:

  • Position is a Vector3 defining the position of the listener in the game world
  • Forward is a Vector3 defining the direction the listener is facing in the game world.
  • Up is a Vector3 defining the direction up relative to the direction the player is facing (generally it would be Vector3.Up). It is used as part of the 3D math calculating the effects.
  • Velocity is a Vector3 defining the velocity at which the listener is moving in the game world.

When using an AudioListener instance, you would set these each update to reflect the corresponding position, orientation, and velocity of the player.

AudioEmitter Class

The AudioEmitter class represents the source of a sound in the game world. It exposes the same properties as the AudioListener:

  • Position is a Vector3 defining the position of the emitter in the game world
  • Forward is a Vector3 defining the direction the emitter is facing in the game world.
  • Up is a Vector3 defining the direction up relative to the direction the emitter is facing (generally it would be Vector3.Up). It is used as part of the 3D math calculating the effects.
  • Velocity is a Vector3 defining the velocity at which the emitter is moving in the game world.

Playing Positional Sound

Positional sounds are played by a SoundEffectInstance, not by the actual emitter; the emitter rather serves to locate the sound source. Thus, to calculate and apply the 3D effects on a sound effect we would use something like:

SoundEffect sfx = Content.Load<sfx>("sound");
var instance = sfx.CreateInstance();
var listener = new AudioListener();
// TODO: Position and orient listener 
var emitter = new AudioEmitter();
// TODO: Position and orient emitter
instance.Apply3D(listener, emitter);

Using Positional Sound in a 2D Game

The positional sound support in MonoGame is for 3D soundscapes, but just as we can render 2D sprites using 3D hardware, we can create 2D soundscapes in 3D. The easiest technique for this is to position all our emitters and listeners in the plane $ z=0 $.

The Vector3 constructor actually has support for this built-in as it can take a Vector2 for the X and Y components, and a separate scalar for the Z component. Consider a game where we represent the player’s position with a Vector2 position, direction with a Vector2 direction, and velocity with a Vector2 velocity. We can update our AudioListener listener with:

// Update listener properties
listener.Position = new Vector3(position, 0);
listener.Forward = new Vector3(direction, 0);
listener.Velocity = new Vector3(velocity, 0);

Since the Up vector will never change, we can just set it to Vector3.UnitZ (which is the vector $ <0,0,1> $) when we first create the listener.

The emitters would be set up the same way.

Music

Music also has a powerful role to play in setting the mood. It can also be used to convey information to the player, as Super Mario Bros does when the remaining time to finish the level falls below 1 minute.

Song Class

While it is possible to play music using a SoundEffect, MonoGame supports music through the Song class. This represents a song loaded from a wav or mp4 file.

In addition to the audio data, the Song defines properties for accessing the audio file’s metadata:

  • Name is the name of the song
  • Album is the album the song is from
  • Artist is the song’s artist
  • Duration is the length of the song.
  • Genre is the genre of the song
  • TrackNumber is song’s track number on its album

Note that for these properties to be populated, the original audio file would need to have the corresponding metadata set.

Unlike the SoundEffect, the Song class does not have a play method. Instead it is played with the static MediaPlayer class, i.e.:

Song song = Content.Load<Song>("mysong");
MediaPlayer.Play(song);

SongCollection Class

Invoking MediaPlayer.Play() will immediately end the current song, so if you want your game to transition between songs smoothly, you’ll probably want to use the SongCollection class.

As you might expect, this is a collection of Song objects, and implements methods:

  • Add(Song song) adds a song to the collection
  • Clear() clears the collection.

SongCollections can also be played with the static MediaPlayer.Play(SongCollection collection) method:

Song song1 = Content.Load<Song>("song1");
Song song2 = Content.Load<Song>("song2");
Song song3 = Content.Load<Song>("song3");
SongCollection songCollection = new SongCollection();
songCollection.Add(song1);
songCollection.Add(song2);
songCollection.Add(song3);
MediaPlayer.Play(songCollection);

The MediaPlayer Class

The static MediaPlayer class is really an interface to the Windows Media Player. Unlike the SoundEffect class, which communicates directly with the sound card and manipulates audio buffers, songs are piped through the Windows Media Player. Hence, the reason MediaPlayer can only play a single song at a time.

Some of the most useful properties of the MediaPlayer for games are:

  • IsMuted - A boolean property that can be used to mute or unmute the game’s music
  • Volume - A number between 0 (silent) and 1 (full volume) that the music will play at
  • IsRepeating - A boolean property that determines if the song or song list should repeat
  • IsShuffled - A boolean property that determines if a song list should be played in a shuffled order
  • State - A value of the MediaState enum, describing the current state of the media player, which can be MediaState.Paused, MediaState.Playing, or MediaState.Stopped.

Much like you would expect from a media playing device, the MediaPlayer also implements some familiar controls as methods:

  • Play(Song song) and Play(SongList songList) play the specified song or song list.
  • Pause() pauses the currently playing song
  • Resume() resumes a paused song
  • Stop() stops playing the current song
  • MoveNext() moves to the next song in the song list
  • MovePrevious() moves to the previous song in the song list

In addition, the MediaPlayer implements two events that may be useful:

  • ActiveSongChanged - triggered when the active song changes
  • MediaStateChanged - triggered when the media state changes
Info

This section only touches on the classes, methods and properties of the Microsoft.Xna.Framework.Media namespace most commonly used in games. Because it is a wrapper around the Windows Media Player, it is also possible to access and play the users’ songs and playlists that have been added to Windows Media Player. Refer to the MonoGame documentation for more details.