Audio
Chapter 5
Audio
Get your games rocking!
Get your games rocking!
We often focus on the visual aspects of games, but the audio aspects can really make a game shine. Consider that many game tracks are now presented as orchestral performances:
And how important sound effects can be for conveying what is happening in a game?
In this chapter, we will explore both sound effects and music, and how to implement them within MonoGame.
From the “bing” of a coin box in Super Mario Bros to the reveal chimes of the Zelda series, sound effects provide a powerful mechanism for informing the player of what is happening in your game world.
MonoGame represents sound effects with the SoundEffect
class. Like other asset types, we don’t normally construct this directly, we rather load it through the content pipeline. Usually, a sound effect will start as a .wav file, though a handful of other file formats are acceptable.
Once loaded, the SoundEffect
can be played with the SoundEffect.Play()
method. This is essentially a fire-and-forget method - you invoke, it and the framework takes care of loading and playing the sound.
You can also use the SoundEffect.Play(float volume, float pitch, float pan)
to customize the playback:
volume
ranges from
$ 0.0 $ (silence) to
$ 1.0 $ (full volume)pitch
adjusts the pitch from
$ -1.0 $ (down an octave) to
$ 1.0 $ (up an octave), with
$ 0.0 $ indicating no changepan
pans the sound in stereo, with
$ -1.0 $ entirely on the left speaker, and
$ 1.0 $ on the right, and
$ 0.0 $ centered.Note that the per-sound-effect volume is multiplied by the static SoundEffect.MasterVolume
property. This allows for the adjustment of all sound effects in the game, separate from music.
Note that if you invoke Play()
on a sound effect multiple frames in a row, it will start playing another copy of the sound effect on each frame. The result will be an ugly mash of sound. So be sure that you only invoke Play()
once per each time you want to use the sound!
If you need finer control of your sound effects, you can also create a SoundEffectInstance
from one with: SoundEffect.CreateInstance()
. This represents a single instance of a sound effect, so invoking its Play()
method will restart the sound from the beginning (essentially, SoundEffect.Play()
creates a SoundEffectInstance
that plays and disposes of itself automatically).
The SoundEffectInstance
exposes properties that can be used to modify its behavior:
IsLooped
is a boolean that when set to true, causes the sound effect to loop indefinitely.Pan
pans the sound in stereo, with
$ -1.0 $ entirely on the left speaker, and
$ 1.0 $ on the right, and
$ 0.0 $ centered.Pitch
adjusts the pitch from
$ -1.0 $ (down an octave) to
$ 1.0 $ (up an octave), with
$ 0.0 $ indicating no changeVolume
ranges from
$ 0.0 $ (silence) to
$ 1.0 $ (full volume)State
returns a SoundState
enumeration value, one of (SoundState.Paused
, SoundState.Playing
, or SoundState.Stopped
)The SoundEffectInstance
also provides a number of methods:
Play()
plays or resumes the sound effectPause()
pauses the sound effectResume()
resumes a paused sound effectStop()
immediately stops the sound effect (so when started it starts from the beginning)Stop(bool immediate)
also stops the sound effect, immediately if true
, or its authored release phase, i.e. a fade, if false
Perhaps the strongest reason for creating a SoundEffectInstance
is to be able to crate positional sound. We’ll discuss this next.
Positional sounds provide the illusion of depth and movement by using panning, doppler shift, and other techniques to emulate the affect movement and distance have on sounds. Positional sounds can convey important information in games, especially when combined with surround-sound speakers and headphones.
To create positional sound effects, we need to place the sound in a 3D (or pseudo 2D) soundscape, which incorporates both a listener (i.e. the player) and an emitter (the source of the sound). Consider the example soundscape below:
We have two sound effects, one played by emitter A and one by emitter B, and the player is represented by the listener. If we imagine the listener is facing downwards, we would expect that the sound from emitter A will play more on the right speaker, and emitter B on the left (given stereo speakers). For a surround sound system, these would be further distinguished by playing on the front speakers.
In addition to determining which speaker(s) a sound is played with, positional sounds also usually incorporate attenuation and doppler effect.
Attenuation in this context means that sound waves get softer the farther they travel (as some of the energy in the wave is absorbed by the air as heat). Thus, as emitter B is farther from the listener than emitter A, we would expect that if the same sound were played by both emitters, emitter B would be softer.
Doppler effect refers to the change in pitch of a sound when either the emitter or listener is moving. When the distance between the emitter and listener is getting smaller, the sound waves emitted by the emitter are closer together (higher frequency), resulting in a higher pitch. And when they are moving apart, the waves are farther apart, resulting in a lower frequency and pitch.
Position, attenuation, and doppler effect represent some of the easiest-to-implement aspects of the physics of sound, which is why they are commonly implemented in video game audio libraries. More complex is the interaction of sound with the environment, i.e. absorption and reflection by surfaces in the game world. This parallels the early days of 3D rendering, when the Phong illumination model (which we’ll talk about soon) provided a simplistic but adequate technique for handling lights in a 3D scene.
The MonoGame framework provides two classes for establishing positional sound, the AudioEmitter
and AudioListener
.
The AudioListener
class represents the player (or microphone) in the game world, and all position, attenuation, and doppler effects are calculated relative to its position, orientation, and velocity. It exposes four properties:
Position
is a Vector3
defining the position of the listener in the game worldForward
is a Vector3
defining the direction the listener is facing in the game world.Up
is a Vector3
defining the direction up relative to the direction the player is facing (generally it would be Vector3.Up
). It is used as part of the 3D math calculating the effects.Velocity
is a Vector3
defining the velocity at which the listener is moving in the game world.When using an AudioListener
instance, you would set these each update to reflect the corresponding position, orientation, and velocity of the player.
The AudioEmitter
class represents the source of a sound in the game world. It exposes the same properties as the AudioListener
:
Position
is a Vector3
defining the position of the emitter in the game worldForward
is a Vector3
defining the direction the emitter is facing in the game world.Up
is a Vector3
defining the direction up relative to the direction the emitter is facing (generally it would be Vector3.Up
). It is used as part of the 3D math calculating the effects.Velocity
is a Vector3
defining the velocity at which the emitter is moving in the game world.Positional sounds are played by a SoundEffectInstance
, not by the actual emitter; the emitter rather serves to locate the sound source. Thus, to calculate and apply the 3D effects on a sound effect we would use something like:
SoundEffect sfx = Content.Load<sfx>("sound");
var instance = sfx.CreateInstance();
var listener = new AudioListener();
// TODO: Position and orient listener
var emitter = new AudioEmitter();
// TODO: Position and orient emitter
instance.Apply3D(listener, emitter);
The positional sound support in MonoGame is for 3D soundscapes, but just as we can render 2D sprites using 3D hardware, we can create 2D soundscapes in 3D. The easiest technique for this is to position all our emitters and listeners in the plane $ z=0 $.
The Vector3
constructor actually has support for this built-in as it can take a Vector2
for the X
and Y
components, and a separate scalar for the Z
component. Consider a game where we represent the player’s position with a Vector2 position
, direction with a Vector2 direction
, and velocity with a Vector2 velocity
. We can update our AudioListener listener
with:
// Update listener properties
listener.Position = new Vector3(position, 0);
listener.Forward = new Vector3(direction, 0);
listener.Velocity = new Vector3(velocity, 0);
Since the Up
vector will never change, we can just set it to Vector3.UnitZ
(which is the vector
$ <0,0,1> $) when we first create the listener.
The emitters would be set up the same way.
Music also has a powerful role to play in setting the mood. It can also be used to convey information to the player, as Super Mario Bros does when the remaining time to finish the level falls below 1 minute.
While it is possible to play music using a SoundEffect
,
MonoGame supports music through the Song
class. This represents a song loaded from a wav or mp4 file.
In addition to the audio data, the Song
defines properties for accessing the audio file’s metadata:
Name
is the name of the songAlbum
is the album the song is fromArtist
is the song’s artistDuration
is the length of the song.Genre
is the genre of the songTrackNumber
is song’s track number on its albumNote that for these properties to be populated, the original audio file would need to have the corresponding metadata set.
Unlike the SoundEffect
, the Song
class does not have a play method. Instead it is played with the static MediaPlayer
class, i.e.:
Song song = Content.Load<Song>("mysong");
MediaPlayer.Play(song);
Invoking MediaPlayer.Play()
will immediately end the current song, so if you want your game to transition between songs smoothly, you’ll probably want to use the SongCollection
class.
As you might expect, this is a collection of Song
objects, and implements methods:
Add(Song song)
adds a song to the collectionClear()
clears the collection.SongCollections
can also be played with the static MediaPlayer.Play(SongCollection collection)
method:
Song song1 = Content.Load<Song>("song1");
Song song2 = Content.Load<Song>("song2");
Song song3 = Content.Load<Song>("song3");
SongCollection songCollection = new SongCollection();
songCollection.Add(song1);
songCollection.Add(song2);
songCollection.Add(song3);
MediaPlayer.Play(songCollection);
The static MediaPlayer
class is really an interface to the Windows Media Player. Unlike the SoundEffect
class, which communicates directly with the sound card and manipulates audio buffers, songs are piped through the Windows Media Player. Hence, the reason MediaPlayer
can only play a single song at a time.
Some of the most useful properties of the MediaPlayer
for games are:
IsMuted
- A boolean property that can be used to mute or unmute the game’s musicVolume
- A number between 0 (silent) and 1 (full volume) that the music will play atIsRepeating
- A boolean property that determines if the song or song list should repeatIsShuffled
- A boolean property that determines if a song list should be played in a shuffled orderState
- A value of the MediaState
enum, describing the current state of the media player, which can be MediaState.Paused
, MediaState.Playing
, or MediaState.Stopped
.Much like you would expect from a media playing device, the MediaPlayer
also implements some familiar controls as methods:
Play(Song song)
and Play(SongList songList)
play the specified song or song list.Pause()
pauses the currently playing songResume()
resumes a paused songStop()
stops playing the current songMoveNext()
moves to the next song in the song listMovePrevious()
moves to the previous song in the song listIn addition, the MediaPlayer
implements two events that may be useful:
ActiveSongChanged
- triggered when the active song changesMediaStateChanged
- triggered when the media state changesThis section only touches on the classes, methods and properties of the Microsoft.Xna.Framework.Media
namespace most commonly used in games. Because it is a wrapper around the Windows Media Player, it is also possible to access and play the users’ songs and playlists that have been added to Windows Media Player. Refer to the MonoGame documentation for more details.