The Wayback Machine - https://web.archive.org/web/20070818112312/http://www.real.com:80/msaudio/
 
      Home  |  Guide  |  Download: RealPlayer - RealJukebox - Games  |  GoldPass  |  Help  |  International

Analysis of the Microsoft Audio Codec

Overview
The MS Audio Codec is claimed by Microsoft to be both: a replacement for the MP3 format for high quality music and superior to the RealAudio G2 Music Codec for internet streaming.

Both claims are false, as outlined in this document:
Introduction: Why listening tests matter
Section One: MS Audio is not superior to MP3
Section Two: RealAudio G2 is a better streaming solution
Section Three: Other factors that make RealAudio G2 a better streaming solution

Introduction: Why listening tests matter
Listening tests are how audio formats are evaluated. For example, established formats like AC-3 (from Dolby laboratories) undergo extensive tests to establish whether listeners can tell the difference between an original and compressed signal. These listening tests are conducted extremely carefully to legitimately confirm the "transparency" (similarity to the original signal) of a given format. In conducting such tests several critical test criteria must be adhered to, which Microsoft has not confirmed they have followed:

Content must be chosen which is "difficult"
When doing critical listening, samples must be chosen which are both hard to compress and likely to create artifacts which are easy to hear. Examples of such content include isolated instruments (like castanets or a harpsichord, for example), snare drums, acoustic guitars, and solo vocal passages. Many types of content are very easy to compress and sound great to a listener even at very low bitrates, such as orchestral music and loud/bright popular music (where there are many instruments and relatively quiet vocals).

Short sample lengths can mask significant objectionable characteristics
It is especially important when evaluating audio codec performance to ensure sample clip lengths are long enough to identify objectionable artifacts that would become annoying after sustained listening. This is especially critical when it comes to codecs for music content, as songs and albums can span minutes or hours, respectively of sustained listening. Short clips can be used to downplay or minimize truly objectionable artifacts, making the mildly objectionable become truly un-listenable.

What follows is a brief assessment of the performance of RealAudio G2, MP3, and the recently introduced MS Audio. These assessments subject each audio technology to evaluation across parameters that are grounded in real-world applications, and highlight capabilities that are required in order for sustained, pleasurable listening experiences.

Section One: MS Audio is not superior to MP3
Several types of audio artifacts (audible differences between the compressed signal and the original) introduced by MS Audio can easily be discerned in careful listening. These artifacts are not present in MP3, nor are they present in any of several other currently available high-quality audio format choices (such as Liquid Audio, Lucent's EPAC, A2B Music, and the MPEG-4 AAC standard). An audio format seeking to deliver a listening experience to the user which is identical to the original CD must be absolutely free of such artifacts, and must perform well enough to produce a satisfactory listening experience over minutes and hours of listening.

Note that it is much easier to hear these artifacts if you listen using headphones.

NOTE: All files are available for download only, they are not configured to stream. Right mouse click on the clip and select "save link" to download these files, left click to download and immediately play.

Pre-echo artifacts in vocals
Listen to the original uncompressed file, and then listen to the MP3 and MS Audio clips. Note that in the original signal and in the MP3 file, the voice is very clear and has no echo. Then listen to the MS Audio clips and notice the echoing, especially at the very beginning of the speaker's words.

  Female Speech
.wav Original
128kbps MP3
64kbps MS Audio
128kbps MS Audio

Microsoft specifically claims (and actually cites some test data to corroborate) that the MS Audio format at 64Kbps is equivalent or superior to MP3 at 128Kbps. In listening to the above samples, notice that even at 128Kbps, MS Audio cannot reproduce the sound quality of MP3 on this type of signal. This is also a good example of a very easy to hear artifact.

Muddiness and blurring on drums and percussion
Listen to the original uncompressed file, and them listen to the MP3 and MS Audio clips. Note that in the original signals and in the MP3 file, the snare drum and high-hat are very clear and dry. Then listen to the MS Audio clips and notice that the drum and high-hat are so badly blurred and muddy as to almost sound like different instruments, a deficiency in encoding.
  Dave Mathews Band Dire Straits
.wav Original Original
128kbps MP3 MP3
64kbps MS Audio MS Audio
128kbps MS Audio MS Audio

Transient artifacts on acoustic guitars
Note that in the original signal and in the MP3 file, the initial pluck of the guitar string is clear, and a sharp click (probably the performer's fingernail on the string) is clearly audible. Then listen to the MS Audio clips and notice that the click and beginning of the note have blurred to become a kind of rushing noise that precedes the note.

  Acoustic Guitar
.wav Original
128kbps MP3
64kbps MS Audio
128kbps MS Audio

Both the acoustic guitar and the snare drums illustrate the MS Audio codec's introduction of artifacts when compressing a broader class of instruments - those with very high "attack". These signals are often referred to as "transient" signals, because they have sounds that are very isolated and do not repeat over time. The MS Audio codec distorts these kinds of signals by creating "echos" of the transient which actually begin before the original sound or note. This results in the unusual artifacts heard here - often this sounds like playing the instrument backwards, particularly with drums.

Section Two: RealAudio G2 is a better streaming solution

G2 has a far lower level of audio artifacts at low bitrates
Compressing audio to low bitrates always involves a tradeoff between frequency response (the highest notes that can be reproduced by format) and artifact level (the amount by which the frequencies that are reproduced differ from the original signal). Although increased frequency response makes the music sound better/brighter, this response is at the cost of higher levels of artifacts. In tuning a low bitrate codec, designers must set this tradeoff to best handle the widest range of music. Certain types of music may sound fantastic at a given setting, but others may sound so bad as to be unlistenable. The design of RealAudio codecs has always been to give acceptable performance for ALL musical types.

As with the evaluations in the prior section of this document, it is especially important when making audio codec performance evaluations to ensure sample clip lengths are long enough to identify objectionable artifacts that would become annoying after sustained listening.

Listen to the RealAudio and MS Audio samples below. In many cases, you will note that while the frequency response (brightness) of the MS Audio codec may be slightly higher than RealAudio G2, but does introduce easily heard artifacts.

NOTE: All files are available for download only, they are not configured to stream. Right mouse click on the clip and select "save link" to download these files, left click to download and immediately play.
  Dave Mathews Band Dire Straits Acoustic Guitar Female Speech
.wav Original Original Original Original
20kbps RealAudio G2 RealAudio G2 RealAudio G2 RealAudio G2
20kbps MS Audio MS Audio MS Audio MS Audio

  Dave Mathews Band Dire Straits Acoustic Guitar Female Speech
.wav Original Original Original Original
32kbps RealAudio G2 RealAudio G2 RealAudio G2 RealAudio G2
32kbps MS Audio MS Audio MS Audio MS Audio

Section Three: Other factors that make RealAudio G2 a better streaming solution

Lost packet tolerance
Internet transmission is inherently lossy, meaning that sometimes packets of audio are lost. An audio codec must gracefully recover by minimizing the audibility of such losses. RealAudio G2 has far better loss performance than MS Audio, making it a better choice for internet streaming. See a demonstration of RealAudio G2 performing over lossy conditions here:
http://www.real.com/showcase/tech/music_quality.html

SureStream
When used in RealSystem G2, RealAudio G2offers another powerful feature - the ability to shift up and down in response to the measured bitrate of the internet connection. This results in a continuous playback experience without rebuffering. The latest release of Windows Media Technologies does not have support for this type of dynamic variability in audio streaming. To see a demonstration of SureStream's seamless bandwidth negotiation capabilities for both audio and video technology, go here: http://www.real.com/showcase/tech/surestream.html