LAME MP3 Encoder :: Mid/Side Stereo

LAME - Mid/Side Stereo

During years, what is called Joint-stereo has been misunderstood.
Joint stereo in MP3 is a mechanism to selectively choose between three modes of storing stereo information. These three modes are Simple Stereo , Mid-Side Stereo, and Intensity-Stereo.

In Simple Stereo, the encoder analyzes the left and the right channels independently and stores the information as-is, without further checking the similarities in the signal¹

In Mid-Side Stereo, the encoder analyzes the left, right², mid (l+r) and side (l-r) channels. It then gives more bits to the mid than the side channel (as usually the side channel is less complex) and then stores just the mid and side channels into the resulting MP3.
This way, the mid channel can be encoded as if the frame was bigger, and as such have more quality with the same bitrate.
Note: Mid/side in MP3 is switched frame-by-frame. In AAC, it can be switched band by band.

Intensity-Stereo (not supported in LAME) uses a technique known as joint frequency encoding, which is based on the principle of sound localization.
Human hearing is predominantly less acute at perceiving the direction of certain audio frequencies. By exploiting this 'limitation', intensity stereo coding can reduce the data rate of an audio stream with little or no perceived change in apparent quality.
It works by merging the upper spectrum into just one channel (thus reducing overall differences between channels) and transmiting a little side information about how to pan certain frequency regions.
This type of coding does not perfectly reconstruct the original audio because of the loss of information and can cause unwanted artifacts. However, for very low bitrates this tool usually provides a gain of perceived quality. ³

The LAME mid/side switching criterion, and mid/side masking thresholds are taken from Johnston and Ferreira, Sum-Difference Stereo Transform Coding, Proc. IEEE ICASSP (1992) p 569-571.

The MPEG AAC standard claims to use mid/side encoding based on this paper.

This is not the same than dual-mono. Dual-mono should be used where the left and right channels of the input file contain two different streams, where you should choose one (as in two different languages)
If one channel has much less noise masking in a certain band than the other, it could happen than the noise spread (by mid/side stereo) may no longer be masked for that channel. If both channels have the same masking, then the noise spread between both channels will be equally well masked.
To prevent this from happening, there is an analysis done on the left and right channel to determine the noise masking thresholds and properly mask the noise.
Quote from wikipedia Joint_stereo.