When newer technologies are properly applied, the CD can deliver stunning results that closely rival the high-resolution formats. A significant improvement can be gained through the use of high-headroom interpolators in D/A and SRC devices. This paper shows how intersample peaks can be reconstructed and rendered without clipping.
The Compact Disc (CD) was introduced in 1982. Thirty five years later, the 44.1 kHz 16-bit CD audio format is still the most popular digital format. In spite of the push for high-resolution audio formats, the vast majority of new recordings are only released in this format (or in files derived from this format).
The CD format delivers a DC to 22 kHz frequency response and about a 96 dB signal to noise ratio. In 1982, the 44.1 kHz sample rate and 16-bit word length were selected to match the limitations of the human ear, the limitations of transducers, and the limitations of typical recording and playback environments. At that time, the 16-bit word length exceeded the capabilities of converters by about 2 bits. Anything beyond 16 bits would have seemed unreasonable in 1982.
In recent years, converters, power amplifiers, and transducers have improved significantly. For example, the Benchmark DAC3 converters and AHB2 power amplifiers achieve A-weighted signal to noise ratios ranging from 128 dB to 132 dB. This is equivalent to the performance of a 21 to 22 bit digital system. Components of this quality justify a move to high-resolution formats, but most listeners will find that the vast majority of their favorite tracks are still only available in CD format.
Fortunately, we have learned how to extract more performance from the CD format. Newer recordings use noise-shaped dither to reduce the 16-bit quantization noise to a level that is equivalent to that of a 20-bit format. Oversampling converters with specialized filters have improved the high-frequency performance in the region between 18 kHz and 22 kHz while reducing distortion at all frequencies. These oversampling converters can also accurately reconstruct the original analog waveform between samples.
These spectacular improvements have brought the theoretical performance of the 35 year old CD format very close to that of the new high-resolution formats. Old recordings benefit from the improvements in D/A converters and power amplifiers. Newer recordings add the the benefits of the noise shaping and improved A/D conversion used to produce the recordings.
When all of these newer technologies are properly applied, the CD can deliver stunning results that closely rival that of the high-resolution formats. In theory, the frequency response and noise performance of the best CD recordings should be good enough to eliminate any and all audible defects related to the limitations of the 44.1 kHz 16-bit PCM system. Unfortunately, this theory is an oversimplification. Here at Benchmark, we believe that most oversampling D/A converters produce audible artifacts when playing 44.1 kHz recordings. These artifacts are caused by inadequate headroom in the oversampling interpolation filters. Fortunately, this artifact is preventable.
PCM digital systems sample audio waveforms at discrete instances in time. Samples may occur at a waveform peak, but in most cases, the samples will miss the peaks. The diagram below shows an analog waveform being sampled by a digital system. "1" and "-1" represent the maximum and minimum digital codes. In a 16-bit system, these codes would be +32,767 and - 32,768. In the diagram below, note that the peaks exceed the maximum and minimum codes by a factor of 1.414 (3.01 dB). This worst-case example occurs when the audio tone is 1/4 of the sample rate. In a 44.1 kHz system, this worst-case scenario occurs near 11 kHz. High sample rate systems are much less prone to this problem because the worst-case occurs at ultrasonic frequencies where there is very little audio content.
If the digital samples are interpolated to a high sample rate (as is done in oversampled D/A converters) the resulting samples typically look like this:
In the above diagram, the interpolator in the oversampled D/A converter produces a severely clipped sine wave when attempting to reproduce the original analog waveform. Some D/A converters will produce additional distortions that are caused by numeric overflows in the digital signal processing. In either case, these conventional oversampled D/A converters produce bursts of distortion whenever intersample overs occur. Please note that SRC and ASRC devices (sample rate converters) use interpolators and these have similar overload problems. In our opinion, the audible signature of many SRC and ASRC chips can be attributed to this overload problem.
Every D/A chip and SRC chip that we have tested here at Benchmark has an intersample clipping problem! To the best of our knowledge, no chip manufacturer has adequately addressed this problem. For this reason, virtually every audio device on the market has an intersample overload problem. This problem is most noticeable when playing 44.1 kHz sample rates.
It is possible to build interpolators that will not clip or overload, but this is not being done by the D/A and SRC chip manufacturers. For this reason, Benchmark has moved some of the digital processing outside of the D/A chip. In the Benchmark DAC2 and DAC3 converters we have an external interpolator that has 3.5 dB of headroom above 0 dBFS. This means that the worst-case +3.01 dBFS intersample peaks can be processed without clipping. We also drive the ESS D/A converter chips at -3.5 dB so that no clipping will occur inside the ES9018 and ES9028PRO converter chips. The results are represented in the following diagram:
If we run a spectrum analysis of the output of a conventional D/A converter and a high-headroom D/A converter we can see the difference:
The green trace was produced by a high-headroom Benchmark DAC2. The DAC2 correctly reproduced an 11.025 kHz sine wave having an amplitude of +3.01 dBFS. The red trace shows the output of the Benchmark DAC1 under the same test conditions. Like most products on the market, the older DAC1 uses conventional interpolation and it will overload whenever intersample peaks exceed 0 dBFS. Note that the red plot shows many distortion products that are not harmonically related to the 11.025 kHz tone. These IMD (intermodulation distortion) products are produced by interactions between the test tone, the sampling frequency, and the oversampling frequency. This IMD distortion is not musical and does not occur in analog systems. It produces a sound that is unique to digital systems. Note that most of the distortion occurs between 5 kHz and 22 kHz. This can produce a false brightness or brittleness when overloads occur. This begs the question; how often do intersample overs occur?
We have frequently used Steely Dan's Gaslighting Abbie from Two Against Nature in our listening tests. At trade shows, it is not unusual to hear this track being played. This is a spectacular CD recording with lots of dynamics and a low noise floor. Nevertheless, in a little over 5 minutes, this track has 559 intersample overs on the left track and 570 on the right track for a total of 1129. This means that there are about 3.7 intersample overs per second. The highest intersample over measures +0.8 dBFS. The track itself is not clipped, the 44.1 kHz sampling has simply captured peaks that exceed 0 dBFS. Of the 1129 overs, a total of 993 intersample peaks measure at least +0.5 dBFS. The following image shows the track with the intersample overs highlighted in red:
This track can be played cleanly by the Benchmark DAC2 and DAC3 converters. These converters accurately render the intersample peaks that were captured in the recording process. In contrast, conventional converters will clip each of the peaks highlighted in red. In this track the peaks coincide with hits to the snare drum. Converters that clip these peaks add a false brightness to the snare drum and alter its sound.
To arrive at these counts we reduced the amplitude by 3 dB and then applied 8X interpolation to render the intersample peaks without clipping. Working at the 8X (352.8 kHz) sample rate, we then measured the amplitudes of the reconstructed intersample peaks. Next we restored the track to its original amplitude by increasing the gain by 3 dB. This allowed us to highlight the 1129 intersample overs in red. We used Audacity to perform these functions.
The following segment from Gaslighting Abbie compares conventional interpolation to high-headroom interpolation. The top two tracks were 8X interpolated with a conventional interpolator. The middle two tracks were 8X interpolated with a high-headroom interpolator. Note how the peaks were properly rendered by the high-headroom interpolator. The bottom two tracks show where the clipping occurs in the conventional interpolator. The segment starts at 2:30.46880 and ends at 2:30.47015 seconds, and is typical of many of the intersample overs contained in this recording.
Please note that the 44.1 kHz CD sample rate captured the intersample peaks. When high-headroom interpolation was applied, the original peaks contained in the music were recovered and accurately rendered. This high-headroom interpolation technique delivers an accurate rendition of the analog signal captured by the A/D converter in the studio.
Intersample overs are not unique to the Gaslighting Abbie track or to Steely Dan's Two Against Nature CD. Most CD recordings contain intersample overs.
Here are some other examples of tracks with intersample peaks that exceed 0 dBFS:
Intersample overs are shown in red, highest intersample peak = +0.486 dBFS
Intersample overs are shown in red, highest intersample peak = +0.565 dBFS.
Intersample overs are shown in red, highest intersample peak = +1.49 dBFS. This track has most of the intersample overs concentrated in the right channel. They seem to be caused by percussion that is slightly panned to the right.
Intersample overs are shown in red, highest intersample peak = +0.485 dBFS.
Note: This is an 88.2 kHz recording that has been hard limited to -0.01 dBFS. The limiting produces intersample overs when interpolating 4X to 382.8 kHz. This means that a conventional interpolator would add distortion at the peaks shown in red.
The following CD track is loud but, surprisingly, it has no intersample overs. It was mastered by Gavin Lurssen in 2007. This track demonstrates that it is possible to create CDs without intersample overs, although this should not be necessary if converters and SRC devices have high-headroom interpolators.