We have frequently used Steely Dan's Gaslighting Abbie from Two Against Nature in our listening tests. This is a spectacular CD recording with lots of dynamics and a low noise floor. Nevertheless, in a little over 5 minutes, this track has 559 intersample overs on the left track and 570 on the right track for a total of 1129. This means that there are about 3.7 intersample overs per second. The highest intersample over measures +0.8 dBFS. The track itself is not clipped, the 44.1 kHz sampling has simply captured peaks that exceed 0 dBFS. The image above shows the track with the intersample overs highlighted in red.
If we zoom in on this track we can see the what the track should look like without clipping (using a high-headroom interpolator) and we can see what it looks like when rendered with a conventional interpolator (such as those found in most oversampled D/A converters). The vertical red lines have been added to show where clipping occurred in the conventional interpolator.
When newer technologies are properly applied, the CD can deliver stunning results that closely rival the high-resolution formats. A significant improvement can be gained through the use of high-headroom interpolators in D/A and SRC devices. This paper shows how intersample peaks can be reconstructed and rendered without clipping.
The Compact Disc (CD) was introduced in 1982. Thirty five years later, the 44.1 kHz 16-bit CD audio format is still the most popular digital format. In spite of the push for high-resolution audio formats, the vast majority of new recordings are only released in this format (or in files derived from this format).
The CD format delivers a DC to 22 kHz frequency response and about a 96 dB signal to noise ratio. In 1982, the 44.1 kHz sample rate and 16-bit word length were selected to match the limitations of the human ear, the limitations of transducers, and the limitations of typical recording and playback environments. At that time, the 16-bit word length exceeded the capabilities of converters by about 2 bits. Anything beyond 16 bits would have seemed unreasonable in 1982.
In recent years, converters, power amplifiers, and transducers have improved significantly. For example, the Benchmark DAC3 converters and AHB2 power amplifiers achieve A-weighted signal to noise ratios ranging from 128 dB to 132 dB. This is equivalent to the performance of a 21 to 22 bit digital system. Components of this quality justify a move to high-resolution formats, but most listeners will find that the vast majority of their favorite tracks are still only available in CD format.
Fortunately, we have learned how to extract more performance from the CD format. Newer recordings use noise-shaped dither to reduce the 16-bit quantization noise to a level that is equivalent to that of a 20-bit format. Oversampling converters with specialized filters have improved the high-frequency performance in the region between 18 kHz and 22 kHz while reducing distortion at all frequencies. These oversampling converters can also accurately reconstruct the original analog waveform between samples.
These spectacular improvements have brought the theoretical performance of the 35 year old CD format very close to that of the new high-resolution formats. Old recordings benefit from the improvements in D/A converters and power amplifiers. Newer recordings add the the benefits of the noise shaping and improved A/D conversion used to produce the recordings.
When all of these newer technologies are properly applied, the CD can deliver stunning results that closely rival that of the high-resolution formats. In theory, the frequency response and noise performance of the best CD recordings should be good enough to eliminate any and all audible defects related to the limitations of the 44.1 kHz 16-bit PCM system. Unfortunately, this theory is an oversimplification. Here at Benchmark, we believe that most oversampling D/A converters produce audible artifacts when playing 44.1 kHz recordings. These artifacts are caused by inadequate headroom in the oversampling interpolation filters. Fortunately, this artifact is preventable.
PCM digital systems sample audio waveforms at discrete instances in time. Samples may occur at a waveform peak, but in most cases, the samples will miss the peaks. The diagram below shows an analog waveform being sampled by a digital system. "1" and "-1" represent the maximum and minimum digital codes. In a 16-bit system, these codes would be +32,767 and - 32,768. In the diagram below, note that the peaks exceed the maximum and minimum codes by a factor of 1.414 (3.01 dB). This worst-case example occurs when the audio tone is 1/4 of the sample rate. In a 44.1 kHz system, this worst-case scenario occurs near 11 kHz. High sample rate systems are much less prone to this problem because the worst-case occurs at ultrasonic frequencies where there is very little audio content.
If the digital samples are interpolated to a high sample rate (as is done in oversampled D/A converters) the resulting samples typically look like this:
In the above diagram, the interpolator in the oversampled D/A converter produces a severely clipped sine wave when attempting to reproduce the original analog waveform. Some D/A converters will produce additional distortions that are caused by numeric overflows in the digital signal processing. In either case, these conventional oversampled D/A converters produce bursts of distortion whenever intersample overs occur. Please note that SRC and ASRC devices (sample rate converters) use interpolators and these have similar overload problems. In our opinion, the audible signature of many SRC and ASRC chips can be attributed to this overload problem.
Every D/A chip and SRC chip that we have tested here at Benchmark has an intersample clipping problem! To the best of our knowledge, no chip manufacturer has adequately addressed this problem. For this reason, virtually every audio device on the market has an intersample overload problem. This problem is most noticeable when playing 44.1 kHz sample rates.
It is possible to build interpolators that will not clip or overload, but this is not being done by the D/A and SRC chip manufacturers. For this reason, Benchmark has moved some of the digital processing outside of the D/A chip. In the Benchmark DAC2 and DAC3 converters we have an external interpolator that has 3.5 dB of headroom above 0 dBFS. This means that the worst-case +3.01 dBFS intersample peaks can be processed without clipping. We also drive the ESS D/A converter chips at -3.5 dB so that no clipping will occur inside the ES9018 and ES9028PRO converter chips. The results are represented in the following diagram:
If we run a spectrum analysis of the output of a conventional D/A converter and a high-headroom D/A converter we can see the difference:
The green trace was produced by a high-headroom Benchmark DAC2. The DAC2 correctly reproduced an 11.025 kHz sine wave having an amplitude of +3.01 dBFS. The red trace shows the output of the Benchmark DAC1 under the same test conditions. Like most products on the market, the older DAC1 uses conventional interpolation and it will overload whenever intersample peaks exceed 0 dBFS. Note that the red plot shows many distortion products that are not harmonically related to the 11.025 kHz tone. These IMD (intermodulation distortion) products are produced by interactions between the test tone, the sampling frequency, and the oversampling frequency. This IMD distortion is not musical and does not occur in analog systems. It produces a sound that is unique to digital systems. Note that most of the distortion occurs between 5 kHz and 22 kHz. This can produce a false brightness or brittleness when overloads occur. This begs the question; how often do intersample overs occur?
We have frequently used Steely Dan's Gaslighting Abbie from Two Against Nature in our listening tests. At trade shows, it is not unusual to hear this track being played. This is a spectacular CD recording with lots of dynamics and a low noise floor. Nevertheless, in a little over 5 minutes, this track has 559 intersample overs on the left track and 570 on the right track for a total of 1129. This means that there are about 3.7 intersample overs per second. The highest intersample over measures +0.8 dBFS. The track itself is not clipped, the 44.1 kHz sampling has simply captured peaks that exceed 0 dBFS. Of the 1129 overs, a total of 993 intersample peaks measure at least +0.5 dBFS. The following image shows the track with the intersample overs highlighted in red:
As stated above, this track has 1129 intersample overs, but the track itself is not clipped. This intersample overs contain dynamics that can be recovered accurately if they are rendered with a high-headroom interpolator.
Unfortunately these dynamic peaks that were captured in the recording process, cannot be rendered by most D/A converters. These peaks will overload conventional interpolators and each over will be rendered as a burst of percussive noise. On this track, an overloaded interpolator tends to add a false brightness to the snare drum while changing its sound.
Benchmark DAC2 and DAC3 converters are among the very few converters that have high-headroom interpolators. These converters accurately render the intersample peaks that were captured in the recording process. This dynamics of this track can only be fully appreciated when it is rendered without clipping.
To arrive at these counts we reduced the amplitude by 3 dB and then applied 8X interpolation to render the intersample peaks without clipping. Working at the 8X (352.8 kHz) sample rate, we then measured the amplitudes of the reconstructed intersample peaks. Next we restored the track to its original amplitude by increasing the gain by 3 dB. This allowed us to highlight the 1129 intersample overs in red. We used Audacity to perform these functions.
The following segment from Gaslighting Abbie compares conventional interpolation to high-headroom interpolation. The top two tracks were 8X interpolated with a conventional interpolator. The middle two tracks were 8X interpolated with a high-headroom interpolator. Note how the peaks were properly rendered by the high-headroom interpolator. The bottom two tracks show where the clipping occurs in the conventional interpolator. The segment starts at 2:30.46880 and ends at 2:30.47015 seconds, and is typical of many of the intersample overs contained in this recording.
Please note that the 44.1 kHz CD sample rate captured the intersample peaks. When high-headroom interpolation was applied, the original peaks contained in the music were recovered and accurately rendered. This high-headroom interpolation technique delivers an accurate rendition of the analog signal captured by the A/D converter in the studio.
Intersample overs are not unique to the Gaslighting Abbie track or to Steely Dan's Two Against Nature CD. Most CD recordings contain intersample overs.
Here are some other examples of tracks with intersample peaks that exceed 0 dBFS:
Intersample overs are shown in red, highest intersample peak = +0.486 dBFS
Intersample overs are shown in red, highest intersample peak = +0.565 dBFS.
Intersample overs are shown in red, highest intersample peak = +1.49 dBFS. This track has most of the intersample overs concentrated in the right channel. They seem to be caused by percussion that is slightly panned to the right.
Intersample overs are shown in red, highest intersample peak = +0.485 dBFS.
Note: This is an 88.2 kHz recording that has been hard limited to -0.01 dBFS. The limiting produces intersample overs when interpolating 4X to 382.8 kHz. This means that a conventional interpolator would add distortion at the peaks shown in red.
The following CD track is loud but, surprisingly, it has no intersample overs. It was mastered by Gavin Lurssen in 2007. This track demonstrates that it is possible to create CDs without intersample overs, although this should not be necessary if converters and SRC devices have high-headroom interpolators.
At Benchmark, listening is the final exam that determines if a design passes from engineering to production. When all of the measurements show that a product is working flawlessly, we spend time listening for issues that may not have shown up on the test station. If we hear something, we go back and figure out how to measure what we heard. We then add this test to our arsenal of measurements.
Benchmark's listening room is equipped with a variety of signal sources, amplifiers and loudspeakers, including the selection of nearfield monitors shown in the photo. It is also equipped with ABX switch boxes that can be used to switch sources while the music is playing.
Benchmark's lab is equipped with Audio Precision test stations that include the top-of-the-line APx555 and the older AP2722 and AP2522. We don't just use these test stations for R&D - every product must pass a full set of tests on one of our Audio Precision test stations before it ships from our factory in Syracuse, NY.
Paul Seydor of The Absolute Sound interviews John Siau, VP and chief designer at Benchmark Media Systems. The interview accompanies Paul's review of the LA4 in the December, 2020 issue of TAS.
"At Benchmark, listening is the final exam that determines if a design passes from engineering to production. But since listening tests are never perfect, it’s essential we develop measurements for each artifact we identify in a listening test. An APx555 test set has far more resolution than human hearing, but it has no intelligence. We have to tell it exactly what to measure and how to measure it. When we hear something we cannot measure, we are not doing the right measurements. If we just listen, redesign, then repeat, we may arrive at a solution that just masks the artifact with another less-objectionable artifact. But if we focus on eliminating every artifact that we can measure, we can quickly converge on a solution that approaches sonic transparency. If we can measure an artifact, we don't try to determine if it’s low enough to be inaudible, we simply try to eliminate it."
- John Siau