Category Archives: Audio

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (part 1 of 8)

This is part 1 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1.

I’ve used an Oppo HA-1 as my DAC, preamp and headphone amp for nearly 4 years. The reason I still have it is because I enjoy listening to it so much. Yet I wonder whether better sound could be had. The Oppo is primarily designed as a headphone amp; its linestage, while very good, is a secondary feature. I listen on speakers at least as much as I do on headphones, and my speakers are more transparent than most headphones. Also its volume control is a potentiometer; a high quality Alps, but still not the ultimate in transparency and perfect channel balance that a well implemented stepped attenuator can provide.

My audio system at work uses a Meier Corda Jazz amp. I bought it over 4 years ago. It has a wonderful sound: detailed, smooth and sweet without euphonics. And some nice features, like a stepped attenuator volume control, selectable L-R cross-feed and balanced ground drive. It’s hard to find anything this nice for twice the price, and the build quality is great. So when I heard that Jan Meier recently built a SOTA DAC/headphone amp/preamp called the Soul, I was intrigued. The Soul has come out with some anticipation in the headphone audiophile community. I read that Jan brought the Soul to CanJam in Europe and it was judged “best in show”. Given some of the very nice (and expensive!) gear at CanJam, that says a lot. I also found a few people online who had auditioned it, nothing but rave reviews.

Jan built 2 prototypes of this device. The prototype resembles a science project but is solid, if not elegant, and electrically equivalent to the production version (which as of Dec 19, 2018 is yet to be released). If you contact Jan, you can arrange with him to borrow it for a listening session. When I did so, he told me he was running a pre-order to gauge interest in this device and estimate production volume (which influences the price). People who pre-order pay a deposit up front and get a lower price when the production version of the Soul is released. It turns out I contacted him just as this pre-order was ending. Jan has a generous policy of not charging to borrow the Soul prototype (though shipping it back to him cost about $65), and a 14-day return policy for the final product.

It may seem unfair to compare the Oppo with the Soul, as the latter will probably cost several times as much. But I’ve always believed, based on comparison with other headphone amps and DACs, that the Oppo punches well above its weight class when it comes to performance per dollar. Also, my goal was not a head-to-head comparison of these 2 preamps, but rather to find out whether the Soul would be a relative improvement. That is only a subtle distinction, but an important one.

Two weeks later, just before Christmas, the Soul arrived at my door having crossed an ocean, a continent, and customs inspectors. I carefully unpacked it, connected it to my system, centered/zeroed all the knobs and made a quick function test. Music! Success! I swallowed my anticipation and left both it and my Oppo HA-1 powered on overnight for listening sessions the next day.

Next: system summary and setup

DAC / DA Conversion / Linear vs Minimum Phase

Digital audio requires an anti-aliasing filter to suppress high frequencies (at or above Nyquist, or half the sampling frequency). Without this, an infinite number of different analog waves could pass through the digital sampling points. With this, there is only 1 unique analog wave that passes through them. The anti-aliasing filter is essential to ensure the analog wave that the DAC constructs from the bits is the same one that was recorded and encoded (assuming that the original analog mic feed was properly anti-alias filtered, preventing frequencies above Nyquist from leaking through).

Audiophiles debate about whether linear or minimum phase anti-aliasing filters are ideal for sound reproduction and perception. Linear phase has the lowest overall distortion, but its symmetric response around transients (a bit of ripple just before and after a transient pulse), often called the Gibbs effect, means there is a “pre-echo” or “pre-ring”. In the diagram below, the red line is the signal and the black wave is the analog wave constructed from it using a linear phase filter.

If the X axis is t for time, this black curve is the function sinc(t). It is symmetric before and after the transient, which means it starts wiggling before the transient actually happens. This is unnatural; in the real world, all of the sound happens after the actual event. This pre-ringing is an artifact of linear phase anti-aliasing filters. Many audiophiles claim this is audible, smearing transients and adding “digital glare”.

Here’s what the audio books don’t always tell you. According to the Whittaker-Shannon interpolation formula, this sinc(t) response represents the “perfect” reconstruction of the bandwidth limited analog signal encoded by the sampling points. The pre-ring is very low level, and it rings at the Nyquist frequency (half the sampling frequency). That is at least 22,500 Hz (octaves higher if the digital signal is oversampled, as it virtually always is). This makes it unlikely for anyone to hear it even under ideal conditions of total silence followed by a sudden percussive SMACK.

NOTE: I say “unlikely” not “impossible” because even though humans can’t hear 22,500 Hz (let alone frequencies octaves higher), it is at least feasible that somebody could still hear the difference. Under the right conditions, removing frequencies we can’t hear as pure tones causes audible changes to the wave in the time domain. That doesn’t make sense mathematically, but human perception of the frequency & time domains is not as symmetric as Fourier transforms.

Some audiophiles suggest minimum phase filters as an alternative to solve this problem. But I believe this cure is worse than the disease. Minimum phase filters have an asymmetric response around transients with no pre-ringing. A picture is worth 1,000 words, so here’s what that same impulse looks like when a minimum phase filter is used.

You can see that the impulse strikes instantly without any pre-ringing. Well it actually rings louder and longer than the linear phase filter, but that ringing happens after the transient.

This has the added benefit that the ringing is masked by the sound itself for the simple reason that loud sounds psychoacoustically mask quiet ones. So what’s not to like here?

The problem is, minimum phase filters actually have more distortion (more ringing, more phase shift) than linear phase. So you get more distortion overall, but it’s time-delayed so you get cleaner initial transients with more distorted decay. And the phase shift caused by minimum phase filters happens all the time, not just in transients. So it seems you can have clean transients, or good phase response, but not both. Choose your poison.

At this point a purist audiophile might hang his head in sadness. But there’s a better solution to the digital bogeyman of pre-ring: oversampling (or higher sampling rates). The phase distortion and ringing of any filter is related to its slope, or the width of its transition band. Oversampling further increases the frequency of the pre-ring (which was already ultrasonic), makes a shallower slope, wider transition band, reducing distortion.

For example consider CD, sampled at 44,100 Hz. Nyquist is 22,050 and some people can hear 20,000 so the transition band is from 20,000 to 22,050. That’s very narrow (only 0.14 octaves) and requires a steep filter with Gibbs effect pre-ring at 22,500 Hz. Oversample it 8x and Nyquist is now 176.4 kHz, so your transition band is now 20k to 176.4k, which is 3.14 octaves (actually, you’d use a lower cutoff frequency, but it’s still at least a good octave above 22,050 Hz). Absolutely inaudible; go ahead and use linear phase with no worries.

In short, use higher sampling frequencies (or oversample) not because you need to capture higher frequencies, but because it gives you a more gradual anti-aliasing filter which means faster transient response without any time or phase distortion.

This idea is nothing new. Most D-A converters already oversample, and have been doing so for decades. The pre-ring or ripple of a well-engineered DAC is negligibly small, supersonic and inaudible. However, some people prefer minimum phase filters! How can we explain that? Minimum phase filters have no pre-ripple, yet they also have phase distortion, they ring louder and longer, and in some cases they allow higher frequencies to be aliased into the signal.

First, if this preference comes from a non-blind test, we can’t be sure they really heard any difference at all. Maybe they did, maybe they didn’t. A negative result from a blind test doesn’t mean they can’t hear a difference, it only means we can’t be sure they hear a difference.

Along these lines of non-blind testing, Keith Howard wrote a good one for Stereophile a few years ago: https://www.stereophile.com/reference/106ringing
I love their experimental attitude: test and discover! But when they talk about how hard it was to tell the filters apart, it is kinda funny thinking about a bunch of middle-age guys wondering why they can’t hear a supersonic ripple octaves above the range of their hearing. Especially when most of them understand math & engineering well enough to know why.

Second, consider if this preferences comes from a blind test. Blind tests only reveal whether people can hear differences; they don’t qualify exactly what differences they were hearing. My guess is that people who prefer minimum phase filters are simply finding some of these distortions to be euphonic. This seems reasonable, given that preferences for vinyl records and tube amps are also common.

This topic has been endlessly debated in audiophile circles for years. Here’s an article showing some actual measurements: http://archimago.blogspot.com/2013/06/measurements-digital-filters-and.html

A couple years later he followed up with a listening test: http://archimago.blogspot.com/2015/04/internet-blind-test-linear-vs-minimum.html

So what do I think about all this? Like the Stereophile reviewers, listening to music, I find it difficult to hear a difference between the “sharp” (linear phase) and “slow” (minimum phase) filters. Test signals highlight the differences, but I don’t enjoy listening to test signals, and since they’re not natural sounds, even if you can tell them apart there’s no reference for what they should sound like. I know the sharp filter is “correct” from a math & engineering perspective, so that’s the one I use.

How Loud Does it Get?

Magnepan 3.6/R specs don’t give efficiency, but they give voltage sensitivity. That’s 86 dB @ 500 Hz @ 2.83 V. From this we can determine efficiency.  500 Hz is carried by the midrange panel which has 4.2 Ohm impedance, so 2.83 V drives 2.83/4.2 = 0.674 A of current, which makes 2.83 * 0.674 = 1.907 Watts.

So, 1.9 W of power makes 86 dB SPL at 1 meter. That’s low efficiency for a speaker.

The Adcom 5800 is rated at 400 W continuous in each channel with 2.1 dB of headroom. 400 W is 10 * log (400 / 1.9) = 23 dB louder than 1.9 W, which makes 86+23 = 109 dB SPL in each speaker. 2 speakers is twice the power which is +3 dB making 112 dB SPL from both speakers. Plus 2.1 for headroom makes 114 dB SPL peak.

Subjectively, I can say this is VERY loud. Over the 26 years I’ve owned this amp I can count on the fingers of 1 hand the number of times I’ve seen its yellow 1% distortion warning lights briefly flicker during a transient peak.

NOTE: I tested this last night by holding an SPL meter while listening to a test CD. A full scale (0 dB) digital signal, passing through my preamp (Oppo HA-1) at 0 dB measures 104 dB SPL at the listening position. The power amp (Adcom 5800) warning lights do not even flicker. The preamp goes up to +6 dB output, which would be 110 dB SPL. That’s pretty close to the theoretical measurement–within 2 dB!

That 2.1 dB of headroom means peak power is 10^(2.1/10) = 1.62 times higher than continuous, making 400 * 1.62 = 648 Watts.

Also we can sanity check the amp’s overall efficiency. The 5800’s peak continuous power draw is rated at 1800 VA (Watts). While delivering 800 W to a pair of speakers, that’s 44% efficient. It’s actually less efficient at lower volumes because it’s biased to run in symmetric class  A up to about 10 Watts output. The max theoretical efficiency of class A is 25%. It draws about 250 W when idle!

Next question: if the Adcom 5800 operates in symmetric class A up to 10 W, how loud can it play these speakers while in class A, before transitioning to class AB?

From above, the speakers play at 86 dB SPL when consuming 1.907 watts. 10 watts is 7.2 dB louder, plus 86 = 93.2 dB SPL. That’s per side, so +3 dB makes 96.2 dB SPL. That’s pretty loud.

Double-check the answer: 400 watts is 16 dB louder than 10, so add 16 dB to 96.2 and you get 112.2 dB. The math checks: same answer as above.

Bits and Dynamic Range

When digital audio came out I wondered how the number of bits per sample correlated to the amplitude of waves. I imagined that the total expressible range was independent of the size of the smallest discernible gradation. Since this appeared to be a trade-off, I wondered how anyone decided what was a good balance.

Later I realized this is a false distinction. First: the number of bits per sample determines the size of the smallest gradation. Second: total expressible range is not a “thing” in the digital domain. Third: if the total range is a pie of arbitrary size, dynamic range is the number of slices. The smaller the slices, the bigger the dynamic range.

Regarding the first: to be more precise, bits per sample determines the size of the smallest amplitude gradation, as a fraction of full scale. Put differently: what % of full scale is the smallest amplitude gradation. But full scale is the amplitude of the analog wave, which is determined after D/A conversion, so it’s simply not part of the digital specification.

Amplitude swings back and forth. Half the bits are used for negative, the other for positive, values. Thus 16 bit audio gives 2^16 = 65,536 amplitudes, which is 32,768 for positive and negative each (actually one of the 65,536 values is zero, which leaves an odd number of values to cover the + and – amplitude swings, making them asymmetric by 1 value, which is a negligible difference). Measuring symmetrically from zero, we have 32,768 amplitudes in either direction. So the finest amplitude gradation is 1/32,768 of full scale in either direction, or 1/65,536 of peak-to-peak. 16-bit slices the amplitude pie into 65,536 equal pieces.

Here’s another way to think about this: the first bit gives you 2 values and each additional bit doubles the number of values. Amplitude is measured as voltage, and doubling the voltage is 6 dB. So each bit gives 6 dB of range, and 16 bits gives 96 dB of range. But this emphasizes the total range of amplitude, which can be misleading because what we’re really talking about is the size of the finest gradation.

So let’s follow this line of reasoning but think of it as halving, rather than doubling. We start with some arbitrary amplitude range (defined in the analog domain after the D/A conversion). It can be anything; you can suppose it’s 1 Volt but it doesn’t matter. The first digital bit halves it into 2 bins, and each additional bit doubles the number of bins, slicing each bin to half its size. Each of these halving operations shrinks the size of the bins by 6 dB. So 16 bits gives us a bin size 96 dB smaller than full scale. Put differently, twiddling the least significant bit creates noise 96 dB quieter than full scale.

To check our math, let’s work it backward. For any 2 voltages V1 and V2, the definition of voltage dB is:

20 * log(V1/V2) = dB

So 96 dB means for some ratio R,

20 * log R = 96

where R is the ratio of full scale to the smallest bin. This implies that

R = 10 ^ (96/20) = 63,096

That’s almost the 65,536 we expected. The reason it’s slightly off, is that doubling the voltage is not exactly 6 db. That’s just a convenient approximation. To be more precise:

20 * log 2 = 6.0206

So doubling (or halving) the voltage changes the level by 6.0206 dB. If we use this more precise figure, then 16 bits gives us 96.3296 dB of dynamic range. If we compute:

20 * log R = 96.3296

We get

R = 10 ^ (96.3296 / 20) = 65,536

When the math works, it’s always a nice sanity check.

Summary

The term dynamic range implies how “big” the signal can be. But it is both more precise and more intuitive to imagine the concept of dynamic range as the opposite: the size of the smallest amplitude gradation or “bin”, relative to full scale. Put differently: dynamic range is defined as the ratio of full scale, to the smallest amplitude bin.

With 16 bits, that smallest bin is 1/65,536 of full scale, which is 96 dB quieter. With 16-bit amplitudes, if you randomly wiggle the least significant bit, you create noise that is 96 dB below full scale.

With 24 bits, that smallest bin is 1/16,777,216 of full scale, which is 144 dB quieter. With 24-bit amplitudes, if you randomly wiggle the least significant bit, you create noise that is 144 dB below full scale.

Typically, the least significant bit is randomized with dither, so we get half a bit less dynamic range, so for 16-bit we get 93 dB and 24-bit we get 141 dB.

Practical Dynamic Range

Virtually nothing we record, from music to explosions, requires more than 93 dB of dynamic range, so why does anyone use 24-bit recording? With more bits, you slice the amplitude pie into a larger number of smaller pieces, which gives more fine-grained amplitude resolution–and, consequently, a larger range of amplitudes to play with. This can be useful during live recording, when you aren’t sure exactly how high peak levels will be. More bits gives you the freedom to set levels conservatively low, so peaks won’t overload, but without losing resolution.

However, once that recording is completed, you know what the peak level recorded was. You can up-shift the amplitude of the entire recording to set the peak level to 0 dB (or something close like -0.1 dB). So long as the recording had less than 93 dB of dynamic range, this transforms the recording to 16-bit without any loss of information (such as dynamic range compression).

In the extremely rare case that the recording had more than 93 dB of dynamic range, you can keep it in 24-bit, or you can apply a slight amount of dynamic range compression while in the 24-bit domain, to shrink it to 93 dB before transforming it. There are sound engineering reasons to use compression in this situation, even for purist audiophiles!

To put this into perspective: 93 dB of dynamic range is beyond what most people can pragmatically enjoy. Consider: a really quiet listening room has an ambient noise level around 30 dB SPL. If you listened to a recording with 93 dB of dynamic range, and you wanted to hear the quietest parts, the loud parts would would peak at 93 + 30 = 123 dB SPL. That is so loud as to be painful; the maximum safe exposure is only a couple of seconds. And whether your speakers or amplifier can do this at all, let alone without distortion, is a whole ‘nuther question. You’d have to apply some amount of dynamic range compression simply to make such a recording listenable.

High Bit Rate Audio

When CDs first came out in the 1980s they sounded lifeless. I still have several in my collection from those years and they still sound bad. In some ways they were better than LPs: no background rumble or hiss, much cleaner and tighter bass, uncolored midrange, and consistent sound quality unlike LPs that sound best in the outer groove with sound quality gradually deteriorating as the record plays and the needle moves toward the inner groove. At the end of the record, just when the orchestra is reaching is crescendo finale, you gear audible distortion or dynamic range compression because the inner groove can’t handle the dynamic range. CD avoided these issues, yet by “lifeless” I mean the high frequencies and transient response on CD were quite poor, much worse than LP.

Over the 1990s, CDs improved until around the year 2000 I found the best CDs had surpassed LPs. CD high frequency and transient response had vastly improved, plus CDs retained the other advantages they had all along. By this point, the best CDs of live acoustic music sounded more natural and real, where the best LPs sounded like an artistically euphonic sonic portrayal.

Looking back, the reason for this transformation of CD audio quality seems to be the use of poorly implemented anti-aliasing filters in the early days. Over the 1990s, we owe the improvement in CD quality mainly to digital oversampling and more transparent anti-aliasing filters, and partially to better implementations of dither and noise shaping.

At the same time, around the turn of the century, high bit rate formats came out: SACD and DVD-Audio. Various engineering and acoustic reasons were given for these high bit rates, most of which were based on well-intended yet fallacious understanding of digital audio, some on blatant pseudo-science.

The best explanation I’ve seen comes from a video by Monty Montgomery, and on his website, where he debunks the most common misunderstandings about digital audio. However, in his zeal to shed the light of math and engineering on this subject, he overstates the case in a few areas. Here I describe those areas. However, while I dispute these points, generally I do agree with him. He’s essentially got it right and is worth reading.

Audible Spectrum

Monty says, Thus, 20Hz – 20kHz is a generous range. It thoroughly covers the audible spectrum, an assertion backed by nearly a century of experimental data. This is mostly, yet not quite true. The range of human hearing is closer to 18 Hz to 18 kHz. It’s common for people to hear below 20 Hz, but almost nobody above the age of 15 can hear 20 kHz. For example, at age 50 as I write this, my personal hearing range is from around 16 Hz to 15 kHz.

Ironically, this actually strengthens Monty’s case. Digital audio has no problem going lower than 20 Hz, and we only need to go up to around 18 kHz to be transparent.

The Human Ear: Time vs. Frequency Domain

The ear is a strange device. Highly sensitive, yet also non-linear, and it can also be inconsistent and unreliable. Our keen perception of transient response is more sensitive than one would expect, given the upper threshold of frequency tones we can hear.

For example, consider castanets. They have lots of high frequency energy, to 20 kHz and above. If you listen to real castanets–not an audio recording, but an actual person snapping them in front of you–the “snap” or “click” has an incredibly crisp, yet light and clean sound. Most recordings of them sound artificial with smeared transients, because these recordings don’t capture those high frequencies well. They’re lost somewhere in the microphone, the position of the mic to the musician, or the audio processing.

I have an excellent CD recording of castanets (it’s a flute quintet, but several tracks feature castanet accompaniment) that has energy up to 20 kHz. It’s one of the best, most realistic castanet recordings I have heard: clean, crisp yet light. Almost perfect sounding. As a test, I’ve applied EQ to this recording to attenuate frequencies above 15 kHz. I can differentiate this from the original in an A/B/X test. In the filtered version, the castanets don’t sound as crisp or clean. It’s hard to describe, but they sound slightly “smeared” for lack of a better word. The effect is subtle, but consistently noticeable when you know what to listen for, and listen carefully.

Yet as mentioned above, I can’t hear frequencies above 15 kHz, so I can’t hear the frequencies I attenuated. How is that possible? I believe it’s because the ear can detect transient response timing that requires higher frequencies to resolve, than it can hear as tones. Put differently: take a musical signal of castanets (or anything else with very high frequencies) and apply a Fourier Transform to convert to the frequency domain. The highest frequencies you cannot hear as pure tones. But if you filter them out, it distorts the original waveform in the time domain, rounding off sharp transients and causing pre-echo. The ear can detect these artifacts.

The moral of this story: well-engineered digital audio does perfectly capture any analog signal that has been bandwidth-limited to the Nyquist frequency. But, some caveats apply:

  1. Bandwidth-limiting the signal can create audible distortion. Anti-alias filtering with a steep slope creates audible time domain distortion in the pass-band.
  2. Higher sampling rates (alternately, oversampling) give a wider transition band, making a gradual filter slope, reducing this pass-band distortion.
  3. The frequencies needed for transient response to sound transparent, may be higher than the frequencies that people can hear as pure tones.

Of course, these points are not unique to digital audio. To get transparent transient response, every step in the recording chain must preserve high frequencies. You must use microphones with extended high frequency response, position them close enough to the musicians to capture the frequencies, etc.

Lossy Compression

Monty says: a properly encoded Ogg file (or MP3, or AAC file) will be indistinguishable from the original at a moderate bitrate. Whether that is true depends on one’s definition of “moderate”. Trained listeners of high quality recordings on high quality equipment may need bit rates higher than Monty suggests.

A/B/X testing the highest quality recordings in my collection, I can reliably distinguish MP3 up to about 200 kbps rates, using LAME 3.99.5, which is one of the best encoders. Most MP3s are done at 128 to 160 kbps thus could be differentiated from the original.

There is some truth to the “moderate bitrates are sufficient” viewpoint. Most MP3s are of rock, pop or electronic music, inferior quality master recordings that are compressed, clipped, and heavily EQed. The low 128 to 160 kbps rates may be transparent for this content. But that’s not relevant to us; here we’re talking about high end.

In short, if you are a trained critical listener of high quality recordings on high quality equipment, you can hear the difference of MP3 and other lossy compression, unless you use the best modern encoders at higher than average bit rates (say 256 kbps minimum to be safe).

I’ve also got a few thoughts on dynamic range and 16 vs 24-bit. That’s a whole ‘nuther discussion.

Conclusion

What Monty says about digital audio is true, generally speaking. He’s done a great job of debunking common myths. High bit rate recordings are over-hyped and can actually be counterproductive. However, there are some caveats to keep in mind:

  1. High bit rate recordings often do sound better, because when they are being made, extra care and attention is used throughout the entire recording process.
    • But if you took that recording and down-sampled it to CD quality using properly implemented methods, it is likely to be indistinguishable from the original.
  2. High bit rate recordings may be sold as “studio masters”, not having dynamic range compression, equalization or other processing often applied to CDs.
    • This is related to (1), and the same comment applies.
  3. High bit rates can offer subtle improvements to transient (impulse) response.
    1. This benefit is intrinsic to high bit rate audio
    2. However, it is not always realized because the limiting factor for transient response may be the microphones or other parts of the recording process.
  4. High bit rates can sound worse, because they may capture ultrasonic frequencies that increase intermodulation distortion.
  5. The differences that high bit rates make (improvement or detriment) are subtle and most people don’t have good enough equipment or recordings to hear the differences.

Parting Words

Engineers may want to record at higher sampling rates with more bit depth to give headroom for setting levels and other processing. But their final result can virtually always be transformed to 44-16 without any audible compromises (distortion, compression, or loss of information). Yet in some areas, 44-16 while sufficient, is barely sufficient, which means it requires careful well engineered re-sampling, anti-aliasing filters, noise-shaped dither, etc.

High bit rate recordings, when done carefully, can offer slightly better transient response for certain types of music. But to the extent they actually do achieve this by accurately capturing higher frequencies that improve transient response (which is rare), this HF content is a double-edged sword that brings the risk of higher IMD distortion. Of course, high quality well-engineered audio gear (DAC, amp, speakers, etc.) mitigates this risk.

Some practical guidelines:

  • If the original recording was made in the 1980s or earlier, there is no point to high bit rates. Ultra high frequencies are already non-existent or rolled off, transient response is already imperfect, dynamic range is already limited. Here, the 44-16 standard is higher fidelity than the original.
  • If it’s rock, pop, electronic, there’s probably no point to high bit rates. It’s already heavily processed and there is no absolute reference for what this kind of music is supposed to sound like. Classic rock/pop albums get re-released every few years with different re-masterings that all sound different. One version may have better bass or smoother mids, but that is not a 44-16 limitation. Which release is “best” is not a limitation of digital bit rate, but only a matter of opinion.
  • If it is acoustic music recorded in natural spaces, a high bit rate recording may be useful, especially if the recording has very high frequencies (castanets, bagpipes, trumpets) or transient impulses. Even if the bit rate alone doesn’t help things, the entire recording is probably (though not always) made with more careful attention to detail and high engineering standards.

Overall, I don’t worry about it. The quality of a music recording depends far more on the mics used, their placement, the room it was recorded in, mixing and mastering, than it does on the bit rate. And 44-16 is either completely transparent, or so close to transparent that even on the highest quality equipment with the most discerning listener, limitations in other areas of the recording process make the differences moot.

Back to the HD-580 – For a While

My Audeze LCD-2 fell off my desk at work and got pranged so they’re going back to Audeze for repair and, incidentally, upgrade to the 2016 drivers. My home pair has  these drivers and they are a subtle improvement over the 2014.

In the meantime, I’m listening to my trusty old HD-580s. Original 18 year old drivers, though I’ve replaced the headband and ear pads, and the cable, a few times over the years. They’re clean and play, fit and look like new.

First impression: these HD-580s are nice headphones! Smooth mids, nice timbres, well balanced. They really were the very first audiophile headphone, SOTA for 1999, a whole different league apart from Grados and the like. But compared to the Audeze:

  • The low bass is rolled off
  • The bass is not as tight
  • The mids are a tad boxy, not as open sounding
  • The high treble is rolled off

Overall, they sound a tad muffled and slow compared to the LCD-2. Conversely, the LCD-2 has:

  • Wider bandwidth: deeper bass, higher treble
  • Better detail & articulation throughout the range
  • More natural, realistic voicing

A gentle parametric EQ helps widen the HD-580’s apparent bandwidth:

  • +3 @ 25 Hz, Q=0.67
  • +3 @ 14 kHz, Q=1.5

I’m enjoying this trip down memory lane. I listened to these same HD-580s during most of the 10,000 hours I put into Octane Software back in the day. They sound nice, but I will be very happy to get my Audeze back.

Audio History

I loved music and was fascinated with audio electronics since I was a little kid. Later I became interested in the physics of sound.

I bought my first audio component in the 1980s in college, a Harman Kardon integrated amplifier. It was simple and cheap, had no tuner, only 40 WPC output, but it did have a phono amp (MM only) and decent gain stage. To find good speakers, my friend Shawn and I visited the local audio store and listened to several different speakers (Klipsch, Polk, and a few others) with a variety of music. We both liked the Polk 10Bs best. They had the smoothest least colored sound for my limited budget. My musical taste at the time was about half classical, half rock.

Back in those days digital audio and headphones were not an audiophile option. Good headphones simply didn’t exist and digital audio was so new, consumer CD players were expensive and tended to have poor reproduction of high frequencies and transient response. Because of this, there were no good cheap paths to high quality sound, like we have today.

I didn’t have a turntable, they were too expensive. But I did get a good CD player, an Onkyo DX-530 which was one of the first CD players to use oversampling, which improved the high frequency and transient response by enabling more gradual slope Nyquist filters.

This little system lasted me through college with many hours of satisfying listening. Then, my junior year in college, the local audio store went out of business and I got their used demo pair of Polk SDA-2 speakers. This was a big upgrade from the 10Bs, and the price was so good it was almost an even trade when I sold the 10Bs.

After graduating from college I was ready for a decent turntable. I visited the local audio store and auditioned a couple of different turntables & cartridges for several hours, picking a Thorens TD-318 MK II with an Ortofon MC-3 high output MC. That was in 1991. That HK integrated amp only had a low-gain MM phono amp, and my budget didn’t allow for a low ouput MC. The high output MC was a little on the bright side, but it had the smoothest, least colored sound compared to the MMs.

This little system lasted me for several years, until around 1995 I got a new job and promotion and my budget was ready for an upgrade. I auditioned a couple of different power amps and pre amps at the local audio store and ended up taking home an Adcom 5800 power amp with a Rotel RC-990BX pre amp, which had a dual-stage phono amp, so I could now try low output MC phono cartridges. And I had enough power to fully drive those Polk SDA-2 speakers.

At this time, digital was improving but to my ears, good vinyl still had more natural sounding high frequencies and transient response. But only good vinyl – like heavy 280-220 gram pressings, half-speed masters, etc. I started collecting MoFi half-speed masters, Cheskys, Audioquest, Telefunken, Wilson Audio, Classic, Water Lily, and other audiophile vinyl. I didn’t have the budget for much, so I carefully selected and treasured each new addition to the collection.

In the late 90s I replaced my Onkyo DX-530 with a Rega Planet CD player. I read so many good things about it, I thought it must be great. I never really got into this CD player, I think the old Onkyo was actually better. The Rega had a distinct sound that grabbed one’s attention at first. But upon further listening it was to my ears, congested and the high frequencies were all wrong. I ended up selling the Rega about a year later. It was so popular, it was easy to sell. I replaced it with a Rotel RCD-1070. Nothing special, but a solid well engineered good sounding player.

Fast forward a few years to 2000, when I sold my first startup (Octane software) and was ready for another audio upgrade. I already had reference quality amplification so this time it was the speakers. I visited the local audio store with my best albums and spent all day listening to every fine speaker system they had. I also did a bunch of research in audiophile channels. I ended up picking Magnepan 3.6/R speakers, as they had the most natural, linear, uncolored midrange and treble of any speaker I listened to. The Adcom 5800 had plenty of power with enough refined clarity to make these excellent speakers really sing.

About a year later I designed and built my own ladder stepped attenuator to replace the preamp. This added a level of clarity and transparency to the system — no active preamp is cleaner than a single metal film resistor in the signal path! And I learned a little about analog audio circuits, grounding and soldering. Now I didn’t have a phono amp anymore. I did a bunch of research and picked up a DACT CT100, which is an excellent reference quality flexible phono amp, but just a circuit card. I designed and built a power supply for it (dual 12V batteries), with a small chassis, cabling & grounding & connectors. I was delighted with the sound, a noticeable upgrade from the Rotel pre amp’s phono amp, which was quite good to begin with.

This new level of transparency revealed the limitations of the Rotel CD player so I looked for alternatives, knowing that DACs were constantly improving. I ended up with another Onkyo, a DX-7555. It had a more refined sound with more natural midrange voicing.

After we moved from Orcas Island to Seattle my listening room changed. I used test tones, microphones and measurements to tune my new audio room. I built floor-to-ceiling height 22″ diameter tube traps for the rear corners, RPG acoustic foam 4 layers thick strategically located on the wall behind the listener, careful room and speaker arrangement, and ended up with a great sounding room that was within 4 dB of flat from 40 Hz to 20 kHz. It wasn’t perfect though. There was a small rise in the mids around 1 kHz, likely inherent to the Mag 3.6 speakers, and the lowest bass octave was from 6 to 12 dB down. Notwithstanding these limitations, it was a great sounding room.

I kept this system for about 10 years, from 2005 to around 2015. Then I replaced the ladder stepped attenuator with an Oppo HA-1 DAC, using the digital outputs from my source components. And I got a Behringer DEQ 2496 and used its pure digital parametric EQ to tame the 1 kHz bump and lift the bottom bass octave. This put the in-room system response within 3 dB of flat from 30 Hz to 20 kHz, which is comparable to a good recording studio. The sound is fantastically natural: detailed yet smooth and not bright, bass is deep, yet controlled and fast, natural voicing through the mids with seamless transition to high frequencies.

Finally, in Jan 2018 I sold my turntable, vinyl, and related analog equipment. I just wasn’t using it anymore, since I had all those recordings on digital, and the sound quality of digital had improved so much, while great LPs do sound great, I no longer felt that they sounded any better than great digital.

Mike’s Best Vinyl LP Records

UPDATE: Mar 2018: These are all sold!

As I’m liquidating my vinyl and playback equipment, I’ve sorted through all my LPs and found about 100 of them to be half-speed masters, heavy vinyl, 45 RPM single sided, Japanese Press, Mobile Fidelity, Chesky, Wilson Audio, Telefunken limited edition pressings, or other such. Many are out of print, all are in mint condition – no scratches, cleaned with the Nitty Gritty 2.5FI, played only on properly aligned high end equipment.

I’ve got a few hundred more LPs not shown in this list, many of which are nice, but they’re standard quality. I’ll probably sell them in bulk for $1 each somewhere.

Here’s the list of my best LPs. Items already sold are highlighted in RED: lpListHighQuality-1712

Vinyl LP Cleaning Solution Recipe

I covered this topic about 10 years ago, offering a recipe for fluid to clean vinyl LPs. I still use that recipe in my Nitty Gritty; here’s a summary and a few more tips.

It has 3 ingredients, one of which is optional:

  • Distilled Water
  • Isopropyl Alcohol
  • Wetting Agent (optional)

Most wetting agents are soaps which contain fragrances and other non-essential ingredients that you don’t want polluting your record cleaning fluid. I’ve stopped using the wetting agent and it still works just fine. If you use a wetting agent, all it takes is a couple of drops for a small batch.

Alcohol is a solvent that may degrade the seals of record cleaning machines. To avoid damaging the machine, keep the alcohol below 20%. That seems to be a conservatively safe level, and it doesn’t take much alcohol to do the job so adding more won’t necessarily get records any cleaner.

Two kinds of isopropyl alcohol are commonly available: 70% and 91%.

  • Recommended: Conservative formula (< 20% alcohol)
    • With 70%: 1 part alcohol to 3 parts water = 17.5% alcohol
    • With 91%: 1 part alcohol to 4 parts water = 18.2% alcohol
  • Aggressive formula (< 25% alcohol)
    • With 70%: 1 part alcohol to 2 parts water = 23.3% alcohol
    • With 91%: 1 part alcohol to 3 parts water = 22.8% alcohol

As for cost (as of Jan 2018):

You can buy 91% isopropyl for about $3.50 per quart, and distilled water for about $1 per gallon. That makes 1.25 gallons of fluid for about $5. Nitty Gritty charges about $80 for 1 gallon of their solution, which is for all practical purposes the same thing.

Audio: Balanced and Unbalanced

The term “balanced” is somewhat ambiguous when it comes to audio. In audio circles it usually means differential signalling, and that’s the sense I describe here.

Below is what an standard unbalanced audio signal looks like. The Y axis is volts, the X axis is time. The red line is the + signal, the black horizontal line is the – signal. The + signal carries the music, the – signal is ground. This is sometimes called “single-ended” because only one wire carries the musical signal.

audioSignal-unbalanced

Below is what the same audio signal looks like when balanced (signalled differentially). The red line is the + signal, the blue line is the – signal. Here, neither wire carries ground. Each wire carries the same signal, but they have reverse polarity (inverted phase). The difference between them is a signal having twice the amplitude. At every instant in time, the voltage sum of the + and – wires is zero, so the overall cable (containing both + and – wires insulated from each other) has a net field of zero, which makes it immune to interference.

audioSignal-balanced

This gives balanced signals 2 advantages: S/N ratio is 6 dB higher (twice the voltage = 6 dB), and immunity from interference.

Balanced audio was designed for microphones, which have low level signals carried on long wires. In this application, noise isolation is important and you need all the S/N you can get. Consumer audio analog line levels are in the range of 1-2 Volts, about 1,000 times or 60 dB stronger than microphones. And cable runs tend to be shorter.

Thus, balanced audio doesn’t make much if any difference in consumer audio applications. It’s a superior engineering design, but it doesn’t necessarily make any audible difference especially in top notch gear that already has S/N ratios over 100 dB. It’s nice to have, but I would not pay extra, or chose one piece of equipment over another, for this feature alone. Sound quality comes first, balanced vs. single ended is a secondary concern. Some single ended amps are better than some balanced amps.