Category Archives: Audio

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (5 of 8)

This is part 5 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1. Click here for the introduction.

Wed 12/26; LCD-2 headphones; direct, no EQ

  • New config for faster switching
    • Oppo BDP-83 coax output to HA-1
    • Oppo BDP-83 toslink output to Soul
    • Or reverse of the above; coax and toslink output levels match
    • Level matched using white noise & SPL meter (as before) to < ½ dB
    • Simply replug the headphones back & forth, nothing else
    • Both amps continually playing the same signal
  • Many of the above tracks played repeatedly… also
  • Bruce Katz; Three Feet off the Ground: an excellent Bernie Grundman master
  • Clementi; Demidenko; Helios
  • Doug MacLeod; Brand New Eyes; One Eyed Owl
    • This is a superbly recorded track deep, tight bass, light fast transients and near perfect natural vocal reproduction
    • Oppo & Soul almost the same, but the Soul had slightly deader space between the notes, tighter bass
  • Michael Hedges; Aerial Boundaries: fast transients with extreme HF
    • Soul & Oppo: equal speed, crisp transients
  • Tuck & Patty; Love Warriors; Little Wing
    • This is a nice recording, uncompressed and natural sounding
    • Soul & Oppo: sound the same, bass & voice have same timbre, bass plucks are equally fast & light
  • Julian Bliss Quartet; Hyperion:
    • Almost the same
    • Oppo slightly more air, Soul a touch more mid bass
  • Gillian Welch; Harrow & the Harvest: compressed but very detailed with subtle timbres
    • Both Oppo & Soul capture the very delicate shades of timbre in the voices, the guitar work and micro-detail of breathing & movement
  • Ronnie Earl; Maxwell Street: crunchy & compressed, how well do they portray a bad recording?
    • Oppo & Soul sound the same.
  • Dream Theater; Systematic Chaos: dynamically compressed but otherwise clear with full, wide bandwidth: how well do they rock out?
    • The Oppo has slightly more air, but the difference is so small I can’t be sure
    • Otherwise both sound the same: the bass hits down to 20 Hz, the midrange tonality, the layers of background detail, all identical.
  • Also played several tracks from Steven Wilson’s Yes re-mix
  • These are so similar that even for a picky detail-oriented guy like me, even if I could tell them apart in a blind test (not sure I could), I could love either one.
  • This is beyond splitting hairs. That said…
    • The Soul seems a bit more tight, pure, punchy
    • The Oppo seems to have more depth & breadth

Wed 12/26; speakers, direct, no EQ

  • New config for more fair comparison
    • Both players running as above (Oppo from coax, Soul from toslink, or vice versa)
    • Both preamps running in balanced mode (no more unbalanced Oppo output)
    • Swap the balanced XLR outputs to the power amp
    • Balanced cables = quiet hot swap, no need to power off amplifier
    • This swap is about as fast as before
  • Several of the above tracks played again, plus:
  • Tabula Rasa; Fleck, Bhatt, Chen; 88/24
  • Bourbon & Rosewater; Meyer, Bhatt; 88/24
  • The Oppo’s balanced output is a slight improvement; a bit of the veil is lifted, the bass tightens up a smidge and it’s dynamically punchier.
  • Some of the differences I was hearing were limitations of the Oppo’s unbalanced line out.
    • As mentioned earlier, the Oppo’s primary signal path is internally balanced; the unbalanced inputs and outputs have an additional conversion
  • The Soul still sounds slightly different from the Oppo; it’s more pure and tight where the Oppo gives the impression of breadth & depth.
  • But much like the headphone observations above, the Oppo’s balanced output shrinks these differences.
  • NOTE: from this point forward, all speaker comparisons were done in this way using exclusively balanced outputs from both devices.

Wed 12/26; LCD-2 with EQ

  • Now that I know what the Soul & Oppo sound like, how they’re different, it’s time to listen for enjoyment across a variety of music and see which I want to live with.
  • Listened to the first 2 albums of the Steven Wilson Yes remix
    • Not exactly audiophile material, the original recordings are limited and flawed.
    • But it sounds way better than the originals, which I could never listen to because they gave me a headache.
    • This good music deserved a better recording, and now it has one.
  • Used the Soul’s first notch of crossfeed to fix some of the absolute hard L-R separation. Very nice, a subtle effect that doesn’t eliminate it but makes it less annoying.
  • NOTE: this crossfeed seems more transparent than the one on my Jazz amp. It does the same thing, but with less impact to tone and resolution.

Next, subjective listening notes part 6 (day 5)

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (4 of 8)

This is part 4 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1. Click here for the introduction.

Subject Listening Comparison, continued…

Tue 12/25; speakers; direct, no EQ

  • Beethoven; Piano Sonatas Op 13; Apassionata; Brendel; Decca
    • Soul: purity, timbral accuracy
    • Oppo: +HF, more earthy, dirty tone
  • Chopin Op 58; Mapleshade, Gampel (Fazioli piano): this is a good but slightly flawed recording of a huge “in your face” sounding piano with a bit too much midrange presence.
    • Soul: bright, slight edge on HF, some distortion in upper RH dynamic peaks (sounds like analog tape overload)
    • Oppo: virtually indistinguishable
  • Schubert; flute/piano songs D911; Naxos/Grodd: one of few truly excellent Naxos recordings
    • Soul: slightly rounder flute tone; less air, but extreme HF information is there (lip/air overtones & light whistletones)
  • Oppo: a touch more air
  • Doppler; Andante & Rondo; Rampal, Arimany, Ritter; Delos
    • both excellent: tone, dynamics, voicing, virtually identical

Tue 12/25; headphones LCD-2; with EQ (+3 dB @ 4500 Q=0.67)

  • Volume test
    • Soul click 31 / 12:00 noon → -7.5 dB on Oppo
    • set to -8 dB by ear, fine tuned to -7.5 dB with SPL meter and white noise
  • Doppler (from above): indistinguishable
  • Taheke track 13 (from above): indistinguishable
  • Chieftans 7: indistinguishable
  • Notes
    • Are the Magnepans are more revealing than the headphones? Probably.
    • Is the Oppo’s headphone amp slightly different sounding than its line level outputs? Certainly.
    • Either way, the Soul & Oppo seem indistinguishable on headphones. Both are excellent!
  • Before today, I’d say the HA-1 is the best sounding headphone amp I’ve ever heard; the Soul is its equal.

Tue 12/25; speakers; direct, no EQ

  • Brahms piano quartet; Belcea 96/24: despite its high bit rate, this recording is imperfect with slightly edgy voicing and the piano sounds distant.
    • Soul: purifies the tone, smooths the edge
    • Oppo: detailed yet slight grain/edge
  • Krall; Quiet Nights; 96/24; track 4: this recording is good overall but like most of her albums, it adds an edgy presence to Krall’s voice
    • Soul: smooths the vocal edge, but all the highs, cymbal brush, still there. Three dimensional imaging.
    • Oppo: voice is just a bit over the top with edge, HF more present but dirtier. Image has depth but a touch less 3D deep as the Soul.
    • Which is more true to the original master is unknown, but the Soul sounds cleaner.
  • Krall; Girl in the Other Room; Temptation: another good but edgy Krall recording
    • Soul: surprisingly, not apparently smoother or more pure sounding
    • Oppo: bass during solo has slightly greater perceived depth
  • Mokave; first album; Audioquest; tracks 1, 3, 5: this is a near-perfect recording!
    • Soul: rounder, more pure piano tone; smoother extreme HF transients, may be slightly rounded off. A touch more mid-bass, less bottom depth.
    • Oppo: extreme transients sharper, slightly accentuated.
  • Lily & the Rose; Binchois, Kirkman; 96/24; tracks 16-17: a superb recording from Hyperion
    • Soul: slightly more pure midrange voicing, a tiny tad less sibilant
  • Monteverdi; book 7; Naxos: one of few truly excellent Naxos recordings
    • Disc 1 track 2
      • Soul slightly more distinct and pure
      • Oppo more emphasis on harmonics / overtones
      • Differences very slight, nearly identical
    • Disc 2 track 3
      • Soul: a very thin slight veil lifted from the music
  • Schubert; Schiller-Lieder vols 3 & 4; Naxos: very good but too much midrange edge on the voices (why do mastering engineers feel this is necessary!?)
    • Soul: may be smoother but so slight I can’t be sure; virtually identical
    • With this recording having a bit of edgy midrange presence, I expected to hear the Soul smooth it to a more natural presentation. Against my expectations, I was surprised to find it virtually identical to the Oppo.

Next, subjective listening notes part 5 (day 4)

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (3 of 8)

This is part 3 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1. Click here for the introduction.

Subjective Listening Notes

Before we dive into my notes, I must say that these are my personal subjective observations. I reliably detected these differences in level matched blind tests, so they are real. But I don’t claim they relate to advantages or flaws in any measurable engineering sense. I do my best to describe these differences as a neutral observer without judging which is “better” or “worse”.

For example, even terms like “pure” and “dirty” aren’t necessarily praise or criticism. “Pure” can be good, meaning free of distortion. Pure can be bad, meaning the reproduction of a natural sound is more pure than it sounds in reality (such as its complex timbre sounding filtered or simplified).

The Soul and Oppo are both high quality well engineered DACs with no obvious measurable flaws, they both have a neutral solid state sound without obvious euphonics or colorations. So the differences are necessarily subtle. We’re splitting hairs here, but that’s what high end audio is all about!

Also: I mention the recordings used but I only rarely give CD catalog numbers. You can probably find the exact recordings from the descriptions, but if you can’t, contact me and I’ll be happy to provide them.

Sun 12/23; speakers; direct, no EQ

  • Level testing
    • Set by ear using music and white noise without emphasis (equal energy all freqs 20 Hz – 20 kHz).
    • Tested with SPL meter @ listening position: subjective level matching was about ½ dB off.
    • Soul 12:00 Yellow (click 31 from minimum) = Oppo -14.5 to -15.0 dB (unbalanced line output).
    • Soul clicks measured as SPL, average 0.5 dB per click around the center position
      • From 10:00 to 3:00 position, clicks 19 through 51.
      • Different from manual, which says 0.8 dB per click.
      • Perhaps the manual averages all clicks, which gives bigger number because the first few clicks are bigger jumps.
  • Brahms Clarinet Trio; Ax, Ma, Stoltzman; Sony; track 1:
    • The Soul resolves the instruments so you can hear slightly better what each is doing even during dynamic crescendos.
    • The clarinet and piano are voiced ever so slightly differently through the Soul, just a touch more pure to my ears.
  • The Elfin Knight; Frederiksen; track 3:
    • The Soul resolves the flute & string instruments slightly better especially when they’re in the background with other instruments playing.
  • Soul’s bottom octave (< 30 Hz) sounds weaker.
    • It’s there if you turn it up, but sounds attenuated relative to the Oppo at the same overall volume level.
    • This is perception, not measurement. Both Soul & Oppo have ruler flat frequency response, so this perception is probably related to something else going on in the sound.
  • Taheke; McGee/Krutzen; track 13: the harp’s lowest 25 Hz tones subtly push the air in the room from the Oppo, yet are less noticeable from the Soul.
  • Roots and Sprouts; Abou-Khalil; track 2: the double-bass solo is more audible from the Oppo, and the Soul portrays it with slightly less depth, but more subtle timbre.
  • Barley Moon; Ayreheart; 96/24; track 4: when the drum enters about 50 seconds into the track, it sounds slightly deeper and more compelling from the Oppo.
    • Soul’s mid-upper bass is slightly emphasized relative to the Oppo.
    • Soul’s midrange is slightly more pure than the Oppo.
  • Dowland First Booke of Songes; Grace Davidson, David Miller; Hyperion 96/24; all tracks: at first the Soul and Oppo sound identical, but deeper listening reveals that Soul renders Davidson’s voice as ever so slightly more pure, with sibilants just a hint softer.
  • Soul’s extreme HF (> 10 kHz) is slightly less than the Oppo.
    • ?? initial impression, more listening to confirm
    • Is it possible that this relative attenuation (however slight) contributes to the observed midrange purity?

Mon 12/24; speakers; direct, no EQ

  • Soul’s extreme HF (> 10 kHz) is slightly less than the Oppo.
    • Confirmed. The Soul isn’t lacking these frequencies, but they sound very slightly attenuated compared to the Oppo. Which sounds “best” depends on the recording.
  • Tarab; Abou-Khalil; tracks 2-4: the top overtones of the instruments are slightly more evident with the Oppo. The difference in tonality is so subtle, it’s like the difference in live listening just a few feet further away.
  • Eeg & Fonnesbaek; tracks 1, 3, 6: this great recording has a bit of edge on Eeg’s voice. This is more apparent on the Oppo than the Soul. The Soul sounds slightly more natural, yet still with more “edge” than reality. Which is more true to the slightly edgy original master is unknown.
  • Vivaldi Concerto for violin, flutes, oboes, bassoons; RV577; McGegan, Philharmonia Baroque; tracks 7-9: this fine recording is on the airy side of reality. The Oppo slightly accentuates this airiness while the Soul slightly de-emphasizes it. Unknown which is more true to original master but the Soul gives a more natural presentation for this excessively airy recording.
  • Soul’s bottom octave (< 30 Hz) is weaker.
    • Eeg & Fonnesbaek; tracks 1, 3, 6: the bass on the Soul sounds a tad tighter, perhaps just a hint more speed, grip & control, yet not quite as much depth and richness as the Oppo.
    • Saint-Seans Symphony 3; Stern, Kansas City; tracks 4, 6: about 15 seconds into track 4 the organ hits a deep soft 20-30 Hz tone that pushes the air in the room. Both Soul & Oppo portray this, but the Oppo has a touch more depth and energy.
  • Soul’s midrange is ever so slightly more pure than the Oppo.
  • The Elfin Knight; Frederiksen; several tracks: the Soul has a touch more midrange purity. This could be related to its relatively attenuated HF, but the impression is that it is slightly more damped, as if the brief pauses of silence in the music are quieter.
    • Note: usually, a perceived attenuation of HF (however slight) relates to less clarity, not more. The Soul’s character is enigmatic.
  • After a few hours of the above, listener fatigue set in… resume later

Same Day, Hours later…

  • Measured FR to see if it shows any hint of my above observations–probably not, buy why not check?
    • Recorded warble tones from Stereophile test CD #2, analog line-level balanced XLR outputs of each device (Soul, Oppo) to Tascam recorder.
    • Frequency response matches within 0.1 dB from 20 Hz to 20 kHz (1/3 octave spacing) except at extremes
      • Matched levels at 1 kHz.
      • 20 Hz: Soul is +0.05dB relative to Oppo (-1.35 vs. -1.4 recorded on Tascam)
      • 20 kHz: Soul is -0.25dB relative to Oppo (-1.35 vs. -1.1 recorded on Tascam)
    • These differences should be inaudible
    • NOTE: these levels are relative to each other, not absolute (the Tascam doesn’t have perfectly flat response).
    • HD unmeasurable; both below -90 dB
  • As expected.
  • The above subjective listening impressions are subtle.
    • Subtle changes near the threshold of hearing can be perceived differently from what they actually are (slight difference in loudness perceived not as loudness but as sounding “fuller” etc.).
  • In light of this, how to explain the differences I’m hearing?
  • They’re not psychosomatic; I can differentiate them blind.
  • It sounds as if the Oppo has a slight touch of extra frequency content in the upper mids to treble, and a hint more low bass energy.
  • Sometimes it sounds like a touch of extra detail, other times it sounds like a touch of glare or grain; depending on the music.
  • Could it be a slight difference in frequency response? Could it be harmonic or intermodulation distortion?
  • Unlikely, the measurements are so similar.
  • But subjectively, that describes what it sounds like.

Next, subjective listening notes part 4 (day 3)

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (2 of 8)

This is part 2 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1. Click here for the introduction.

Before diving into the listening sessions, let me summarize a few things:

Overview

  • Both have DAC, preamp and headphone amp.
  • Both operate natively in balanced differential mode.
    • Technically speaking, the terms “balanced” and “differential” are two different things–often, but not always, used together.
    • Here, I use the word “balanced” to mean both, as is commonly done in audio circles.
  • Both have digital inputs (toslink, coax, USB) and analog outputs.
  • Both are well engineered and built.

Functional Differences: Summary

  • The Soul has DSP features; Oppo doesn’t.
  • The Oppo has additional inputs and outputs that the Soul doesn’t have.

Functional Differences: Details

  • Oppo also has unbalanced inputs and outputs (line level & headphone); Soul doesn’t.
    • The Oppo’s internal signal path is balanced.
    • Its unbalanced inputs and outputs go through an extra conversion.
    • This is completely internal and automatic: just plug it in.
  • Soul has multiple digital inputs (3 toslink, 3 coax), Oppo has only 1 each.
  • Soul has DSP: selectable DA reconstruction filter, L-R balance, EQ, channel mixing. Oppo doesn’t.
  • Soul has digital output (to use its DSP with another DA converter), Oppo doesn’t (it doesn’t have any DSP effects to do this with).
  • Soul has a high (120 Ohm) impedance headphone output — in addition to a standard low (< 1 Ohm) impedance output. Jan describes the reason here. Summary:
    • The low Z output is normally used with most headphones, especially high impedance and planar magnetics.
    • The high Z output can dampen oscillation (e.g. tame a “hot” response) for certain headphones having low impedance.
  • Soul has a ground lift switch. You shouldn’t need it but it’s nice to have. Oppo doesn’t have one.
  • Soul has switchable high/low gain for its analog input. Oppo has switchable high/low gain for its headphone output, which applies in the final analog stage to all inputs both analog & digital.
  • Soul is custom built gear and Jan will make whatever individual adjustments you want to your unit: changing the analog gain, custom DSP, whatever. Just ask him!
  • Oppo has Bluetooth input, Soul doesn’t.
  • Oppo has AES/EBU digital input, Soul doesn’t.
  • Oppo has mobile USB input (Apple only), Soul doesn’t.
  • Oppo USB input accepts PCM and DSD, Soul is PCM only.
    • The Soul handles the following sampling freqs: 32, 44.1, 48, 88.2, 96, 192.
    • Oppo handles a few sampling frequencies that Soul doesn’t: 176.4, 352.8 and 384.
    • These aren’t used much, but if you have a source using them, Jan recommends resampling them to a frequency the Soul handles. This can be done in software on a PC. Just make sure you use good software that converts properly in 24 bit or greater with frequency-shaped dither.

Functional Differences I care about

  • Unbalanced analog inputs and outputs are nice to have (though not essential).
  • An unbalanced headphone output is nice to have (though not essential).
  • Channel mixing crossfeed is nice to have when headphone listening to recordings having absolute L-R separation (though not essential).
    • Because this feature slightly changes tonal balance and resolution, I use it only when hard channel separation is annoying.

Other Equipment

  • Disc player: Oppo BDP-83 playing CDs and DVD-Audio, using Toslink and Coax PCM output direct to preamp (Soul or Oppo). Varying bit rates from 44-16 to 192-24.
  • Headphones
    • Audeze LCD-2 Fazor, version 2016 upgraded drivers
    • Sennheiser HD-580 with fresh ear & headband pads
  • Speakers
    • Adcom 5800 amp
    • Magnepan 3.6/R speakers
    • Tuned listening room (floor to ceiling tube traps, acoustic foam, etc.)
  • EQ: Behringer DEQ-2496
    • Not always used; details below

The equipment with electrons flowing through it.

The Oppo looking all shiny & black (hard to believe it’s had 4 years of regular duty).

The Adcom 5800 pushes the electrons through my Magnepans. Hard to believe it’s over 25 years old, still going strong (I have it tested every few years).

It’s amazing how 21″ diameter floor-to-ceiling tube traps clean up the bass response (yes they’re home built)! The dark stuff on the wall is 4-layers thick of RPG acoustic foam strategically located to clean up the midrange response.

These Magnepan 3.6/R have given me over 15 years of musical enjoyment. Being dipoles, they are very sensitive to room setup, but when set up right they are downright magical. With the room treatment and positional setup, at the listener position they measure within 3 dB of flat from 30 Hz to 20 kHz. I love their midrange voicing, so natural and free of resonances; with extended and detailed yet silky smooth treble, distortion lower than most headphones, and bass having the taughtness, control and timbral accuracy that is unique to planar magnetics.

Listening Configuration

  • All Soul DSP features disabled and standard linear phase sinc(t) AA filter used, except where noted.
    • I normally use a Behringer DEQ 24-96 for mild parametric EQ to correct headphone & speaker room response.
    • When using LCD2, I listened both with, and without, EQ. The mild EQ I use (+3 dB @ 4500 Hz, Q=0.67) partially corrects the LCD2 response dip and makes it more neutral and resolving.
    • When using speakers, I disabled EQ. The room treatments give good clean response making the speaker EQ mild and unnecessary for critical listening comparisons.
  • Both Oppo & Soul left ON all week to ensure they were fully warmed up and stabilized.
  • The Adcom 5800 powered off at night (it draws 250 W idle), but on for at least 30 minutes before each listening session–long enough for the fans to be running.
  • Headphones: the Soul’s low Z output; the Oppo’s balanced output.
  • Speakers: the Soul’s XLR output to Adcom 5800; the Oppo’s unbalanced output to Adcom 5800.
    • This slightly favors the Soul, because the Oppo is internally balanced so the unbalanced output goes through an additional conversion. Its balanced output has slightly better specs than unbalanced.
    • While imperfect, this allows faster switching (no need to plug/unplug analog cables).
    • I figured it was probably fair enough because the cables are short (1 meter), high quality (Blue Jeans Cable), and both the Adcom and Oppo have excellent measurements for both inputs, single-ended and balanced.
    • I changed this later (described in notes) and found the Oppo’s balanced outputs sound slightly better.
  • Level matching
    • All comparisons level matched within ½ dB.
    • White noise, equal energy all frequencies, used for level matching.
    • Matching done subjectively by ear, then confirmed and fine tuned with an SPL meter.

Observed Soul notes not primarily listening related

  • Soul occasionally emits a “click” to the analog outputs (speakers or headphones). Not a huge “orgre slurping breakfast” click that could damage speakers, just a light to medium volume audible click.
    • After no music input for a few seconds.
    • Occasionally when starting to play a new disc.
    • Occasionally when hitting play after the disc was stopped for a while.
    • Seems to be a minor bug in the Soul firmware/software.
    • NOTE: if it implements a volume fade-in to avoid the click, it would have to be very fast (say, 30 ms) to avoid clipping some tracks that start immediately.
  • Soul’s volume knob
    • A better design
      • No potentiometers in signal path
      • Changes gain rather than attenuating fixed gain
    • But those relays are physically loud! Not in the output signal, but a mechanical clicking noise in the room.
      • Are the relays this loud on the production version?
      • The volume control relays on my Corda Jazz are much quieter.
      • Jan says: production unit has same relays, but the box is more solid and damps the sound
    • How long do these relays last (relay life in terms of MTBF/MCBF)?
      • Jan says: veeery long, been using them for years and yet to replace one.
    • Output profile
      • About 0.5 dB / click (from 10:00 to 3:00)
      • About -15 dB from full scale at 12:00 (click 31)
      • Larger steps per click for the first 15 or so clicks.
      • Not remote controllable; confirm that the production version is?
        • Jan says: confirmed

NOTE: my setup is a bit unusual, in that my speakers in combination with the carefully tuned listening room are higher resolution than most headphones. Normally, good headphones are more resolving than good speakers. So my observations and conclusions may also be a bit unusual.

Next: subjective listening notes, part 3 (days 1 & 2)

DAC, Preamp, Headphone Amp: Corda Soul and Oppo HA-1 (part 1 of 8)

This is part 1 of an 8 part series comparing the Meier Corda Soul and Oppo HA-1.

Click here if you want to cut to the chase and read the summary.

I’ve used an Oppo HA-1 as my DAC, preamp and headphone amp for nearly 4 years. The reason I still have it is because I enjoy listening to it so much. Yet I wonder whether better sound could be had. The Oppo is primarily designed as a headphone amp; its linestage, while very good, is a secondary feature. I listen on speakers at least as much as I do on headphones, and my speakers are more transparent than most headphones. Also its volume control is a potentiometer; a high quality Alps, but still not the ultimate in transparency and perfect channel balance that a well implemented stepped attenuator can provide.

My audio system at work uses a Meier Corda Jazz amp. I bought it over 4 years ago. It has a wonderful sound: detailed, smooth and sweet without euphonics. And some nice features, like a stepped attenuator volume control, selectable L-R cross-feed and balanced ground drive. It’s hard to find anything this nice for twice the price, and the build quality is great. So when I heard that Jan Meier recently built a SOTA DAC/headphone amp/preamp called the Soul, I was intrigued. The Soul has come out with some anticipation in the headphone audiophile community. I read that Jan brought the Soul to CanJam in Europe and it was judged “best in show”. Given some of the very nice (and expensive!) gear at CanJam, that says a lot. I also found a few people online who had auditioned it, nothing but rave reviews.

Jan built 2 prototypes of this device. The prototype resembles a science project but is solid, if not elegant, and electrically equivalent to the production version (which as of Dec 19, 2018 is yet to be released). If you contact Jan, you can arrange with him to borrow it for a listening session. When I did so, he told me he was running a pre-order to gauge interest in this device and estimate production volume (which influences the price). People who pre-order pay a deposit up front and get a lower price when the production version of the Soul is released. It turns out I contacted him just as this pre-order was ending. Jan has a generous policy of not charging to borrow the Soul prototype (though shipping it back to him cost about $65), and a 14-day return policy for the final product.

It may seem unfair to compare the Oppo with the Soul, as the latter will probably cost several times as much. But I’ve always believed, based on comparison with other headphone amps and DACs, that the Oppo punches well above its weight class when it comes to performance per dollar. Also, my goal was not a head-to-head comparison of these 2 preamps, but rather to find out whether the Soul would be a relative improvement. That is only a subtle distinction, but an important one.

Two weeks later, just before Christmas, the Soul arrived at my door having crossed an ocean, a continent, and customs inspectors. I carefully unpacked it, connected it to my system, centered/zeroed all the knobs and made a quick function test. Music! Success! I swallowed my anticipation and left both it and my Oppo HA-1 powered on overnight for listening sessions the next day.

Next: system summary and setup

Note: I joined the Soul pre-order just as it was ending. Owning Meier’s Jazz amp for the past few years told me he builds high quality products that last, and stands behind them with excellent support. If the Soul doesn’t work out for me and I decide to keep the Oppo HA-1, I can forfeit my deposit knowing it supports Jan’s efforts, or I can re-sell the Soul without loss, considering its retail price will be $1000 more than the preorder price. Either way seemed worth the risk if the Soul lives up to its promise.

DAC / DA Conversion / Linear vs Minimum Phase

Digital audio requires an anti-aliasing filter to suppress high frequencies (at or above Nyquist, or half the sampling frequency). Without this, an infinite number of different analog waves could pass through the digital sampling points. With this, there is only 1 unique analog wave that passes through them. The anti-aliasing filter is essential to ensure the analog wave that the DAC constructs from the bits is the same one that was recorded and encoded (assuming that the original analog mic feed was properly anti-alias filtered, preventing frequencies above Nyquist from leaking through).

Note: what happens if the filter is not used at all?

  • As I just mentioned, without a bandwidth limit, many different analog waves could be constructed from the same sampling points – which one is correct?
  • Without a bandwidth limit, the DAC will produce an analog wave with frequencies above Nyquist, which must be distortion, since they could not be in the analog wave that was encoded.

Pragmatically, one might ask what is the problem, since the difference is all in frequencies above Nyquist, which we can’t hear? The problem is aliasing. Passing these high frequencies will cause the D-A conversion process to mis-interpret samples, creating an analog wave with spurious noise in the audible spectrum through a phenomena known as aliasing. So you get distortion in the audible spectrum – not just at supersonic frequencies. Intuitively, this effect is similar to watching a wheel spin in a movie; it sometimes appears to spin backward when it’s really spinning forward, because the frame rate (typically 24 / second) captures it at just the right moments. The wheel is spinning faster than “Nyquist” for 24 frames per second, which is aliased into the illusion of motion in the opposite direction happening slower than 24 frames per second.

So the DAC definitely needs a low pass filter to suppress frequencies above Nyquist. The question is – what kind of filter?

Audiophiles debate about whether linear or minimum phase anti-aliasing filters are ideal for sound reproduction and perception. Linear phase has the lowest overall distortion, but its symmetric response around transients (a bit of ripple just before and after a transient pulse), often called the Gibbs effect, means there is a “pre-echo” or “pre-ring”. In the diagram below, the red line is the signal and the black wave is the analog wave constructed from it using a linear phase filter.

If the X axis is t for time, this black curve is the function sinc(t). It is symmetric before and after the transient, which means it starts wiggling before the transient actually happens. This is unnatural; in the real world, all of the sound happens after the actual event. This pre-ringing is an artifact of linear phase anti-aliasing filters. Many audiophiles claim this is audible, smearing transients and adding “digital glare”.

Here’s what the audio books don’t always tell you. According to the Whittaker-Shannon interpolation formula, this sinc(t) response represents the “perfect” reconstruction of the bandwidth limited analog signal encoded by the sampling points. The pre-ring is very low level, and it rings at the Nyquist frequency (half the sampling frequency). That is at least 22,050 Hz (octaves higher if the digital signal is oversampled, as it virtually always is). This makes it unlikely for anyone to hear it even under ideal conditions of total silence followed by a sudden percussive SMACK.

NOTE: I say “unlikely” not “impossible” because even though humans can’t hear 22 kHz (let alone frequencies octaves higher), it is at least feasible that somebody could still hear the difference. Under the right conditions, removing frequencies we can’t hear as pure tones causes audible changes to the wave in the time domain. That doesn’t make sense mathematically, but human perception of the frequency & time domains is non-linear and not as symmetric as Fourier transforms.

Some audiophiles suggest minimum phase filters as an alternative to solve this problem. But this cure may be worse than the disease. Minimum phase filters have an asymmetric response around transients with no pre-ringing. A picture is worth 1,000 words, so here’s what that same impulse looks like when a minimum phase filter is used.

You can see that the impulse strikes instantly without any pre-ringing. Well it actually rings louder and longer than the linear phase filter, but that ringing happens after the transient.

This has the added benefit that the ringing is masked by the sound itself for the simple reason that loud sounds psychoacoustically mask quiet ones. So what’s not to like here?

The problem is, minimum phase filters actually have more distortion (more ringing, more phase shift) than linear phase. So you get more distortion overall, but it’s time-delayed so you get cleaner initial transients with more distorted decay. And the phase shift caused by minimum phase filters happens all the time, not just in transients. So it seems you can have clean transients, or good phase response, but not both. Choose your poison.

At this point a purist audiophile might hang his head in sadness. But there’s a better solution to the digital bogeyman of pre-ring: oversampling (or higher sampling rates). The phase distortion and ringing of any filter is related to its slope, or the width of its transition band. Oversampling further increases the frequency of the pre-ring (which was already ultrasonic), makes a shallower slope, wider transition band, reducing distortion.

For example consider CD, sampled at 44,100 Hz. Nyquist is 22,050 and some people can hear 20,000 so the transition band is from 20,000 to 22,050. That’s very narrow (only 0.14 octaves) and requires a steep filter with Gibbs effect pre-ring at 22,050 Hz. Oversample it 8x and Nyquist is now 176.4 kHz, so your transition band is now 20k to 176.4k, which is 3.14 octaves (actually, you’d use a lower cutoff frequency, but it’s still at least a good octave above 22,050 Hz). Absolutely inaudible; go ahead and use linear phase with no worries.

In short, use higher sampling frequencies (or oversample) not because you need to capture higher frequencies, but because it gives you a more gradual anti-aliasing filter which means faster transient response without any time or phase distortion.

This idea is nothing new. Most D-A converters already oversample, and have been doing so for decades. The pre-ring or ripple of a well-engineered DAC is negligibly small, supersonic and inaudible. However, some people prefer minimum phase filters! How can we explain that? Minimum phase filters have no pre-ripple, yet they also have phase distortion, they ring louder and longer, and in some cases they allow higher frequencies to be aliased into the signal.

First, if this preference comes from a non-blind test, we can’t be sure they really heard any difference at all. Maybe they did, maybe they didn’t. A negative result from a blind test doesn’t mean they can’t hear a difference, it only means we can’t be sure they hear a difference.

Along these lines of non-blind testing, Keith Howard wrote a good one for Stereophile a few years ago: https://www.stereophile.com/reference/106ringing
I love their experimental attitude: test and discover! But when they talk about how hard it was to tell the filters apart, it is kinda funny thinking about a bunch of middle-age guys wondering why they can’t hear a supersonic ripple well above the range of their hearing. Especially when most of them understand math & engineering well enough to know why.

Second, consider whether this preferences comes from a blind test. Blind tests only reveal whether people can hear differences; they don’t qualify exactly what differences they were hearing. Perhaps people who prefer minimum phase filters are simply finding some of these distortions to be euphonic. This seems reasonable, given that preferences for vinyl records and tube amps are also common. However, it could also be that some DAC chips implement one filter better than the other.

This topic has been endlessly debated in audiophile circles for years. Here’s an article showing some actual measurements: http://archimago.blogspot.com/2013/06/measurements-digital-filters-and.html

A couple years later he followed up with a listening test: http://archimago.blogspot.com/2015/04/internet-blind-test-linear-vs-minimum.html

So what do I think about all this? Like the Stereophile reviewers, listening to music, I find it difficult to hear a difference between the “sharp” (linear phase) and “slow” (minimum phase) filters. Test signals highlight the differences (I can hear the difference clearly with a square wave) but I don’t enjoy listening to test signals, and since they’re not natural sounds, even if you can tell them apart there’s no reference for what they should sound like. I know the sharp filter (when properly implemented) is correct from a math & engineering perspective, as long as it is properly implemented. The sharp filter in my DAC is only -6 dB at Nyquist, so it might not be properly implemented, though its slope is very steep at that point, so it’s probably not leaking supersonic noise which can be aliased into the audible spectrum. Since there’s essentially no audible difference, I prefer the sharp filter in my DAC. I made some measurements and this seems justified from a technical perspective.

How Loud Does it Get?

Magnepan 3.6/R specs don’t give efficiency, but they give voltage sensitivity. That’s 86 dB @ 500 Hz @ 2.83 V. From this we can determine efficiency.  500 Hz is carried by the midrange panel which has 4.2 Ohm impedance, so 2.83 V drives 2.83/4.2 = 0.674 A of current, which makes 2.83 * 0.674 = 1.907 Watts.

So, 1.9 W of power makes 86 dB SPL at 1 meter. That’s lowish efficiency for a speaker.

The Adcom 5800 is rated at 400 W continuous in each channel with 2.1 dB of headroom. 400 W is 10 * log (400 / 1.9) = 23 dB louder than 1.9 W, which makes 86+23 = 109 dB SPL in each speaker. 2 speakers is twice the power which is +3 dB making 112 dB SPL from both speakers. Plus 2.1 for headroom makes 114 dB SPL peak.

At this power level, what is the voltage and current? We have 400 W into 4 ohms, which is 40 Volts and 10 Amps. That’s per channel.

I’m ignoring distance mainly because (A) dispersion is line source not spherical so it decays less with distance and (B) it’s in a room so some energy is not lost but reflected back, and (C) listener position is close, only about 2 meters from the speakers.

Subjectively, I can say this is VERY loud. Over the 26 years I’ve owned this amp I can count on the fingers of 1 hand the number of times I’ve seen its yellow 1% distortion warning lights briefly flicker during a transient peak.

NOTE: I tested this last night by holding an SPL meter while listening to a test CD. A full scale (0 dB) digital signal, passing through my preamp (Oppo HA-1) with volume at 0 dB measures 104 dB SPL at the listening position. The power amp (Adcom 5800) warning lights do not even flicker. The preamp goes up to +6 dB output, which would be 110 dB SPL. That’s pretty close to the theoretical measurement–within 2 dB.

That 2.1 dB of headroom means peak power is 10^(2.1/10) = 1.62 times higher than continuous, making 400 * 1.62 = 648 Watts.

Also we can sanity check the amp’s overall efficiency. The 5800’s max continuous power draw is rated at 1800 VA (Watts). While delivering 800 W to a pair of speakers (400 per speaker), that’s 44% efficient. It’s actually less efficient at lower volumes because it’s biased to run in symmetric class  A up to about 10 Watts output. The max theoretical efficiency of class A is 25%. It’s rated to draw about 250 W when idle.

Next question: if the Adcom 5800 operates in symmetric class A up to 10 W, how loud can it play these speakers while in class A, before transitioning to class AB?

From above, the speakers play at 86 dB SPL when consuming 1.907 watts. 10 watts is 7.2 dB louder, plus 86 = 93.2 dB SPL. That’s per side, so +3 dB makes 96.2 dB SPL. That’s very loud. But most likely, the transition from A to AB depends on voltage not current so the power level will vary depending speaker impedance.

Double-check the answer: 400 watts is 16 dB louder than 10, so add 16 dB to 96.2 and you get 112.2 dB. The math checks: same answer as above.

What Input is Required?

As mentioned above, the Adcom 5800 max continuous power both channels driven into 4 ohms, is 400 W, which is 40 V and 10 A. What voltage output from the preamp is needed to achieve this?

The Adcom 5800 gain is 29 dB with unbalanced inputs and 26 dB with balanced. These are ratios of 28.18:1 and 19.95:1 respectively. So the input voltage needed to achieve those output levels is 40 V divided by these ratios. That is 1.419 V unbalanced, or 2.0 V balanced. This is also known as the Adcom’s input voltage sensitivity.

However, those are continuous ratings and the Adcom has 2.1 dB of headroom, and 2.1 dB is a voltage ratio of 1.274:1. So to really crank it to max rated levels you’d need 1.274 times that voltage, which is 1.78 V unbalanced or 2.55 V balanced.

Let’s do the math the other way to double check. 2.55 V input, times 19.95:1 gain ratio is 50.8 V. Into a 4 ohm load draws 12.7 A, and the product of voltage and current is 50.8 * 12.7 = 645 Watts. This jives with the Adcom 5800’s rated peak power of 648 Watts.

Bits and Dynamic Range

When digital audio came out I wondered how the number of bits per sample correlated to the amplitude of waves. I imagined that the total expressible range was independent of the size of the smallest discernible gradation. Since this appeared to be a trade-off, I wondered how anyone decided what was a good balance.

Later I realized this is a false distinction. First: the number of bits per sample determines the size of the smallest gradation. Second: total expressible range is not a “thing” in the digital domain. Third: if the total range is a pie of arbitrary size, dynamic range is the number of slices. The smaller the slices, the bigger the dynamic range.

Regarding the first: to be more precise, bits per sample determines the size of the smallest amplitude gradation, as a fraction of full scale. Put differently: what % of full scale is the smallest amplitude gradation. But full scale is the amplitude of the analog wave, which is determined after D/A conversion, so it’s simply not part of the digital specification.

Amplitude swings back and forth. Half the bits are used for negative, the other for positive, values. Thus 16 bit audio gives 2^16 = 65,536 amplitudes, which is 32,768 for positive and negative each (actually one of the 65,536 values is zero, which leaves an odd number of values to cover the + and – amplitude swings, making them asymmetric by 1 value, which is a negligible difference). Measuring symmetrically from zero, we have 32,768 amplitudes in either direction. So the finest amplitude gradation is 1/32,768 of full scale in either direction, or 1/65,536 of peak-to-peak. 16-bit slices the amplitude pie into 65,536 equal pieces.

Here’s another way to think about this: the first bit gives you 2 values and each additional bit doubles the number of values. Amplitude is measured as voltage, and doubling the voltage is 6 dB. So each bit gives 6 dB of range, and 16 bits gives 96 dB of range. But this emphasizes the total range of amplitude, which can be misleading because what we’re really talking about is the size of the finest gradation.

So let’s follow this line of reasoning but think of it as halving, rather than doubling. We start with some arbitrary amplitude range (defined in the analog domain after the D/A conversion). It can be anything; you can suppose it’s 1 Volt but it doesn’t matter. The first digital bit halves it into 2 bins, and each additional bit doubles the number of bins, slicing each bin to half its size. Each of these halving operations shrinks the size of the bins by 6 dB. So 16 bits gives us a bin size 96 dB smaller than full scale. Put differently, twiddling the least significant bit creates noise 96 dB quieter than full scale.

To check our math, let’s work it backward. For any 2 voltages V1 and V2, the definition of voltage dB is:

20 * log(V1/V2) = dB

So 96 dB means for some ratio R,

20 * log R = 96

where R is the ratio of full scale to the smallest bin. This implies that

R = 10 ^ (96/20) = 63,096

That’s almost the 65,536 we expected. The reason it’s slightly off, is that doubling the voltage is not exactly 6 db. That’s just a convenient approximation. To be more precise:

20 * log 2 = 6.0206

So doubling (or halving) the voltage changes the level by 6.0206 dB. If we use this more precise figure, then 16 bits gives us 96.3296 dB of dynamic range. If we compute:

20 * log R = 96.3296

We get

R = 10 ^ (96.3296 / 20) = 65,536

When the math works, it’s always a nice sanity check.

Summary

The term dynamic range implies how “big” the signal can be. But it is both more precise and more intuitive to imagine the concept of dynamic range as the opposite: the size of the smallest amplitude gradation or “bin”, relative to full scale. Put differently: dynamic range is defined as the ratio of full scale, to the smallest amplitude bin.

With 16 bits, that smallest bin is 1/65,536 of full scale, which is 96 dB quieter. With 16-bit amplitudes, if you randomly wiggle the least significant bit, you create noise that is 96 dB below full scale.

With 24 bits, that smallest bin is 1/16,777,216 of full scale, which is 144 dB quieter. With 24-bit amplitudes, if you randomly wiggle the least significant bit, you create noise that is 144 dB below full scale.

Typically, the least significant bit is randomized with dither, so we get half a bit less dynamic range, so for 16-bit we get 93 dB and 24-bit we get 141 dB.

Practical Dynamic Range

Virtually nothing we record, from music to explosions, requires more than 93 dB of dynamic range, so why does anyone use 24-bit recording? With more bits, you slice the amplitude pie into a larger number of smaller pieces, which gives more fine-grained amplitude resolution–and, consequently, a larger range of amplitudes to play with. This can be useful during live recording, when you aren’t sure exactly how high peak levels will be. More bits gives you the freedom to set levels conservatively low, so peaks won’t overload, but without losing resolution.

Another reason that 24 bits can be useful is related to the frequency spectrum of musical energy. Most music has its maximum energy at or near its lowest frequencies, and from the lower midrange upward, energy usually drops by around 6 dB / octave. By the time you get to the top octave, the level is down 30 dB or more, so you’ve lost at least 5 bits of resolution — sometimes more. You might think that 16 – 5 = 11 bits is enough. But since the overall level was below full scale to begin with, you don’t have 16 bits. You typically have only 8 bits for these high frequencies, which is only 48 dB. Recording in 24-bit gives you 8 more bits which solves the problem, giving you 16 bits in this top octave.

Back in the late 80s there was another solution to this, part of the CD Redbook standard called “emphasis”. They applied an EQ to boost the top octave by about 10 dB before digitally encoding it, giving about 2 bits more resolution. Then after decoding it, then cut it by the same amount. In principle, it’s Dolby B for digital audio. However, this is never used anymore because the latest ADC and DACs are so much better now than they used to be.

However, once that recording is completed, you know what the peak level recorded was. You can up-shift the amplitude of the entire recording to set the peak level to 0 dB (or something close like -0.1 dB). So long as the recording had less than 93 dB of dynamic range, this transforms the recording to 16-bit without any loss of information (such as dynamic range compression).

In the extremely rare case that the recording had more than 93 dB of dynamic range, you can keep it in 24-bit, or you can apply a slight amount of dynamic range compression while in the 24-bit domain, to shrink it to 93 dB before transforming it. There are sound engineering reasons to use compression in this situation, even for purist audiophiles!

To put this into perspective: 93 dB of dynamic range is beyond what most people can pragmatically enjoy. Consider: a really quiet listening room has an ambient noise level around 30 dB SPL. If you listened to a recording with 93 dB of dynamic range, and you wanted to hear the quietest parts, the loud parts would would peak at 93 + 30 = 123 dB SPL. That is so loud as to be painful; the maximum safe exposure is only a couple of seconds. And whether your speakers or amplifier can do this at all, let alone without distortion, is a whole ‘nuther question. You’d have to apply some amount of dynamic range compression simply to make such a recording listenable.

High Bit Rate Audio

When CDs first came out in the 1980s they sounded lifeless. I still have several in my collection from those years and they still sound dull. In some ways they were better than LPs: no background rumble or hiss, much cleaner and tighter bass, uncolored midrange, and consistent sound quality unlike LPs that sound best in the outer groove with sound quality gradually deteriorating as the record plays and the needle moves toward the inner groove. At the end of the record, just when the orchestra is reaching is crescendo finale, you hear audible distortion or dynamic range compression because the inner groove can’t handle the dynamic range. CD avoided these issues. Yet by “lifeless” I mean the midrange detail, high frequencies and transient response on CD sounded worse than LP.

Over the 1990s, CDs improved until around the year 2000 I thought the best CDs had surpassed LPs, with better sounding high frequency and transient response, while retaining the other advantages they had all along. By this point, the best CDs of live acoustic music sounded more natural and real, where the best LPs sounded like an artistically euphonic sonic portrayal.

Looking back, one contributing factor to this transformation of CD audio quality may be the use of poorly implemented anti-aliasing filters in the early days. Over the 1990s, we owe the improvement in CD quality largely to digital oversampling and more transparent anti-aliasing filters, and partially to better implementations of dither and noise shaping.

At the same time, around the turn of the century, high bit rate formats came out: SACD and DVD-Audio. Various engineering and acoustic reasons are given for these high bit rates, most of which are based on well-intended yet fallacious understanding of digital audio, some on blatant pseudo-science.

The best explanation I’ve seen comes from a video by Monty Montgomery, and on his website, where he debunks the most common misunderstandings about digital audio. However, in his zeal to shed the light of math and engineering on this subject, he overstates the case in a few areas. Here I describe those areas. However, while I dispute these points, generally I do agree with Monty. He’s essentially got it right and is worth reading.

Audible Spectrum

Monty says, Thus, 20Hz – 20kHz is a generous range. It thoroughly covers the audible spectrum, an assertion backed by nearly a century of experimental data. This is mostly, yet not quite true. The range of human hearing is closer to 18 Hz to 18 kHz. It’s common for people to hear below 20 Hz, but almost nobody above the age of 15 can hear 20 kHz. For example, at age 50 as I write this, my personal hearing range is from around 16 Hz to 15 kHz.

Ironically, this actually strengthens Monty’s case. Digital audio has no problem going lower than 20 Hz, and we only need to go up to around 18 kHz to be transparent for 99% of people.

The Human Ear: Time vs. Frequency Domain

The ear is a strange device. Highly sensitive, yet inconsistent and unreliable. Our keen perception of transient response is more sensitive than one would expect, given the upper threshold of frequency tones we can hear.

For example, consider castanets. They have lots of high frequency energy, to 20 kHz and above. If you listen to real castanets–not an audio recording, but an actual person snapping them in front of you–the “snap” or “click” has an incredibly crisp, yet light and clean sound. Most recordings of them sound artificial with smeared transients, because these recordings don’t capture those high frequencies well. They’re lost somewhere in the microphone, the position of the mic to the musician, or the audio processing.

I have an excellent CD recording of castanets (it’s a flute quintet, but several tracks feature castanet accompaniment) that has energy up to 20 kHz. It’s one of the best, most realistic castanet recordings I have heard: clean, crisp yet light. Almost perfect sounding. As a test, I’ve applied EQ to this recording to attenuate frequencies above 15 kHz. I can differentiate this from the original in an A/B/X test. In the filtered version, the castanets don’t sound as crisp or clean. It’s hard to describe, but they sound slightly “smeared” for lack of a better word. The effect is subtle, but consistently noticeable when you know what to listen for, and listen carefully.

Yet as mentioned above, I can’t hear frequencies above 15 kHz, so I can’t hear the frequencies I attenuated. How is that possible? It may be that the ear is more sensitive to timing than it is to frequency. That is, it can detect transient response requiring higher frequencies to resolve, than it can hear as pure tones. Put differently: take a musical signal of castanets (or anything else with very high frequencies) and apply a Fourier Transform to convert to the frequency domain. The highest frequencies you cannot hear as pure tones. But if you filter them out, it distorts the original waveform in the time domain, rounding off sharp transients and causing pre-echo. The ear can detect these artifacts.

The moral of this story: well-engineered digital audio does perfectly capture any analog signal that has been bandwidth-limited to the Nyquist frequency. But, some caveats apply:

  1. Bandwidth-limiting the signal can create audible distortion. Anti-alias filtering with a steep slope creates audible time domain distortion in the pass-band.
  2. Higher sampling rates (alternately, oversampling) give a wider transition band, making a gradual filter slope, reducing this pass-band distortion.
  3. The frequencies needed for transient response to sound transparent, may be higher than the frequencies that people can hear as pure tones.

Of course, these points are not unique to digital audio. To get transparent transient response, every step in the recording chain must preserve high frequencies. You must use microphones with extended high frequency response, position them close enough to the musicians to capture the frequencies, etc.

Anti-Alias Filtering

The CD standard of 44.1 kHz sampling is not high enough to implement proper anti-aliasing filters that run on normal hardware (DAC chips) in real time. The proof of this assertion is in the specifications of nearly all common DAC chips: at the 44.1 kHz sample rate, their digital filter stop band is 24.1 kHz, which is above Nyquist.

The reason has to do with how aliasing works. Every frequency in the passband has an alias above Nyquist, and these frequencies are always mirrored around Nyquist. For example, at 44.1 kHz sampling the alias of 17 kHz is (22,050 – 17000) + 22,500 = 27,100 Hz. If we stretch the stop band from 22,050 to 24,100, then we allow frequencies from 22,050 to 24,100 to leak through. These are above Nyquist, so they are always noise. But since Nyquist (22,050) is exactly halfway between 20,000 (top of passband) and 24,100 (filter stop band), the passband aliases of this supersonic noise must necessarily all be above 20,000, thus inaudible to humans.

This engineering trick or kludge is clever, but the engineers designing these DAC chips would not resort to it unless it were necessary. At 44.1 kHz sampling, the transition band (20,000 to 22,050) is so narrow it’s impossible to implement a proper digital filter, so they bend the rules. Further proof is that the digital filters at higher sampling rates (88.2, 96, etc.) are properly implemented, with stop bands at Nyquist or lower.

Lossy Compression

Monty says: a properly encoded Ogg file (or MP3, or AAC file) will be indistinguishable from the original at a moderate bitrate. This is downright false — yet it depends on one’s definition of “moderate”. Trained listeners of high quality recordings on high quality equipment can reliably differentiate lossy compressed audio even at high bit rates (like 320 kbps MP3).

A/B/X testing the highest quality recordings in my collection, I can reliably distinguish MP3 up to about 200 kbps rates, using LAME 3.99.5, which is one of the best encoders. With some specialize recordings (jangling keys, castanets) I can differentiate them at the max 320 kbps rate. Most MP3s are done at 128 to 160 kbps thus could be differentiated from the original.

However, there is some truth to the “moderate bitrates are sufficient” viewpoint. Most MP3s are of rock, pop or electronic music, inferior quality master recordings that are compressed, clipped, and heavily EQed. The low 128 to 160 kbps rates may be transparent for this content. But that’s not relevant to us; here we’re talking about high end.

In short, if you are an experienced critical listener of high quality recordings on high quality equipment, you can hear the difference of MP3 and other lossy compression.

I’ve also got a few thoughts on dynamic range and 16 vs 24-bit. That’s a whole ‘nuther discussion.

Conclusion

What Monty says about digital audio is true, generally speaking. He’s done a great job of debunking common myths. High bit rate recordings are over-hyped and can actually be counterproductive. However, there are some caveats to keep in mind:

  1. High bit rate recordings often do sound better, because when they are being made, extra care and attention is used throughout the entire recording process.
    • But if you took that recording and down-sampled it to CD quality using properly implemented methods, it is likely to be indistinguishable from the original.
  2. High bit rate recordings may be sold as “studio masters”, not having dynamic range compression, equalization or other processing often applied to CDs.
    • This is related to (1), and the same comment applies.
  3. High bit rates can offer subtle improvements to transient (impulse) response.
    1. This benefit is intrinsic to high bit rate audio
    2. However, it is not always realized because the limiting factor for transient response may be the microphones or other parts of the recording process.
  4. High bit rates can sound worse, because they may capture ultrasonic frequencies that increase intermodulation distortion.
  5. The differences that high bit rates make (improvement or detriment) are subtle and most people don’t have good enough equipment or recordings to hear the differences.

Parting Words

Engineers may want to record at higher sampling rates with more bit depth to give headroom for setting levels and other processing. But their final result can virtually always be transformed to 44-16 without any audible compromises (distortion, compression, or loss of information). Yet in some areas, 44-16 while sufficient, is barely sufficient, which means it requires careful well engineered over-sampling, anti-aliasing filters, noise-shaped dither, etc.

High bit rate recordings, when done carefully, can offer slightly better transient response for certain types of music. But to the extent they actually do achieve this by accurately capturing higher frequencies that improve transient response (which is rare), this HF content is a double-edged sword that brings the risk of higher IMD distortion. Of course, high quality well-engineered audio gear (DAC, amp, speakers, etc.) mitigates this risk.

If you do use high bit rates, it doesn’t take much more than 44,100 to get the benefits. You don’t need 192k or higher. Most likely, 64k sampling would be enough to get all the advantages. But since that rate is never actually used, we’d go to 88,200 (twice the normal CD rate).

Some practical guidelines:

  • If the original recording was made in the 1980s or earlier, there is no point to high bit rates. Ultra high frequencies are already non-existent or rolled off, transient response is already imperfect, dynamic range is already limited. Here, the 44-16 standard is higher fidelity than the original.
  • If it’s rock, pop or electronic (whether old or new) there’s probably no point to high bit rates. It’s already heavily processed and there is no absolute reference for what this kind of music is supposed to sound like. Classic rock/pop albums get re-released every few years with different re-masterings that all sound different. One version may have better bass or smoother mids, but that is not a 44-16 limitation. Which release is “best” is not a limitation of digital bit rate, but only a matter of opinion.
  • If it is acoustic music recorded in natural spaces, a high bit rate recording may be useful, especially if the recording has very high frequencies (castanets, bagpipes, trumpets) or transient impulses. Even if the bit rate alone doesn’t help things, the entire recording is probably (though not always) made with more careful attention to detail and high engineering standards.

Overall, I don’t worry about it. The quality of a music recording depends far more on the mics used, their placement, the room it was recorded in, mixing and mastering, than it does on the bit rate. And 44-16 is either completely transparent, or so close to transparent that even on the highest quality equipment with the most discerning listener, limitations in other areas of the recording process make the differences mostly moot. However, for these rare special excellent recordings, I will get high bit rate versions if they’re available, if they haven’t been remastered and reprocessed to squeeze the life out of the music, and they don’t cost more than the CD.

Back to the HD-580 – For a While

My Audeze LCD-2 fell off my desk at work and got pranged so they’re going back to Audeze for repair and, incidentally, upgrade to the 2016 drivers. My home pair has  these drivers and they are a subtle improvement over the 2014.

In the meantime, I’m listening to my trusty old HD-580s. Original 18 year old drivers, though I’ve replaced the headband and ear pads, and the cable, a few times over the years. They’re clean and play, fit and look like new.

First impression: these HD-580s are nice headphones! Smooth mids, nice timbres, well balanced. They really were the very first audiophile headphone, SOTA for 1999, a whole different league apart from Grados and the like. But compared to the LCD-2:

  • The low bass is rolled off
  • The bass is not as tight
  • The mids are a tad boxy, not as open sounding
  • The high treble is rolled off

Overall, they sound a tad muffled and slow compared to the LCD-2. Conversely, the LCD-2 has:

  • Wider bandwidth: deeper bass, higher treble
  • Better detail & articulation throughout the range

One advantage the HD-580 has over the LCD-2 is comfort. The HD-580 are lighter and breathe better. That better breathing is due to having velour earpads instead of leather, which is more comfortable but it doesn’t seal as well which likely contributes to the bass attenuation.

Another advantage of the HD-580 is their midrange linearity. The LCD-2 has a response bump in the mids (600-1200) and a dip in treble (3-4 kHz). If you don’t have EQ to correct this, the HD-580 can actually be better than the LCD-2.

A gentle parametric EQ helps widen the HD-580’s apparent bandwidth:

  • +6 low shelf @ 100 Hz, Q=0.67

I’m enjoying this trip down memory lane. I listened to these same HD-580s during most of the 10,000 hours I put into Octane Software back in the day. They sound nice, but I will be happy to get my Audeze back.