All posts by Mike Clements

Ubuntu Upgrade: 16 to 18

Introduction

My upgrade from Ubuntu 16 to 18 was anything but smooth. Here’s the story, in case others find it useful or I need to refer back to what happened.

Why Upgrade?

If it ain’t broke, don’t fix it. My desktop had been running Ubuntu 16 for years and was happy. Since one of the things I use it for is listening to music, I wanted the later version of Pulseaudio that can play music at its native bit rate. Ubuntu 16 resamples everything to one of 2 system bit rates.

The Upgrade

The upgrade itself went smoothly. In short, these commands ran just fine:

sudo apt updatesudo apt dist-upgradesudo do-release-upgrade

After that, all hell broke loose.

Problem 1: No Boot

The system failed in the Grub boot loader and dropped the Grub rescue prompt:

Error: Symbol 'grub_calloc' not found. 
Entering rescue mode...
grub_rescue>

To fix this, I used Ubuntu Boot Repair. I flashed its ISO to a USB stick, booted to that stick, then ran the standard boot repair, which reinstalls Grub on the PC.

Problem fixed: the computer now boots!

Problem 2: Laggy Desktop

The mouse pointer lagged by a second or two, reminding me of using a mouse on my old 4.77 MHz IBM XT years ago. Turns out the upgrade hosed my video driver, making it not entirely the Nvidia driver, not entirely the open source driver, but some kind of zombie hybrid of the two.

I uninstalled the Nvidia driver and rebooted. But I couldn’t do that, because apt-get was broken. I had to fix that first.

Problem 3: Apt-Check

After every Ubuntu install or upgrade I always re-enable whatever repos were disabled for the upgrade, then update all the software so it can install new versions. This problem manifested as an error message whenever I ran “apt”:

libdvd_pkg: 'apt-get check' failed. You may have broken packages. Aborting...

This turned out to be caused by the libdvd package not completing its installation steps. To fix this, we can complete them manually:

sudo dkpg-reconfigure libdvd-pkg

After that, make sure apt-get is really fixed:

sudo apt update
sudo apt dist-upgrade
sudo apt autoremove

Done!

Problem 2: Laggy Desktop: Reprise

With apt-get fixed, I can now uninstall & reinstall the NVidia drivers to fix the laggy desktop:

sudo apt purge nvidia*
sudo apt-get check
sudo apt update
sudo apt upgrade
sudo apt autoremove
sudo reboot

Problem fixed! So I reinstalled the NVidia binary driver:

sudo ubuntu-drivers autoinstall
sudo reboot

Problem 4: Python 3 Broken

I use Python virtual environments and virtualenvwrapper. The upgrade broke these and I was getting a virtualenv related error message every time I opened a terminal. To fix this, I had to reinstall pip and virtualenvwrapper:

python -m pip uninstall pip
python -m pip install pip
pip install virtualenvwrapper

I did this in the system default python (python 2) because that I run virtualenv from the command line.

Fixed!

Problem 5: Desktop Crashes

At this point, my desktop was working pretty well in Ubuntu 18. But I was disappointed to find that periodically, from every 10 minutes to every hour, the computer would crash entirely and revert to a text mode screen. After such a crash I could ssh to it from another machine, so the machine was still running, it was just the desktop that crashed. From ssh, I could kill Xorg and the login prompt would appear on that desktop.

I found this bug report, which provides a workaround: add a parameter to the linux kernel when booting

GRUB_CMDLINE_LINUX_DEFAULT="scsi_mod.scan=sync"

This seems to have fixed it, though only time will tell.

Update: the system still crashes. This does not fix the problem. More specifically, the desktop suddenly goes black and I must ssh from another computer, kill Xorg and it restores the login prompt.

As I’m experiencing these Xorg crash/hangs, I’ll list the various things I’ve tried that don’t fix it:

  • adding kernel param “scsi_mod.scan=sync”
  • disabling display power management in XFCE power Manager (screen never powers off)
  • disabling screen saver (screen saver never kicks in)
  • disabling screen blank (screen never goes blank)
    • … PENDING … no crash yet, this might actually fix it!

Road Biking: Vashon Island

Vashon Island is a fun road bike ride. It’s rural and scenic, and hilly making a great workout. Here you will find suggestions to make this ride a smooth experience.

Ride Overview

The full ride is about 42 miles with 3900′ of climbing. The hills are steep and there’s lots of them, making it feel more like a 50-60 mile ride. Here is a GPX of the full route. Here is a picture:

On the route you can see 2 black dots. These are decision points to change the ride.

Decision point 1: about mile 13; whether to ride down to the Tahlequah ferry terminal at the S tip of Vashon Island. This is an out-and-back 3.7 mile round trip descending 350′ then back up.

Decision point 2: about mile 22; whether to ride the Maury Island loop. This is 11 miles with about 1100′ of climb. It has scenic sections, and the toughest climbs on the route.

Here are some photos from a ride in Oct 2020.

Getting There (and Back!)

From Seattle, the easiest way to get to Vashon is the WA state ferry. Here’s the schedule. It’s different on weekends & weekdays. You’ll depart from and return to Fauntleroy. As of March 2023, the fare is $7.25 for a walk-on with a bike and is paid only westbound. Masks are no longer required. As you approach the Fauntleroy ferry terminal, if you are on the walkway, then walk (don’t ride) your bike. If you want to ride your bike, take the car lanes. If your forget this, the ferry workers will chide you when you arrive.

Parking: you can park a car at Lincoln Park parking lot 1. This is about 1/4 mile up the street from the ferry terminal. There are no fees for parking. From there, ride down to the ferry terminal to walk onto the ferry.

Ride Notes

Vashon’s pavement is often rough. Autumn leaves fall onto the road, get rained on and decompose, making the corners slippery. Debris also washes into the roads, and some of the downhills are steep. So take extra care!

Most Vashon roads don’t have bike lanes, but most cars drive slow. It’s similar to biking the San Juan Islands.

The ride starts with a 1 mile climb from the ferry landing that is sure to warm you up on a cold morning (on a good day you can hit 40+ mph on the way down at the end of the ride). Then the road levels off for a while before you enter the rolling hills of Vashon.

For typical riders in decent physical condition, due to road conditions & hills, plan on lowish average speeds around 14 mph which makes 3+ hours of riding. Bring food & water accordingly. Close to halfway through the ride there’s a little town with a coffee shop, hotel, and a deli called Harbor Mercantile, where you can get food & water. Near the end of the ride you’ll pass through Vashon’s downtown which has several good coffee shops and restaurants.

Ride Scenario

Here’s a typical scenario for a Vashon ride in Spring 2023:

  • Meet @ 7:45am @ Lincoln Park
  • Park, prep, ride to Ferry
  • Pay fare & walk onto 8:15am ferry
  • Disembark on Vashon around 8:45am
  • Bike: 4 hours (full route, with extra time to eat or fix a flat)
  • Walk onto return ferry: 12:40 or 13:30
  • Ride back to Lincoln Park

Car Performance & Handling: Swaybars!

It’s been several years since I autocrossed or owned a high performance car. I still like fun to drive cars, I just don’t have the time for it anymore. I finally got around to doing the first performance upgrades that I’ve done in years. Back in my SCCA days, the first 2 mods anybody did was (1) tires, and (2) swaybar.

Swaybar 101

When a car has a swaybar, in order for the body to roll L or R it must twist the swaybar. Stiffer swaybars reduce L-R body roll without affecting the spring or shock rates. If both wheels hit a bump and move together, the swaybar does nothing. It only kicks in when the L and R sides try to move differently. When one side (L or R) tries to move up or down, the swaybar forces the other side to also move up or down too. How much, depends on the swaybar’s stiffness.

A stiffer swaybar reduces body roll, which reduces weight transfer, which reduces overall traction. Yet at the same time, it improves response and agility. So it’s a trade-off. Put differently, you need to allow some body roll to get good traction, but too much of it reduces response.

Adjustable Swaybars

When the car’s body roll twists the swaybar, it does so through connecting arms. Longer arms give the body roll forces a longer moment arm, making it easier to twist the bar, making a softer bar (lower rate) with less resistance to body roll. Adjustable swaybars usually have several points along their arms where you can connect the end links.

Must both of the swaybar arms have the same length? Imagine if one arm is longer than the other; that is, connecting the end links to different mounting points on each arm. Newton’s 3rd law says the torques exerted are always equal and opposite, which seems to suggest that an asymmetric setup could give symmetric roll response. In this case, an adjustable bar with 2 holes actually has 3 different rates, or with 3 holes has 5 different rates.

However, while the torques are always equal and opposite, even when connected asymmetrically, the moment arms are not. And the torque on the bar exerts a force on the opposite side through its moment arm. So connecting the swaybar asymmetrically would create asymmetric roll rates: stiffer to the L than to the R, or vice versa.

The conclusion: always connect the end links to the swaybar using equal length arms.

Tuning the Response

Most cars are designed to understeer: that is, under most conditions the front slides before the rear does. This is easy to control, especially for unskilled drivers. But skilled drivers find excessive understeer to be less fun and even annoying. Excessive understeer makes a car less responsive. As a general rule:

  • A stiffer rear swaybar reduces understeer, increases oversteer.
  • A stiffer front swaybar reduces oversteer, increases understeer.

This is all relative. Most factory cars are too soft overall and also understeer, so a stiffer rear swaybar is ideal. But if the car is too stiff overall and also understeers, you might use a softer front swaybar.

Mazda 3

My 2014 Mazda 3 is actually pretty fun to drive, for a FWD economy car. At 36000 miles I finally had to replace the OEM tires. While I was doing this I figured I’d also install a stiffer rear swaybar.

This is such a common car there are many options. I used a 22mm Progress bar. It was relatively inexpensive and came with new bushings and brackets to handle the larger forces. The OEM rear swaybar is 18mm diameter with a rate of 334 in-lbs. The Progress is 22mm diameter with rates of 772 in-lbs (about twice as stiff), and 1,015 in-lbs. (about 3 times as stiff). This bar has excellent build quality and perfect fit with the end links pointing straight up and down just like they do with the OEM bar.

The soft setting made a noticeable yet not a huge reduction in body roll. I quickly shifted to the stiff setting which was completely different. Less body roll, quicker turn-in and more precise handling. But, it got twitchy. The swaybar was too stiff for the rest of the suspension. So I kept it at the soft setting (still twice as stiff as stock).

A few months later, I replaced the shocks & springs all around. With the stiffer shocks (Koni yellows) and springs (+20% rates from Racing Beat), the swaybar’s stiff setting was better matched to the rest of the suspension. The twitchiness was gone, now it was just a precise, sharp, flat tracking, great handling car.

Subaru Forester

My wife’s 2018 Forester is a lot softer than our 2004 Forester. It drives more like a bus than a car. Over the years, Subaru softened the suspensions. Subaru has OEM swaybar replacements in 2 sizes: 19mm or 20mm. Stock is 16mm, so the 19mm is about twice the rate. I got this bar from Subaru Online Parts, cost about $100 and included new brackets and bushings.

This makes the new Forester handle more like the old one. It’s less like a bus, more like a car. It feels tighter and more controlled. But not too tight. If you go too stiff with an AWD vehicle it can impair traction off road. It’s perfect for my wife, who wanted less body roll but is not a performance car enthusiast.

NOTE: the end links on this Subaru were quite nearly frozen. The end link attachment bolt was corroded to the nut. And the car is only 2 years old, 4700 miles, and has not been driven on salted roads. I removed the end links from the car, so I could remove the entire bar with end links attached to it. Soaked the end link bolts in liquid wrench and moved the nut back and forth chasing the threads until it finally came free.

More generally, a stiffer swaybar applies greater forces to the end links. So if you more than double the rates, don’t be surprised if the end links eventually break. Keep an eye on them and be ready to replace them with more robust aftermarket end links.

High Res Audio on Ubuntu: Part 2

In part 1 we saw recommended settings for bit depth and sample rates, why these are recommended, how they work, and how to set them. Here, we’ll talk about glitch-free audio.

If you want to check your configuration, skip ahead to part 3. Or return to part 1.

In Ubuntu you may notice occasional audio glitches. They can be obvious or subtle. For example, here is one I encountered recently that is not quite obvious, but you can definitely hear if you are paying attention:

Clean
Dirty

The higher the resolution of the audio, the increased demand for data flow & processing, the more likely these glitches are to occur. These glitches arise from the way Pulseaudio buffers audio data and schedules interrupts for itself to process and flow that data. Many systems don’t glitch with CD quality and lower, but start to glitch at higher rates. Or they may glitch only when the PC is busy doing other work.

Fortunately, this can be configured so almost any computer can play high resolution audio glitch-free. I’ve experimented with these settings on a 15-year-old PC running Ubuntu 18 that was seriously glitchy even at CD quality, using default settings. By changing settings I got this PC to play local audio files up to 192-24, and stream audio in the browser up to 96-24. Then I applied these settings to a fast modern PC running Ubuntu 16. This PC played CD quality audio just fine with the system defaults, but glitched when playing back high resolution audio. This PC now plays back audio seamlessly at all bit rates.

There are 4 basic settings to configure. You may not need to do them all. Try each individually in turn to see if it fixes the problem.

Pulseaudio Process Priority

The Pulseaudio process normally runs at nice level -11. This gives it priority over normal system processes. But increasing its priority even more can help. That means a numerically smaller number (you’re being “less nice” to the rest of the system).

File: /etc/pulse/daemon.conf

; nice-level = -11
nice-level = -15

Comment out the default nice-level and set it a bit lower. It doesn’t seem like much, but it does make a difference.

Pulseaudio Timer Based Scheduling

A few years ago, Pulseaudio switched to timer based scheduling. This is a better way to reduce audio latency while keeping audio streams running smoothly. But Linux is not a real time operating system; it doesn’t give processes guarantees when they will get CPU time. So timer based scheduling sometimes causes buffer under-runs, which is one cause of audio glitches. The timer based scheduling system is supposed to detect when this happens and increase buffers & latency to compensate. But even if it does, you may still get occasional audio glitches as it detects and compensates.

File /etc/pulse/daemon.conf has settings for audio buffers:

; default-fragments = 4
default-fragments = 4
; default-fragment-size-msec = 25
default-fragment-size-msec = 50

The total buffer is fragment size * count, so the above example is 4 * 50 = 200 ms of audio buffer, which is 200 ms of latency. This is more than twice the default value.

Note: while the setting says milliseconds, it actually sets the buffer size in bytes. The conversion is based on the default sample rate. So if you set it to 200 ms and the default rate is 44.1 kHz, at 96 kHz it will be about 92 ms, as 200 * (44.1 / 96) = 91.875.

However, if you simply increase these values and restart Pulseaudio, nothing will change. That’s because Pulseaudio by default uses timer based scheduling, which ignores these buffer settings. For these settings to take effect — to increase the buffer size — you must disable timer based scheduling.

Open this file: /etc/pulse/default.pa

Look for this section of the file (around line 50):

### Automatically load driver modules depending on the hardware available
.ifexists module-udev-detect.so
load-module module-udev-detect
.else
### Use the static hardware detection module (for systems that lack udev support)
load-module module-detect
.endif

See the line I marked in bold face above? Add a parameter at the end, like this:

load-module module-udev-detect tsched=0

Adding this parameter and setting it to 0 disables timer based scheduling, and makes Pulseaudio use the fragment settings shown above.

Don’t go too crazy with huge audio buffers. Increase it just enough to eliminate audio glitching, and add maybe 50% more to account for system load or high sample rates. Big buffers increase latency which becomes problematic in applications like gaming and video calls.

Resampling

In part 1 we mentioned using the highest quality resampler, soxr-vhq or speex-float-10. You can use a faster, lower quality resampler like speex-float-3.

However, if it becomes necessary to make this change, then there’s no point to high resolution audio because your system is so slow it can’t handle the necessary data & processing. So if you must resort to this, you should also set sample rates back to their defaults (44100 and 48000), and set avoid-resampling to false (its default). This way, Pulseaudio will downsample higher rates to something your system can handle, and it will use a fast lower quality resampling algorithm when doing so.

The benefit is, if your system can’t handle high resolution audio, at least you can configure it to play CD quality audio glitch-free.

CPU Governor

Ubuntu’s default CPU governor is “ondemand”, which sometimes throttles back the CPU when it shouldn’t. For example, playing audio is considered a background task, and it may think the PC is not busy and throttle back the CPU, causing audio glitches.

It’s worth trying the “performance” governor instead. If it doesn’t improve things, you can easily revert back. To try this, first disable the service that always sets the “ondemand” governor, because this will override any other settings you make:

sudo update-rc.d ondemand disable

Next, install package cpufrequtils:

sudo apt install cpufrequtils

Then edit the config file: /etc/init.d/cpufrequtils

Find the commented-out section that looks like this, around line 40

# eg: ENABLE="true"
# GOVERNOR="ondemand"
# MAX_SPEED=1000
# MIN_SPEED=500

After this section, add a line like this:

ENABLE="true"
GOVERNOR="performance

Ensure to comment out or replace any existing lines that set these same settings. Then reboot.

Conclusion

Make sure to restart Pulseaudio after every config change. Use “ps” to ensure only 1 copy of the pulseaudio process is running at a time. When you find settings that work, try them under different conditions of system load to see how robust they are. Sometimes they’ll work when the system is idle then you’ll have problems when it gets busy, as other processes take computing time away from Pulseaudio.

Glitch-free audio is easier to achieve when playing back local files, than when streaming. This is because streaming presents more system load. Thus, you may find settings that work fine for playing back local files but glitch when streaming. If so, try increasing the buffer size.

Note: this method uses bigger audio buffers to ensure smooth playback. This increases latency, which can negatively impact other applications like video calling, movies, and gaming. So, experiment with different buffer sizes and uses the smallest buffers that work reliably.

Above I mentioned an alternative approach. Instead of increasing buffering, you can enable the Linux kernel "threadirqs" feature and increase the IRQ priority of the sound card. This may provide glitch-free playback without increasing latency. I have not tried this approach.

Now, we can jump to part 3, where we check how things work while playing audio.

High Res Audio on Ubuntu: Part 1

People sometimes criticize Ubuntu, more specifically, Pulseaudio and any Linux variants that use it, for not being audiophile friendly. Not surprisingly, this criticism has a thread of truth to it. Yet Ubuntu can be configured to support high quality audio.

The settings are simple, but explaining them takes space, making this a multi-part series.

Note: I made these changes on Ubuntu, though they probably work on any Linux variant that uses Pulseaudio.

Click to skip this and jump to part 2 or part 3.

Pulseaudio Versions

Pulseaudio is an audio layer on top of ALSA. One of its key benefits is enabling different apps to share the audio hardware (e.g. the sound card). ALSA works without Pulseaudio, but in this case only 1 app at a time can use the audio hardware.

Yet one of the essential parts to sharing audio is converting formats: sample rates and bit depths. Pulseaudio tends to do this all the time, even when it’s not necessary because only 1 app is using audio. This unnecessary resampling gives Pulseaudio a bad reputation among audiophiles.

Pulseaudio Resampling

In days of yore, Pulseaudio had a single sample rate and resampled everything to this rate. Since DVDs use 48000 and CDs use 44100, however you configure Pulseaudio, one or the other would always be resampled.

About 10 years ago Pulseaudio introduced the alternate-sample-rate config setting. This gave it 2 sample rates, for example the default /etc/pulse/daemon.conf file says:

default-sample-rate = 44100
alternate-sample-rate = 48000

The first is for CD, the second is for DVD, the 2 most common audio sources. This means Pulseaudio uses whichever rate provides the minimum effort / cleanest  conversion. Resampling between rates that are integer multiples is simple and transparent: less math and cleaner audio. For example, if the audio stream is at 96000, then downsampling to 48000 is cleaner and easier than to 88200; even though 88200 is numerically closer. So Pulseaudio has these defaults (44100 and 48000) for good reason, and when it must resample, it chooses the rate intelligently. Every audio rate commonly used for music and movies is one of these, or an integer multiple of it.

So the good news is this feature is really useful. The bad news is that it doesn’t always work. Here’s a super important limitation of Pulseaudio: It doesn’t change the sample rate while sounds are playing, so it can only change the rate while audio isn’t being used. So if you start playing a DVD, Pulseaudio sets the system sample rate to 48k. If you start a CD while the DVD is playing, the audio rate will remain at 48k and Pulseaudio will resample the CD’s 44.1k audio to 48k — and keep it there even if you stop the DVD and keep the CD going. The reverse happens if you start the CD first, then start the DVD while the CD is playing.

So to take advantage of the alternate sample rate, you must stop all apps from playing.

Avoiding Resampling

In version 1.11 Pulseaudio added a new config setting. In the /etc/pulse/daemon.conf file it looks like this:

avoid-resampling = true

Pulseaudio still uses the default and alternate sample rates. So this new setting controls what Pulseaudio does when it encounters an audio stream using a sample rate that is neither the default nor the alternate. If this setting is false (the default), Pulseaudio will resample the stream to one of the 2 configured rates, as described above. If this setting is true, Pulseaudio will use the stream’s native sample rate without resampling it.

Essentially, this new setting enables Pulseaudio to play every audio stream at its native rate, avoiding all resampling. The configured rates (default and alternate) become entirely optional, rather than mandatory.

However, Pulseaudio still won’t change the sampling rate while sounds are playing. And it still forces resampling of a new audio stream, if another audio stream is already playing when it starts. So this new feature to avoid resampling only works when no other audio is already playing, when we start a new audio stream.

Bit Depth and Reample Method

For bit depth, I recommend using at least s24le (signed 24-bit little endian), or s32le or float32le.  That’s because converting to larger sizes is harmless, but going the opposite way reduces resolution.

Pulseaudio supports several different methods for resampling. This command lists the available resamplers:

pulseaudio --dump-resample-methods

There is no reason not to use the highest quality: soxr-vhq. If it isn’t available on your system, use speex-float-10.

Summary

Overall, I recommend the following settings in Pulseaudio. When you make these changes to the config file, make sure to comment out the default settings you are replacing.

Version 1.8 (Ubuntu 16.04 or earlier)

/etc/pulse/daemon.conf

resample-method = soxr-vhq
default-sample-format = float32le
default-sample-rate = 44100
alternate-sample-rate = 48000

Version 1.11 (Ubuntu 18.04 or later)

Same as above, but with 1 extra line to avoid resampling.

/etc/pulse/daemon.conf

resample-method = soxr-vhq
default-sample-format = float32le
default-sample-rate = 44100
alternate-sample-rate = 48000
avoid-resampling = true

Now you’ve configured the system to set preferred sample rates, avoid resampling, and you know how to allow the system to change sampling rates. In part 2 we will set audio system buffers and priority to avoid audio playback glitches.

Digital Audio: Bit Depth vs. Resolution

It’s commonly said that digital audio’s resolution depends on the bit depth of each sample. Each bit doubles the range of amplitudes that can be stored, and a doubling of voltage is about 6 dB, so 16-bit audio is said to have 16 * 6 = 96 dB of resolution.

However, I believe that resolution is the wrong word. Here I will show that digital audio actually has virtually* infinite resolution at any bit depth. But first, let’s explore the common belief with an example.

Use REW to generate a single-tone sin wave, say 622 Hz at -114 dB. It sounds like this:

Of course you probably can’t hear it because -114 dB is very quiet. So let’s amplify it by +113 dB:

OK, that’s it. Yet experienced listeners may notice this doesn’t sound like a pure tone. It sounds a bit dirty. Let’s take a look at it:

You can see that the curve isn’t smooth. It has jagged jumps. This is called quantization distortion. We’ll get to this later. But the point is, the wave is there.

Now that we know this wave really exists, let’s take it at its original level of -114 dB and convert it to 16-bit. Here’s what that sounds like:

Nothing to hear, folks. It pure digital zeroes. No matter how high you turn it up, the only noise you’ll hear is from your sound card or amp.

Intuitively this makes sense. This wave’s peaks are too small; they never get anywhere near as loud as -96 dB, which is the smallest signal that 16-bit audio can capture. In fact, their peaks are a full 18 dB below that minimum threshold.

So, doesn’t this prove that 16-bit audio has only 96 dB of resolution? That is, it can’t capture anything below -96 dB? It seems so, but no — it doesn’t.

The reason for this is because I did the above transformations without using dither.  But dither is an essential part of digital audio. When dithered, digital audio can capture signals well below -96 dB.

Here’s that -114 dB signal converted to 16-bit, with dither:

If that is too quiet to hear, here’s the same signal boosted by +90 dB (this is loud, so turn down the volume before playing):

That noise like tape hiss is the dither. You can clearly hear the sin wave in the noise. For comparison, here’s the above non-dithered transformation, boosted to the same level with dither:

This is pure noise/hiss without any signal. Comparing it to the above, the difference is obvious.

Conclusion

Here we’ve captured a -114 dB signal with 16-bit audio, which supposedly has only 96 dB of resolution. That’s 18 dB below its supposed minimum. Yet there’s nothing special about 18 dB. If it can go 18 dB below, there’s no arbitrary limit how much lower it can go. Eventually it will get masked by the noise so you won’t hear it anymore, but that happen far below 16-bit’s oft-quoted “resolution”.

This might seem like a contradiction, but it’s not. That’s because resolution is the wrong way to think about bit depth, leading to wrong notions about what actually is limited by bit depth.

Dither is what makes this possible, so it’s an essential part of digital audio. It enables us to capture signals well below the 6 dB / bit levels that are often quoted. Dither is not about psychoacoustics, it is about physics (or math, if you prefer).

What exactly is dither? Essentially, it’s randomizing the LSB (least significant bit) of each sample. Yes “random” means noise, so this adds noise to the signal. The irony is, adding noise increases the resolution. How much noise you get by randomizing the LSB depends on how “small” the LSB is. That is, it depends on the bit depth. With 16-bit audio, the LSB is -90 to -96 dB. With 24-bit audio, the LSB is -138 to -144 dB. In this sense, higher bit depths are like better quality analog tape having less hiss (though of course even 16-bit has far less noise than any analog tape ever invented).

Alternative Explanation

So how exactly does randomizing the LSB enable the samples to detect tiny signals below the bit depth? Here’s an intuitive way to think about this: every sample’s LSB is randomized, so 0 and 1 are equally likely. But when you add a tiny signal to this, it slightly biases the outcome. When the signal swings positive, the sum of signal + random LSB is slightly biased toward 1, meaning it’s slightly more likely to be 1 than 0. When the signal swings negative, the opposite happens.

Conclusion

In summary, digital audio can capture extremely low level signals well below its bit depth. The limiting factor for the smallest encodable signal is determined not by the bit depth, but by the noise level. At some point the dither noise will mask low level signals, but this happens well below the bit depth.

Phone, Tablet Measurements

I’ve read that most mobile devices (phones and tablets) have surprisingly good audio quality from their analog headphone outputs. To test this, I decided to measure mine and found that this is not necessarily the case.

Method

I used Room EQ Wizard to generate frequency sweep files at 44 kHz and 96 kHz. Copied the files to my phone (Galaxy Note 4 SM-N910T) and tablet (Galaxy Tab S SM-T700). Connected the device’s analog headphone output to my sound card’s analog input. Played the sweep files on the device at max volume, recorded using Audacity on my PC. Then used REW to “import sweep” and analyze the files.

The results showed audible discrepancies in both frequency response and distortion. So I played the files back using 2 different apps: USB Audio Pro (in bit perfect mode, all DSP disabled), and VLC. Both measured the same.

Baseline Loopback

I made these measurements with my sound card, so its performance is the baseline. To measure that, I used RCA cables to connect its outputs directly to its inputs to measure its loopback performance.

As you can see below, the Juli@ measures quite well for a sound card. It should be audible transparent.

Loopback Frequency Response

At both sampling frequencies, frequency response is flat with less than 0.1 dB variation through the audible spectrum. Phase response and group delay are equally flat.

Loopback Distortion

First 44.1 kHz, then 96 kHz. As you can see, distortion around -96 dB with a few peaks into the -80 range at 30, 60 and 180 Hz, probably related to 60 Hz power regulation.

Device Measurements

The baseline having been set, here are how my phone & tablet measured. These are raw, uncorrected so they are relative to the baseline.

Results: Frequency Response

The frequency response is nowhere near flat, with deviations plenty big enough to hear.

The top lines (purple/blue) are the phone, bottom lines (brown/teal) are the tablet. 44 kHz and 96 kHz are right on top of each other, so the sampling rate didn’t make any difference.

These response curves are so far off from flat I thought I measured it wrong. I double checked the apps playing back the frequency sweeps (USB Audio Pro and VLC), made sure they weren’t applying any EQ. Both were set to “bit perfect” or flat, and had the same response.

Results: Distortion

The phone’s distortion rises in the low frequencies to about -50 dB. That’s nowhere near as good as I expected and worse than inexpensive dedicated DACs. But it should be below perceptible thresholds. Especially since even good headphones typically have between 1% (-40 dB) and 10% (-20 dB) distortion in the bass.

The tablet’s distortion is significantly higher: -20 dB in the lows and about -40 in the mids and treble. This close to perceptible thresholds and may be audible. It’s dominated by 3rd harmonic.

Conclusion

The take-away here is to bust the myth that phones & tables produce decent sound quality from their headphone jacks; their main limitation is they have only enough power to drive sensitive IEMs, not full size headphones. They certainly do have this power limitation, but their sound quality may be compromised even when driving easy loads. Of course, other phones and tables may perform better than the ones I measured.

Frequency response varies by around 6 dB which is not only audible, but obvious. My old cassette tape deck had flatter frequency response! Distortion is “OK” but I’d like to see lower.

However, the phone or tablet can still be used as a musical source. All of the above limitations are in the built-in DAC and headphone amp. Instead, you can use an app like USB Audio Player to stream the musical data bits out its USB port to a dedicated DAC and headphone amp. This bypasses the above distortions. For portable listening you could use a USB dongle; some of them have surprisingly good measurements, far superior to what I saw above. For desktop/home listening you have a lot more options, using any DAC having a USB input.

Corda Soul Measurements

I was curious about my Corda Soul, so I measured a few things. My measurement setup is pretty basic, which limits what I can measure.

Setup

This PC has a Juli@ XTE sound card. It’s a great sound card, but it’s not professional test equipment. But it does have balanced inputs & outputs. So here’s the setup:

Source: PC playing test signals through USB output
Test Device: Corda Soul, USB input, Analog output (balanced XLR)
Measurement: PC sound card, Analog input (balanced TRS)

Update: I also used Tascam SS-R1 and DA-3000 recorders to explore distortion & noise, see below.

Baseline Loopback

I made these measurements with my sound card, so its performance is the baseline. To measure that, I used TRS cables to connect its balanced outputs directly to its inputs to measure its loopback performance.

As you can see below, the Juli@ measures quite well for a sound card. It should be audibly transparent.

Loopback Frequency Response

At both sampling frequencies, frequency response is flat with less than 0.1 dB variation through the audible spectrum. Phase response and group delay are equally flat.

Loopback Distortion

The Juli@’s distortion was the same at 44.1 and 96 kHz sampling. So I’ll show the graph for 96, measured with a -1 dB digital signal:

We can see 60 Hz power at -86 dB and its harmonics nearly as strong. Overall, this is good performance for a sound card (especially one nearly 10 years old) and should be audibly transparent. The baseline now completed, let’s look at the Corda Soul.

Frequency Response

I expected to see perfectly flat response, but it wasn’t. At 96 kHz sampling with the filter in the “sharp” position, the Soul is entirely ripple-free, yet shows slow rolloff, down 0.5 dB at 20 kHz.

I measured the Soul’s frequency response at different volume settings. Why? Because it has 2 unique design features that might make its response vary with volume.

  1. Its unique volume control
  2. Its frequency-shaped gain-feedback

The Soul has a uniquely designed volume control. Instead of attenuating a fixed gain ratio like most preamps do, it changes the gain ratio. It has 64 discrete positions, each applying different resistors in the gain-feedback loop. As you reduce volume from full, it has less gain and more negative feedback. Theoretically, this gives lower volume settings lower noise and distortion, and wider bandwidth, which could impact frequency response.

The Soul’s frequency-shaped gain feedback means it digitally attenuates low frequencies before DA conversion, then it boosts them back to normal level in the final analog stage (after DA conversion and analog gain/volume control). These shaped curves are applied in separate steps, one digitally, one analog, so any imperfections in the matching of these curves should appear as variations in frequency response.

To see if the above features had any measurable impact, I tested frequency response at different volume settings:

The grey line is the sound card, for reference. I made all lines equal at about 600 Hz, which is the perceptual midrange. Note the Y scale is only 1/2 dB per division to exaggerate the differences. At lower volumes the Soul has a small lift in the bass and the treble. This is only 1 or 2 tenths of a dB, so it is inaudible. Also, it has a gradual rolloff in the treble that is down from 0.2 to 0.5 dB at 20 kHz, also inaudible. Note also that the Soul’s frequency response is perfectly smooth, free from the Juli@’s ripples.

At higher sampling rates (48, 88, 96 and 192), the Soul applies a slow rolloff that starts just above 20 kHz. This minimizes passband distortion.

Note: when using an external DAC, the Soul's frequency response is ruler flat in the passband and still applies a slow rolloff above 20 kHz. So these slight frequency response variations are caused by its DA converters (within spec for the WM8741 chip), not by FF curve matching. More on this later...

Below, frequency response at half volume at sample rates 44, 48, 88, 96 and 192. It’s essentially ruler flat so I zoomed the Y scale to 0.1 dB per division to see the differences. The 0.2 dB LF attenuation at 192k is the Juli@ sound card, not the Soul.

Now the same, but from 10 kHz on up. They’re identical up to 20 kHz. Each sample rate is free of ripples and uses the full available transition band to make the smoothest, gentlest attenuation.

In chart form:

RateFilter20k (max; half)-1 dB Fr-3 dB Fr-3 dB %Fs
44.1lin-0.5; -0.220,96021,3500.484
44.1min-4.4; -4.119,50019,8500.450
48lin-0.5; -0.222,75023,2500.484
48min-0.5; -0.221,20021,6200.450
88.2lin-0.5; -0.225,35028,3500.322
88.2min-0.5; -0.225,81028,6500.325
96lin-0.5; -0.227,35030,7000.320
96min-0.5; -0.227,85031,0500.324
192lin-0.5; -0.236,00045,8000.238

Note: the Soul’s output is non-inverting, so readers with EE knowledge may wonder: if the Soul’s volume knob changes the gain, how can it have less than unity gain? The Soul uses an inverting topology in the gain-feedback loop, so gain is simply Rf/Rin and can be less than unity. Its final fixed gain stage is also inverting, so it does not invert overall.

The high frequency rolloff starts a little lower sampling at 44.1 kHz with the filter in “slow” mode, due to its internal WM8741 DAC chip’s filter implementation. More on that subject here.

Noise & Distortion

Here’s a -1 dB digital signal, with Soul at max analog volume:

Here we see the Soul is much cleaner with a noise floor lower than the loopback connector. But the Soul does have an interesting distortion profile, peaking around -70 dB between 1 and 2 kHz. This is surprisingly high distortion. But look closer: this is dominated by 3rd harmonic with a little 5th (green). This pattern of odd harmonic distortion is sometimes seen in balanced (differentially signalled) systems, which tend to squash even harmonic distortion.

This unusual result was worth another test, so I played the same test signal through the Soul and recorded its analog output with my Tascam SSR1 instead of the sound card.

Wow – what a difference! The Tascam is professional equipment and we can see it is cleaner than the Juli@ sound board. The distortion is 10 dB lower and matches the spec for the Tascam recorder (-80 dB). Also, there is no hint of any 60 Hz or its harmonics. This is to be expected since the Soul uses a switched power supply. Yet the distortion hump is still there, even if smaller.  What’s up with that?

Bypassing the DAC

Where exactly is that distortion hump coming from? To find out, I played test signals through the Soul, bypassing its DAC. This is possible due to a unique feature of the Soul. It has a switch on the front panel to listen to its analog input, but this switch is separate from the digital input selector and does not disable its digital processing. It still receives and processes the digital input, and sends it out via toslink SPDIF. You can route the Soul’s digital output to an external DAC, and connect that DAC’s analog output to the Soul’s analog input. Of course, the external DAC must have balanced XLR output.

Like this:
(Digital Source) --d--> (Soul for DSP) --d--> (external DAC) --a--> (Soul for volume control) --a--> (headphones or power amp)

In this mode, the Soul becomes a DSP processor and analog balanced preamp. DAC is handled externally. Why? Future proofing. DSP is purely digital which at 32-bit precision is near mathematically perfect, so it’s not going to improve over time. Analog preamp technology fully peaked and optimized years ago so that’s not improving either. But DACs are constantly evolving, so the Soul enables you to use an external DAC while keeping the rest of the unit.

The signal chain: UAPP on my phone, playing in bit perfect mode (at 48k sampling) to the Soul’s USB input, the Soul’s digital output sent to the Tascam SS-R1, which performed D/A conversion and its analog balanced output sent to the Soul’s analog input, then record the Soul’s analog output with a Tascam DA-3000.

Here’s the distortion plot using the DA-3000 for D/A conversion:

Using an external DAC, the distortion hump entirely disappears. The noise is so low I can’t measure it, and distortion is at the limits of the 16-bit recorder. Conclusion: the Soul’s distortion hump is caused by its DA converters.

In the comparison plot below, the solid lines are using the Soul’s internal DA, and the semi-transparent lines are using the Tascam DA3000 as an external DA:

Soul vs. JDS Atom

I happen to have a JDS Atom headphone amp, which is one of the best (lowest noise & distortion) that Amir has measured at ASR. Subjectively, the Atom is a great sounding amp, a little “giant killer”. It’s as good as amps in the kilobuck price range. One impressive aspect of the Atom is how well it performs as you turn down the volume. Its SNR at 50 mV is 92 dB, which is phenomenally high. This is important because SNR is usually measured at full-scale max volume. But nobody ever listens that loud, so this is an example of measurements that are pointless because they don’t reflect actual listening conditions. When you turn the volume down to actual listening levels, the SNR in most amps typically drops by 30 to 40 dB.

So let’s get a comparative measurement at actual listening levels. I measured the Atom and the Soul at a typical listening level with my LCD-2F headphones, which is the 10:00 knob position on both (low gain on the Atom).

Here is the Soul at the 10:00 knob position (about 15 clicks up from the min):

Here is the Soul at the 10:00 position, using the Tascam DA3000 DA converters:

Here is the JDS Atom (low gain, 10:00 position):

We can still see that the Soul has lower noise, and about the same distortion, as the JDS Atom. When an external DA converter is used, the Soul simply blows away the JDS Atom. REW says the Soul’s noise is at least 8 dB lower than the Atom, which would put the Soul’s 50 mV SNR at least 100 dB, higher than anything measured at ASR.

In summary, the Soul’s performance looks “good” for distortion and “great” for noise. The WM8741 DACs that it uses were great for their time, but that was several years ago and DAC technology has improved. Its limitations are most likely inaudible, but if you use a high quality external DAC, the Soul is truly state of the art.

Note: using an external DAC with the Soul is not a decision to be taken lightly. The Soul's noise floor is extremely low, so you may end up eliminating distortion that can't hear, at the cost of introducing noise that you can hear.

Headphone Notch Filters

Many headphones have a resonance causing a bump in frequency response between 6 and 12 kHz. The Soul has a notch filter to correct this. The manual says it ranges from 6 to 11 kHz, each is -6 dB, Q=2.0. Specifically, the frequencies should be spaced 6.3% apart which is 1/11 of an octave, or slightly further apart than a musical half-step.

Here’s how they measured. The grey line is the frequency response with all controls disabled.

Here’s a closer in look:

Each measures spot-on to what the Soul’s manual says: in frequency, amplitude and width.

Tone Controls

The Soul has 4 tone controls. Meier customized mine to be equally spaced in octaves. That is, the corner frequencies should be 80, 320, 1,250 and 5,000 Hz. All 4 are shelf controls; the bottom two are low pass, the top two are high pass. Each control has 5 clicks up and 5 clicks down, each click should be 0.8 dB. I measured each at click positions -5, -3, +3, and +5.

Note: I measured these with a digital frequency sweep at full scale / 0 dB. This should cause digital clipping when the tone controls are set in the positive range. But due to Meier’s “FF” or frequency shaped feedback, the lower frequency controls don’t clip. That is, “FF” is reducing low frequencies more than 4 dB, which is the tone control range. More on this later.

In each of the following graphs, the vertical marker is at the corner frequency.

Knob 1, low bass.

Knob 2, mid bass

Knob 3, mid treble

Aha! In the above we finally see clipping, so we get some idea of the shape of the FF response curve. To compensate, I lowered the frequency sweep to -6 dB:

Knob 4, high treble

You can see that the lowest position attenuates a lot more. Let’s zoom out a bit to see the full curve:

What we see here is that the lowest position on knob 4 triggers the CD redbook de-emphasis curve, which is a gradual cut that starts at 1 kHz and becomes -10 dB at 20 kHz. This feature was rarely used, but if you have any old CDs using it, and they sound too bright, it means your playback equipment failed to detect it. The Soul enables you to apply the proper de-emphasis manually.

Here are all the tone control knobs seen at once

You can see they are spaced symmetrically. Also, their combined effects are cumulative, which enables a lot of flexibility when setting them. Because they are shelf controls, you won’t get amplitude ripples when combining them.

Crossfeed

The Soul has DSP to narrow (for headphones), and widen (for speakers), the stereo image. This is a common feature for headphone amps, having several different implementations. Meier’s is one of the best: it reduces the “blobs in my head” effect that headphones can have, especially with recordings that have instruments hard-panned fully L or R. And it does this without any perceptual sonic side effects like changes to frequency response, which is what sets it apart from others.

I measured the Soul’s frequency response in all 10 modes (5 narrow, 5 wide), plus its frequency response with all DSP disabled. As you can see below, all 11 curves are exactly the same, even with the Y scale zoomed in to 0.1 dB per division.

For example, the crossfeed in the Headroom amps from 15-20 years ago attenuated mids & treble due to comb filter effects from their inter-channel time delay. These amps had a gentle high pass filter to compensate for this. Meier’s crossfeed is free of these effects.

This doesn’t necessarily mean the crossfeed will be perceptually transparent. Measuring the same doesn’t imply that it sounds the same, because crossfeed is mixing some L into R and vice versa, with time shifts. Percepetually, this may make it sound like the FR has changed, to some people.

Meier FF

The Corda Soul uses Meier’s Frequency Adaptive Feedback. I’ve written about this here and here. Essentially, it shapes the frequency response to attenuate low frequencies in order to “unload” the digital and analog stages of the DAC and preamp, and brings the bass level back to normal for the final output stage, so the overall frequency response remains flat. This improves the midrange & treble where our hearing perception is most sensitive.

Meier customized my Soul’s firmware to make some changes I requested. These changes are:

  • Auto-Mute: the Soul auto-mutes whenever the digital input signal drops below a threshold for more than a brief time. This prevents the outputs from carrying a DC offset. The threshold is just above digital zero, so digital dither won’t prevent auto-mute from triggering. Auto-Mute is a standard Soul feature, not something Meier did just for me.
    • Extend the auto-mute delay
      • The original delay at 44 kHz was only a couple of seconds. This caused the Soul to auto-mute, then turn back on, on some CDs that had between-track silence. When doing this, the Soul emitted an audible “click”.
      • The new delay is about 20 seconds at 44 kHz, so this never happens anymore.
    • Disable auto-mute entirely
      • The Soul has a 3-way gain switch: high, medium, low. It’s implemented digitally. I never used the high position, so on my Soul, this switch position disables auto-mute entirely (mine has no high gain mode).
      • The medium and low settings are unchanged.
    • Silent auto-mute
      • The Soul emitted an audible click when auto-mute triggered. Meier changed my firmware so this does not happen; the auto-mute is completely silent in both ways, coming on and off.
  • Tone control changes
    • Space the corner frequencies at equal octave intervals (80, 320, 1250, 5000).

After doing these customizations, Meier sent me the firmware code so I can keep a backup copy, in case my Soul ever needs maintenance. From this code, I have the actual frequency response curve he uses for FF. Meier asked to keep this confidential, so I do not publish it here. Suffice to say, like the rest of the measurements above, it is truth in advertising. The implementation is exactly what he says it is.

DACs and Digital Filters, Pushing the Limits

I’ve discussed this topic before, here and here. A recent discussion at ASR led me to think about this further, devise some practical examples, and gain a deeper understanding, which I share here.

44-16 is a Tough Nut to Crack

It all started with the digital filters of the WM8741, which my DAC uses (article linked above). We tend to think of CD audio as being “perfect” for all practical purposes. It certainly is higher quality than lossless streaming, and perceptually transparent for most people. Yet at 44.1 kHz, none of the WM8741’s 5 filters was perfect from an engineering perspective. The closest were filters #3 and #5, which it labels “sharp linear phase” and “slow linear phase”, respectively.

Filter #3 has perfectly flat frequency response up to 20,021 Hz (0.454 fs at 44,100 kHz sampling) and no phase distortion. Problem is, it is too weak. At Nyquist (22,050) it is attenuated only 6.43 dB and the stopband (-110 dB) is 24,079 Hz (0.546 fs at 44,100 kHz sampling). The stopband being above Nyquist, it could allow high frequency noise to leak through.

Filter #5 is fully attenuated by Nyquist – the stopband (-110 dB) is 22,050 Hz. And it has no phase distortion. But the passband only goes up to 18,390 Hz, so it begins to attenuate below 20 kHz.

Neither of these filters is perfect, each is a compromise. Why is that? The problem is, the CD standard of 44.1 kHz sampling is so low, it forces a filter transition band that is very narrow (20,000 to 22,050; only 0.14 octaves). Even with modern hardware, it’s hard to implement digital filters that are correct from an engineering perspective and run in real-time, with these constraints. Something’s got to give: frequency response, phase response, or Nyquist attenuation.

Note that at 48 kHz, the WM8741’s filter is perfect. Fully attenuated at Nyquist, with no attenuation or phase shift below 20 kHz. So while 44.1 kHz may not be quite sufficient for implementing perfect real-time filters, it’s almost sufficient. It only takes just a little more “room” to make it perfect. By “room” I mean a wider filter transition band.

So which of these filters, #3 or #5, is better? At first I thought filter #5 was better because I considered full attenuation at Nyquist to be the most important feature of any digital reconstruction filter. Few people (not me) can hear above 18 kHz, so that is a small price to pay for full attenuation. But on further thought, I believe that filter #3 is better. To explain why, I’ll start with aliasing.

Aliasing

Most audiophiles have heard of aliasing and have some idea what it means. Yet surprisingly few have a solid grasp on the math behind what it actually is. I was one of them, so I did a little exploring to rectify that.

The Nyquist-Shannon theorem says if we sample at least twice as fast as the highest frequency we want to capture, our sampling points capture the wave with mathematical perfection. The Whittaker Shannon formula provides a method to perfectly reconstruct the analog wave from the digital sampling points. In both cases, limiting the bandwidth to frequencies below half the sampling rate (the Nyquist limit) is critical.

Note: the Whittaker-Shannon interpolation formula provides mathematically perfect reconstruction, but it is not the only way to reconstruct the analog wave. It requires summing an infinite series for every sampling point, and even when the series is truncated it is too computationally expensive to be practical for real-time decoding. Two common methods that DACs use are delta-sigma and R2R, which provide similar results. One can think of these as engineering compromises: mathematically imperfect, but requiring fewer computations.

For any frequency (below Nyquist) we encode digitally into sampling points, an alias is a different frequency (above Nyquist) that passes through the exact same sampling points. We can derive a mathematical relationship between frequencies and their aliases. Intuitively, each frequency and its alias are reflected across Nyquist. Put differently, they are equidistant from Nyquist, or that Nyquist is always the arithmetic average of a frequency and its alias.

At CD sampling at 44,100 Hz, Nyquist is 22,050 Hz, so we can encode any frequency below this. Examples:

  • The alias of 18,000 Hz is 22,050 + (22,050 – 18,000) = 26,100 Hz. That is: 18,000 and 26,100 are each 4,050 away from 22,050: one below it, one above it.
  • The alias of 1 kHz is 43,100 Hz; each is 21,050 away from Nyquist
  • The alias of 100 Hz is 44,000 Hz; each is 21,950 away from Nyquist

A picture’s worth a thousand words. In the following graphs, I use small numbers to keep it all simple, but it all extends to any sampling frequency. The entire X axis is 1 second, and we sample at 10 Hz, so Nyquist is 5 Hz.

Here is a 3 Hz wave.

At 10 Hz sampling, the alias of this 3Hz wave is 7 Hz, in red below.

Now recall what exactly it means to say that these 2 waves are aliases of each other at 10 Hz sampling: it means either of these waves can perfectly match the same sampling points.

We can see this below:

Hmmm… is that not obvious? OK try this:

The green shows the points where these waves intersect. Of course, intersecting means they are equal. Observe that these intersection points are perfectly evenly spaced in time. If you sampled either of these waves at these points, you would get the exact same thing. Both waves perfectly fit the sampling points. That is what aliasing means.

Note: the astute reader may notice that the above 2 waves intersect more often than the points noted in green. For purposes of digital sampling and reconstruction, it is sufficient that they pass through the same sampling points, and it's irrelevant whether they intersect more often than that.

Now suppose all you have are these sampling points, and you must construct the analog wave. You could construct either one! So the solution is ambiguous: how do you know which is the correct one — meaning the one that was recorded and encoded?

Recall the primary rule of digital recording: you must filter the analog wave to remove all frequencies above Nyquist. The same rule applies when reconstructing the wave from the sampling points. Alias pairs are always symmetrically centered around Nyquist; one above, one below. Thus, filtering to only frequencies below Nyquist eliminates the ambiguity during reconstruction.

A Simple Yet Clever Trick

One conclusion we can draw from the above is that frequencies close to Nyquist have aliases close to Nyquist. Grokking the fullness of this symmetry leads to a simple, yet clever trick when implementing digital reconstruction filters.

As we’ve seen above, the filter’s stopband should be no higher than Nyquist. But squashing the signal from full scale at 20,000 Hz to negative infinity (say -100 dB) at 22,050 Hz will cause passband artifacts, given real-time hardware limitations.

Yet consider what happens if we break the rules and shift the filter stopband a little above Nyquist. Remember how aliases reflect across Nyquist? We want the top of our passband to be 20 kHz, and Nyquist is 22,050. The difference is 2,050 Hz. Add that to Nyquist and we have 24,100 Hz. This is the alias of 20 kHz, when sampled at 44.1 kHz. What if we make this the filter stopband?

Any frequency below 20 kHz will have an alias above 24,100 Hz, so it will be fully attenuated. Conversely, any frequency between Nyquist and the stopband will have an alias above 20 kHz. And we stretched our filter transition twice as wide, making a gentler slope, easier to implement.

Thus, our digital filter will be imperfect from a math or engineering perspective, but perceptually transparent. It may leak some frequencies above Nyquist, which is by definition noise or distortion (call it “junk”). But all this “junk” and its aliases must be all above 20 kHz which is inaudible.

In this case, we shifted the filter stopband just a bit above Nyquist, to widen its transition band. We took advantage of aliasing symmetry, or the fact that frequencies near Nyquist have their aliases near Nyquist.

Of course, TANSTAAFL and this is no exception. This filter may leak some supersonic junk from 20 kHz to 24 kHz. This is inaudible in itself, but when it passes through analog circuits (preamps, power amps, speakers), harmonic and intermodulation distortion will create artifacts in the passband. However, this filter transition band from 20 to 24 kHz is strongly attenuated and most music has little or no energy up there to begin with. So pragmatically speaking, it should not be a problem. Even so, one can see why Wolfson’s engineers provided filter #5 as an alternative – being fully attenuated at Nyquist, it cannot leak any supersonic junk. So the engineers building devices that use the WM8741 can choose which filter makes the best compromise for their needs.

The WM8741 Uses this Trick

Now let’s take another look at the WM8741’s filter #3, at 44.1 kHz sampling. The passband goes up to .454 * fs, which is 20,021 Hz. The stopband is .546 * fs, which is 24,079 Hz. The range between them is the transition band.

Notice anything interesting about these numbers? The transition band is perfectly centered around Nyquist! By sampling frequency ratio, it’s .046 below and .046 above. By frequency, it’s 2,029 below and 2,029 above. Any frequency below 20,021 will alias above 24,049, so aliases of all passband frequencies are fully attenuated. This is the filter we just described above!

BTW, I don’t think this trick is unique to the WM8741. At ASR, reviews of various DACs show their “sharp, linear phase” digital filters down only 6 dB at Nyquist (22,050 Hz), and their stopband around 24 kHz. So it seems like common engineering practice, creative rule-breaking to stretch the limits and provide the best implementation possible given the constraints of 44.1 kHz sampling. Now I know why, and so do you!

If audio standardized on a higher sampling frequency (even only slightly higher like 48 kHz which is already used for DVD), or as DAC chips gain more processing power, these engineering compromises would become unnecessary.

Ubuntu 18 and Slow Network

Recently my Ubuntu 18 laptop had intermittent very slow internet/network. After checking the usual causes (router, WiFi, etc.), I found this helpful article. Turns out what fixed it was one of those suggestions.

Edit this file:

sudo vi /etc/nsswitch.conf

Then change the line that says:

hosts:          files mdns4_minimal [NOTFOUND=return] dns mdns

To this:

hosts:          files dns mdns4_minimal [NOTFOUND=return] mdns

So, in hindsight, it seems to have been a DNS problem.

Update 1

While this seemed to fix the problem temporarily, network problems started again. After an hour or two of reading about the issue, I made another change. I learned that Ubuntu 18 changed the way network is configured, using a new service called “netplan”. The /etc/netplan directory should have config files, but mine was empty! I don’t know how it got emptied; I certainly didn’t do it. But I needed to create a default config file. I created a file called 01-netcfg.yaml:

network:
  version: 2
  renderer: networkd
#  ethernets:
#    wlan0:
#      dhcp4: true
#      nameservers:
#        addresses: [192.168.1.1]

After some experimentation I commented out the last few lines. This tells Ubuntu that the network manager service will control network. Then I configured it in the network WiFi GUI, enabling DHCP wasn’t enough; I also added my home router (192.168.1.1) to the DNS list.

Networking (more specifically, DNS resolution) was still slow and occasionally intermittent. Finally, I had to flush the DNS cache:

sudo systemd-resolve --flush-caches

Now, everything seems to be working again.

Update 2

This too only fixed things temporarily. I still was getting intermittent network problems – but only on WiFi, not when wired. It looks like Ubuntu 18’s new “netplan” was conflicting with the “networking” service. I configured netplan and stopped the networking service. That is:

In the /etc/netplan directory, I had this file named 10-netcfg.yaml, which tells the system that the desktop Network Manager app will control network setup:

network:
version: 2
renderer: NetworkManager

I created additional files AFTER the ones that are there, so they override this. These files look like this:

20-wlan0.yaml

# Enable this only if you don't want to use the desktop GUI
network:
  ethernets:
    wlan0:
      addresses: []
      dhcp4: true
      optional: true
  version: 2

30-eth0.yaml

# Enable this only if you don't want to use the desktop GUI
network:
  ethernets:
    wlan0:
      addresses: []
      dhcp4: true
      optional: true
  version: 2

Because they sort after the first file, they override it. Next, run this command:

mclement@clements6:~$ sudo netplan --debug generate
DEBUG:command generate: running ['/lib/netplan/generate']
** (generate:27016): DEBUG: 09:12:23.854: Processing input file /etc/netplan/10-netcfg.yaml..
** (generate:27016): DEBUG: 09:12:23.854: starting new processing pass
** (generate:27016): DEBUG: 09:12:23.854: Processing input file /etc/netplan/20-wlan0.yaml..
** (generate:27016): DEBUG: 09:12:23.854: starting new processing pass
** (generate:27016): DEBUG: 09:12:23.854: Processing input file /etc/netplan/30-eth0.yaml..
** (generate:27016): DEBUG: 09:12:23.854: starting new processing pass
** (generate:27016): DEBUG: 09:12:23.854: wlan0: setting default backend to 2
** (generate:27016): DEBUG: 09:12:23.854: Configuration is valid
** (generate:27016): DEBUG: 09:12:23.854: eth0: setting default backend to 2
** (generate:27016): DEBUG: 09:12:23.854: Configuration is valid
** (generate:27016): DEBUG: 09:12:23.855: Generating output files..
** (generate:27016): DEBUG: 09:12:23.855: networkd: definition wlan0 is not for us (backend 2)
** (generate:27016): DEBUG: 09:12:23.855: networkd: definition eth0 is not for us (backend 2)

Note the last 2 lines, which is “networkd” saying it won’t be managing these network connections.

Next, apply this configuration

mclement@clements6:~$ sudo netplan apply

Now, disable the system networking service

sudo service networking stop

At this point, my WiFi networking started working again, and was not slow anymore.

Update 3

Sigh… this again was only a temporary fix. Even after all of the above, network was still slow! Then I remembered that I was using 5 GHz WiFi, which has different frequencies/channels in different regions, so requires the device to know what country it is in. So I changed one more thing. Edit the file /etc/default/crda and set my country code. That is, the file originally looked like this:

# Set REGDOMAIN to a ISO/IEC 3166-1 alpha2 country code so that iw(8) may set
# the initial regulatory domain setting for IEEE 802.11 devices which operate
# on this system.
#
# Governments assert the right to regulate usage of radio spectrum within
# their respective territories so make sure you select a ISO/IEC 3166-1 alpha2
# country code suitable for your location or you may infringe on local
# legislature. See `/usr/share/zoneinfo/zone.tab' for a table of timezone
# descriptions containing ISO/IEC 3166-1 alpha2 country codes.

REGDOMAIN=

Note that REGDOMAIN was not set. I changed that last line to this:

REGDOMAIN=US

Since “US” is the code for my country.

What is interesting, is that I tested and 2.4 GHz WiFi worked fine all along. It was only 5 GHz WiFi that was intermittently broken. This would be consistent with not having the region set.

Given all this, I reverted the netplan changes above, so the desktop NetworkManager controls my networking.

Update 4

Again, this still didn’t fix the problem. However, the problem may have been the channel I was using on 5 GHz. This thread was helpful. Channel 149 was listed by “iw list”, but not by “iwlist chan”. I changed the router to use channel 48, which appears in both lists, and it is now working.

Conclusion

There’s a lot here, some of which wasn’t necessary. In summary, the fix was:

Configure /etc/netplan to tell the desktop Network Manager GUI to manage network connections. This is the default Ubuntu desktop setup.

Set /etc/default/crda to set the system country code (needed for 5 GHz).

Run iw list and iwlist chan to see which 5 GHz channels the WiFi card supports.

Configure my router to use one of these 5 GHz channels.

BANG! It’s still working overnight, fast and reliable. Problem solved.