High Res Audio on Ubuntu: Part 1

People sometimes criticize Ubuntu, more specifically, Pulseaudio and any Linux variants that use it, for not being audiophile friendly. Not surprisingly, this criticism has a thread of truth to it. Yet Ubuntu can be configured to support high quality audio.

The settings are simple, but explaining them takes space, making this a multi-part series.

Note: I made these changes on Ubuntu, though they probably work on any Linux variant that uses Pulseaudio.

Click to skip this and jump to part 2 or part 3.

Pulseaudio Versions

Pulseaudio is an audio layer on top of ALSA. One of its key benefits is enabling different apps to share the audio hardware (e.g. the sound card). ALSA works without Pulseaudio, but in this case only 1 app at a time can use the audio hardware.

Yet one of the essential parts to sharing audio is converting formats: sample rates and bit depths. Pulseaudio tends to do this all the time, even when it’s not necessary because only 1 app is using audio. This unnecessary resampling gives Pulseaudio a bad reputation among audiophiles.

Pulseaudio Resampling

In days of yore, Pulseaudio had a single sample rate and resampled everything to this rate. Since DVDs use 48000 and CDs use 44100, however you configure Pulseaudio, one or the other would always be resampled.

About 10 years ago Pulseaudio introduced the alternate-sample-rate config setting. This gave it 2 sample rates, for example the default /etc/pulse/daemon.conf file says:

default-sample-rate = 44100
alternate-sample-rate = 48000

The first is for CD, the second is for DVD, the 2 most common audio sources. This means Pulseaudio uses whichever rate provides the minimum effort / cleanest  conversion. Resampling between rates that are integer multiples is simple and transparent: less math and cleaner audio. For example, if the audio stream is at 96000, then downsampling to 48000 is cleaner and easier than to 88200; even though 88200 is numerically closer. So Pulseaudio has these defaults (44100 and 48000) for good reason, and when it must resample, it chooses the rate intelligently. Every audio rate commonly used for music and movies is one of these, or an integer multiple of it.

So the good news is this feature is really useful. The bad news is that it doesn’t always work. Here’s a super important limitation of Pulseaudio: It doesn’t change the sample rate while sounds are playing, so it can only change the rate while audio isn’t being used. So if you start playing a DVD, Pulseaudio sets the system sample rate to 48k. If you start a CD while the DVD is playing, the audio rate will remain at 48k and Pulseaudio will resample the CD’s 44.1k audio to 48k — and keep it there even if you stop the DVD and keep the CD going. The reverse happens if you start the CD first, then start the DVD while the CD is playing.

So to take advantage of the alternate sample rate, you must stop all apps from playing.

Avoiding Resampling

In version 1.11 Pulseaudio added a new config setting. In the /etc/pulse/daemon.conf file it looks like this:

avoid-resampling = true

Pulseaudio still uses the default and alternate sample rates. So this new setting controls what Pulseaudio does when it encounters an audio stream using a sample rate that is neither the default nor the alternate. If this setting is false (the default), Pulseaudio will resample the stream to one of the 2 configured rates, as described above. If this setting is true, Pulseaudio will use the stream’s native sample rate without resampling it.

Essentially, this new setting enables Pulseaudio to play every audio stream at its native rate, avoiding all resampling. The configured rates (default and alternate) become entirely optional, rather than mandatory.

However, Pulseaudio still won’t change the sampling rate while sounds are playing. And it still forces resampling of a new audio stream, if another audio stream is already playing when it starts. So this new feature to avoid resampling only works when no other audio is already playing, when we start a new audio stream.

Bit Depth and Reample Method

For bit depth, I recommend using at least s24le (signed 24-bit little endian), or s32le or float32le.  That’s because converting to larger sizes is harmless, but going the opposite way reduces resolution.

Pulseaudio supports several different methods for resampling. This command lists the available resamplers:

pulseaudio --dump-resample-methods

There is no reason not to use the highest quality: soxr-vhq. If it isn’t available on your system, use speex-float-10.

Summary

Overall, I recommend the following settings in Pulseaudio. When you make these changes to the config file, make sure to comment out the default settings you are replacing.

Version 1.8 (Ubuntu 16.04 or earlier)

/etc/pulse/daemon.conf

resample-method = soxr-vhq
default-sample-format = float32le
default-sample-rate = 44100
alternate-sample-rate = 48000

Version 1.11 (Ubuntu 18.04 or later)

Same as above, but with 1 extra line to avoid resampling.

/etc/pulse/daemon.conf

resample-method = soxr-vhq
default-sample-format = float32le
default-sample-rate = 44100
alternate-sample-rate = 48000
avoid-resampling = true

Now you’ve configured the system to set preferred sample rates, avoid resampling, and you know how to allow the system to change sampling rates. In part 2 we will set audio system buffers and priority to avoid audio playback glitches.