Zoyd W.

Smashing against the reality glass

Saturday, May 31, 2008

ALSA: the sounds of silence

Open source has been a wonderful things for most technologies. Alas, this is not one of those. I'm referring to the chaotic state of the "art" in Linux sound systems. Of course, we all know that manufacturers are to blame for most of the driver issues. But how we ended up with such a crappy bunch of audio subsystems is beyond my understanding. Maybe myopic architectures, inadequately large egos, a perverse geeky desire to frustrate the poor lusers - well, I don't know the cause. Some subsystems cannot share their audio devices between several applications, so we end up listening to just one at a time (this is the sad case of OSS). Most of the rest force their users through a thousand hoops to interface with each other. And don't get me started on the configuration issues...! If only Linux sound systems had been designed with the quality of the Linux printing system CUPS!

Nowadays we have ALSA, although we still have the occasional OSS application. ESD and Jack may be a bit more rewarding but they're pretty rare these days. ALSA is quite powerful, but not enough, and its configuration is a terrible mess. Simple things are REALLY hard to do. Defaults are mostly useless. It's gotten better (it has a builtin mixer now), but it's not gonna get far enough. Ever. And the distribution programmers have not done enough to hide ALSA's shortcomings either. Even window systems developers have their share of the blame since mixers and configuration menues are usually quite obscure.

Anyway, we got a lemon, so we'd better start sucking it while we wait for some promising system (PulseAudio?) to rescue us a few years from now. Let's talk about making ALSA actually work as it should: we are going to (try to) set it up so that we can play and record sounds from all applications at the same time. The simplicity of this objective should make Windows or Mac users laugh their heads off, but it's a very hard thing to do with ALSA. (Note: ALSA will do this just fine when the sound card supports hardware mixing, which most onboard card do not; but of course then ALSA is doing no mixing at all since it is being done in hardware, so this is no virtue of ALSA's).

Our kernel should, of course, support our sound hardware. Most distributions will get this done, but some may not play well with certain sound systems. If one is recompiling the Linux kernel, the correct kernel modules should be built. The kernel itself has support for OSS and ALSA applications, so this must also be included. But the old OSS kernel code should not be used: pick the OSS-over-ALSA code instead. It can also be configured to play nice with Jack and other realtime subsystems: just include the HZ_1000, PREEMPT, PREEMPT_RT, PREEMPT_BKL and SECURITY_CAPABILITIES modules.

Then we need to load the kernel module during startup. Most distributions will do this too, although they sometimes will fail to pass some configuration parameters to the modules because they lack hardware detection code. The user is left with the chore of learning her hardware's specifics and its relation to Linux kernel drivers. For example, Asus' P5B Deluxe motherboard comes with an "HD Audio" onboard sound controller from Intel's ICH8 chipset. This controller is based on the Analog Devices AD1988B chip. It's supported by the snd_hda_intel module, and this module takes a "model=" parameter to know how many physical connectors it has. This particular motherboard needs the "model=6stack-digout" parameter, which should be manually added to /etc/modprobe.conf in the correct place.

The window manager of choice must also detect and use the existing sound system. Be it Gnome, KDE, Enlightenment, or whatever, it should work but it may simply not, or it might work suboptimally. For example, CentOS 5.1 with KDE detects the aforementioned sound controller, but it cannot open the capture device in full-duplex mode, and it doesn't have the best performance by default. If you want the best performance, you must set the SETUID bit on /usr/bin/artswrapper (chmod 4755 artswrapper). If you want to capture audio in full-duplex mode, you'll have to mess with the ~/.asoundrc file. And there be dragons (but we'll try to slay them below).

All ALSA applications are equal and use the same configuration file. But they won't play nice to each other unless that configuration file tells them how. The window manager is just another ALSA application, but it's running most of the time. In the case of KDE, all KDE applications use sound devices via a KDE component called aRts, and aRts is an ALSA application. To configure aRts you should go to KDE's Control Center, on the Sound & Multimedia --> Sound System menu. There are two tabs. On the General tab you should check the "Enable the sound system" and "Run with the highest possible priority" checkboxes, and clear the others. The Sound buffer should be fine for most cards with 10 fragments and 4096 bytes. The Hardware tab should have selected the "Advanced Linux Sound System" with the Full-duplex checkbox set, and the rest all cleared. But this setup gives errors unless the .asoundrc file is correct.

The autogenerated .asoundrc file often ranges between nonexistent and inadequate. So the user must edit its contents, but its syntax is pretty arcane and the user will have to figure out what this file is for. So let me save you some work here: the .asoundrc file is read by the ALSA libraries every time they are loaded. And they are loaded every time an application gets loaded that needs sound input/output. If you change that file, you should restart all sound-related applications in order to make it work.

Let me introduce you to a hard-earned .asoundrc file that may be useful to you:


# See http://gentoo-wiki.com/HOWTO_ALSA_sound_mixer_aka_dmix
# Set default sound card
# Useful so that all settings can be changed to a different card here.
pcm.snd_card {
     type hw
     card 0
     #device 0
}
# Allow mixing of multiple output streams to this device
pcm.output {
     type dmix
     ipc_key 1024
     ipc_perm 0660 # Sound for everybody in your group!
     #ipc_key_add_uid false   # also share with other users
     slave.pcm "snd_card"

     slave {
          # This stuff provides some fixes for latency issues.
          # buffer_size should be set for your audio chipset.
          # If limit exceeded, aRts will show a popup message indicating
          # the limit as the "can_write" value.
          period_time 0
          period_size 1024 # must be power of 2 for OSS compatibility
          buffer_size 4096 # must be power of 2 for OSS compatibility
          # Other adjustments:
          #format "S32_LE" # in case you want to fix it
          #periods 128 # must be power of 2 for OSS compatibility
          rate 48000 # needed for Qemu (same as QEMU_AUDIO_DAC_FIXED_FREQ)
     }
     # This says, that only the first
     # two channels are to be used by dmix, which is
     # enough for (most) oss apps and also lets
     # multichannel chios work much faster:
     bindings {
          0 0  # from 0 => to 0
          1 1  # from 1 => to 1
     }
}
ctl.output {
     type hw
     card 0
}
# Allow reading from the default device.
# NOTE: For some reason, ALSA is unable to share its input devices among
#       different applications. So dnsoop is mostly useless and we cannot use
#       the following definition, which would be correct (and in fact it works
#       albeit for only one application at a time).
#pcm.input {
#     type dsnoop
#     ipc_key_add_uid true
#     ipc_key 1099
#     slave.pcm "snd_card"
#
#     slave {
#          # aRts full duplex fix (see more comments above):
#          period_time 0
#          period_size 1024
#          buffer_size 4096
#          rate 48000 # needed for Qemu (same as QEMU_AUDIO_ADC_FIXED_FREQ)
#     }
#
#     bindings {
#          0 0
#          1 1
#     }
#}
# Thus we need to share the underlying hardware directly, which fortunately
# is smart or simple enough to let us to that:
pcm.input {
      type plug
      slave.pcm "snd_card"
}

# This is what we want as our default device
# a fully duplex (read/write) audio device.
pcm.duplex {
     type asym
     playback.pcm "output"
     capture.pcm "input"
}

###################
# CONVERSION PLUG #
###################
# Setting the default pcm device allows the conversion
# rate to be selected on the fly.
# duplex mode allows any alsa enabled app to read/write
# to the dmix plug (Fixes a problem with wine).
pcm.!default {
     type plug
     slave.pcm "duplex"
}

# Apparently this is wrong (breaks mplayer for me opening the device)
#ctl.!default {
#     type plug
#     slave.pcm "snd_card"
#}

########
# AOSS #
########
# OSS dsp0 device (OSS needs only output support, duplex will break some stuff)
pcm.dsp0 {
     type plug
     slave.pcm "output"
}

This configuration file sets up ALSA to mix all (well, most) its outputs into one so they can be listened on the same speakers at the same time. Unfortunately ALSA cannot mix its inputs so that several applications listen at the same time. This shortcoming is ugly, but we'll get by thanks to the fact that most sound cards (even onboard ones) can do this in hardware. So this configuration creates a new "virtual" sound card with the real souncard inputs and the mixed soundcard outputs. Then it sets that "virtual" sound card as default, so all ALSA-compatible applications will use it. If you really need to mix inputs, go for JACK or PulseAudio instead of ALSA, and good luck to you. [Edit: Also look into OpenAL, and see http://blogs.adobe.com/penguin.swf/2007/05/welcome_to_the_jungle.html for a nice graph showing dependencies between audio libraries]

But what about non-ALSA-compatible applications? We'll deal with OSS applications here, because that's the vast majority of them all. Well... those OSS applications are used to hogging the sound device. Our configuration file creates a special dsp0 output for them to share, but we need to run them with the special command aoss, which is part of the alsa-oss package on some distributions. If your distribution doesn't include this package (CentOS doesn't, for example) you'll need to compile it yourself.

Most ALSA applications will be able to use that configuration file, although they may need some specific adjustments (or they may behave suboptimally). Some applications support several audio systems, but they tend to be difficult to setup precisely because of this complexity. There are no general rules for this, and it can get quite tedious to solve each case. For example Qemu (and its derivative KVM) supports ALSA and OSS, although you may have it packaged without ALSA support. In that case you'll have to recompile it. You'll need to run the following commands prior to launching Qemu:

export QEMU_AUDIO_DAC_FIXED_FREQ=48000
export QEMU_AUDIO_ADC_FIXED_FREQ=48000
export QEMU_ALSA_DAC_BUFFER_SIZE=16384
export QEMU_ALSA_ADC_BUFFER_SIZE=16384

These parameters should match the ones given in the ALSA configuration file. The frequency should match the card's native one. For example HD Audio uses 48000 Hz (most do), but some use 44100 Hz instead. You can choose any frequency, but if it doesn't match the card's frequency then sound quality will be degraded because sound will be resampled at your chosen frequency. The buffer size should be less than the card's maximum buffer size, and it should be a power of 2 for OSS compatibility. If the card's buffer is too small sound quality may be bad and there's not much one can do about it sometimes: Qemu may experience buffer xruns because of this, and we cannot elevate its CPU priority, so our only chance will be to use its OSS audio interface and invoke Qemu through aoss, which usually gives better sound performance. But ALSA's low buffer limit may disappear if the right parameter is passed to the kernel module (e.g., our "model=6stack-digout" parameter mentioned above).

Another problem is that the window manager's mixer interface can be deceiving. For example, KDE's Kmix has 3 tabs (Output, Input and Switches), but the items' names are quite confusing. In the HD Audio case, you have to set the three "Input Sources" on the Switches tab to your input devices (e.g., Mic, Line and Front Mic) while on the Input tab you'll have to enable all three nondescript "Capture" inputs and raise their levels; finally you'll have to turn off the "Analog Mix" in the Output tab, enable your input devices (Mic, Line and Front Mic) and raise their levels. All this is needed in order to be able to record sound from your inputs, but it's not at all obvious by looking at Kmix. Other mixer UIs (Gnome's, alsamixer...) are equally obscure. This ambiguity stems from the kernel driver's generality (it needs to drive several different cards with the same chip), from the distribution installer which should detect or ask for hardware details and from the mixers' graphical user interface's own disorganization and lack of help.

All in all, sound can be a major roadblock for the adoption of Linux as a desktop OS.

Some debugging commands:


artsdsp 
aoss 
mpg123 -d dsp0 some.mp3
aplay -D default some.mp3
alsaplayer -d default some.mp3
qemu-kvm -audio-help
amixer
aplay -l
fuser /dev/snd/*
alsactl names -f /dev/stdout
arecord -M -fdat | aplay -vv

A few references:


http://www.sabi.co.uk/Notes/linuxSoundALSA.html
http://gentoo-wiki.com/HOWTO_ALSA_sound_mixer_aka_dmix
http://ubuntuforums.org/archive/index.php/t-205924.html
http://ubuntuforums.org/showthread.php?p=3464546

posted by ZoydW # 10:05 PM 1 Comments

Zoyd W.

Saturday, May 31, 2008

ALSA: the sounds of silence

Quasirandom links

Archives