Partie interactive du site pédagogique ELECINF344/ELECINF381 de Télécom ParisTech (occurrence 2011).


[Casper] Audio in/out on the beagleboard, and drivers

Audio in/out

As I said in a previous post, we are now able to synthesize speech from a text input, and play the result directly on the audio output jack using a home-made interface between the synthesis engine and alsa.

We had also to port our speech recognition hello-world on the beagleboard. We first compiled the CMU Pocketsphinx library for the board, that is to say for an arm target, and then the hello-world program.

The program successfully recognized commands we recorded and played on the laptop, while having the beagleboard’s audio input connected to the laptop’s headset output by an appropriate cable.

We now have to interface electronically our microphones to the beagleboard’s audio input.



Apart from the progress in the audio, we also managed to compile a linux kernel module hello world on the board, despite the current custom kernel’s lack of certain header files.
The helloworld ran properly, and we were able to write a string to it, and read it back.

The next step will be to start developing our custom linux device driver, responsible for casper’s mechanical control.


After some over-the-internet discussions, we finally agreed on the following list of goals and milestones :

A interface task is placed between SVOX Pico and ALSA on the beagleboard  – 04/18  Thibault
The beagleboard and Casper PCB communicate over an UART link – 19/04  Alain
We store and read images to and from the flash memory on the Casper PCB – 20/04  Alain
We are capable of having the beagleboard read a text file out loud – 20/04 Thibault
The beagleboard is able to recognized predetermined vocal commands – 21/04  Thibault
The LCD display adapts itself according to the robot’s orientation – 22/04  Alain
We are capable of controlling the robot’s tilt angle  -  22/04  Thomas
We are capable of controlling the robot’s orientation (azimuth) – 22/04   Thomas
The robot as been assembled, and the devices (camera, PCB, …) are on board – 23/04  Thomas
We can make the robot track a face (the robot’s orientation follows slow mouvements) – 25/04   Alain
The API we previously defined is accessible from LUA scripts - 27/04   Thibault
The robot is capable of accessing a simple mail box (without encryption protocols) – 28/04   Alain
The robot has a simple remote http setup interface – 29/04   Thomas
We use a LUA script, using the API, to drive the robot for the demo – 29/04   Thomas
The servomotors are being driven by a linux device driver on the beagleboard – 29/04  Thibault

[CASPER] : text-to-speech on the beagleboard

Here is a little summary of what has been done today, regarding the text-to-speech on the beagleboard.

Audio with alsa on the beagleboard

First, I would like to explain the step we followed to get the audio output to work on the beagleboard without damaging the TPS6595, which manages the audio, but also the power supply (now I am sure that you understand the reason why we should not burn this one down).

We have on our SD card a bootstrapped version of the ubuntu linux distribution, with alsa installed.

To get alsa to work without being the superuser, you have to add the normal user to the audio group, and reboot the beagleboard.

Then, open the alsamixer program.

Here is was you SHOULD NOT do, despite it is being advised on some forums : enable each and every device in alsamixer.
This will cause the TPS6595 chip to overheat, and may damage it.


What you should do is enable only what is necessary :

  • Increase the volume of the DAC2 analog; DAC2 digital coarse and DAC2 digital fine.
  • Increase the volume of the headset
  • Enable headsetL2 and headsetR2

You should now have a working audio output.



In order for our whole application to work properly on the board, we decided not to use pulseaudio (which requires up to 40% of the CPU on the board). We decided to implement our own interface for the audio output, which would handle all the write requests from internal threads such as the text-to-speech engine’s thread. This interface would store the corresponding samples, pre-process them in order to fit alsa’s interleaved pcm format, and play them on the audio output.

We were able to test successfully this interface today, by synthesizing speech with SVOX pico on the beagleboard, and playing it simultaneously on the audio output.

The whole process requires 30% of the cpu during a short period (synthesis/samples post) and then 0/0.7% of the CPU during the rest of the the process, which is good news compared to the 40% CPU minimum required during the whole process in our previous experiments.

The next step will be to port the CMU Sphinx recognition helloworld we designed to the beagleboard.