ELECINF344/381

Partie interactive du site pédagogique ELECINF344/ELECINF381 de Télécom ParisTech (occurrence 2011).

Catégories

[CASPER] : text-to-speech on the beagleboard

Here is a little summary of what has been done today, regarding the text-to-speech on the beagleboard.

Audio with alsa on the beagleboard

First, I would like to explain the step we followed to get the audio output to work on the beagleboard without damaging the TPS6595, which manages the audio, but also the power supply (now I am sure that you understand the reason why we should not burn this one down).

We have on our SD card a bootstrapped version of the ubuntu linux distribution, with alsa installed.

To get alsa to work without being the superuser, you have to add the normal user to the audio group, and reboot the beagleboard.

Then, open the alsamixer program.

Here is was you SHOULD NOT do, despite it is being advised on some forums : enable each and every device in alsamixer.
This will cause the TPS6595 chip to overheat, and may damage it.

 

What you should do is enable only what is necessary :

  • Increase the volume of the DAC2 analog; DAC2 digital coarse and DAC2 digital fine.
  • Increase the volume of the headset
  • Enable headsetL2 and headsetR2

You should now have a working audio output.

 

Text-to-speech

In order for our whole application to work properly on the board, we decided not to use pulseaudio (which requires up to 40% of the CPU on the board). We decided to implement our own interface for the audio output, which would handle all the write requests from internal threads such as the text-to-speech engine’s thread. This interface would store the corresponding samples, pre-process them in order to fit alsa’s interleaved pcm format, and play them on the audio output.

We were able to test successfully this interface today, by synthesizing speech with SVOX pico on the beagleboard, and playing it simultaneously on the audio output.

The whole process requires 30% of the cpu during a short period (synthesis/samples post) and then 0/0.7% of the CPU during the rest of the the process, which is good news compared to the 40% CPU minimum required during the whole process in our previous experiments.

The next step will be to port the CMU Sphinx recognition helloworld we designed to the beagleboard.

Sur le même sujet :

  1. [CASPER] Reconnaissance vocale et synthèse
  2. [CASPER] Beagleboard, pilotes et pwm
  3. [CASPER] Reconnaissance et synthèse vocale / Solutions Wi-Fi
  4. [Casper] OpenCV et Beagleboard
  5. [CASPER] Dernières avancées…

Commentaires fermés.