Samstag, 17. April 2010

Goodbye Portaudio! Long live QtMultimedia!

The sound stack of simon was long a source of many issues. This was mainly because it relied on portaudio which sadly isn't supported that well by the sound configuration of e.g. Ubuntu because it interferes with their Pulseaudio setup. Long story short: Users of Ubuntu often had completely unusable simon installations because it crashed often and seemingly at random. Because those crashes happened in portaudios code and not in simons, there was little for us to do.

In the last week, I finally found some time and threw out all the old sound handling code and replaced it by a completely new, QtMultimedia based system. QtMultimedia is still a very young library and too has issues but I suspect that those will get fixed pretty quickly.

While I was at it, I also implemented a much cleaner way to stream audio to simond. Older versions used Julius libsent to do this because of their voice activity detection implementation. We now implemented a similar system (configurable, level based voice activity detection) in simon and now have complete control over the audio stream. Because of the new implementation I also implemented the feature to keep recognition samples - complete with their recognition results - on the server. This could for example be used to gather training data during normal usage. All you'd need to do is check if the words were correctly recognized and add them to the model.

Because all sound in/output is handled through a central point, I implemented a quite primitive sound server that will handle multiple simultanious streams correctly. Recordings while simon is activated will now work much faster (because the sound device handle will simply stay open) and are of course completely stable. You even get automatic pausing / unpausing for interrupted streams (If you for example start to record one sample, while recording this one start to record another sample the first one will pause until you are done recording the second).

The new implementation also has a much better level meter integrated into the recording widget so you can check your current microphone volume while you record. If you start to clip, simon will now automatically display a warning message telling you to re-record the sample.

Btw, QtMultimedia also works e.g. on Symbian devices so a simond client on a mobile phone should be trivial now.

All this has already been merged to the master branch and works very well in my tests. However, just like any new code it might contain bugs so try it at your own risk :).

Keine Kommentare: