How To Do Stuff I Found Difficult: 2011

Saturday, August 6, 2011

Project Schliemann/ GLF/ Generic Language Framework with Netbeans

I've been trying for some time to get editing support in Netbeans for my custom language. There are many different APIs for doing this, depending on the version of Netbeans and the level of integration you want. Unfortunately, for my purposes there was only one API which was not ridiculously over-complex, called Project Schliemann (or alternately ".nbs files", or the "Generic Languages Framework"). The basic idea is that you write a single file with the grammar and rules about your language, and a generic parser reads that file and then can do some basic syntax checking, etc. on your code. Unfortunately, with recent versions of Netbeans, it has been removed. In addition, with version 6.0, I encountered a bug which prevented me from using it. This bug is fixed in 6.1, which is the version I ended up using.

After many tutorials and searching online, as well as several attempts for me to write or copy an NBS file, I still had made no progress. As a last ditch effort, I decided to try to find a complete working GLF Netbeans Project, to see if perhaps the entire API was broken. I discovered a Prolog Language support module, written by Rosa Gutierrez, on this page: http://edu.netbeans.org/courses/nbplatform-certified-training/linz.html unfortunately, the link to the source was dead! So I emailed her, and she was able to send me a copy of the source, which you can now download here.

Once I had a working module, I was able to incrementally modify it to do what I wanted. However, there is one catch which I didn't notice anywhere else: Even if your grammar would accept the contents of a file, if there is something in your file which is not defined as one of the tokens, then it will give you an error, even if that exact token is listed in your grammar as acceptable. That's a little confusing, but the basic idea is that even if your grammar defines

Statement = "Hello" | "Goodbye";

if "hello" and "goodbye" do not somehow fall under the category of one of the tokens you defined, you will get an error.

Also, another quirk of the grammar is that the "root" of the grammar must be labeled "S". So for example, a simple grammar might be entirely defined like

S = "hello world" | "goodbye world";

I'll post any more quirks or tips I encounter here.

Sunday, June 19, 2011

Fixing bad voices produced with festvox (or "Hey! My voice don't work!") (or "How to fix bad labellings")

Someone asked on a mailing list for possible ways to fix a bad voice the produced using festvox. I realized that my answer took my quite some time to figure out without any help, So I thought I'd post my response here.

Basically, if your voice is bad, chances are your labeling of some of the prompts is bad. (even if it isn't, it doesn't hurt to make sure they are good). You want to fix the bad labellings

To do so, copy the contents of your wav folder and the contents of your lab folder into the same directory (or setup links to make it seem that way). Once you've done that, open up the wav files with waveurfer, and choose the "transcription" view for all of them. Now you can go through one by one and check if the labellings are right. Options are: re-record the ones with bad labellings (remember to run bin/make_lab again before checking the labels again, I made this mistake once, and kept re-recording and thinking that the autolabeller sucked. Also, to save time, you can run bin/make_labs prompt-wav/test001.wav to just relabel test001.wav, instead of doing it to all the recordings, which can be time-consuming.), or hand-correcting the labels. You can literally just drag the labels from within wavesurfer (remember to copy your changes back to the lab/ directory).

Once you've got all the labels as perfect as you care to have them, just repeat all the steps after "bin/make_labs prompt-wav/*.wav" from whatever tutorial you are following and you should get the voice built with proper labeling (Come on, I know that if you knew how to do anything with festvox without a tutorial in front of you, there's no way you would need to be reading this post).

Thursday, June 9, 2011

Switching between multiple grammars with pocketsphinx

I was having difficulty understanding the pocketsphinx api, specifically when it comes to switching between multiple grammars.

Here's how it works:

Pocketsphinx actually keeps track of a set of grammars at all time. Normally, this set of grammars only has one element. However, it can contain multiple grammars, while only one is switched on at a time. The basic method is

get this set of grammars using ps_get_fsgset()
Add your grammar to the set using fsg_set_add()
Select your grammar from the set as the active one using fsg_set_select()
Notify the recognizer that you have updated the grammar using ps_update_fsgset()

(this assumes that the recognizer was initially instantiated with a FSG, rather than an N-Gram model. Otherwise, you first need to switch it to an FSG model).

Example code:

ps_decoder_t * p= ...; //Decoder already initialized somehow
fsg_model_t * m= ...; //Load the model using fsg_model_read or jsgf_parse_file and jsgf_build_fsg
fsg_set_t* fsgset=ps_get_fsgset(p);
fsg_set_add(fsgset, "newgrammarname", m);
fsg_set_select(fsgset,"newgrammarname")
ps_update_fsgset(p);

NOTE: I realize that even jsgf_build_fsg is confusing. Here's how you should handle it:

jsgf_build_fsg(jsfgmodel, rule, ps_get_logmath(ps), 6.5);

where jsfgmodel is the jsgf model loaded using jsgf_parse_file, and "rule" is a rule chose from it. (use the jsgf_* functions to select the rule). Also, free the jsgf once the fsg has been created using jsgf_grammar_free.

Oh yeah, and the 6.5 just seems to be a magic number. In two places I've seen it used without any explanation. The documentation says nothing about what the number "lw" does anywhere, so I'd just stick to the value 6.5 and hope for the best...

Friday, March 4, 2011

Pointyclicky for festival/festvox

There are references scattered throughout the festvox documentation to a GUI program for recording speech prompts, called "pointyclicky". Sounds nice, right?

Well, if you try to find the sources, you find that it was last updated in 2000 (!), and it's in disrepair. (http://festvox.org/pointyclicky/)

It won't compile any more, but Kev 'Kyrian' Green at the redhat bugzilla patched it up so it's closer to compiling. I also made some more changes so it would work for me on Debian, but I don't remember exactly what I did. Anyway, you can download my patched version of the sources from here, which should be up at least for the next four years.

Tuesday, February 8, 2011

Building (Open) AstroMenace on Debian

To start, download the sources from http://sourceforge.net/projects/openastromenace/, install the required dependencies (ReadMe.txt will tell you, or you can try just building it and see what it complains about missing), and build (cd build; cmake ..; make;).

While trying to build (Open) AstroMenace on Debian I encountered the following error:

OpenAstroMenaceSVN/AstroMenaceSource/Core/RendererInterface/OGL_Draw3D.cpp:38: error: ‘PFNGLCLIENTACTIVETEXTUREPROC’ does not name a type

Googling only turned up this, which was exactly the problem I encountered, but offered no solution. I figured out it was missing a definition from some OpenGL header. So I traced back the includes, and found the following text in OpenAstroMenaceSVN/AstroMenaceSource/Core/Base.h

#ifdef WIN32

...

#endif

#if defined(__APPLE__) && defined(__MACH__)

...

#else

#define __glext_h_ // Don't let gl.h include glext.h

#include // Header File For The OpenGL32 Library

#include // Header File For The GLu32 Library

#undef __glext_h_

#endif

Now, it has the comment "Don't let gl.h include glext.h". However, I don't see why not! Because when I comment out the #define and #undef statements, it compiles fine! It should look like:

#else

//#define __glext_h_ // Don't let gl.h include glext.h

#include // Header File For The OpenGL32 Library

#include // Header File For The GLu32 Library

//#undef __glext_h_

#endif

So then it compiled correctly, but still got a very strange error when linking:

Linking CXX executable AstroMenace

c++: `sdl-config: No such file or directory

make[2]: *** [AstroMenace] Error 1

make[1]: *** [CMakeFiles/AstroMenace.dir/all] Error 2

make: *** [all] Error 2

It seems like some sort of quotation mismatch error. I went through the CMake setup file, and it seemed fine. So instead, I just tried to manually link it myself. I found the linker command it was trying to execute in CMakeFiles/AstroMenace.dir/link.txt, but I couldn't find anything wrong with it, so I copied the contents, put them into a terminal, and pressed enter. For some reason, this now worked! For reference, this is what the file contained for me: http://pastebin.com/U8UifLAq

Once it built, I then went to download the necessary data files (under the vfs section of openastromenace downloads on sourceforge, download the data, and a language), and extracted their contents into the build directory.

Then run ./AstroMenace and .... well...it works for me at this point!

Sunday, February 6, 2011

Creating a voice with Festvox for Festival in debian/linux

Okay, Debian provides packages for speech-tools, but they aren't complete. On top of that, the latest package versions provided at festvox.org are of incompatible versions with eachother. Instead of trying to figure that out, I suggest downloading the svn: http://developer.berlios.de/svn/?group_id=3272

All we care about are speech_tools and

./configure and make them both.

Then start following this tutorial:

http://festvox.org/bsv/x1003.html

However, there might be a few problems. Recording audio is a little bit funky, as they use na_record, which doesn't support ALSA. However, hopefully you have oss support enabled. When you get to the point where you are going to execute bin/prompt_them you might want to modify it first.

the line

$ESTDIR/bin/na_record -f 16000 -t $duration wav/$f.wav

can be replaced with

$ESTDIR/bin/na_record -audiodevice /dev/dsp1 -f 16000 -time $duration wav/$f.wav

if you want to send it to the oss device /dev/dsp1

Or, you can replace it with arecord (which uses ALSA) like so:

arecord -Dplughw:1 -r 16000 -d $duration wav/$f.wav

(this will record from the ALSA device plughw:1).

You can also uncomment various features inside this file, such as waiting for you to press return before recording, or playing back your recording immediately after you record it.

I'll update this as I encounter more problems.

Friday, January 14, 2011

Merging in git while ignoring newlines (or, Convert newline style to match file-by-file)

Basically, I want to merge two commits that are mostly the same, except some of the files have different newline endings.
Basically my problem looked like this.

   First commit with strange line endings
   | |
   V
A-B-C-D-E-F-G

   H
   ^
   ||
   Second commit with strange line endings.

I wanted to rebase H on top of C, and then merge with G. but C and H had differing newline endings, and there was no order to which files had which type. So my solution was to hack together an absolutely horrible script that went through each file in H and converted it to UNIX or DOS newline styles, depending on which one would cause it to have a smaller diff with the same file in C.

Here it is:

   find . \! -type d -exec sh -c "unix2dos -q '{}';DOS=\$(git diff COMMITC -- '{}' | wc -c); dos2unix -q '{}'; UNIX=\$(git diff COMMITC -- '{}' |wc -c); if \[ \$DOS -lt \$UNIX ]; then unix2dos '{}'; fi" \;

You execute it with commit H checked out, and COMMITC replaced with the name of your "commit C" in my diagram.