Google Speech API now supports Greek in Speech recognition

With the speech recognition that is available in the Google Speech API, it is possible to get voice converted into text.

Here is an example on how to do this for Greek using Ubuntu.

First, lets record a short sample of speech.

$ arecord --format=S16_LE --rate=16000 --duration=3 myvoicerecording.wav

We use the arecord command (package: alsa-utils) that records from the microphone a WAV file with specific format and rate. The duration is set to 3 seconds so that we do not need to press Ctrl+C to stop the recording. The output file is myvoicerecording.wav.

Then, we create an API key in order to access the Google Speech API. We follow the instructions at http://www.chromium.org/developers/how-tos/api-keys in order to enable the Speech API for our Google account. Follow the instructions up to Step 7.

After completing the Step 7, click on Create new Key (in the section Public API Access). You will be prompted to select a type of key.

Among the different types of keys, select Server key. You will be prompted for a IP Address that will work when you access this API. For this testing, you can add your current IP address (if you do not know your IP address, click http://www.whatismyip.com/). Then, copy the API KEY that will be generated. We will use it in the next command.

In order to send our myvoicerecording.wav to the Google Speech API, we use curl:

curl -X POST --data-binary @'myvoicerecording.wav' --header 'Content-Type: audio/l16; rate=16000;' 'https://www.google.com/speech-api/v2/recognize?output=json&lang=el&key=mykey'

Note that the language code (per ISO 639) for Greek is el. More than 40 languages are supported so your language should already be supported. For this to work, you need to replace mykey with your own API key.

The output is

{"result":[]}
{"result":[{"alternative":[{"transcript":"1 2 3 4 5","confidence":0.92287904},{"transcript":"ένα δύο τρία τέσσερα πέντε"},{"transcript":"1 2 3 4 και 5"},{"transcript":"ένα δυο τρία τέσσερα πέντε"}],"final":true}],"result_index":0}

which is the correct output for the voice recording that counted from one to five.

These steps show how to test the Speech API.

Ideally you would write a program to make use of the Speech API. At http://www.noobslab.com/2014/06/control-your-ubuntulinux-mint-system.html shows a set of scripts that adds voice recognition to Ubuntu that makes use of the Google Speech API.

 

 

Leave a Reply

%d bloggers like this: