Tag : translation

Towards a GNOME CLI translation management tool

Update 7/June/2009: The repository was moved to https://github.com/simos/gnome-i18n-manage-vcs/. There was some confusion between this script and intltools, which now is a general localisation tool, not tied to GNOME.

In Designing a command-line translation tool for GNOME, I described how a CLI translation management tool would be used to ease the work of a translator with commit access. The discussion was continued with Leonardo‘s post Parsing damned-lies’ releases.xml.in in the command line.

The stage we are now is that we have a tool (not official GNOME tool, but rather at beta testing phase!) that can manage the repositories for us, so that the checking out and committing can be fairly automated. The source is available at https://github.com/simos/gnome-i18n-manage-vcs/.

We show two working examples.

Let’s say we want to update the documentation for gcalctool. We run

$ ./intltool-manage-vcs --language el --release gnome-2-26 \
--username simos --module gcalctool --transtype doc --init
Release  : gnome-2-26
Language : el
    Category: admin-tools
    Category: dev-tools
    Category: dev-platform
    Category: desktop
        Module:              gcalctool, Branch: gnome-2-26

Download completed successfully.
$ _

In the PO/ subdirectory there is a PO file for gcalctool. We update it using our favourite translation tool, and then

$ ./intltool-manage-vcs --language Greek --commit
Sending        el/el.po
Transmitting file data .
Committed revision 2475.
$ _

Let’s see another example. We want to update the gnome-games documentation. These are several individual PO files, for each of the games.

$ ./intltool-manage-vcs --language el --release gnome-2-26 \
--username simos --module gnome-games --transtype doc --init
Release  : gnome-2-26
Language : el
    Category: admin-tools
    Category: dev-tools
    Category: dev-platform
    Category: desktop
        Module:            gnome-games, Branch: gnome-2-26

Download completed successfully.
$ _

There are several files,

$ ls PO
aisleriot.gnome-2-26.el.po  gnibbles.gnome-2-26.el.po
gnotravex.gnome-2-26.el.po  README
blackjack.gnome-2-26.el.po  gnobots2.gnome-2-26.el.po
gnotski.gnome-2-26.el.po    same-gnome.gnome-2-26.el.po
glchess.gnome-2-26.el.po    gnome-sudoku.gnome-2-26.el.po
gtali.gnome-2-26.el.po      START
glines.gnome-2-26.el.po     gnometris.gnome-2-26.el.po
iagno.gnome-2-26.el.po      gnect.gnome-2-26.el.po
gnomine.gnome-2-26.el.po    mahjongg.gnome-2-26.el.po
$ _

We enter the PO/ subdirectory and we update those files we wish. We can also run scripts on the PO files. For example, all these documentation files contain the same fragment of the FDL license, so we can translate the license once, and then merge automatically to all translations.

Finally,

$ ./intltool-manage-vcs --language Greek --commit
Sending        el/el.po
Transmitting file data .
Committed revision 9014.
Sending        el/el.po
Transmitting file data .
Committed revision 9015.
Sending        el/el.po
Transmitting file data .
Committed revision 9016.
$ _

In the above example, we updated the documentation of three of the games.

Here are tips when using this tool

  • There is a –dry-run option that is useful when experimenting or trying for the first time.
  • You can filter which group of a release to download, based on category. Existing categories are desktop, admin-tools, dev-tools, dev-platform. Also, on translation type, either documentation or UI (if you do not specify, we get both). On module, by providing the module name.

And the current limitations

  • We currently only support SVN. This will change once the repositories move to git.gnome.org, in about two weeks time.
  • You need to have at least an initial translation (currently, the script does not svn add files). To be fixed once we move to git.
  • We do not currently update ChangeLog files. That’s why gnome-games is so cool for these experiments. Due to the git move, we would not need to mess with ChangeLog files.
  • We are dependent on the http://l10n.gnome.org/languages/el/gnome-2-26/xml URLs (replace el with your language). These URLs expose the release modules information in a nice XML file. Previously, the information used to exist in an XML file in the repository of damned-lies. Now, the information lies in the mysql database of damned-lies+vertimus, and is exposed through the above type of URL.
  • Due to the previous point, we commit to branch or trunk, depending on what is available in the latest release (gnome-2-26). That means, my translation fixes in gnome-games have not made it to trunk (HEAD). This is something that can be fixed with a workaround. It would be actually cool to use this tool to commit to both gnome-2-xx and master at the same time.
  • We currently do not deal with figures.

Considering that damned-lies+vertimus will be having commit functionality soon, I think that having more than one option for easy commiting translations is good.

Designing a command-line translation tool for GNOME

One messy task with GNOME translations is the whole workflow of getting the PO files, translating/updating/fixing them, and then uploading them back. One would need to use command line, and several different commands to accomplish this.

KDE and KBabel has a nice feature that allows you to easily grab all translation files, work on them, then commit through SVN. All through the GUI! It helps a bit here that the translation files for a specific language are located under a single directory.

The current workflow in GNOME translations typically consists of

  1. Getting the PO file from the L10n server (for example, GNOME 2.22 Greek) (also possible to use intltool-update within po/)
  2. Translate using KBabel, POEdit, GTranslator, vim, emacs, etc.
  3. svn co the package making sure you have the correct branch. One may limit to the po/ directory.
  4. Put the updated file in po/
  5. Update the ChangeLog (either with emacs, or with that Perl script)
  6. Commit the translation.
  7. (If you committed on a branch, also commit on HEAD)

Tools such as Transifex (used currently in Fedora) take away altogether the use of command line tools, and one works here through a web-based interface. Apparently, Transifex is having a command-line tool in the TODO list.
What I would like to see in GNOME translations, is a tool that one can use to

  1. Grab all or a section of the PO files from GNOME 2.22. Put them in a local folder.
  2. Use the tools of my preference (translation tools, scripts, etc) to update those translations I need to update.
  3. Commit those translation files that changed (using my SVN account), automatically add ChangeLog entries, also commit to HEAD if required.

I would prefer to have a command-line tool for this, for now, though it would be great if GUI tools would get the same functionality at some point. For a command line tool, the workflow would look like

The workflow would be something like

$ ssh-add
Enter passphrase for /home/simos/.ssh/id_rsa:
Identity added: /home/simos/.ssh/id_dsa (/home/simos/.ssh/id_dsa)
$ tsfx --project=gnome-2.22 --language=el --collection=gnome-desktop --user=simos --action=checkout
Reading from http://svn.gnome.org/svn/damned-lies/trunk/releases.xml.in... done.
Getting alacarte (HEAD)... done.
Getting bug-buddy (branch: xyz)... done.
...
Completed in 4:11s.
$ _

Now we translate any of the files we downloaded, and we push back upstream (of course, only those files that were changed).

$ tsfx --action=commit
Found local repository, Project: gnome-2.22, Language: el, Collection: gnome-desktop, User: simos
 Reading local files...
Found 6 changed files.
Uploading alacarte (HEAD)... done.
Uploading bug-buddy (branch:xyz, HEAD)... done.
...
Completed uploading translation files to gnome-2.22, language el.
$ _

FOSDEM ’08, summary and comments

I attended FOSDEM ’08 which took place on the 23rd and 24th of February in Brussels.

Compared to other events, FOSDEM is a big event with over 4000 (?) participants and over 200 lectures (from lightning talks to keynotes). It occupied three buildings at a local university. Many sessions were taking place at the same time and you had to switch from one room to another. What follows is what I remember from the talks. Remember, people recollect <8% of the material they hear in a talk.

The first keynote was by Robin Rowe and Gabrielle Pantera, on using Linux in the motion picture industry. They showed a huge list of movies that were created using Linux farms. The first big item in the list was the movie Titanic (1997). The list stopped at around 2005 and the reason was that since then any significant movie that employs digital editing or 3D animation is created on Linux systems. They showed trailers from popular movies and explained how technology advanced to create realistic scenes. Part of being realistic, a generated scene may need to be blurred so that it does not look too crisp.

Next, Robert Watson gave a keynote on FreeBSD and the development community. He explained lots of things from the community that someone who is not using the distribution does not know about. FreeBSD apparently has a close-knit community, with people having specific roles. To become a developer, you go through a structured mentoring process which is great. I did not see such structured approach described in other open-source projects.

Pieter Hintjens, the former president of the FFII, talked about software patents. Software patents are bad because they describe ideas and not some concrete invention. This has been the view so that the target of the FFII effort fits on software patents. However, Pieter thinks that patents in general are bad, and it would be good to push this idea.

CMake is a build system, similar to what one gets with automake/autoconf/makefile. I have not seen this project before, and from what I saw, they look quite ambitious. Apparently it is very easy to get your compilation results on the web when you use CMake. In order to make their project more visible, they should make effort on migration of existing projects to using CMake. I did not see yet a major open-source package being developed with CMake, apart from CMake itself.

Richard Hughes talked about PackageKit, a layer that removes the complexity of packaging systems. You have GNOME and your distribution is either Debian, Ubuntu, Fedora or something else. PackageKit allows to have a common interface, and simplifies the workflow of managing the installation of packages and the updates.

In the Virtualisation tracks, two talks were really amazing. Xen and VirtualBox. Virtualisation is hot property and both companies were bought recently by Citrix and Sun Microsystems respectively. Xen is a Type 1 (native, bare metal) hypervisor while VirtualBox is a Type 2 (hosted) hypervisor. You would typically use Xen if you want to supply different services on a fast server. VirtualBox is amazingly good when you want to have a desktop running on your computer.

Ian Pratt (Xen) explained well the advantages of using a hypervisor, going into many details. For example, if you have a service that is single-threaded, then it makes sense to use Xen and install it on a dual-core system. Then, you can install some other services on the same system, increasing the utilisation of your investment.

Achim Hasenmueller gave an amazing talk. He started with a joke; I have recently been demoted. From CEO to head of virtualisation department (name?) at Sun Microsystems. He walked through the audience on the steps of his company. The first virtualisation product of his company was sold to Connectix, which then was sold to Microsoft as VirtualPC. Around 2005, he started a new company, Innotek and the product VirtualBox. The first customers were government agencies in Germany and only recently (2007) they started selling to end-users.

Virtualisation is quite complex, and it becomes more complex if your offering is cross platform. They manage the complexity by making VirtualBox modular.

VirtualBox comes in two versions; an open-source version and a binary edition. The difference is that with the binary edition you get USB support and you can use RDP to access the host. If you installed VirtualBox from the repository of your distribution, there is no USB support. He did not commit whether the USB/RDP support would make it to the open-source version, though it might happen since Sun Microsystems bought the company. I think that if enough people request it, then it might happen.

VirtualBox uses QT 3.3 as the cross platform toolkit, and there is a plan to migrate to QT 4.0. GTK+ was considered, though it was not chosen because it does not provide yet good support in Win32 (applications do not look very native on Windows). wxWidgets were considered as well, but also rejected. Apparently, moving from QT 3.3 to QT 4.0 is a lot of effort.

Zeeshan Ali demonstrated GUPnP, a library that allows applications to use the UPnP (Universal Plug n Play) protocol. This protocol is used when your computer tells your ADSL model to open a port so that an external computer can communicate directly with you (bypassing firewall/NAT). UPnP can also be used to access the content of your media station. The gupnp library comes with two interesting tools; gupnp-universal-cp and gupnp-network-light. The first is a browser of UPnP devices; it can show you what devices are available, what functionality they export, and you can control said devices. For example, you can use GUPnP to open a port on your router; when someone connects from the Internet to port 22 on your modem, he is redirected to your server, at port 22.

You can also use the same tool to figure out what port mapping took place already on your modem.

The demo with the network light is that you run the browser on one computer and the network light on another, both on the local LAN (this thing works only on the local LAN). Then, you can use the browser to switch on/off the light using the UPnP protocol.

Dimitris Glezos gave a talk on transifex, the translation management framework that is currently used in Fedora. Translating software is a tedious task, and currently translators spent time on management tasks that have little to do with translation. We see several people dropping from translations due to this. Transifex is an evolving platform to make the work of the translator easier.

Dimitris talked about a command-line version of transifex coming out soon. Apparently, you can use this tool to grab the Greek translation of package gedit, branch HEAD. Do the translation and upload back the file.

What I would like to see here is a tool that you can instruct it to grab all PO files from a collection of projects (such as GNOME 2.22, UI Translations), and then you translate with your scripts/tools/etc. Then, you can use transifex to upload all those files using your SVN account.

The workflow would be something like

$ tfx --project=gnome-2.22 --collection=gnome-desktop --action=get
Reading from http://svn.gnome.org/svn/damned-lies/trunk/releases.xml.in... done.
Getting alacarte... done.
Getting bug-buddy... done.
...
Completed in 4:11s.
$ _

Now we translate any of the files we downloaded, and we push back upstream (of course, only those files that were changed).

$ tfx --project=gnome-2.22 --collection=gnome-desktop --user=simos --action=send
 Reading local files...
Found 6 changed files.
Uploading alacarte... done.
...
Completed uploading translation files to gnome-2.22.
$ _

Berend Cornelius talked about creating OpenOffice.org Wizards. You get such wizards when you click on File/Wizards…, and you can use them to fill in entries in a template document (such as your name, address, etc in a letter), or use to install the spellchecker files. Actually, one of the most common uses is to get those spellchecker files installed.

A wizard is actually an OpenOffice.org extension; once you write it and install it (Tools/Extensions…), you can have it appear as a button on a toolbar or a menu item among other menus.

You write wizards in C++, and one would normally work on an existing wizard as base for new ones.

When people type in a word-processor, they typically abuse it (that’s my statement, not Berend’s) by omitting the use of styles and formatting. This makes documents difficult to maintain. Having a wizard teach a new user how to write a structured document would be a good idea.

Perry Ismangil talked about pjsip, the portable open-source SIP and media stack. This means that you can have Internet telephony on different devices. Considering that Internet Telephony is a commodity, this is very cool. He demonstrated pjsip running two small devices, a Nintendo DS and an iPhone. Apparently pjsip can go on your OpenWRT router as well, giving you many more exciting opportunities.

Clutter is a library to create fast animations and other effects on the GNOME desktop. It uses hardware acceleration to make up for the speed. You don’t need to learn OpenGL stuff; Clutter is there to provide the glue.

Gutsy has Clutter 0.4.0 in the repositories and the latest version is 0.6.0. To try out, you need at least the clutter tarball from the Clutter website. To start programming for your desktop, you need to try some of the bindings packages.

I had the chance to spend time with the DejaVu guys (Hi Denis, Ben!). Also met up with Alexios, Dimitris x2, Serafeim, Markos and others from the Greek mission.

Overall, FOSDEM is a cool event. In two days there is so much material and interesting talks. It’s a recommended technical event.

OpenOffice Writer training notes (request: make training video plz!)

OpenOffice.org is one of the most important layers of the open-source stack. Although it does a superb job, we really need to make effort to get more users working on it.

Here we present training notes for the use of Writer, the word processor component of OpenOffice.org. We aim to make the best use of styles by creating well-structured documents. What we show here is built on work of others, including the OpenOffice Linux.com articles by Bruce Byfield, the amazing OpenOffice.org documentation and the spot-on article of Christian Paratschek at osnews.com. Actually, the following follow more or less Christian’s article.

When training in OpenOffice.org, it is important to create a fluid workflow that starts from the basics and increases gradually in complexity. It would be great if someone could turn the notes in a training video.

  1. We start of with running OpenOffice.org Writer. The default windows appears. Compared with other word processors, in OOo we see this text boundary in the document (the dim rectangle that shows the area we can write in). We mention we can show/hide it with View/Text boundaries.
  2. When creating a document, it is good to set the properties such as Title and Subject. We do that from File/Properties/Description. It may look too much effort now, but it will help us later wherever we want to write the document title or subject. Use Using OpenOffice.org Writer for title and How to write nice document in OpenOffice.org Writer for subject.
  3. Writer supports styles which makes life much easier. You probably have used styles before; using Heading 1, Heading 2 for headings so that you can create easily the Table of Contents. Writer has a Styles and Formatting window that is accessible from the icon/button near the File menu. The icon looks like a hand clicking on a 3×3 grid. You can also get the windows from Format/Styles and Formatting, or by simply pressing F11. Once you do that, you get a floating window. You can dock it by dragging it to the right edge of the Writer window. If you are into 3D desktop, it may not be easy to dock (it automatically switches to another side of the desktop cube). In this case, use the key combination Ctrl-Shift-F10 to dock the Styles and Formatting window. It is good here to resize the document (that is, change the magnification) so that it appears centered with little empty space around.
  4. Writer supports styles, not only for Paragraphs (like Heading 1) but also for Pages. See the status bar at the bottom of the Writer window; it mentions Default which is the default page style. When we write a document, the first page is good to have a distinct style that is appropriate to the properties of a first page. This includes, making sure the second page appears empty, the page gets no page numbering and so on. On the Styles and Formating dock we select the Page styles tab and we double-click on the First Page style. This will set the current page to the First Page style, and we can verify visually by looking at the status bar (Now First Page instead of the old Default).
  5. We are not writing yet; lets create the subsequent pages first. To do so, we insert manual breaks in our document. Click on Insent/Manual Break…/ and select to insert a Page Break. As style for the page after the break choose the Index page style, tick on Change page number, and make sure the numbering starts from 1. Click OK. Proper documents start numbering from the Index page. The Index page is the page we put the Table of Contents, Table of Figures and so on.
  6. Make sure the cursor is on the new page with the Index style. We need to create a new page break, so that we can get writing the actual document. Click on Insert/Manual Break…/ and select a Page Break. As style for the page after the break you can choose Default. Leave any page numbering settings as is because it inherits from before. Click OK.
  7. Now, to view what we have achieved, let’s go to Print Preview, and choose to see four pages at a time. We can see the first page, another page which is intentionally left blank, the Index page and the Default page. Close Print preview and return to the document.
  8. Now let’s go back to the first page. We want to put the title on the first page. Nothing extravagant, at least yet. What we do is we visit the Paragraph styles and find the Title style. While the cursor is on the first page at the start, we double-click on the Title style. The cursor moves the the center of the document and we can verify that the Title paragraph style has been applied; see on the right of the Styles and Formating icon on the top-left of the Writer window. Shall we write the title of the document now? Not so fast. We can insert the title as a field, because we already wrote it in the properties at the beginning in Step 2. Click Insert/Fields/Title.
  9. Now press Enter; the cursor moves down and it somehow automatically changes to the Subtitle style. Styles in OpenOffice allow you to choose a Next style (a followup style) and in this case, when someone presses Enter on the Title style, they get a new paragraph in the Subtitle style. While in the line/paragraph with Subtitle style, click on Insert/Field…/Subject. Fields in OpenOffice.org appear with a dark gray background; this does not appear in printing, it is just there to help you identify where the fields are.
  10. Now lets move to the last page, the page with Default style and write something. Select the Heading 1 paragraph style and type Introduction. Press enter and you notice that the next style is Text body. Text body is the natural paragraph style for text in Writer (most documents have the default Default paragraph style which is wrong). Now write something in Text Body such as I love writing documents in OpenOffice.org Writer. Copy the line and paste several times so that we get a nice paragraph of at least five lines. Make sure when pasting that after a full stop there should be a single space, then the new sentence starts.
  11. Press Enter and now we are ready to add a new heading. Type Writing documents and set the Heading 1 paragraph style. Press Enter and fill up a paragraph with more of I love writing documents in OpenOffice.org Writer.
  12. Press Enter and create a new section (add a Heading 2, name it Writing documents in style and fill up a corresponding paragraph).
  13. Press Enter and create a last section (add a Heading 1, name it Conclusion, and fill up a corresponding paragraph style).
  14. Now we are ready to place the cursor at the Index page we created before, and go for the Table of Contents. Click on Insert/Indexes and Tables/Indexes and Tables. The default index type is Table of Contents. We keep the default settings and click OK. We get a nice looking table of contents.
  15. At this stage we have a complete basic document, with first page, index page and default page.

The next set of steps include more polishing and adding extra elements to our document.

  1. The text body style is configured to have the left alignment by default. Normally, one would select paragraphs and click on a paragraph alignment button on the toolbar to change the alignment. Because we are using styles, we can modify the Text Body style to have another alignment, and presto the whole document with text in the same style follow suit. In the Styles and Formating dock, at the paragraph styles tab, select the Text Body style. Right-click on the Text Body style and choose to Modify style. Find the Alignment tab and choose Justified as the new alignment for Text Body paragraphs. Click Ok and observe the document changing to the new configuration.
  2. It is nice to the section numbers on the headings, such as 2.1 Writing documents in style. To do this, we need to change the default outline numbering. Click on Tools/Outline numbering… and select to modify the numbering for all levels (under Level, click 1-10). Then, under the Numbering group, change the Number option from the default None to 1, 2, 3, …. Click OK and the number is changed in the document.
  3. Go back to the Table of Contents. You notice that the numbering format does not look nice; some section numbers are too close to the section names. To fix, right click on the gray area of the table of contents and select Edit Index/Table. In the new dialog box, select the Entries tab. Under Structure and Formatting you can see the structure of each line of line in the table of contents table. The button labeled E# is the placeholder for the chapter number. After that there is a placeholder that you can actually type text. In our case we simply click and press the space bar to add another space. We then click the All button and finally click OK. Now, all entries in the Table of contents will have a space between the chapter number and chapter title.
  4. In order to add a footer with the current page number, click on Insert/Footer and pick Index, then Default. Both the Index and the Default style of pages get to show page numbers. Then, place the cursor in the footer area and Insert/Field/Page Number. You can modify the Footer paragraph style so that the text alignment is centered. You have to insert the field in both an Index page and a Default page.
  5. The page number in the Index page is commonly shown in Roman lowercase numbers. How can we fix that? We simply have to modify the Index page style accordingly; click on the Page Styles tab in Styles and Formatting, click to modify the Index page style, and at the Page tab in Layout Settings select the i, ii, iii, … format. Click OK.
  6. It would be nice to have the title on the header of each page, either Index or Default. Click on Insert/Header and add a header for Index and Default. Then, place the cursor in the header for both styles and click to add the Title field (Insert/Field/Title). Would it be nice to put a line under the header? The header text has the Header paragraph style. In the Styles and Formatting, click the Paragraph styles tab and select the Header paragraph style. Right-click and choose to Modify. In the Borders tab enable a bottom line and click OK.

OpenOffice.org Writer in Style

You can download this sample document (.odt) from the link Using OpenOffice.org Writer.

I’ll stop here for now. There are more to put such as Table of Figures, Index of Tables and Bibliography.
It would be good to leave feedback if there is interest to work on this direction.

Update 15Mar2008: This appears to be a Farsi translation/adaptation of the article.

The Google Highly Open Participation Contest

One more initiative by Google to reach out to the community and promote free and open-source software is the The Google Highly Open Participation Contest 2007/2008.

The purpose of the competition is to enable young students older than 13 years old but have not entered yet the tertiary education, to participate in open-source development.

To get started, read the Official Contest Rules and the Frequently Asked Questions (FAQ) pages.

There are ten projects to choose to work from, one of which is GNOME, the desktop environment found in Linux distributions such as Ubuntu Linux and Fedora.

The current list of items to work on for GNOME include several documentation and translation tasks. If there is interest to work on the Greek localisation, leave a comment at this post. The direction I propose is to help with translating the documentation of GNOME applications.

Open-source software progresses by having more people contributing. This effort by Google and also previous efforts (Google Summer of Code) help tremendously towards the wider participation.

Greek OLPC localisation status

The Greek OLPC localisation effort is ongoing and here is a report of the current status.

For discussions, reading discussion archives and commenting, please see the Greek OLPC Discussion Group.

We are localising two components, the UI (User Interface) and applications of the OLPC, and the main website at http://www.laptop.org/

The UI is currently being translated at the OLPC Wiki, at OLPC_Greece/Translation. At this page you can see the currently available packages, what is pending and which is the page that you also can help translate.

At this stage we need people with skills in music terminology to help out with the localisation of TamTam. In addition, there are more translations that need review and comments before they are sent upstream.

Moreover, if you find a typo and a better suggestion for a term in the submitted translations, feel free to tell us at the Greek OLPC Discussion Group.

The other project we are working on is the localisation of the Greek version of www.laptop.org. The pages are not 100% translated yet, so if you want to finish the difficult parts, see the Web translation page of laptop.org.

The translators that helped up to now have done an amazing job.

Important MO file optimisation for en_* locales, and partly others

During GUADEC, Tomas Frydrych gave a talk on exmap-console, a cut-down version of exmap that can work well on mobile devices.

During the presentation, Tomas showed how to use the tool to find the culprits in memory (ab)use on the GNOME desktop. One issue that came up was that the MO files taking up space though the desktop showed English. Why would the MO translation files loaded in memory be so big in size?

gtk20.mo                             : VM   61440  B, M   61440  B, S   61440  B

atk10.mo                      	     : VM    8192  B, M    8192  B, S    8192  B

libgnome-2.0.mo			: VM   28672  B, M   24576  B, S   24576  B

glib20.mo			     : VM   20480  B, M   16384  B, S   16384  B

gtk20-properties.mo           : VM     128 KB, M     116 KB, S     116 KB

launchpad-integration.mo  : VM    4096  B, M    4096  B, S    4096  B

A translation file looks like

msgid “File”

msgstr “”

When translated to Greek it is

msgid “File”

msgstr “Αρχείο”

In the English UK translation it would be

msgid “File”

msgstr “File”

This actually is not necessary because if you leave those messags untranslated, the system will use the original messages that are embedded in the executable file.

However, for the purposes of the English UK, English Canadian, etc teams, it makes sense to copy the same messages in the translated field because it would be an indication that the message was examined by the translation. Any new messages would appear as untranslated and the same process would continue.

Now, the problem is that the gettext tools are not smart enough when they compile such translation files; they replicate without need those messages occupying space in the generated MO file.

Apart from the English variants, this issue is also present in other languages when the message looks like

msgid “GConf”

msgstr “GConf”

Here, it does not make much sense to translate the message in the locale language. However, the generated MO file contains now more than 10 bytes (5+5) , plus some space for the index.

Therefore, what’s the solution for this issue?

One solution is to add to msgattrib the option to preprocess a PO file and remove those unneeded copies. Here is a patch,

— src.ORIGINAL/msgattrib.c 2007-07-18 17:17:08.000000000 +0100
+++ src/msgattrib.c 2007-07-23 01:20:35.000000000 +0100
@@ -61,7 +61,8 @@
REMOVE_FUZZY = 1 << 2,
REMOVE_NONFUZZY = 1 << 3,
REMOVE_OBSOLETE = 1 << 4,
– REMOVE_NONOBSOLETE = 1 << 5
+ REMOVE_NONOBSOLETE = 1 << 5,
+ REMOVE_COPIED = 1 << 6
};
static int to_remove;

@@ -90,6 +91,7 @@
{ “help”, no_argument, NULL, ‘h’ },
{ “ignore-file”, required_argument, NULL, CHAR_MAX + 15 },
{ “indent”, no_argument, NULL, ‘i’ },
+ { “no-copied”, no_argument, NULL, CHAR_MAX + 19 },
{ “no-escape”, no_argument, NULL, ‘e’ },
{ “no-fuzzy”, no_argument, NULL, CHAR_MAX + 3 },
{ “no-location”, no_argument, &line_comment, 0 },
@@ -314,6 +316,10 @@
to_change |= REMOVE_PREV;
break;

+ case CHAR_MAX + 19: /* –no-copied */
+ to_remove |= REMOVE_COPIED;
+ break;
+
default:
usage (EXIT_FAILURE);
/* NOTREACHED */
@@ -436,6 +442,8 @@
–no-obsolete remove obsolete #~ messages\n”));
printf (_(“\
–only-obsolete keep obsolete #~ messages\n”));
+ printf (_(“\
+ –no-copied remove copied messages\n”));
printf (“\n”);
printf (_(“\
Attribute manipulation:\n”));
@@ -536,6 +544,21 @@
: to_remove & REMOVE_NONOBSOLETE))
return false;

+ if (to_remove & REMOVE_COPIED)
+ {
+ if (!strcmp(mp->msgid, mp->msgstr) && strlen(mp->msgstr)+1 >= mp->msgstr_len)
+ {
+ return false;
+ }
+ else if ( strlen(mp->msgstr)+1 < mp->msgstr_len )
+ {
+ if ( !strcmp(mp->msgstr + strlen(mp->msgstr)+1, mp->msgid_plural) )
+ {
+ return false;
+ }
+ }
+ }
+
return true;
}
However, if we only change msgattrib, we would need to adapt the build system for all packages.

Apparently, it would make sense to change the default behaviour of msgfmt, the program that compiles PO files into MO files.

An e-mail was sent to the email address for the development team of gettext regarding the issue. The development team does not appear to have a Bugzilla to record these issues. If you know of an alternative contact point, please notify me.

Update #1 (23Jul07): As an indication of the file size savings, the en_GB locale on Ubuntu in the installation CD occupies about 424KB where in practice it should have been 48KB.

A full installation of Ubuntu with some basic KDE packages (only for the basic libraries, i.e. KBabel – (ls k* | wc -l = 499)) occupies about 26MB of space just for the translation files. When optimising in the MO files, the translation files occupy only 7MB. This is quite important because when someone installs for example the en_CA locale, all en_?? locales are added.

The reason why the reduction is more has to do with the message types that KDE uses. For example,

msgid “”
“_: Unknown State\n”
“Unknown”
msgstr “Unknown”

I cannot see a portable way to code the gettext-tools so that they understand that the above message can be easily omitted. For the above reduction to 7MB, KDE applications (k*) occupy 3.6MB. The non-KDE applications include GNOME, XFCE and GNU traditional tools. The biggest culprits in KDE are kstars (386KB) and kgeography (345KB).

Update #2 (23Jul07): (Thanks Deniz for the comment below on gweather!) The po-locations translations (gnome-applets/gweather) of all languages are combined together to generate a big XML file that can be found at usr/share/gnome-applets/gweather/Locations.xml (~15MB).

This file is not kept in memory while the gweather applet is running.
However, the file is parsed when the user opens the properties dialog to change the location.
I would say that the main problem here is the file size (15.8MB) that can be easily reduced when stripping copied messages. This file is included in any Linux distribution, whatever the locale.

The po-locations directory currently occupies 107MB and when copied messages are eliminated it occupies 78MB (a difference of 30MB). The generated XML file is in any case smaller (15.8MB without optimisation) because it does not include repeatedly the msgid lines for each language.

I regenerated the Locations.xml file with the optimised PO files and the resulting file is 7.6MB. This is a good reduction in file space and also in packaging size.

Update #3 (25Jul07): Posted a patch for gettext-tools/msgattrib.c. Sent an e-mail to the kde-i18n-doc mailing list and got good response and a valid argument for the proposed changes. Specifically, there is a case when one gives custom values to the LANGUAGE variable. This happens when someone uses the LANGUAGE variable with a value such as “es:fr” which means show me messages in Spanish and if something is untranslated show me in French. If a message has msgid==msgstr for Spanish but not for French, then it would show in French if we go along with the proposed optimisation.

GUADEC Day #2

(see http://www.guadec.org/schedule/warmup)

At the first presentation, Quim Gil talked about GNOME marketing, what have been done, what is the goal of marketing. He showed a focused mind on important marketing tasks; it is easy to get carried away and not be effective, a mistake that happens in several projects.

The next session was by Tomas Frydrych (Open Hand – I have their sticker on my laptop!) on memory use in GNOME applications. Many people complain that XYZ is bloated. However, this does not convey what exactly happens; pretty useless. In addition, the common tools that show memory use do not show the proper picture because of the memory management techniques. That is, due to shared libraries, the total memory occupied by an application appears very big. A tool examined is exmap. This tool uses a kernel module that shows memory use of applications by reading in /proc. It takes a snapshot of memory use; it’s not real-time info. It comes with a GTK+ front-end (gexmap) that requires a big screen (oops, PDAs). However, it is not suitable for internet tablets and other low-spec devices. Therefore, they came up with exmap-console which addresses the shortcommings. It has a console interface based on the readline library.

Here are the rest of my notes. Hope they make sense to you.

. exmap –interactive
. ?: help
. Head: quite useful (dynamic allocation)
. Mapped:
. Sole use: memory that app is using on its own (rss?)
. “sort vm”
. “print” or “p”
. “add nautilus”
. “clear”
. “detail file” (what executables/libs loaded and how much consume)
. “detail none”

Sole use
. valgrind, to analyse Sole Use memory?
. “detail ????”

Lots of small libraries: overhead

Looking ahead
. Pagemap: by Matt Macall
. http://projects.o-hand.com/exmap-console/

Python
. Sole use: ~18MB ;-(

Tomas was apparently running Ubuntu with the English UK locale. The English UK translation team is doing an amazing job at the translation stats. Actually, most messages are copied, however with a script one can pick up words such as organization and change to organisation. The problem here is that, for example, the GAIM mo file is 215KB (?), however for the British English translation the actual changes should be less than 2-3KB. Messages that are missing from a translation mean that the original US English messages will be used. I’ll have to find how to use msgfilter to make messages untranslated if msgid == msgstr. Where is Danilo?

After lunch time (did not go for lunch), I went to the Accerciser session. Pretty cool tool, something I have been look for. Accerciser uses the accessibility framework of GNOME in order to inspect the windows of running applications and see into the properties. A good use is to identify if elements such as text boxes come with description labels; they are important to be there for accessibility purposes (screen reader), as a person that depends on software to read (text to speech) the contents of windows.

The next session was GNOME accessibility for blind people. Jan Buchal gave an excellent presentation.

My notes,

. is from Chech republic, is blind himself. has been using computers for 20+ years

. from user perspective
. users, regular and irregular 😉
. software
. firefox 3.0beta – ok for accessibility other versions no
. gaim messenger ok
. openoffice.org ok but did not try
. orca screenreader ^^^ works ok.
. generally ready for prime time
. ubuntu guy for accessibility was there
. made joke about not having/needing display slides ;-]
. synthesizer: festival, espeak, etc – can choose
. availability of voices
. javascript: not good for accessibility
. links/w3m: just fine!
. firefox3 makes accessibility now possible.
. web designer education, things like title=””, alt=”” for images.
. OOo, not installed but should work, ooo-gnome
. “braillcom” company name
. “speech dispatcher”
. logical events
. have short sound event instead of “button”, “input form”
. another special sound for emacs prompt, etc.
. uses emacs
. have all events spoken, such as application crashing.
. problems of accessibility
. not money main factor, but still exists.
. standard developers do not use accessibility functions
. “accessor” talk, can help
. small developer group on accessiblity, may not cooperate well
. non-regular users (such as blind musician)
. musicians
. project “singing computer”
. gtk, did not have good infrastructure
. used lilypond (music typesetter, good but not simple to use)
. singing mode in festival
. use emacs with special mode to write music scores (?)
. write music score and have the computer sing it (this is not “caruso”)
. gnome interface for lilypond would be interesting
. chemistry for blind
. gtk+
. considering it
. must also work, unfortunately, on windows
. gtk+ for windows, not so good for accessibility
. conclusion: free accessibility
. need users so that applications can be improved
. have festival synthesizer, not perfect but usable
. many languages, hindi, finnish, afrikaans
. endinburgh project, to reimplement festival better
. proprietary software is a disadvantage
. q: how do you learn to use new software?
. a: has been a computer user for 20+ years, is not good candidate to say
. a: if you are dedicated, you can bypass hardles, old lady emacs/festival/lilypond
. brrlcom, not for end-users(?)
. developer problem?
. generally there is lack of documentation; easy to teach what a developer needs to know
. so that the application is accessible
. HIG Human Interface Guidelines, accessible to the developers
. “speakup” project
. Willy, from Sun microsystems, working on accessibility for +20 years, Lead of Orca.
. developers: feel accessibility is a hindrance to development
. in practice the gap is not huge
. get tools (glade) and gtk+ to come with accessibility on by default
. accessibility
. is not only for people with disabilities
. can do amazing things like 3d interfaces something

These summaries are an important example of the rule that during presentation, participants tend to remember only about 8% of the material. In some examples, even less is being recollected.

Using SVN for GNOME Translators

Update 3rd June 2009: This is a very old post when GNOME was using SVN for the VCS (now we use git). My blog theme does not show the year, so I am writing this in case you are confused by the post.

Now GNOME uses SVN to manage the development of the software.
To use SVN, the basic relevant commands are described at Getting the most out of Subversion in GNOME.

If you are a translator, the work is further simplified. You would normally new SVN to get a copy of the source code of a package so that you can extract the translation messages of the UI or the documentation. In addition, in some cases you can provide localised images and screenshots.

First of all, if you do not have an account on SVN yet, you need to connect using Anonymous access. You still have all access, however if you want to upload any translations would need to give them to someone else who has such an SVN account.

Furthermore, the source code of a package is often branched during a GNOME release so that when there is ongoing development, the released version of the package is not affected. Branches usually have a name similar to gnome-2-18. The not-branched branch is called trunk (or HEAD, in CVS lingo), where all cutting-edge development usually happens.

To checkout (here checkout means to obtain a copy) the source code of a package.

Checkout trunk as anonymous

svn checkout http://svn.gnome.org/svn/gnome-utils/trunk my-trunk-gnome-utils

Checkout trunk as simos

svn checkout svn+ssh://simos@svn.gnome.org/svn/gnome-utils/trunk my-trunk-gnome-utils

Checkout branch called “gnome-2-18” as anonymous

svn checkout http://svn.gnome.org/svn/gnome-utils/branches/gnome-2-18/ gnome-utils-stable

Checkout branch called “gnome-2-18” as simos

svn checkout svn+ssh://simos@svn.gnome.org/svn/gnome-utils/branches/gnome-2-18 gnome-utils-stable

To commit you changes means that you send your changes upstream to the project.
In order to commit, you enter the directory you checked out and you run

svn commit -m “Updated Greek translation”

The changes you make typically include updated your language’s LL.po file, and also updating the ChangeLog file.

You cannot commit in a anonymous checkout. The system knows that it’s you when you are commiting because the checkout command saved the username you used earlier.

In the SVN commands, you can abbreviate checkout with co, and commit with ci. Sometimes this leads to the most common newbie error; you tend to think that co is for commit. In practice you cannot make a mess though, as the command line parameters between the two actions are very different, and the command will fail.

Νέες προθεσμίες για τη μετάφραση του GNOME 2.18(.1)

Κατά το http://live.gnome.org/TwoPointSeventeen η προθεσμία για την υποβολή μεταφράσεων για το GNOME 2.18 έχει περάσει.
Τώρα επικεντρώνουμε την προσοχή μας στο GNOME 2.18.1 που θα γίνει διαθέσιμο στις αρχές Απριλίου. Θα προσπαθήσουμε να τελειώσουμε τη δουλειά νωρίτερα από τις αρχές Απριλίου.

Έχω την εντύπωση ότι οι διανομές Fedora και Ubuntu θα πάρουν από το 2.18.1.

Έτσι, για να έχουμε ολοκληρωμένο τον εξελληνισμό των διανομών Fedora και Ubuntu (+ΟpenSuse;, κτλ), βοηθούμε στη μετάφραση του GNOME. Είναι δε σημαντικό να μεταφραστεί και να διορθωθεί το GNOME διότι αυτό είναι που βλέπει ο τελικός χρήστης στην καθημερινή χρήση.

Από το http://l10n.gnome.org/languages/el/gnome-2-18 μεταφράζουμε από τη δεξιά στήλη.
Για να είμαστε σίγουροι ότι κάποιος άλλος δεν μεταφράζει το ίδιο
πρόγραμμα, το καταγράφουμε στη σελίδα
http://www.ubuntu-gr.org/Wiki/Community/Translation/Upstream/POReservation

Για τη μετάφραση μπορούμε να χρησιμοποιήσουμε KBabel (sudo apt-get
install kbabel). Για την ενεργοποίηση της ελληνικής ορθογραφίας δείτε
πρόσφατο γράμμα στη λίστα ubuntu-gr.
Το τελικό αρχείο μπορείτε να το στείλετε στο team ατ gnome τελεία gr.

Για περισσότερες τεχνικές πληροφορίες στη μετάφραση, δείτε και το γράμμα του Αθανάσιου Λευτέρη.

Αν χρησιμοποιείτε τη διανομή Fedora και ελληνικά, τότε είναι καλό να γραφτείτε στη λίστα συνδρομητών
https://www.redhat.com/mailman/listinfo/fedora-trans-el

Αν χρησιμοποιείτε τη διανομή Ubuntu και ελληνικά, τότε είναι καλό να γραφτείτε στη λίστα συνδρομητών
https://lists.ubuntu.com/mailman/listinfo/ubuntu-gr

Translating OLPC software

The core OLPC software is developed at http://dev.laptop.org/ using the GIT source code management system.
For the tasks of the translator, one needs to look into the different projects and locate any po/ subdirectory. The existence of this subdirectory show that the piece of software is internationalised (=can be translated).

For example, the core component sugar can be translated. In the main sugar page, and locate the po/ subdirectory that shows up. Click on it and you get the sugar po/ subdirectory with a few translations. Specifically, Yoruba, Hausa, Igbo and Italian. The italian translation is sadly useless. The translator made a mistake; he saw

msgid “Hello”

msgstr “”

and changed to (WRONG)

msgid “Ciao”

msgstr “”

instead of (CORRECT)

msgid “Hello”

msgstr “Ciao”

Normally, one would need to regenerate the Template PO (POT) file before translating, instead of working on one of the existing translated files. To do so, one needs to download the source code of sugar using the git tool and then use intltool-update -P to create the fresh sugar.pot file.

Translating the OLPC

In a previous post, we covered how to install fonts and enabling writing support on the OLPC. The OLPC contains a limited number of applications that are available to be translated. These applications include

  1. NetworkManager, part of the GNOME project (HEAD, extras)
  2. alsa-utils, ???
  3. aspell, external
  4. atk10, part of the GNOME project (GNOME 2.18, developer)
  5. chkconfig, part of the Fedora Project
  6. diffutils, external
  7. glib20, part of the GNOME project (GNOME 2.18, developer)
  8. gst-plugins-base-0.10, external
  9. gst-plugins-good-0.10, external
  10. gstreamer-0.10, external
  11. gtk20-properties, part of the GNOME project (GNOME 2.18, developer)
  12. gtk20, part of the GNOME project (GNOME 2.18, developer)
  13. hal, external
  14. initscripts, part of the Fedora Project
  15. libc, part of the Translation Project (reduced version?)
  16. libuser, part of the Fedora Project
  17. libwnck, part of the GNOME Project (GNOME 2.18, desktop)
  18. stardict, external
  19. vte, part of the GNOME Project (GNOME 2.18, desktop)
  20. wget, part of the Translation Project

The links provided point to the latest available version. The versions that the OLPC using are not the latest with the upstream project, therefore keep in mind that the translated files may differ. It would be good to establish the exact .PO files from the OLPC project (URL to source?).