GNU Gettext - Yet Another Tutorial¶
Well, developing a simple C program is easy, but developing it in a internationalized way (Yeah!! all those l10n, m17n and i18n thingy) is not so easy unless you understand Autotools. However, understanding it may take some time (atleast it took some time for me). In this post, I’m trying to explain how I learnt it. It may be wrong way, but atleast I can recollect what I did today in future.
Lets just create a simple C project. Obviously without any doubt, it should be called helloworld
. Lets just create the directory tree first.
$ mkdir -p helloworld/{src,man}
Switch to helloworld/src
and create two files helloworld.h
and helloworld.c
/* helloworld/src/helloworld.h */
#ifndef __HELLOWORLD__
#define __HELLOWORLD__
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <libintl.h>
#include <locale.h>
#define _(STRING) gettext(STRING)
#endif
We have to include libintl.h to get bindtextdomain(3), textdomain(3) and gettext(3) functions. We have to include locale.h to get setlocale(3) function, let see why we need these functions,
setlocale()
Every glibc executable starts with the default locale called
C
. We usesetlocale()
function to switch to different locale, this function takes two parameterscategory
andlocale
,category
indicates which locale variable we want to change andlocale
contains what is the new value, Iflocale
is “”, setlocale() will get the value from the corresponding environment variable (see man page for more details).
bindtextdomain()
The gettext framework works as follows,
Get all the english output strings from the sources and generate a .pot file
Translate the english strings in the .pot file in different language and create .po file for each language
Generate .gmo binary files from the .po file for each language
Make the executable read corresponding translation from the .gmo file according to the locale settings each time it wants to print a message
To do the last step, we have to specify where the .gmo files are available. For that purpose, we use
bindtextdomain()
, it takes two arguments,domainname
anddirname
.domainname
is the name we choose to group all our .gmo files under one place. Most of the time, we use the name of our project asdomainname
.dirname
is the common directory where different project’s .gmo files were placed. Usually it is/usr/share/locale
.
textdomain()
we have to set the
textdomain
so that executable will get the translated messages from the .gmo files correctly.textdomain()
takes only one argumentdomainname
which is the name of our project.
gettext()
Finally we have to wrap every output string to make them pass through
gettext()
so that it can catch the correct translated string from .gmo files. We defined a macro_()
alias togettext()
because we are lazy(aren’t we!?) to typegettext()
everytime.
So, here is the helloworld.c
/* helloworld/src/helloworld.c */
#include <helloworld.h>
int main(int argc, char *argv[])
{
setlocale(LC_ALL, "");
#ifdef ENABLE_NLS
bindtextdomain(PACKAGE, LOCALEDIR);
textdomain(PACKAGE);
#endif
printf(_("hello world\n"));
return(0);
}
Now, we have to replace PACKAGE
and LOCALEDIR
macros to the real values. Here comes autotools, automake can give real value to PACKAGE
at compile time and automake also have a way to define LOCALEDIR at compile time, Lets do autotools by creating following files,
# helloworld/src/Makefile.am
bin_PROGRAMS = helloworld
helloworld_SOURCES = helloworld.c helloworld.h
DEFS += -DLOCALEDIR=\'$(localedir)\'
# helloworld/man/helloworld.1
helloworld :) !!! check after sometime
to see the real man page
# helloworld/man/Makefile.am
dist_man_MANS = helloworld.1
# helloworld/Makefile.am
SUBDIRS = src man
We need to run autoscan
to generate configure.scan
file. Rename configure.scan
to configure.ac
and edit that file according to the project’s need. I can’t explain all the autoconf macros within this blog post, see the end of this blog post to get the links for further reading.
$ cd helloworld
$ autoscan
Here is the customized configure.ac
file,
# helloworld/configure.ac
# -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.
AC_INIT([helloworld], [0.1], [mokka at comedysite dot com])
AC_CONFIG_SRCDIR([src/helloworld.c])
# Automake init
AM_INIT_AUTOMAKE([foreign -Wall])
# Checks for programs.
AC_PROG_CC
AM_PROG_CC_C_O
# Gettext init
AM_GNU_GETTEXT_VERSION([0.18])
AM_GNU_GETTEXT([external])
# Checks for libraries.
# Checks for header files.
AC_CHECK_HEADERS([libintl.h locale.h stdlib.h])
# Checks for typedefs, structures, and compiler characteristics.
# Checks for library functions.
AC_CHECK_FUNCS([setlocale])
AC_CONFIG_FILES([Makefile
man/Makefile
src/Makefile])
AC_OUTPUT
Now we have to run gettextize
under helloworld
directory to put gettext settings into configure.ac
and Makefile.am
.
$ cd helloworld
$ gettextize
If things go well, you can see helloworld/po
directory and modifications into configure.ac
and Makefile.am
. Now we can run autoreconf
to finish autotools procedure.
$ cd helloworld
$ autoreconf --force --install --verbose
Now, switch to helloworld/po
and rename Makevars.template
to Makevars
. Inside Makevars
file, you may have to give inputs to some variables, may be atleast to MSGID_BUGS_ADDRESS
, Here is a way to add your email address to that variable
$ cd helloworld/po
$ mv Makevars.template Makevars
$ sed -i '/^MSGID/s/$/mokka at comedytime dot com/g' Makevars
Now, we need to add the source filenames to POTFILES.in
, Here a way,
$ cd helloworld/po
$ find ../src -name '*.c' -o -name '*.h' | sed 's/\.\.\///g' >> POTFILES.in
Time to compile,
$ cd helloworld
$ ./configure
$ make
You can see PACKAGE, LOCALEDIR macro definitions when make compile helloworld.c. As a programmer, your job is almost done.
Now switch yourself as a translator. go to helloworld/po
directory and generate a po file for your language using msginit
, you have to provide locale
using -l option. You should know the lanuagecode and countrycode to construct locale
string. msginit
will ask for your email-id to put yourself into the translators list. Here I’m translating for Tamil
(ta_IN.utf8).
$ cd helloworld/po
$ msginit -i helloworld.pot -o ta.po -l ta_IN.utf8
I edited ta.po file with gedit+ibus, translated the word hello world\n
to வனக்கம்\n
. Now, I have to add my language to LINGUAS file. LINGUAS file contains languagecodes which have corresponding translated .po file inside helloworld/po
directory.
# helloworld/po/LINGUAS
ta
Now its time to generate binary .gmo file. Before that, We have to re-run autoreconf
to regenerate the helloworld/po/Makefile.in, because we updated LINGUAS file.
$ cd helloworld
$ make distclean
$ autoreconf --force --install --verbose
$ cd po
$ make update-gmo
rm -f ta.gmo && /usr/bin/gmsgfmt -c --statistics --verbose -o ta.gmo ta.po
ta.po: 1 translated message.
$
If your translation don’t have any errors, you will see 1 translated message
. Few more steps to achieve our goal, that is, creating distribution tarball and install our program to see the result.
$ cd helloworld
$ make distclean
$ make dist-bzip2
$ mkdir -p /tmp/buildir
$ mv helloworld-0.1.tar.bz2 /tmp/builddir
$ cd /tmp/builddir
$ tar xvjf helloworld-0.1.tar.bz2
$ cd helloworld-0.1
$ ./configure --prefix="/tmp/destdir"
$ make install
$ LANG="ta_IN.utf8" /tmp/destdir/bin/helloworld
வனக்கம்
$ /tmp/destdir/bin/helloworld
hello world
$
Thats it. My helloworld
program can say வனக்கம் now. You can also make it to speak your favourite language!!
There is another beautiful tutorial for gettext available at oriya.sarovar.org.