GNU Gettext - Yet Another Tutorial

Updated: Apr 22, 2023

Well, developing a simple C program is easy, but developing it in a internationalized way (Yeah!! all those l10n, m17n and i18n thingy) is not so easy unless you understand Autotools. However, understanding it may take some time (atleast it took some time for me). In this post, I’m trying to explain how I learnt it. It may be wrong way, but atleast I can recollect what I did today in future.

Lets just create a simple C project. Obviously without any doubt, it should be called helloworld. Lets just create the directory tree first.

$ mkdir -p helloworld/{src,man}

Switch to helloworld/src and create two files helloworld.h and helloworld.c

/* helloworld/src/helloworld.h */
#ifndef __HELLOWORLD__
#define __HELLOWORLD__

#define _GNU_SOURCE

#include <stdio.h>
#include <stdlib.h>
#include <libintl.h>
#include <locale.h>

#define _(STRING) gettext(STRING)

#endif

We have to include libintl.h to get bindtextdomain(3), textdomain(3) and gettext(3) functions. We have to include locale.h to get setlocale(3) function, let see why we need these functions,

setlocale()

Every glibc executable starts with the default locale called C. We use setlocale() function to switch to different locale, this function takes two parameters category and locale, category indicates which locale variable we want to change and locale contains what is the new value, If locale is “”, setlocale() will get the value from the corresponding environment variable (see man page for more details).

bindtextdomain()

The gettext framework works as follows,

  • Get all the english output strings from the sources and generate a .pot file

  • Translate the english strings in the .pot file in different language and create .po file for each language

  • Generate .gmo binary files from the .po file for each language

  • Make the executable read corresponding translation from the .gmo file according to the locale settings each time it wants to print a message

To do the last step, we have to specify where the .gmo files are available. For that purpose, we use bindtextdomain(), it takes two arguments, domainname and dirname. domainname is the name we choose to group all our .gmo files under one place. Most of the time, we use the name of our project as domainname. dirname is the common directory where different project’s .gmo files were placed. Usually it is /usr/share/locale.

textdomain()

we have to set the textdomain so that executable will get the translated messages from the .gmo files correctly. textdomain() takes only one argument domainname which is the name of our project.

gettext()

Finally we have to wrap every output string to make them pass through gettext() so that it can catch the correct translated string from .gmo files. We defined a macro _() alias to gettext() because we are lazy(aren’t we!?) to type gettext() everytime.

So, here is the helloworld.c

/* helloworld/src/helloworld.c */
#include <helloworld.h>

int main(int argc, char *argv[])
{

  setlocale(LC_ALL, "");

#ifdef ENABLE_NLS
  bindtextdomain(PACKAGE, LOCALEDIR);
  textdomain(PACKAGE);
#endif

  printf(_("hello world\n"));

  return(0);
}

Now, we have to replace PACKAGE and LOCALEDIR macros to the real values. Here comes autotools, automake can give real value to PACKAGE at compile time and automake also have a way to define LOCALEDIR at compile time, Lets do autotools by creating following files,

# helloworld/src/Makefile.am
bin_PROGRAMS = helloworld
helloworld_SOURCES = helloworld.c helloworld.h
DEFS += -DLOCALEDIR=\'$(localedir)\'
# helloworld/man/helloworld.1
helloworld :) !!! check after sometime
to see the real man page
# helloworld/man/Makefile.am
dist_man_MANS = helloworld.1
# helloworld/Makefile.am
SUBDIRS = src man

We need to run autoscan to generate configure.scan file. Rename configure.scan to configure.ac and edit that file according to the project’s need. I can’t explain all the autoconf macros within this blog post, see the end of this blog post to get the links for further reading.

$ cd helloworld
$ autoscan

Here is the customized configure.ac file,

# helloworld/configure.ac
#                                               -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.

AC_INIT([helloworld], [0.1], [mokka at comedysite dot com])
AC_CONFIG_SRCDIR([src/helloworld.c])

# Automake init
AM_INIT_AUTOMAKE([foreign -Wall])

# Checks for programs.
AC_PROG_CC
AM_PROG_CC_C_O

# Gettext init
AM_GNU_GETTEXT_VERSION([0.18])
AM_GNU_GETTEXT([external])

# Checks for libraries.

# Checks for header files.
AC_CHECK_HEADERS([libintl.h locale.h stdlib.h])

# Checks for typedefs, structures, and compiler characteristics.

# Checks for library functions.
AC_CHECK_FUNCS([setlocale])

AC_CONFIG_FILES([Makefile
                 man/Makefile
                 src/Makefile])
AC_OUTPUT

Now we have to run gettextize under helloworld directory to put gettext settings into configure.ac and Makefile.am.

$ cd helloworld
$ gettextize

If things go well, you can see helloworld/po directory and modifications into configure.ac and Makefile.am. Now we can run autoreconf to finish autotools procedure.

$ cd helloworld
$ autoreconf --force --install --verbose

Now, switch to helloworld/po and rename Makevars.template to Makevars. Inside Makevars file, you may have to give inputs to some variables, may be atleast to MSGID_BUGS_ADDRESS, Here is a way to add your email address to that variable

$ cd helloworld/po
$ mv Makevars.template Makevars
$ sed -i '/^MSGID/s/$/mokka at comedytime dot com/g' Makevars

Now, we need to add the source filenames to POTFILES.in, Here a way,

$ cd helloworld/po
$ find ../src -name '*.c' -o -name '*.h' | sed 's/\.\.\///g' >> POTFILES.in

Time to compile,

$ cd helloworld
$ ./configure
$ make

You can see PACKAGE, LOCALEDIR macro definitions when make compile helloworld.c. As a programmer, your job is almost done.

Now switch yourself as a translator. go to helloworld/po directory and generate a po file for your language using msginit, you have to provide locale using -l option. You should know the lanuagecode and countrycode to construct locale string. msginit will ask for your email-id to put yourself into the translators list. Here I’m translating for Tamil (ta_IN.utf8).

$ cd helloworld/po
$ msginit -i helloworld.pot -o ta.po -l ta_IN.utf8

I edited ta.po file with gedit+ibus, translated the word hello world\n to வனக்கம்\n. Now, I have to add my language to LINGUAS file. LINGUAS file contains languagecodes which have corresponding translated .po file inside helloworld/po directory.

# helloworld/po/LINGUAS
ta

Now its time to generate binary .gmo file. Before that, We have to re-run autoreconf to regenerate the helloworld/po/Makefile.in, because we updated LINGUAS file.

$ cd helloworld
$ make distclean
$ autoreconf --force --install --verbose
$ cd po
$ make update-gmo
rm -f ta.gmo && /usr/bin/gmsgfmt -c --statistics --verbose -o ta.gmo ta.po
ta.po: 1 translated message.
$

If your translation don’t have any errors, you will see 1 translated message. Few more steps to achieve our goal, that is, creating distribution tarball and install our program to see the result.

$ cd helloworld
$ make distclean
$ make dist-bzip2
$ mkdir -p /tmp/buildir
$ mv helloworld-0.1.tar.bz2 /tmp/builddir
$ cd /tmp/builddir
$ tar xvjf helloworld-0.1.tar.bz2
$ cd helloworld-0.1
$ ./configure --prefix="/tmp/destdir"
$ make install
$ LANG="ta_IN.utf8" /tmp/destdir/bin/helloworld
வனக்கம்
$ /tmp/destdir/bin/helloworld
hello world
$

Thats it. My helloworld program can say வனக்கம் now. You can also make it to speak your favourite language!!

References

There is another beautiful tutorial for gettext available at oriya.sarovar.org.