Go to the first, previous, next, last section, table of contents.


13 Other Programming Languages

While the presentation of gettext focuses mostly on C and implicitly applies to C++ as well, its scope is far broader than that: Many programming languages, scripting languages and other textual data like GUI resources or package descriptions can make use of the gettext approach.

13.1 The Language Implementor's View

All programming and scripting languages that have the notion of strings are eligible to supporting gettext. Supporting gettext means the following:

  1. You should add to the language a syntax for translatable strings. In principle, a function call of gettext would do, but a shorthand syntax helps keeping the legibility of internationalized programs. For example, in C we use the syntax _("string"), in bash we use the syntax $"string", and in GNU awk we use the shorthand _"string".
  2. You should arrange that evaluation of such a translatable string at runtime calls the gettext function, or performs equivalent processing.
  3. Similarly, you should make the functions ngettext, dcgettext, dcngettext available from within the language. These functions are less often used, but are nevertheless necessary for particular purposes: ngettext for correct plural handling, and dcgettext and dcngettext for obeying other locale environment variables than LC_MESSAGES, such as LC_TIME or LC_MONETARY. For these latter functions, you need to make the LC_* constants, available in the C header <locale.h>, referenceable from within the language, usually either as enumeration values or as strings.
  4. You should allow the programmer to designate a message domain, either by making the textdomain function available from within the language, or by introducing a magic variable called TEXTDOMAIN. Similarly, you should allow the programmer to designate where to search for message catalogs, by providing access to the bindtextdomain function.
  5. You should either perform a setlocale (LC_ALL, "") call during the startup of your language runtime, or allow the programmer to do so. Remember that gettext will act as a no-op if the LC_MESSAGES and LC_CTYPE locale facets are not both set.
  6. A programmer should have a way to extract translatable strings from a program into a PO file. The GNU xgettext program is being extended to support very different programming languages. Please contact the GNU gettext maintainers to help them doing this. If the string extractor is best integrated into your language's parser, GNU xgettext can function as a front end to your string extractor.
  7. The language's library should have a string formatting facility where the arguments of a format string are denoted by a positional number or a name. This is needed because for some languages and some messages with more than one substitutable argument, the translation will need to output the substituted arguments in different order. See section 3.5 Special Comments preceding Keywords.
  8. If the language has more than one implementation, and not all of the implementations use gettext, but the programs should be portable across implementations, you should provide a no-i18n emulation, that makes the other implementations accept programs written for yours, without actually translating the strings.
  9. To help the programmer in the task of marking translatable strings, which is usually performed using the Emacs PO mode, you are welcome to contact the GNU gettext maintainers, so they can add support for your language to `po-mode.el´.

On the implementation side, three approaches are possible, with different effects on portability and copyright:

13.2 The Programmer's View

For the programmer, the general procedure is the same as for the C language. The Emacs PO mode supports other languages, and the GNU xgettext string extractor recognizes other languages based on the file extension or a command-line option. In some languages, setlocale is not needed because it is already performed by the underlying language runtime.

13.3 The Translator's View

The translator works exactly as in the C language case. The only difference is that when translating format strings, she has to be aware of the language's particular syntax for positional arguments in format strings.

13.3.1 C Format Strings

C format strings are described in POSIX (IEEE P1003.1 2001), section XSH 3 fprintf(), http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html. See also the fprintf(3) manual page, http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php, http://informatik.fh-wuerzburg.de/student/i510/man/printf.html.

13.3.2 Python Format Strings

Python format strings are described in Python Library reference / 2. Built-in Types, Exceptions and Functions / 2.2. Built-in Types / 2.2.6. Sequence Types / 2.2.6.2. String Formatting Operations. http://www.python.org/doc/2.2.1/lib/typesseq-strings.html.

13.3.3 Lisp Format Strings

Lisp format strings are described in the Common Lisp HyperSpec, chapter 22.3 Formatted Output, http://www.lisp.org/HyperSpec/Body/sec_22-3.html.

13.3.4 Emacs Lisp Format Strings

Emacs Lisp format strings are documented in the Emacs Lisp reference, section Formatting Strings, http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75. Note that as of version 21, XEmacs supports numbered argument specifications in format strings while FSF Emacs doesn't.

13.3.5 librep Format Strings

librep format strings are documented in the librep manual, section Formatted Output, http://librep.sourceforge.net/librep-manual.html#Formatted%20Output, http://www.gwinnup.org/research/docs/librep.html#SEC122.

13.3.6 Smalltalk Format Strings

Smalltalk format strings are described in the GNU Smalltalk documentation, class CharArray, methods `bindWith:´ and `bindWithArguments:´. http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238. In summary, a directive starts with `%´ and is followed by `%´ or a nonzero digit (`1´ to `9´).

13.3.7 Java Format Strings

Java format strings are described in the JDK documentation for class java.text.MessageFormat, http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html. See also the ICU documentation http://oss.software.ibm.com/icu/apiref/classMessageFormat.html.

13.3.8 awk Format Strings

awk format strings are described in the gawk documentation, section Printf, http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf.

13.3.9 Object Pascal Format Strings

Where is this documented?

13.3.10 YCP Format Strings

YCP sformat strings are described in the libycp documentation file:/usr/share/doc/packages/libycp/YCP-builtins.html. In summary, a directive starts with `%´ and is followed by `%´ or a nonzero digit (`1´ to `9´).

13.3.11 Tcl Format Strings

Tcl format strings are described in the `format.n´ manual page, http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm.

13.4 The Maintainer's View

For the maintainer, the general procedure differs from the C language case in two ways.

13.5 Individual Programming Languages

13.5.1 C, C++, Objective C

RPMs
gcc, gpp, gobjc, glibc, gettext
File extension
For C: c, h.
For C++: C, c++, cc, cxx, cpp, hpp.
For Objective C: m.
String syntax
"abc"
gettext shorthand
_("abc")
gettext/ngettext functions
gettext, dgettext, dcgettext, ngettext, dngettext, dcngettext
textdomain
textdomain function
bindtextdomain
bindtextdomain function
setlocale
Programmer must call setlocale (LC_ALL, "")
Prerequisite
#include <libintl.h>
#include <locale.h>
#define _(string) gettext (string)
Use or emulate GNU gettext
Use
Extractor
xgettext -k_
Formatting with positions
fprintf "%2$d %1$d" (POSIX but not C 99)
Portability
autoconf (gettext.m4) and #if ENABLE_NLS
po-mode marking
yes

13.5.2 sh - Shell Script

RPMs
bash, gettext
File extension
sh
String syntax
"abc", 'abc', abc
gettext shorthand
"`gettext "abc"`"
gettext/ngettext functions
gettext, ngettext programs
textdomain
environment variable TEXTDOMAIN
bindtextdomain
environment variable TEXTDOMAINDIR
setlocale
automatic
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
---
Formatting with positions
---
Portability
---
po-mode marking
---

13.5.3 bash - Bourne-Again Shell Script

RPMs
bash 2.0 or newer, gettext
File extension
sh
String syntax
"abc", 'abc', abc
gettext shorthand
$"abc"
gettext/ngettext functions
gettext, ngettext programs
textdomain
environment variable TEXTDOMAIN
bindtextdomain
environment variable TEXTDOMAINDIR
setlocale
automatic
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
bash --dump-po-strings
Formatting with positions
---
Portability
---
po-mode marking
---

13.5.4 Python

RPMs
python
File extension
py
String syntax
'abc', u'abc', r'abc', ur'abc',
"abc", u"abc", r"abc", ur"abc",
"'abc"', u"'abc"', r"'abc"', ur"'abc"',
"""abc""", u"""abc""", r"""abc""", ur"""abc"""
gettext shorthand
_('abc') etc.
gettext/ngettext functions
gettext.gettext, gettext.dgettext, also ugettext
textdomain
gettext.textdomain function, or gettext.install(domain) function
bindtextdomain
gettext.bindtextdomain function, or gettext.install(domain,localedir) function
setlocale
not used by the gettext emulation
Prerequisite
import gettext
Use or emulate GNU gettext
emulate. Bug: uses only the first found .mo file, not all of them
Extractor
xgettext
Formatting with positions
'...%(ident)d...' % { 'ident': value }
Portability
fully portable
po-mode marking
---

13.5.5 GNU clisp - Common Lisp

RPMs
clisp 2.28 or newer
File extension
lisp
String syntax
"abc"
gettext shorthand
(_ "abc"), (ENGLISH "abc")
gettext/ngettext functions
i18n:gettext, i18n:ngettext
textdomain
i18n:textdomain
bindtextdomain
i18n:textdomaindir
setlocale
automatic
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
xgettext -k_ -kENGLISH
Formatting with positions
format "~1@*~D ~0@*~D"
Portability
On platforms without gettext, no translation.
po-mode marking
---

13.5.6 GNU clisp C sources

RPMs
clisp
File extension
d
String syntax
"abc"
gettext shorthand
ENGLISH ? "abc" : ""
GETTEXT("abc")
GETTEXTL("abc")
gettext/ngettext functions
clgettext, clgettextl
textdomain
---
bindtextdomain
---
setlocale
automatic
Prerequisite
#include "lispbibl.c"
Use or emulate GNU gettext
use
Extractor
clisp-xgettext
Formatting with positions
fprintf "%2$d %1$d" (POSIX but not C 99)
Portability
On platforms without gettext, no translation.
po-mode marking
---

13.5.7 Emacs Lisp

RPMs
emacs, xemacs
File extension
el
String syntax
"abc"
gettext shorthand
(_"abc")
gettext/ngettext functions
gettext, dgettext (xemacs only)
textdomain
domain special form (xemacs only)
bindtextdomain
bind-text-domain function (xemacs only)
setlocale
automatic
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
xgettext
Formatting with positions
format "%2$d %1$d"
Portability
Only XEmacs. Without I18N3 defined at build time, no translation.
po-mode marking
---

13.5.8 librep

RPMs
librep 0.15.3 or newer
File extension
jl
String syntax
"abc"
gettext shorthand
(_"abc")
gettext/ngettext functions
gettext
textdomain
textdomain function
bindtextdomain
bindtextdomain function
setlocale
---
Prerequisite
(require 'rep.i18n.gettext)
Use or emulate GNU gettext
use
Extractor
xgettext
Formatting with positions
format "%2$d %1$d"
Portability
On platforms without gettext, no translation.
po-mode marking
---

13.5.9 GNU Smalltalk

RPMs
smalltalk
File extension
st
String syntax
"abc"
gettext shorthand
NLS? "abc"
self? "abc"
gettext/ngettext functions
LcMessagesDomain>>#at:, LcMessagesDomain>>#at:plural:with:
textdomain
LcMessages>>#? (returns a LcMessagesDomain object).
Example: Locale default messages ? 'gettext'
bindtextdomain
LcMessages>>#domain:directory: (returns a LcMessagesDomain object)
setlocale
You can obtain any Locale object from Locale class methods such as #fromString: or #default.
Example: Locale default messages gives the LcMessages object for the default locale.
Prerequisite
The gettext code is contained in the `I18N´ package.
Use or emulate GNU gettext
emulate
Extractor
---
Formatting with positions
'%1 %2' bindWith: 'Hello' with: 'world'
Portability
fully portable
po-mode marking
---

13.5.10 Java

RPMs
java, java2
File extension
java
String syntax
"abc"
gettext shorthand
_("abc")
gettext/ngettext functions
GettextResource.gettext, GettextResource.ngettext
textdomain
---, use ResourceBundle.getResource instead
bindtextdomain
---, use CLASSPATH instead
setlocale
automatic
Prerequisite
---
Use or emulate GNU gettext
---, uses a Java specific message catalog format
Extractor
xgettext -k_
Formatting with positions
MessageFormat.format "{1,number} {0,number}"
Portability
fully portable
po-mode marking
---

Before marking strings as internationalizable, uses of the string concatenation operator need to be converted to MessageFormat applications. For example, "file "+filename+" not found" becomes MessageFormat.format("file {0} not found", new Object[] { filename }). Only after this is done, can the strings be marked and extracted.

GNU gettext uses the native Java internationalization mechanism, namely ResourceBundles. To convert a PO file to a ResourceBundle, the msgfmt program can be used with the option --java or --java2. To convert a ResourceBundle back to a PO file, the msgunfmt program can be used with the option --java.

Two different programmatic APIs can be used to access ResourceBundles. Note that both APIs work with all kinds of ResourceBundles, whether GNU gettext generated classes, or other .class or .properties files.

  1. The java.util.ResourceBundle API. In particular, its getString function returns a string translation. Note that a missing translation yields a MissingResourceException. This has the advantage of being the standard API. And it does not require any additional libraries, only the msgfmt generated .class files. But it cannot do plural handling, even if the resource was generated from a PO file with plural handling.
  2. The gnu.gettext.GettextResource API. Reference documentation in Javadoc 1.1 style format is in the javadoc1 directory and in Javadoc 2 style format in the javadoc2 directory. Its gettext function returns a string translation. Note that when a translation is missing, the msgid argument is returned unchanged. This has the advantage of having the ngettext function for plural handling. To use this API, one needs the libintl.jar file which is part of the GNU gettext package and distributed under the LGPL.

13.5.11 GNU awk

RPMs
gawk 3.1 or newer
File extension
awk
String syntax
"abc"
gettext shorthand
_"abc"
gettext/ngettext functions
dcgettext, missing dcngettext in gawk-3.1.0
textdomain
TEXTDOMAIN variable
bindtextdomain
bindtextdomain function
setlocale
automatic, but missing setlocale (LC_MESSAGES, "") in gawk-3.1.0
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
xgettext
Formatting with positions
printf "%2$d %1$d" (GNU awk only)
Portability
On platforms without gettext, no translation. On non-GNU awks, you must define dcgettext, dcngettext and bindtextdomain yourself.
po-mode marking
---

13.5.12 Pascal - Free Pascal Compiler

RPMs
fpk
File extension
pp, pas
String syntax
'abc'
gettext shorthand
automatic
gettext/ngettext functions
---, use ResourceString data type instead
textdomain
---, use TranslateResourceStrings function instead
bindtextdomain
---, use TranslateResourceStrings function instead
setlocale
automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
Prerequisite
{$mode delphi} or {$mode objfpc}
uses gettext;
Use or emulate GNU gettext
emulate partially
Extractor
ppc386 followed by xgettext or rstconv
Formatting with positions
uses sysutils;
format "%1:d %0:d"
Portability
?
po-mode marking
---

The Pascal compiler has special support for the ResourceString data type. It generates a .rst file. This is then converted to a .pot file by use of xgettext or rstconv. At runtime, a .mo file corresponding to translations of this .pot file can be loaded using the TranslateResourceStrings function in the gettext unit.

13.5.13 wxWindows library

RPMs
wxGTK, gettext
File extension
cpp
String syntax
"abc"
gettext shorthand
_("abc")
gettext/ngettext functions
wxLocale::GetString, wxGetTranslation
textdomain
wxLocale::AddCatalog
bindtextdomain
wxLocale::AddCatalogLookupPathPrefix
setlocale
wxLocale::Init, wxSetLocale
Prerequisite
#include <wx/intl.h>
Use or emulate GNU gettext
emulate, see include/wx/intl.h and src/common/intl.cpp
Extractor
xgettext
Formatting with positions
---
Portability
fully portable
po-mode marking
yes

13.5.14 YCP - YaST2 scripting language

RPMs
libycp, libycp-devel, yast2-core-translator
File extension
ycp
String syntax
"abc"
gettext shorthand
_("abc")
gettext/ngettext functions
_() with 1 or 3 arguments
textdomain
textdomain statement
bindtextdomain
---
setlocale
---
Prerequisite
---
Use or emulate GNU gettext
use maps instead
Extractor
xgettext
Formatting with positions
sformat "%2 %1"
Portability
fully portable
po-mode marking
---

13.5.15 Tcl - Tk's scripting language

RPMs
tcl
File extension
tcl
String syntax
"abc"
gettext shorthand
[_ "abc"]
gettext/ngettext functions
::msgcat::mc
textdomain
---
bindtextdomain
---, use ::msgcat::mcload instead
setlocale
automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
Prerequisite
package require msgcat
proc _ {s} {return [::msgcat::mc $s]}
Use or emulate GNU gettext
---, uses a Tcl specific message catalog format
Extractor
xgettext -k_
Formatting with positions
format "%2\$d %1\$d"
Portability
fully portable
po-mode marking
---

Before marking strings as internationalizable, substitutions of variables into the string need to be converted to format applications. For example, "file $filename not found" becomes [format "file %s not found" $filename]. Only after this is done, can the strings be marked and extracted. After marking, this example becomes [format [_ "file %s not found"] $filename] or [msgcat::mc "file %s not found" $filename]. Note that the msgcat::mc function implicitly calls format when more than one argument is given.

13.5.16 Perl

RPMs
perl, perl-gettext
File extension
pl, PL
String syntax
"abc"
gettext shorthand
---
gettext/ngettext functions
gettext, dgettext, dcgettext
textdomain
textdomain function
bindtextdomain
bindtextdomain function
setlocale
Use setlocale (LC_ALL, "");
Prerequisite
use POSIX;
use Locale::gettext;
Use or emulate GNU gettext
use
Extractor
?
Formatting with positions
---
Portability
?
po-mode marking
---

13.5.17 PHP Hypertext Preprocessor

RPMs
mod_php4, phplib, phpdoc
File extension
php, php3, php4
String syntax
"abc"
gettext shorthand
_("abc")
gettext/ngettext functions
gettext, dgettext, dcgettext
textdomain
textdomain function
bindtextdomain
bindtextdomain function
setlocale
setlocale function
Prerequisite
---
Use or emulate GNU gettext
use
Extractor
---
Formatting with positions
---
Portability
On platforms without gettext, the functions are not available.
po-mode marking
---

13.5.18 Pike

RPMs
roxen
File extension
pike
String syntax
"abc"
gettext shorthand
---
gettext/ngettext functions
gettext, dgettext, dcgettext
textdomain
textdomain function
bindtextdomain
bindtextdomain function
setlocale
setlocale function
Prerequisite
import Locale.Gettext;
Use or emulate GNU gettext
use
Extractor
---
Formatting with positions
---
Portability
On platforms without gettext, the functions are not available.
po-mode marking
---

13.6 Internationalizable Data

Here is a list of other data formats which can be internationalized using GNU gettext.

13.6.1 POT - Portable Object Template

RPMs
gettext
File extension
pot, po
Extractor
xgettext

13.6.2 Resource String Table

RPMs
fpk
File extension
rst
Extractor
xgettext, rstconv

13.6.3 Glade - GNOME user interface description

RPMs
glade, libglade, xml-i18n-tools
File extension
glade
Extractor
xgettext, libglade-xgettext


Go to the first, previous, next, last section, table of contents.