This is libc.info, produced by makeinfo version 4.2 from libc.texinfo.

INFO-DIR-SECTION GNU libraries
START-INFO-DIR-ENTRY
* Libc: (libc).                 C library.
END-INFO-DIR-ENTRY

   This file documents the GNU C library.

   This is Edition 0.10, last updated 2001-07-06, of `The GNU C Library
Reference Manual', for Version 2.3.x.

   Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002
Free Software Foundation, Inc.

   Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with the
Invariant Sections being "Free Software Needs Free Documentation" and
"GNU Lesser General Public License", the Front-Cover texts being (a)
(see below), and with the Back-Cover Texts being (b) (see below).  A
copy of the license is included in the section entitled "GNU Free
Documentation License".

   (a) The FSF's Front-Cover Text is:

   A GNU Manual

   (b) The FSF's Back-Cover Text is:

   You have freedom to copy and modify this GNU Manual, like GNU
software.  Copies published by the Free Software Foundation raise
funds for GNU development.


File: libc.info,  Node: Setting the Locale,  Next: Standard Locales,  Prev: Locale Categories,  Up: Locales

How Programs Set the Locale
===========================

   A C program inherits its locale environment variables when it starts
up.  This happens automatically.  However, these variables do not
automatically control the locale used by the library functions, because
ISO C says that all programs start by default in the standard `C'
locale.  To use the locales specified by the environment, you must call
`setlocale'.  Call it as follows:

     setlocale (LC_ALL, "");

to select a locale based on the user choice of the appropriate
environment variables.

   You can also use `setlocale' to specify a particular locale, for
general use or for a specific category.

   The symbols in this section are defined in the header file
`locale.h'.

 - Function: char * setlocale (int CATEGORY, const char *LOCALE)
     The function `setlocale' sets the current locale for category
     CATEGORY to LOCALE.  A list of all the locales the system provides
     can be created by running

            locale -a

     If CATEGORY is `LC_ALL', this specifies the locale for all
     purposes.  The other possible values of CATEGORY specify an single
     purpose (*note Locale Categories::).

     You can also use this function to find out the current locale by
     passing a null pointer as the LOCALE argument.  In this case,
     `setlocale' returns a string that is the name of the locale
     currently selected for category CATEGORY.

     The string returned by `setlocale' can be overwritten by subsequent
     calls, so you should make a copy of the string (*note Copying and
     Concatenation::) if you want to save it past any further calls to
     `setlocale'.  (The standard library is guaranteed never to call
     `setlocale' itself.)

     You should not modify the string returned by `setlocale'.  It might
     be the same string that was passed as an argument in a previous
     call to `setlocale'.  One requirement is that the CATEGORY must be
     the same in the call the string was returned and the one when the
     string is passed in as LOCALE parameter.

     When you read the current locale for category `LC_ALL', the value
     encodes the entire combination of selected locales for all
     categories.  In this case, the value is not just a single locale
     name.  In fact, we don't make any promises about what it looks
     like.  But if you specify the same "locale name" with `LC_ALL' in
     a subsequent call to `setlocale', it restores the same combination
     of locale selections.

     To be sure you can use the returned string encoding the currently
     selected locale at a later time, you must make a copy of the
     string.  It is not guaranteed that the returned pointer remains
     valid over time.

     When the LOCALE argument is not a null pointer, the string returned
     by `setlocale' reflects the newly-modified locale.

     If you specify an empty string for LOCALE, this means to read the
     appropriate environment variable and use its value to select the
     locale for CATEGORY.

     If a nonempty string is given for LOCALE, then the locale of that
     name is used if possible.

     If you specify an invalid locale name, `setlocale' returns a null
     pointer and leaves the current locale unchanged.

   Here is an example showing how you might use `setlocale' to
temporarily switch to a new locale.

     #include <stddef.h>
     #include <locale.h>
     #include <stdlib.h>
     #include <string.h>
     
     void
     with_other_locale (char *new_locale,
                        void (*subroutine) (int),
                        int argument)
     {
       char *old_locale, *saved_locale;
     
       /* Get the name of the current locale.  */
       old_locale = setlocale (LC_ALL, NULL);
     
       /* Copy the name so it won't be clobbered by `setlocale'. */
       saved_locale = strdup (old_locale);
       if (saved_locale == NULL)
         fatal ("Out of memory");
     
       /* Now change the locale and do some stuff with it. */
       setlocale (LC_ALL, new_locale);
       (*subroutine) (argument);
     
       /* Restore the original locale. */
       setlocale (LC_ALL, saved_locale);
       free (saved_locale);
     }

   *Portability Note:* Some ISO C systems may define additional locale
categories, and future versions of the library will do so.  For
portability, assume that any symbol beginning with `LC_' might be
defined in `locale.h'.


File: libc.info,  Node: Standard Locales,  Next: Locale Information,  Prev: Setting the Locale,  Up: Locales

Standard Locales
================

   The only locale names you can count on finding on all operating
systems are these three standard ones:

`"C"'
     This is the standard C locale.  The attributes and behavior it
     provides are specified in the ISO C standard.  When your program
     starts up, it initially uses this locale by default.

`"POSIX"'
     This is the standard POSIX locale.  Currently, it is an alias for
     the standard C locale.

`""'
     The empty name says to select a locale based on environment
     variables.  *Note Locale Categories::.

   Defining and installing named locales is normally a responsibility of
the system administrator at your site (or the person who installed the
GNU C library).  It is also possible for the user to create private
locales.  All this will be discussed later when describing the tool to
do so.

   If your program needs to use something other than the `C' locale, it
will be more portable if you use whatever locale the user specifies
with the environment, rather than trying to specify some non-standard
locale explicitly by name.  Remember, different machines might have
different sets of locales installed.


File: libc.info,  Node: Locale Information,  Next: Formatting Numbers,  Prev: Standard Locales,  Up: Locales

Accessing Locale Information
============================

   There are several ways to access locale information.  The simplest
way is to let the C library itself do the work.  Several of the
functions in this library implicitly access the locale data, and use
what information is provided by the currently selected locale.  This is
how the locale model is meant to work normally.

   As an example take the `strftime' function, which is meant to nicely
format date and time information (*note Formatting Calendar Time::).
Part of the standard information contained in the `LC_TIME' category is
the names of the months.  Instead of requiring the programmer to take
care of providing the translations the `strftime' function does this
all by itself.  `%A' in the format string is replaced by the
appropriate weekday name of the locale currently selected by `LC_TIME'.
This is an easy example, and wherever possible functions do things
automatically in this way.

   But there are quite often situations when there is simply no function
to perform the task, or it is simply not possible to do the work
automatically.  For these cases it is necessary to access the
information in the locale directly.  To do this the C library provides
two functions: `localeconv' and `nl_langinfo'.  The former is part of
ISO C and therefore portable, but has a brain-damaged interface.  The
second is part of the Unix interface and is portable in as far as the
system follows the Unix standards.

* Menu:

* The Lame Way to Locale Data::   ISO C's `localeconv'.
* The Elegant and Fast Way::      X/Open's `nl_langinfo'.


File: libc.info,  Node: The Lame Way to Locale Data,  Next: The Elegant and Fast Way,  Up: Locale Information

`localeconv': It is portable but ...
------------------------------------

   Together with the `setlocale' function the ISO C people invented the
`localeconv' function.  It is a masterpiece of poor design.  It is
expensive to use, not extendable, and not generally usable as it
provides access to only `LC_MONETARY' and `LC_NUMERIC' related
information.  Nevertheless, if it is applicable to a given situation it
should be used since it is very portable.  The function `strfmon'
formats monetary amounts according to the selected locale using this
information.

 - Function: struct lconv * localeconv (void)
     The `localeconv' function returns a pointer to a structure whose
     components contain information about how numeric and monetary
     values should be formatted in the current locale.

     You should not modify the structure or its contents.  The
     structure might be overwritten by subsequent calls to
     `localeconv', or by calls to `setlocale', but no other function in
     the library overwrites this value.

 - Data Type: struct lconv
     `localeconv''s return value is of this data type.  Its elements are
     described in the following subsections.

   If a member of the structure `struct lconv' has type `char', and the
value is `CHAR_MAX', it means that the current locale has no value for
that parameter.

* Menu:

* General Numeric::             Parameters for formatting numbers and
                                 currency amounts.
* Currency Symbol::             How to print the symbol that identifies an
                                 amount of money (e.g. `$').
* Sign of Money Amount::        How to print the (positive or negative) sign
                                 for a monetary amount, if one exists.


File: libc.info,  Node: General Numeric,  Next: Currency Symbol,  Up: The Lame Way to Locale Data

Generic Numeric Formatting Parameters
.....................................

   These are the standard members of `struct lconv'; there may be
others.

`char *decimal_point'
`char *mon_decimal_point'
     These are the decimal-point separators used in formatting
     non-monetary and monetary quantities, respectively.  In the `C'
     locale, the value of `decimal_point' is `"."', and the value of
     `mon_decimal_point' is `""'.

`char *thousands_sep'
`char *mon_thousands_sep'
     These are the separators used to delimit groups of digits to the
     left of the decimal point in formatting non-monetary and monetary
     quantities, respectively.  In the `C' locale, both members have a
     value of `""' (the empty string).

`char *grouping'
`char *mon_grouping'
     These are strings that specify how to group the digits to the left
     of the decimal point.  `grouping' applies to non-monetary
     quantities and `mon_grouping' applies to monetary quantities.  Use
     either `thousands_sep' or `mon_thousands_sep' to separate the digit
     groups.

     Each member of these strings is to be interpreted as an integer
     value of type `char'.  Successive numbers (from left to right)
     give the sizes of successive groups (from right to left, starting
     at the decimal point.)  The last member is either `0', in which
     case the previous member is used over and over again for all the
     remaining groups, or `CHAR_MAX', in which case there is no more
     grouping--or, put another way, any remaining digits form one large
     group without separators.

     For example, if `grouping' is `"\04\03\02"', the correct grouping
     for the number `123456787654321' is `12', `34', `56', `78', `765',
     `4321'.  This uses a group of 4 digits at the end, preceded by a
     group of 3 digits, preceded by groups of 2 digits (as many as
     needed).  With a separator of `,', the number would be printed as
     `12,34,56,78,765,4321'.

     A value of `"\03"' indicates repeated groups of three digits, as
     normally used in the U.S.

     In the standard `C' locale, both `grouping' and `mon_grouping'
     have a value of `""'.  This value specifies no grouping at all.

`char int_frac_digits'
`char frac_digits'
     These are small integers indicating how many fractional digits (to
     the right of the decimal point) should be displayed in a monetary
     value in international and local formats, respectively.  (Most
     often, both members have the same value.)

     In the standard `C' locale, both of these members have the value
     `CHAR_MAX', meaning "unspecified".  The ISO standard doesn't say
     what to do when you find this value; we recommend printing no
     fractional digits.  (This locale also specifies the empty string
     for `mon_decimal_point', so printing any fractional digits would be
     confusing!)


File: libc.info,  Node: Currency Symbol,  Next: Sign of Money Amount,  Prev: General Numeric,  Up: The Lame Way to Locale Data

Printing the Currency Symbol
............................

   These members of the `struct lconv' structure specify how to print
the symbol to identify a monetary value--the international analog of
`$' for US dollars.

   Each country has two standard currency symbols.  The "local currency
symbol" is used commonly within the country, while the "international
currency symbol" is used internationally to refer to that country's
currency when it is necessary to indicate the country unambiguously.

   For example, many countries use the dollar as their monetary unit,
and when dealing with international currencies it's important to specify
that one is dealing with (say) Canadian dollars instead of U.S. dollars
or Australian dollars.  But when the context is known to be Canada,
there is no need to make this explicit--dollar amounts are implicitly
assumed to be in Canadian dollars.

`char *currency_symbol'
     The local currency symbol for the selected locale.

     In the standard `C' locale, this member has a value of `""' (the
     empty string), meaning "unspecified".  The ISO standard doesn't
     say what to do when you find this value; we recommend you simply
     print the empty string as you would print any other string pointed
     to by this variable.

`char *int_curr_symbol'
     The international currency symbol for the selected locale.

     The value of `int_curr_symbol' should normally consist of a
     three-letter abbreviation determined by the international standard
     `ISO 4217 Codes for the Representation of Currency and Funds',
     followed by a one-character separator (often a space).

     In the standard `C' locale, this member has a value of `""' (the
     empty string), meaning "unspecified".  We recommend you simply
     print the empty string as you would print any other string pointed
     to by this variable.

`char p_cs_precedes'
`char n_cs_precedes'
`char int_p_cs_precedes'
`char int_n_cs_precedes'
     These members are `1' if the `currency_symbol' or
     `int_curr_symbol' strings should precede the value of a monetary
     amount, or `0' if the strings should follow the value.  The
     `p_cs_precedes' and `int_p_cs_precedes' members apply to positive
     amounts (or zero), and the `n_cs_precedes' and `int_n_cs_precedes'
     members apply to negative amounts.

     In the standard `C' locale, all of these members have a value of
     `CHAR_MAX', meaning "unspecified".  The ISO standard doesn't say
     what to do when you find this value.  We recommend printing the
     currency symbol before the amount, which is right for most
     countries.  In other words, treat all nonzero values alike in
     these members.

     The members with the `int_' prefix apply to the `int_curr_symbol'
     while the other two apply to `currency_symbol'.

`char p_sep_by_space'
`char n_sep_by_space'
`char int_p_sep_by_space'
`char int_n_sep_by_space'
     These members are `1' if a space should appear between the
     `currency_symbol' or `int_curr_symbol' strings and the amount, or
     `0' if no space should appear.  The `p_sep_by_space' and
     `int_p_sep_by_space' members apply to positive amounts (or zero),
     and the `n_sep_by_space' and `int_n_sep_by_space' members apply to
     negative amounts.

     In the standard `C' locale, all of these members have a value of
     `CHAR_MAX', meaning "unspecified".  The ISO standard doesn't say
     what you should do when you find this value; we suggest you treat
     it as 1 (print a space).  In other words, treat all nonzero values
     alike in these members.

     The members with the `int_' prefix apply to the `int_curr_symbol'
     while the other two apply to `currency_symbol'.  There is one
     specialty with the `int_curr_symbol', though.  Since all legal
     values contain a space at the end the string one either printf
     this space (if the currency symbol must appear in front and must
     be separated) or one has to avoid printing this character at all
     (especially when at the end of the string).


File: libc.info,  Node: Sign of Money Amount,  Prev: Currency Symbol,  Up: The Lame Way to Locale Data

Printing the Sign of a Monetary Amount
......................................

   These members of the `struct lconv' structure specify how to print
the sign (if any) of a monetary value.

`char *positive_sign'
`char *negative_sign'
     These are strings used to indicate positive (or zero) and negative
     monetary quantities, respectively.

     In the standard `C' locale, both of these members have a value of
     `""' (the empty string), meaning "unspecified".

     The ISO standard doesn't say what to do when you find this value;
     we recommend printing `positive_sign' as you find it, even if it is
     empty.  For a negative value, print `negative_sign' as you find it
     unless both it and `positive_sign' are empty, in which case print
     `-' instead.  (Failing to indicate the sign at all seems rather
     unreasonable.)

`char p_sign_posn'
`char n_sign_posn'
`char int_p_sign_posn'
`char int_n_sign_posn'
     These members are small integers that indicate how to position the
     sign for nonnegative and negative monetary quantities,
     respectively.  (The string used by the sign is what was specified
     with `positive_sign' or `negative_sign'.)  The possible values are
     as follows:

    `0'
          The currency symbol and quantity should be surrounded by
          parentheses.

    `1'
          Print the sign string before the quantity and currency symbol.

    `2'
          Print the sign string after the quantity and currency symbol.

    `3'
          Print the sign string right before the currency symbol.

    `4'
          Print the sign string right after the currency symbol.

    `CHAR_MAX'
          "Unspecified".  Both members have this value in the standard
          `C' locale.

     The ISO standard doesn't say what you should do when the value is
     `CHAR_MAX'.  We recommend you print the sign after the currency
     symbol.

     The members with the `int_' prefix apply to the `int_curr_symbol'
     while the other two apply to `currency_symbol'.


File: libc.info,  Node: The Elegant and Fast Way,  Prev: The Lame Way to Locale Data,  Up: Locale Information

Pinpoint Access to Locale Data
------------------------------

   When writing the X/Open Portability Guide the authors realized that
the `localeconv' function is not enough to provide reasonable access to
locale information.  The information which was meant to be available in
the locale (as later specified in the POSIX.1 standard) requires more
ways to access it.  Therefore the `nl_langinfo' function was introduced.

 - Function: char * nl_langinfo (nl_item ITEM)
     The `nl_langinfo' function can be used to access individual
     elements of the locale categories.  Unlike the `localeconv'
     function, which returns all the information, `nl_langinfo' lets
     the caller select what information it requires.  This is very fast
     and it is not a problem to call this function multiple times.

     A second advantage is that in addition to the numeric and monetary
     formatting information, information from the `LC_TIME' and
     `LC_MESSAGES' categories is available.

     The type `nl_type' is defined in `nl_types.h'.  The argument ITEM
     is a numeric value defined in the header `langinfo.h'.  The X/Open
     standard defines the following values:

    `CODESET'
          `nl_langinfo' returns a string with the name of the coded
          character set used in the selected locale.

    `ABDAY_1'
    `ABDAY_2'
    `ABDAY_3'
    `ABDAY_4'
    `ABDAY_5'
    `ABDAY_6'
    `ABDAY_7'
          `nl_langinfo' returns the abbreviated weekday name.  `ABDAY_1'
          corresponds to Sunday.

    `DAY_1'
    `DAY_2'
    `DAY_3'
    `DAY_4'
    `DAY_5'
    `DAY_6'
    `DAY_7'
          Similar to `ABDAY_1' etc., but here the return value is the
          unabbreviated weekday name.

    `ABMON_1'
    `ABMON_2'
    `ABMON_3'
    `ABMON_4'
    `ABMON_5'
    `ABMON_6'
    `ABMON_7'
    `ABMON_8'
    `ABMON_9'
    `ABMON_10'
    `ABMON_11'
    `ABMON_12'
          The return value is abbreviated name of the month.  `ABMON_1'
          corresponds to January.

    `MON_1'
    `MON_2'
    `MON_3'
    `MON_4'
    `MON_5'
    `MON_6'
    `MON_7'
    `MON_8'
    `MON_9'
    `MON_10'
    `MON_11'
    `MON_12'
          Similar to `ABMON_1' etc., but here the month names are not
          abbreviated.  Here the first value `MON_1' also corresponds
          to January.

    `AM_STR'
    `PM_STR'
          The return values are strings which can be used in the
          representation of time as an hour from 1 to 12 plus an am/pm
          specifier.

          Note that in locales which do not use this time representation
          these strings might be empty, in which case the am/pm format
          cannot be used at all.

    `D_T_FMT'
          The return value can be used as a format string for
          `strftime' to represent time and date in a locale-specific
          way.

    `D_FMT'
          The return value can be used as a format string for
          `strftime' to represent a date in a locale-specific way.

    `T_FMT'
          The return value can be used as a format string for
          `strftime' to represent time in a locale-specific way.

    `T_FMT_AMPM'
          The return value can be used as a format string for
          `strftime' to represent time in the am/pm format.

          Note that if the am/pm format does not make any sense for the
          selected locale, the return value might be the same as the
          one for `T_FMT'.

    `ERA'
          The return value represents the era used in the current
          locale.

          Most locales do not define this value.  An example of a
          locale which does define this value is the Japanese one.  In
          Japan, the traditional representation of dates includes the
          name of the era corresponding to the then-emperor's reign.

          Normally it should not be necessary to use this value
          directly.  Specifying the `E' modifier in their format
          strings causes the `strftime' functions to use this
          information.  The format of the returned string is not
          specified, and therefore you should not assume knowledge of
          it on different systems.

    `ERA_YEAR'
          The return value gives the year in the relevant era of the
          locale.  As for `ERA' it should not be necessary to use this
          value directly.

    `ERA_D_T_FMT'
          This return value can be used as a format string for
          `strftime' to represent dates and times in a locale-specific
          era-based way.

    `ERA_D_FMT'
          This return value can be used as a format string for
          `strftime' to represent a date in a locale-specific era-based
          way.

    `ERA_T_FMT'
          This return value can be used as a format string for
          `strftime' to represent time in a locale-specific era-based
          way.

    `ALT_DIGITS'
          The return value is a representation of up to 100 values used
          to represent the values 0 to 99.  As for `ERA' this value is
          not intended to be used directly, but instead indirectly
          through the `strftime' function.  When the modifier `O' is
          used in a format which would otherwise use numerals to
          represent hours, minutes, seconds, weekdays, months, or
          weeks, the appropriate value for the locale is used instead.

    `INT_CURR_SYMBOL'
          The same as the value returned by `localeconv' in the
          `int_curr_symbol' element of the `struct lconv'.

    `CURRENCY_SYMBOL'
    `CRNCYSTR'
          The same as the value returned by `localeconv' in the
          `currency_symbol' element of the `struct lconv'.

          `CRNCYSTR' is a deprecated alias still required by Unix98.

    `MON_DECIMAL_POINT'
          The same as the value returned by `localeconv' in the
          `mon_decimal_point' element of the `struct lconv'.

    `MON_THOUSANDS_SEP'
          The same as the value returned by `localeconv' in the
          `mon_thousands_sep' element of the `struct lconv'.

    `MON_GROUPING'
          The same as the value returned by `localeconv' in the
          `mon_grouping' element of the `struct lconv'.

    `POSITIVE_SIGN'
          The same as the value returned by `localeconv' in the
          `positive_sign' element of the `struct lconv'.

    `NEGATIVE_SIGN'
          The same as the value returned by `localeconv' in the
          `negative_sign' element of the `struct lconv'.

    `INT_FRAC_DIGITS'
          The same as the value returned by `localeconv' in the
          `int_frac_digits' element of the `struct lconv'.

    `FRAC_DIGITS'
          The same as the value returned by `localeconv' in the
          `frac_digits' element of the `struct lconv'.

    `P_CS_PRECEDES'
          The same as the value returned by `localeconv' in the
          `p_cs_precedes' element of the `struct lconv'.

    `P_SEP_BY_SPACE'
          The same as the value returned by `localeconv' in the
          `p_sep_by_space' element of the `struct lconv'.

    `N_CS_PRECEDES'
          The same as the value returned by `localeconv' in the
          `n_cs_precedes' element of the `struct lconv'.

    `N_SEP_BY_SPACE'
          The same as the value returned by `localeconv' in the
          `n_sep_by_space' element of the `struct lconv'.

    `P_SIGN_POSN'
          The same as the value returned by `localeconv' in the
          `p_sign_posn' element of the `struct lconv'.

    `N_SIGN_POSN'
          The same as the value returned by `localeconv' in the
          `n_sign_posn' element of the `struct lconv'.

    `INT_P_CS_PRECEDES'
          The same as the value returned by `localeconv' in the
          `int_p_cs_precedes' element of the `struct lconv'.

    `INT_P_SEP_BY_SPACE'
          The same as the value returned by `localeconv' in the
          `int_p_sep_by_space' element of the `struct lconv'.

    `INT_N_CS_PRECEDES'
          The same as the value returned by `localeconv' in the
          `int_n_cs_precedes' element of the `struct lconv'.

    `INT_N_SEP_BY_SPACE'
          The same as the value returned by `localeconv' in the
          `int_n_sep_by_space' element of the `struct lconv'.

    `INT_P_SIGN_POSN'
          The same as the value returned by `localeconv' in the
          `int_p_sign_posn' element of the `struct lconv'.

    `INT_N_SIGN_POSN'
          The same as the value returned by `localeconv' in the
          `int_n_sign_posn' element of the `struct lconv'.

    `DECIMAL_POINT'
    `RADIXCHAR'
          The same as the value returned by `localeconv' in the
          `decimal_point' element of the `struct lconv'.

          The name `RADIXCHAR' is a deprecated alias still used in
          Unix98.

    `THOUSANDS_SEP'
    `THOUSEP'
          The same as the value returned by `localeconv' in the
          `thousands_sep' element of the `struct lconv'.

          The name `THOUSEP' is a deprecated alias still used in Unix98.

    `GROUPING'
          The same as the value returned by `localeconv' in the
          `grouping' element of the `struct lconv'.

    `YESEXPR'
          The return value is a regular expression which can be used
          with the `regex' function to recognize a positive response to
          a yes/no question.  The GNU C library provides the `rpmatch'
          function for easier handling in applications.

    `NOEXPR'
          The return value is a regular expression which can be used
          with the `regex' function to recognize a negative response to
          a yes/no question.

    `YESSTR'
          The return value is a locale-specific translation of the
          positive response to a yes/no question.

          Using this value is deprecated since it is a very special
          case of message translation, and is better handled by the
          message translation functions (*note Message Translation::).

          The use of this symbol is deprecated.  Instead message
          translation should be used.

    `NOSTR'
          The return value is a locale-specific translation of the
          negative response to a yes/no question.  What is said for
          `YESSTR' is also true here.

          The use of this symbol is deprecated.  Instead message
          translation should be used.

     The file `langinfo.h' defines a lot more symbols but none of them
     is official.  Using them is not portable, and the format of the
     return values might change.  Therefore we recommended you not use
     them.

     Note that the return value for any valid argument can be used for
     in all situations (with the possible exception of the am/pm time
     formatting codes).  If the user has not selected any locale for the
     appropriate category, `nl_langinfo' returns the information from
     the `"C"' locale.  It is therefore possible to use this function as
     shown in the example below.

     If the argument ITEM is not valid, a pointer to an empty string is
     returned.

   An example of `nl_langinfo' usage is a function which has to print a
given date and time in a locale-specific way.  At first one might think
that, since `strftime' internally uses the locale information, writing
something like the following is enough:

     size_t
     i18n_time_n_data (char *s, size_t len, const struct tm *tp)
     {
       return strftime (s, len, "%X %D", tp);
     }

   The format contains no weekday or month names and therefore is
internationally usable.  Wrong!  The output produced is something like
`"hh:mm:ss MM/DD/YY"'.  This format is only recognizable in the USA.
Other countries use different formats.  Therefore the function should
be rewritten like this:

     size_t
     i18n_time_n_data (char *s, size_t len, const struct tm *tp)
     {
       return strftime (s, len, nl_langinfo (D_T_FMT), tp);
     }

   Now it uses the date and time format of the locale selected when the
program runs.  If the user selects the locale correctly there should
never be a misunderstanding over the time and date format.


File: libc.info,  Node: Formatting Numbers,  Next: Yes-or-No Questions,  Prev: Locale Information,  Up: Locales

A dedicated function to format numbers
======================================

   We have seen that the structure returned by `localeconv' as well as
the values given to `nl_langinfo' allow you to retrieve the various
pieces of locale-specific information to format numbers and monetary
amounts.  We have also seen that the underlying rules are quite complex.

   Therefore the X/Open standards introduce a function which uses such
locale information, making it easier for the user to format numbers
according to these rules.

 - Function: ssize_t strfmon (char *S, size_t MAXSIZE, const char
          *FORMAT, ...)
     The `strfmon' function is similar to the `strftime' function in
     that it takes a buffer, its size, a format string, and values to
     write into the buffer as text in a form specified by the format
     string.  Like `strftime', the function also returns the number of
     bytes written into the buffer.

     There are two differences: `strfmon' can take more than one
     argument, and, of course, the format specification is different.
     Like `strftime', the format string consists of normal text, which
     is output as is, and format specifiers, which are indicated by a
     `%'.  Immediately after the `%', you can optionally specify
     various flags and formatting information before the main
     formatting character, in a similar way to `printf':

        * Immediately following the `%' there can be one or more of the
          following flags:
         `=F'
               The single byte character F is used for this field as
               the numeric fill character.  By default this character
               is a space character.  Filling with this character is
               only performed if a left precision is specified.  It is
               not just to fill to the given field width.

         `^'
               The number is printed without grouping the digits
               according to the rules of the current locale.  By
               default grouping is enabled.

         `+', `('
               At most one of these flags can be used.  They select
               which format to represent the sign of a currency amount.
               By default, and if `+' is given, the locale equivalent
               of +/- is used.  If `(' is given, negative amounts are
               enclosed in parentheses.  The exact format is determined
               by the values of the `LC_MONETARY' category of the
               locale selected at program runtime.

         `!'
               The output will not contain the currency symbol.

         `-'
               The output will be formatted left-justified instead of
               right-justified if it does not fill the entire field
               width.

     The next part of a specification is an optional field width.  If no
     width is specified 0 is taken.  During output, the function first
     determines how much space is required.  If it requires at least as
     many characters as given by the field width, it is output using as
     much space as necessary.  Otherwise, it is extended to use the
     full width by filling with the space character.  The presence or
     absence of the `-' flag determines the side at which such padding
     occurs.  If present, the spaces are added at the right making the
     output left-justified, and vice versa.

     So far the format looks familiar, being similar to the `printf' and
     `strftime' formats.  However, the next two optional fields
     introduce something new.  The first one is a `#' character followed
     by a decimal digit string.  The value of the digit string
     specifies the number of _digit_ positions to the left of the
     decimal point (or equivalent).  This does _not_ include the
     grouping character when the `^' flag is not given.  If the space
     needed to print the number does not fill the whole width, the
     field is padded at the left side with the fill character, which
     can be selected using the `=' flag and by default is a space.  For
     example, if the field width is selected as 6 and the number is
     123, the fill character is `*' the result will be `***123'.

     The second optional field starts with a `.' (period) and consists
     of another decimal digit string.  Its value describes the number of
     characters printed after the decimal point.  The default is
     selected from the current locale (`frac_digits',
     `int_frac_digits', see *note General Numeric::).  If the exact
     representation needs more digits than given by the field width,
     the displayed value is rounded.  If the number of fractional
     digits is selected to be zero, no decimal point is printed.

     As a GNU extension, the `strfmon' implementation in the GNU libc
     allows an optional `L' next as a format modifier.  If this modifier
     is given, the argument is expected to be a `long double' instead of
     a `double' value.

     Finally, the last component is a format specifier.  There are three
     specifiers defined:

    `i'
          Use the locale's rules for formatting an international
          currency value.

    `n'
          Use the locale's rules for formatting a national currency
          value.

    `%'
          Place a `%' in the output.  There must be no flag, width
          specifier or modifier given, only `%%' is allowed.

     As for `printf', the function reads the format string from left to
     right and uses the values passed to the function following the
     format string.  The values are expected to be either of type
     `double' or `long double', depending on the presence of the
     modifier `L'.  The result is stored in the buffer pointed to by S.
     At most MAXSIZE characters are stored.

     The return value of the function is the number of characters
     stored in S, including the terminating `NULL' byte.  If the number
     of characters stored would exceed MAXSIZE, the function returns -1
     and the content of the buffer S is unspecified.  In this case
     `errno' is set to `E2BIG'.

   A few examples should make clear how the function works.  It is
assumed that all the following pieces of code are executed in a program
which uses the USA locale (`en_US').  The simplest form of the format
is this:

     strfmon (buf, 100, "@%n@%n@%n@", 123.45, -567.89, 12345.678);

The output produced is
     "@$123.45@-$567.89@$12,345.68@"

   We can notice several things here.  First, the widths of the output
numbers are different.  We have not specified a width in the format
string, and so this is no wonder.  Second, the third number is printed
using thousands separators.  The thousands separator for the `en_US'
locale is a comma.  The number is also rounded.  .678 is rounded to .68
since the format does not specify a precision and the default value in
the locale is 2.  Finally, note that the national currency symbol is
printed since `%n' was used, not `i'.  The next example shows how we
can align the output.

     strfmon (buf, 100, "@%=*11n@%=*11n@%=*11n@", 123.45, -567.89, 12345.678);

The output this time is:

     "@    $123.45@   -$567.89@ $12,345.68@"

   Two things stand out.  Firstly, all fields have the same width
(eleven characters) since this is the width given in the format and
since no number required more characters to be printed.  The second
important point is that the fill character is not used.  This is
correct since the white space was not used to achieve a precision given
by a `#' modifier, but instead to fill to the given width.  The
difference becomes obvious if we now add a width specification.

     strfmon (buf, 100, "@%=*11#5n@%=*11#5n@%=*11#5n@",
              123.45, -567.89, 12345.678);

The output is

     "@ $***123.45@-$***567.89@ $12,456.68@"

   Here we can see that all the currency symbols are now aligned, and
that the space between the currency sign and the number is filled with
the selected fill character.  Note that although the width is selected
to be 5 and 123.45 has three digits left of the decimal point, the
space is filled with three asterisks.  This is correct since, as
explained above, the width does not include the positions used to store
thousands separators.  One last example should explain the remaining
functionality.

     strfmon (buf, 100, "@%=0(16#5.3i@%=0(16#5.3i@%=0(16#5.3i@",
              123.45, -567.89, 12345.678);

This rather complex format string produces the following output:

     "@ USD 000123,450 @(USD 000567.890)@ USD 12,345.678 @"

   The most noticeable change is the alternative way of representing
negative numbers.  In financial circles this is often done using
parentheses, and this is what the `(' flag selected.  The fill
character is now `0'.  Note that this `0' character is not regarded as
a numeric zero, and therefore the first and second numbers are not
printed using a thousands separator.  Since we used the format
specifier `i' instead of `n', the international form of the currency
symbol is used.  This is a four letter string, in this case `"USD "'.
The last point is that since the precision right of the decimal point
is selected to be three, the first and second numbers are printed with
an extra zero at the end and the third number is printed without
rounding.


File: libc.info,  Node: Yes-or-No Questions,  Prev: Formatting Numbers,  Up: Locales

Yes-or-No Questions
===================

   Some non GUI programs ask a yes-or-no question.  If the messages
(especially the questions) are translated into foreign languages, be
sure that you localize the answers too.  It would be very bad habit to
ask a question in one language and request the answer in another, often
English.

   The GNU C library contains `rpmatch' to give applications easy
access to the corresponding locale definitions.

 - Function: int rpmatch (const char *RESPONSE)
     The function `rpmatch' checks the string in RESPONSE whether or
     not it is a correct yes-or-no answer and if yes, which one.  The
     check uses the `YESEXPR' and `NOEXPR' data in the `LC_MESSAGES'
     category of the currently selected locale.  The return value is as
     follows:

    `1'
          The user entered an affirmative answer.

    `0'
          The user entered a negative answer.

    `-1'
          The answer matched neither the `YESEXPR' nor the `NOEXPR'
          regular expression.

     This function is not standardized but available beside in GNU libc
     at least also in the IBM AIX library.

This function would normally be used like this:

       ...
       /* Use a safe default.  */
       _Bool doit = false;
     
       fputs (gettext ("Do you really want to do this? "), stdout);
       fflush (stdout);
       /* Prepare the `getline' call.  */
       line = NULL;
       len = 0;
       while (getline (&line, &len, stdout) >= 0)
         {
           /* Check the response.  */
           int res = rpmatch (line);
           if (res >= 0)
             {
               /* We got a definitive answer.  */
               if (res > 0)
                 doit = true;
               break;
             }
         }
       /* Free what `getline' allocated.  */
       free (line);

   Note that the loop continues until an read error is detected or
until a definitive (positive or negative) answer is read.


File: libc.info,  Node: Message Translation,  Next: Searching and Sorting,  Prev: Locales,  Up: Top

Message Translation
*******************

   The program's interface with the human should be designed in a way to
ease the human the task.  One of the possibilities is to use messages in
whatever language the user prefers.

   Printing messages in different languages can be implemented in
different ways.  One could add all the different languages in the
source code and add among the variants every time a message has to be
printed.  This is certainly no good solution since extending the set of
languages is difficult (the code must be changed) and the code itself
can become really big with dozens of message sets.

   A better solution is to keep the message sets for each language are
kept in separate files which are loaded at runtime depending on the
language selection of the user.

   The GNU C Library provides two different sets of functions to support
message translation.  The problem is that neither of the interfaces is
officially defined by the POSIX standard.  The `catgets' family of
functions is defined in the X/Open standard but this is derived from
industry decisions and therefore not necessarily based on reasonable
decisions.

   As mentioned above the message catalog handling provides easy
extendibility by using external data files which contain the message
translations.  I.e., these files contain for each of the messages used
in the program a translation for the appropriate language.  So the tasks
of the message handling functions are

   * locate the external data file with the appropriate translations.

   * load the data and make it possible to address the messages

   * map a given key to the translated message

   The two approaches mainly differ in the implementation of this last
step.  The design decisions made for this influences the whole rest.

* Menu:

* Message catalogs a la X/Open::  The `catgets' family of functions.
* The Uniforum approach::         The `gettext' family of functions.


File: libc.info,  Node: Message catalogs a la X/Open,  Next: The Uniforum approach,  Up: Message Translation

X/Open Message Catalog Handling
===============================

   The `catgets' functions are based on the simple scheme:

     Associate every message to translate in the source code with a
     unique identifier.  To retrieve a message from a catalog file
     solely the identifier is used.

   This means for the author of the program that s/he will have to make
sure the meaning of the identifier in the program code and in the
message catalogs are always the same.

   Before a message can be translated the catalog file must be located.
The user of the program must be able to guide the responsible function
to find whatever catalog the user wants.  This is separated from what
the programmer had in mind.

   All the types, constants and functions for the `catgets' functions
are defined/declared in the `nl_types.h' header file.

* Menu:

* The catgets Functions::      The `catgets' function family.
* The message catalog files::  Format of the message catalog files.
* The gencat program::         How to generate message catalogs files which
                                can be used by the functions.
* Common Usage::               How to use the `catgets' interface.

