Explains working with GNU Gettext i18n framework in linux environments. Details in this slide are generic and could be used for learning purpose only. It does not include details about processes follow in C-DAC (GIST).
4. Why developing internationalized applications?
Localization (l10n) and Internationalization
(i18n) are means of adapting computer
software to different languages, regional
differences and technical requirements of a
target market.
www.cdac.in
Internationalization is a combination of
developers task and localization. Which
enables a product to be used with multiple
scripts and cultures; separating user interface
resources in a localizable format.
This concept is also known as NLS (National
Language Support or Native Language
Support).
5. Create language
times *.cpp & Set Locale
convert them into (LANG/LANGUAGE
objects Environment variable)
and Bind Text-Domain
Link Time Compile objects to Run Time
Localization libraries Localization
www.cdac.in
Triggering ‘gettext’ to
fetch strings from
Compile parent CPP message catalogs as
with Required per set locale.
Library
6. Translations: Portable Objects & Native Formats
Portable Objects
text file that
includes the
original texts and
the translations.
language
www.cdac.in
independent
Native Formats
.sdf, .xml, .propertie
s, .ini, .rc,
.yml, .wordfast, .json
, .sub
Machine Objects
includes the exact same contents as PO file.
are compiled to binary format and are used for machine translations.
7. Tools we need to keep in box!
For common platforms: Windows
Tools free /non-free Licensed Under Online/Offline Platform Dependency
/1.Linux / Mac
Pootle Free GNU GPL Online N/A
2. Rosetta Non-free Online N/A
3. Kartouche Free GNU GPL Online N/A
4. KBabel Free GNU GPL Offline Widows/Linux
5. poEdit Free Offline Widows/Linux
www.cdac.in
6. Attesoro Free GNU GPL Offline Linux
7. passolo Free GNU GPL Offline Windows
8. IniTranslator Free GNU GPL Offline Windows
9. GTranslator Free GNU GPL Offline Linux
10. LocFactoryEditor Free GNU GPL Offline Mac OS
Poedit: cross-platform gettext catalogs (.po files) editor, using Poedit we
can generate .mo files also.
Translation Toolkit (http://translate.sourceforge.net/wiki/toolkit/index )
• Convertors:
moz2po, oo2po, prop2po, php2po, txt2po, po2wordfast, pot2po, csv2
po, html2po, ini2po, json2po, rc2po
• Tools: poconflicts, pofilter, pogrep, pomerge, pocompile, poclean
8. Localization Sphere: Desktop, Web, Mobile
We have i18n support
available in every
technology in terms of
API, Framework, Librarie
s etc and they work on
similar concept of run-
time injection, fetching
www.cdac.in
strings from native
http://www.endlesslycurious.com/ 2008/10/ format.
An example could be…
GNU gettext for C, C++ and open source tools
Microsoft Localization Framework (resource.dll based)
For Java: Apache Tapestry and International Components for Unicode
BabelFx for Flash and Flex Rich Internet Applications
Rails Internationalization (I18n) API for Ruby on Rails
9. Localization Mechanics
How the things are actually linked up to provide
dynamism in localization: a tool having English
Speaking UI quickly switches to Hindi, thus add
Hindi territory to the list of its lovers!!
www.cdac.in
10. Locale – the program – basis of
localization
A way to handle localization levels easily…
The locale program writes information about current locale
environment, or all locales to standard output.
Environment variables available to locale aware programs:
1. LC_CTYPE (Character classification and case conversion)
2. LC_COLLATE (Collation order)
www.cdac.in
3. LC_TIME (Date and time formats)
4. LC_NUMERIC (Non-monetary numeric formats)
5. LC_MONETARY (Monetary formats)
6. LC_MESSAGES (Formats of informative, diagnostic messages and interactive
responses)
7. LC_PAPER (Paper size)
8. LC_NAME (Name formats)
9. LC_ADDRESS (Address formats and location information)
10. LC_TELEPHONE (Telephone number formats)
11. LC_MEASUREMENT (Measurement units)
12. LC_IDENTIFICATION (Metadata about the locale information)
LOCPATH: where locale data is stored. Default is /usr/lib/locale
11. Working with GNU Gettext
Things we need in place…
Required programs for GNOME are:
1. gcc (GNU C Compiler)
2. gettext (GNU Internationalized Utilities)
www.cdac.in
3. gettext-base (GNU Internationalized Utilities for the base
system)
4. libc6 (GNU C Shared Libraries)
5. libc6-dev (GNU C Development Libraries)
6. locales (Common files for locale support)
7. libintl (Message translations system compatible i18n library)
8. php-gettext (read gettext MO files directly through PHP)
9. gtranslator (PO File editor for the GNOME desktop)
10.poedit (gettext catalog editor)
12. Working with GNU Gettext –
Implementation thru ‘C’
#include<stdio.h>
#include<locale.h> Internationalized ‘Hello World’ Program
#include<libintl.h>
int main(void)
{
www.cdac.in
/* initializes the entire current locale as per environment variables set by the
user */
setlocale(LC_ALL, “”);
/* sets the base directory for the message catalogs */
bindtextdomain(“hello”, “.”);
textdomain(“hello”); /* set domain for future gettext() calls */
/* allows the translator to work independently from the programmer */
printf(gettext(“Hello Worldn”));
return(0);
}
13. Working with GNU Gettext –
Implementation thru ‘C’
Next Steps…
• Extract strings from
source file
• Create the template
for translations
msginit
www.cdac.in
xgettext
• Create the files to
translate using the
template
• Edit and translate • Create target directories
file. in Text Domain Location
• Set Project-Id- bound.
Version to • Compile and install
{TextDomain} translations
msgfmt
14. Developers Checklist
Externalize all translatable content – Take the
text out of the code and place in resource files
www.cdac.in
Separating the translatable text from the code will avoid code
duplication, will let localizers and developers work on updates
simultaneously and remove the possibility of damaging code during
translation.
15. Developers Checklist
Allow input of international data and foreign scripts
www.cdac.in
Input fields often do character validation, so make sure to attach
the validation rule to the specific country or have the validation
rule update when country selection changes.
16. Developers Checklist
Avoid string concatenation
www.cdac.in
Concatenation only works when the content is written for a
specific language. Avoid constructing strings through concatenation
as this makes translation hard – even impossible in certain cases.
17. Developers Checklist
Avoid using given string variable in more
than one context
www.cdac.in
This form will not work for many languages as the verb
will be different depending on the product name.
Further, do not use a noun as a parameter in a sentence
and avoid reusing strings. Translation tools let linguists
recycle previously translated strings during the
translation pass.
18. Developers Checklist
Do all string handling with Unicode
Make sure the characters don‟t get corrupt during
input > database > output route:
www.cdac.in
An internationalized application uses Unicode for all handling of
strings and text. This applies to the static text as well as the
dynamic text that is communicated between the application and
the database.
19. Developers Checklist
Provide extra room for text expansion – User Interface
www.cdac.in
Translated text expands 30% on average with the exception of
some languages where it may shrink. Leave enough room on
the layout for expansion and avoid static sizing. If there are
strings that should not exceed a certain size, always include
comments in the resource file for those items.
20. Developers Checklist
Add context information to strings using comments
www.cdac.in
A string can be translated to a Indian language in many
different ways. It is very important to provide context
information in the resource file when necessary.
21. Developers Checklist
Use system functions for date/time and numeric formatting
www.cdac.in
Date/time and numeric formatting differ even between the regions
that speak the same language. Example: dd.mm.yyyy in Bengali; dd-
mm-yyyy in Kannada, Gujarati, Hindi, Marathi, Punjabi, Tamil; d-m-
yyyy in Telugu, no leading zeroes.
22. Developers Checklist
Externalize all styles and formatting
Font face, size, style will be different for some languages. In line
styling will prevent these modifications to be done or require code
duplication. Always use external style sheets to define styles for a
www.cdac.in
web application.
Avoid using styling tags such as "em", "strong" and "italic" text. Bold
font faces cause problems as bold strokes may result in a big blob of
ink when the font size is small in printing.
If emphasizing a string is needed with bold font face, we can do it by
externalizing the style. This way, localizers can decide for font size as
per need.
24. Developers Checklist
Use system functions for sorting and string comparison
www.cdac.in
http://msdn.microsoft.com/en -us/goglobal/bb688122
This example has been taken from Microsoft MSDN.
An internationalized application does not use any manual
sorting logic and relies on the underlying framework‟s API for
string comparison. This applies to database data as well as the
strings that come from resource files, which may be used in
form elements and others such as combo boxes.