This talk lasts
Localisation is easy
Administrative Notes
• @pilif on twitter
• pilif on github
• working at Sensational AG
• @pilif on twitter
• pilif on github
• working at Sensational AG
• warming up to shirts
Thanks Richard for the
Recording
About that 💩
Maybe ES6…?
My host name is a horrible spoiler if you're into JRPGs. Disregard
however…
close enough.
Back to the topic at
hand
Let’s talk terms
• Language is a language as it is spoken or
written
• Locale is the name given to a set of parameters
tha...
Locale
• Locales consist of a language…
• … and a country
• … and sometimes specific variants
Specifying locales
• IETF BCP-47 document
• See RFC 5646 and RFC 4647
• Use language-script-territory@modifier
• POSIX uses...
fr-Latn-CH
fr-CH
fr_CH.utf-8
The Locale affects
many things
Number formatting
• Probably the most obvious of the bunch.
• Decimal separator
• Thousands separator
• Sign
• Also: Curre...
Some Samples
de-CH de-DE en-US
decimal
separator
. , .
thousands
separator
' . ,
12,435
en-US twelve thousand four hundred and thirty five
de-DE twelve comma four three five
de-CH error
Date Formatting
• Obviously names of months and weekdays
• Order of distinct parts
• Separator character
• Commonly used f...
Date Formatting
• Libraries usually provide a generic short/
medium/long format
• Libraries also provide templates
• If yo...
2015-07-18 17:47
Long Medium Short
en-US
July 18, 2015 at
4:58:00 PM CEST
Jul 18, 2015,
4:58:00 PM
7/18/15, 4:58 PM
fr-CA
...
Choice of calendar
• Most of the world is using the Gregorian
calendar
• The Julian calendar uses the same month names
but...
Collation order
• How to compare to strings. Which one is first?
• Where to put the characters with pesky
accents?
• How to...
Collation fun*
• Phonebook german vs. ordinary german, vs.
Austrian german (dealing with umlauts)
• Contractions (Spanish ...
Case folding
• Some languages don’t differentiate between upper- and
lowercase
• Inconsistent mapping between upper- and l...
Double the fun
• Collation and Case-Folding provide an interesting
team
• Depending on locale, upper- and lowercase should...
Collation strength
• icu created the concept of “collation strength”
• strength 1 is the most lenient
• strength 5 is the ...
‘nough said
RTL
Perspectives matter
Context matters
• “This slide lasts one minute”
• “This talk lasts 30 minutes”
• “Lunch lasted 1:30 hours”
• “Tomorrow I’l...
Let’s get practical
Locale handling is like escaping
• Always store raw unformatted data
• Format near the end of the chain
• Just before you ...
UI Language is not locale
• Users might prefer to use the os in a different
language than what’s inferred by their locale
...
Avoid this mess
Avoid this mess
Avoid this mess
Mixing Locales
• Forming sentences in UI language with locale formatted
data is… challenging
• Be mindful that language mi...
Never be helpful* and
translate units
1kg in de_CH is not
1lbs in en_US
Btw: Apple’s APIs are
really good at this
What about web sites?
• Never, ever infer UI language by IP Geolocation.
People from Google: This slide is for you!
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
People from Google: This slid...
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
• Promise!
People from Google...
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
• Promise!
• You may infer Lo...
Rely on HTTP
• Trust Accept-Language - by now browser set
it correctly
• Use the header to determine UI language
• Use the...
SHOW ME SOME
CODE ALREADY!!!
The past
• There has always been date formatting
(Date.toLocaleString). Mostly useless
• People were self-nebling (search ...
The present
• Microsoft has donated a huge chunk of localisation code to the
jQuery project.
• It’s not integrated into jQ...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
Globalize.locale("fr-CH");



console.log(Globalize.formatDate(

new Date(), {datetime: "medium" }

));



console.log(Glo...
The future
• ECMA-402 from 2012
• Yes. Specs from 2012 are “the future” in JS land
• Provides the global Intl object
• Dat...
Could be worse
node.js is still
bikeshedding because icu
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
var f = new Intl.DateTimeFormat('de-CH', {

weekday: 'long', year: 'numeric',

month: 'long', day: 'numeric'

});

console...
Conclusion
• Proper localisation is part of our job to make the web useful for
everybody
• Use the libraries provided
• Wh...
Before I leave
""".length
[…"""].length
In case you answered
11 and 8, I salute you
Thanks everyone and
enjoy your evening
• U+1F468 (MAN) 👨
• U+200D (ZERO WIDTH JOINER)
• U+2764 (HEAVY BLACK HEART) ❤
• U+FE0F (VARIATION SELECTOR-16)
• U+200D (Z...
This talk lasts 三十分钟
This talk lasts 三十分钟
Nächste SlideShare
Wird geladen in …5
×

This talk lasts 三十分钟

704 Aufrufe

Veröffentlicht am

My talk from Swissjs 2015 . This is about issues to keep in mind when localising software, showing some WTF moments that people probably don't keep in mind when the think about localisation.

Veröffentlicht in: Software
0 Kommentare
0 Gefällt mir
Statistik
Notizen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Keine Downloads
Aufrufe
Aufrufe insgesamt
704
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
122
Aktionen
Geteilt
0
Downloads
3
Kommentare
0
Gefällt mir
0
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

This talk lasts 三十分钟

  1. 1. This talk lasts Localisation is easy
  2. 2. Administrative Notes
  3. 3. • @pilif on twitter • pilif on github • working at Sensational AG
  4. 4. • @pilif on twitter • pilif on github • working at Sensational AG • warming up to shirts
  5. 5. Thanks Richard for the Recording
  6. 6. About that 💩
  7. 7. Maybe ES6…?
  8. 8. My host name is a horrible spoiler if you're into JRPGs. Disregard
  9. 9. however…
  10. 10. close enough.
  11. 11. Back to the topic at hand
  12. 12. Let’s talk terms • Language is a language as it is spoken or written • Locale is the name given to a set of parameters that define how things should be done for users speaking a certain language in a certain place • There are many more locales than countries
  13. 13. Locale • Locales consist of a language… • … and a country • … and sometimes specific variants
  14. 14. Specifying locales • IETF BCP-47 document • See RFC 5646 and RFC 4647 • Use language-script-territory@modifier • POSIX uses language_territory.encoding@modifier
  15. 15. fr-Latn-CH
  16. 16. fr-CH
  17. 17. fr_CH.utf-8
  18. 18. The Locale affects many things
  19. 19. Number formatting • Probably the most obvious of the bunch. • Decimal separator • Thousands separator • Sign • Also: Currency information
  20. 20. Some Samples de-CH de-DE en-US decimal separator . , . thousands separator ' . ,
  21. 21. 12,435 en-US twelve thousand four hundred and thirty five de-DE twelve comma four three five de-CH error
  22. 22. Date Formatting • Obviously names of months and weekdays • Order of distinct parts • Separator character • Commonly used formats in different contexts
  23. 23. Date Formatting • Libraries usually provide a generic short/ medium/long format • Libraries also provide templates • If your library’s template language has any characters that are not for replacement, they are doing it wrong • Apple does it right since 10.11 and iOS9
  24. 24. 2015-07-18 17:47 Long Medium Short en-US July 18, 2015 at 4:58:00 PM CEST Jul 18, 2015, 4:58:00 PM 7/18/15, 4:58 PM fr-CA 18 juillet 2015 16:58:00 UTC+2 18 juil. 2015 16:58:00 15-07-18 16:58 fr-CH 18 juillet 2015 16:58:00 UTC+2 18 juil. 2015 16:58:00 18.07.15 16:58 fr-FR 18 juillet 2015 16:58:00 UTC+2 18 juil. 2015 16:58:00 18/07/2015 16:58
  25. 25. Choice of calendar • Most of the world is using the Gregorian calendar • The Julian calendar uses the same month names but is off by 13 days (they have July 5th right now) • Other calendars use different month names • Might affect holiday calculations
  26. 26. Collation order • How to compare to strings. Which one is first? • Where to put the characters with pesky accents? • How to deal with case differences? • What about non-latin scripts?
  27. 27. Collation fun* • Phonebook german vs. ordinary german, vs. Austrian german (dealing with umlauts) • Contractions (Spanish ch counts as one letter, ch in Czech sorts after h, but c after b, etc) • Handling of accents is language-dependent • Case insensitive is a mess
  28. 28. Case folding • Some languages don’t differentiate between upper- and lowercase • Inconsistent mapping between upper- and lowercase (ß => SS, the reverse is not always true) • Uppercasing accented characters is language (and sometimes locale) dependent. French characters often loose accents when uppercasing • Inconsistent uppercasing for some languages (uppercase turkish i is İ. Lowercase turkish I is ı)
  29. 29. Double the fun • Collation and Case-Folding provide an interesting team • Depending on locale, upper- and lowercase should be sorted together or apart • In some locales, case doesn’t matter at all when sorting • In some locales, case always matters when sorting • Depends on the use-case
  30. 30. Collation strength • icu created the concept of “collation strength” • strength 1 is the most lenient • strength 5 is the most exact • Example: Strength 2 removes accents unless the language is Danish
  31. 31. ‘nough said RTL
  32. 32. Perspectives matter
  33. 33. Context matters • “This slide lasts one minute” • “This talk lasts 30 minutes” • “Lunch lasted 1:30 hours” • “Tomorrow I’ll sleep in” • “August, 1th is a national holiday”
  34. 34. Let’s get practical
  35. 35. Locale handling is like escaping • Always store raw unformatted data • Format near the end of the chain • Just before you escape • Parse user input as early as possible • Use native data types
  36. 36. UI Language is not locale • Users might prefer to use the os in a different language than what’s inferred by their locale • Just because I’m in de_CH it doesn’t mean I want your software to speak german to me • UI language is completely different from the users locale
  37. 37. Avoid this mess
  38. 38. Avoid this mess
  39. 39. Avoid this mess
  40. 40. Mixing Locales • Forming sentences in UI language with locale formatted data is… challenging • Be mindful that language might influence some locale formatting. • “This talk lasts ” • or rather “This talk lasts 30 minutes” • It depends. Does the locale also use hours and minutes?
  41. 41. Never be helpful* and translate units
  42. 42. 1kg in de_CH is not 1lbs in en_US
  43. 43. Btw: Apple’s APIs are really good at this
  44. 44. What about web sites? • Never, ever infer UI language by IP Geolocation. People from Google: This slide is for you!
  45. 45. What about web sites? • Never, ever infer UI language by IP Geolocation. • Ever. Ever. EVER. People from Google: This slide is for you!
  46. 46. What about web sites? • Never, ever infer UI language by IP Geolocation. • Ever. Ever. EVER. • Promise! People from Google: This slide is for you!
  47. 47. What about web sites? • Never, ever infer UI language by IP Geolocation. • Ever. Ever. EVER. • Promise! • You may infer Locale from IP Geolocation though People from Google: This slide is for you!
  48. 48. Rely on HTTP • Trust Accept-Language - by now browser set it correctly • Use the header to determine UI language • Use the header to determine default locale • But ask the user • Same goes for time zones
  49. 49. SHOW ME SOME CODE ALREADY!!!
  50. 50. The past • There has always been date formatting (Date.toLocaleString). Mostly useless • People were self-nebling (search youtube for “ich neble selber”) for example in date pickers and libraries • hint: applying substr() to Date.toDateString() is not a correct solution. • same goes for using replace(‘.’, ‘,’) on a number
  51. 51. The present • Microsoft has donated a huge chunk of localisation code to the jQuery project. • It’s not integrated into jQuery, but maintained by the jQuery project • Check out https://github.com/jquery/globalize • Doesn’t support collation • The library is big • But most of it is data and this problem can only be solved with a huge database of special cases
  52. 52. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  53. 53. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  54. 54. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  55. 55. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  56. 56. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  57. 57. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  58. 58. Globalize.locale("fr-CH");
 
 console.log(Globalize.formatDate(
 new Date(), {datetime: "medium" }
 ));
 
 console.log(Globalize.formatDate(
 new Date(), {skeleton: "yMMMM" }
 ));
 
 console.log(Globalize.formatNumber(12345.6789));
 
 console.log(Globalize.formatCurrency(1956.3334, "EUR"));
 
 console.log(Globalize.formatRelativeTime(-35, "second"));
  59. 59. The future • ECMA-402 from 2012 • Yes. Specs from 2012 are “the future” in JS land • Provides the global Intl object • Date, Number formatting and Collation • see: http://www.ecma-international.org/ ecma-402/1.0/
  60. 60. Could be worse
  61. 61. node.js is still bikeshedding because icu
  62. 62. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  63. 63. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  64. 64. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  65. 65. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  66. 66. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  67. 67. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  68. 68. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  69. 69. var f = new Intl.DateTimeFormat('de-CH', {
 weekday: 'long', year: 'numeric',
 month: 'long', day: 'numeric'
 });
 console.log(f.format(new Date()));
 
 var n = new Intl.NumberFormat('de-CH', {
 style: "decimal",
 minimumFractionDigits: 2
 });
 console.log(n.format(1234.5));
 
 var currency = new Intl.NumberFormat('de-CH', {
 style: "currency",
 currency: 'EUR'
 });
 console.log(currency.format(1234.5));
 
 var comp = new Intl.Collator('de-CH');
 var words = [
 "Swissjs", "swissjs", "is",
 "loads", "of", "fun"
 ];
 console.log(words.sort(comp));
  70. 70. Conclusion • Proper localisation is part of our job to make the web useful for everybody • Use the libraries provided • Whenever you think you know better than the library: No. You don’t. • Remember that UI language and Locale are not always connected • Don’t do IP geolocation for language choice • When in doubt: Ask the user. She’ll know for sure.
  71. 71. Before I leave
  72. 72. """.length […"""].length
  73. 73. In case you answered 11 and 8, I salute you
  74. 74. Thanks everyone and enjoy your evening
  75. 75. • U+1F468 (MAN) 👨 • U+200D (ZERO WIDTH JOINER) • U+2764 (HEAVY BLACK HEART) ❤ • U+FE0F (VARIATION SELECTOR-16) • U+200D (ZERO WIDTH JOINER) • U+1F48B (KISS MARK) 💋 • U+200D (ZERO WIDTH JOINER) • U+1F468 (MAN) 👨

×