SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Camomile : A Unicode library for OCaml

                   Yoriyuki Yamagata

  National Institute of Advanced Science and Technology (AIST)


        ML Workshop, September 18, 2011
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Overview - functionality
Overview - functionality
   Camomile - A Unicode library for OCaml
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
      Case mapping
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
      Case mapping
      Collation (sort and search)
Overview - feature
Overview - feature
      Only support “logical” operations
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
      Purely written in OCaml
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
      Purely written in OCaml
      Functors and lazy evaluation play crucial roles
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
ASCII to Unicode : challenge of multilingualization
ASCII to Unicode : challenge of multilingualization
   Large number of characters
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
                   ˜
              Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
                   ˜
              Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
              â=a+.+ˆ=a+ˆ+.
               .
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
                Case mapping OΣOΣ → oσoς (Greek)
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
                Case mapping OΣOΣ → oσoς (Greek)
                     Sorting ... < H < CH < I < ... (Slovak)
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Unicode normal forms - what is it?
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
   Normal forms give the unique representations
   There are 4 normal forms
    1. NFD
    2. NFC
    3. NFKD
    4. NFKC
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
   Normal forms give the unique representations
   There are 4 normal forms
    1. NFD
    2. NFC
    3. NFKD
    4. NFKC

   We concentrate NFD
Unicode normal form - NFD
Unicode normal form - NFD




   1. Decompose characters as much as possible
            â⇒a+ˆ ⇒a+.+ˆ
             .   .
Unicode normal form - NFD




   1. Decompose characters as much as possible
            â⇒a+ˆ ⇒a+.+ˆ
             .   .
   2. Do stable sort on combining characters based on
      combining class
              a+.+ˆ ⇒a+.+ˆ
Camomile strings - UTF8, UTF16, UCS4
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray

  UCS4
  UTF-32 string as a 32-bit integer bigarray
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray

  UCS4
  UTF-32 string as a 32-bit integer bigarray

  UnicodeString.Type
  UTF-8/16 and UCS4 all confirm UnicodeString.Type
  String operations are functors over UnicodeString.Type
Camomile modules - UNF
  Module for Unicode normal form
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  Create a module for a given Unicode string
        module type Type =
        sig
          type text

          val   nfd : text -> text
          val   nfkd : text -> text
          val   nfc : text -> text
          val   nfkc : text -> text

          val canon_compare : text -> text -> int
        end

        module Make (Text : UnicodeString.Type) :
          Type with type text = Text.t and
          type index = Text.index
Camomile modules - UNF
  Conversion to NFD
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  Compare strings by semantic equivalence
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  By lazily building NFD and compare them
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
ulib - a yet another Unicode library
   Now under development
ulib - a yet another Unicode library
   ulib is compact
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
       No data file
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
       No data file
       No initialization
ulib - a yet another Unicode library
   ulib is modern
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
       Zipper for indexing rope
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
       Zipper for indexing rope
       Pluggable code converter using first class modules
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Conclusion
Conclusion
     Unicode is different from ASCII
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
     Functors and lazyness play crucial roles
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
     Functors and lazyness play crucial roles
     More simplified library "ulib" is now under development.
Project URL




   Camomile https://github.com/yoriyuki/Camomile
         ulib https://github.com/yoriyuki/ulib

Weitere ähnliche Inhalte

Andere mochten auch

Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Anil Madhavapeddy
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocamlpramode_ce
 
Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Anil Madhavapeddy
 
An Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellAn Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellMichel Rijnders
 
Introduction to haskell
Introduction to haskellIntroduction to haskell
Introduction to haskellLuca Molteni
 
OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法Hiroki Mizuno
 
Os Peytonjones
Os PeytonjonesOs Peytonjones
Os Peytonjonesoscon2007
 
OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012Anil Madhavapeddy
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Kel Cecil
 
Real World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたReal World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたblackenedgold
 
関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCamlHaruka Oikawa
 
PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法Yosuke Onoue
 
Neural Turing Machine Tutorial
Neural Turing Machine TutorialNeural Turing Machine Tutorial
Neural Turing Machine TutorialMark Chang
 

Andere mochten auch (20)

A taste of Functional Programming
A taste of Functional ProgrammingA taste of Functional Programming
A taste of Functional Programming
 
Ocaml
OcamlOcaml
Ocaml
 
Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocaml
 
Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)
 
Haskell - Functional Programming
Haskell - Functional ProgrammingHaskell - Functional Programming
Haskell - Functional Programming
 
An Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellAn Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using Haskell
 
計算数学
計算数学計算数学
計算数学
 
Lispmeetup11
Lispmeetup11Lispmeetup11
Lispmeetup11
 
Introduction to haskell
Introduction to haskellIntroduction to haskell
Introduction to haskell
 
OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法
 
Os Peytonjones
Os PeytonjonesOs Peytonjones
Os Peytonjones
 
OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!
 
Real World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたReal World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみた
 
関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml
 
PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法
 
Why Haskell
Why HaskellWhy Haskell
Why Haskell
 
Neural Turing Machine Tutorial
Neural Turing Machine TutorialNeural Turing Machine Tutorial
Neural Turing Machine Tutorial
 
Object-oriented Basics
Object-oriented BasicsObject-oriented Basics
Object-oriented Basics
 

Ähnlich wie Camomile : A Unicode library for OCaml

Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - ITguest6ddfb98
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encodingDuy Lâm
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptAlula Tafere
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeUlf Mattsson
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsRay Paseur
 
SignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUSignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUStephen Slevinski
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character setsrenchenyu
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Jerome Eteve
 
Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Feihong Hsu
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Type हिन्दी in Java
Type हिन्दी in JavaType हिन्दी in Java
Type हिन्दी in Javagagmansa
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode formatAdityaSharma1452
 
Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Dimelo R&D Team
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Ulf Mattsson
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsStephen Slevinski
 

Ähnlich wie Camomile : A Unicode library for OCaml (20)

Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set Collisions
 
Character Sets
Character SetsCharacter Sets
Character Sets
 
SignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUSignWriting in Unicode dot SWU
SignWriting in Unicode dot SWU
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character sets
 
String Encodings
String EncodingsString Encodings
String Encodings
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)
 
Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Type हिन्दी in Java
Type हिन्दी in JavaType हिन्दी in Java
Type हिन्दी in Java
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9
 
SignWriting in Unicode Next
SignWriting in Unicode NextSignWriting in Unicode Next
SignWriting in Unicode Next
 
Uncdtalk
UncdtalkUncdtalk
Uncdtalk
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
Unicode basics in python
Unicode basics in pythonUnicode basics in python
Unicode basics in python
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerations
 

Mehr von Yamagata Yoriyuki

ヴォイニッチ手稿と私
ヴォイニッチ手稿と私ヴォイニッチ手稿と私
ヴォイニッチ手稿と私Yamagata Yoriyuki
 
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Yamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Runtime verification based on CSP
Runtime verification based on CSPRuntime verification based on CSP
Runtime verification based on CSPYamagata Yoriyuki
 
CSPを用いたログ解析その他
CSPを用いたログ解析その他CSPを用いたログ解析その他
CSPを用いたログ解析その他Yamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークRubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークYamagata Yoriyuki
 
CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)Yamagata Yoriyuki
 
CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)Yamagata Yoriyuki
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logicYamagata Yoriyuki
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logicYamagata Yoriyuki
 
Camomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリCamomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリYamagata Yoriyuki
 
Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Yamagata Yoriyuki
 

Mehr von Yamagata Yoriyuki (19)

ヴォイニッチ手稿と私
ヴォイニッチ手稿と私ヴォイニッチ手稿と私
ヴォイニッチ手稿と私
 
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
モデル検査紹介
モデル検査紹介モデル検査紹介
モデル検査紹介
 
Runtime verification based on CSP
Runtime verification based on CSPRuntime verification based on CSP
Runtime verification based on CSP
 
CSPを用いたログ解析その他
CSPを用いたログ解析その他CSPを用いたログ解析その他
CSPを用いたログ解析その他
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
OCamlとUnicode
OCamlとUnicodeOCamlとUnicode
OCamlとUnicode
 
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークRubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
 
CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)
 
CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logic
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logic
 
UML&FM 2012
UML&FM 2012UML&FM 2012
UML&FM 2012
 
Translating STM to CSP
Translating STM to CSPTranslating STM to CSP
Translating STM to CSP
 
Camomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリCamomile - OCaml用Unicodeライブラリ
Camomile - OCaml用Unicodeライブラリ
 
Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010
 
CamomileでUnicode
CamomileでUnicodeCamomileでUnicode
CamomileでUnicode
 

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Camomile : A Unicode library for OCaml

  • 1. Camomile : A Unicode library for OCaml Yoriyuki Yamagata National Institute of Advanced Science and Technology (AIST) ML Workshop, September 18, 2011
  • 2. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 3. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 5. Overview - functionality Camomile - A Unicode library for OCaml
  • 6. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type
  • 7. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings
  • 8. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings
  • 9. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings Case mapping
  • 10. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings Case mapping Collation (sort and search)
  • 12. Overview - feature Only support “logical” operations
  • 13. Overview - feature Only support “logical” operations No support for rendering or formatting
  • 14. Overview - feature Only support “logical” operations No support for rendering or formatting Purely written in OCaml
  • 15. Overview - feature Only support “logical” operations No support for rendering or formatting Purely written in OCaml Functors and lazy evaluation play crucial roles
  • 16. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 17. ASCII to Unicode : challenge of multilingualization
  • 18. ASCII to Unicode : challenge of multilingualization Large number of characters
  • 19. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff
  • 20. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings
  • 21. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32
  • 22. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings
  • 23. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters
  • 24. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨
  • 25. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
  • 26. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. .
  • 27. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions
  • 28. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions Case mapping OΣOΣ → oσoς (Greek)
  • 29. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions Case mapping OΣOΣ → oσoς (Greek) Sorting ... < H < CH < I < ... (Slovak)
  • 30. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 31. Unicode normal forms - what is it?
  • 32. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings.
  • 33. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . .
  • 34. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . . Normal forms give the unique representations There are 4 normal forms 1. NFD 2. NFC 3. NFKD 4. NFKC
  • 35. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . . Normal forms give the unique representations There are 4 normal forms 1. NFD 2. NFC 3. NFKD 4. NFKC We concentrate NFD
  • 37. Unicode normal form - NFD 1. Decompose characters as much as possible â⇒a+ˆ ⇒a+.+ˆ . .
  • 38. Unicode normal form - NFD 1. Decompose characters as much as possible â⇒a+ˆ ⇒a+.+ˆ . . 2. Do stable sort on combining characters based on combining class a+.+ˆ ⇒a+.+ˆ
  • 39. Camomile strings - UTF8, UTF16, UCS4
  • 40. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string
  • 41. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray
  • 42. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray UCS4 UTF-32 string as a 32-bit integer bigarray
  • 43. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray UCS4 UTF-32 string as a 32-bit integer bigarray UnicodeString.Type UTF-8/16 and UCS4 all confirm UnicodeString.Type String operations are functors over UnicodeString.Type
  • 44. Camomile modules - UNF Module for Unicode normal form module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 45. Camomile modules - UNF Create a module for a given Unicode string module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 46. Camomile modules - UNF Conversion to NFD module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 47. Camomile modules - UNF Compare strings by semantic equivalence module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 48. Camomile modules - UNF By lazily building NFD and compare them module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 49. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 50. ulib - a yet another Unicode library Now under development
  • 51. ulib - a yet another Unicode library ulib is compact
  • 52. ulib - a yet another Unicode library ulib is compact Minimum functionalities
  • 53. ulib - a yet another Unicode library ulib is compact Minimum functionalities No data file
  • 54. ulib - a yet another Unicode library ulib is compact Minimum functionalities No data file No initialization
  • 55. ulib - a yet another Unicode library ulib is modern
  • 56. ulib - a yet another Unicode library ulib is modern Rope for Unicode string
  • 57. ulib - a yet another Unicode library ulib is modern Rope for Unicode string Zipper for indexing rope
  • 58. ulib - a yet another Unicode library ulib is modern Rope for Unicode string Zipper for indexing rope Pluggable code converter using first class modules
  • 59. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 61. Conclusion Unicode is different from ASCII
  • 62. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode
  • 63. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode Functors and lazyness play crucial roles
  • 64. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode Functors and lazyness play crucial roles More simplified library "ulib" is now under development.
  • 65. Project URL Camomile https://github.com/yoriyuki/Camomile ulib https://github.com/yoriyuki/ulib