SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
2011-06-17 HiRoshima.R #1@



Saturday, June 18, 2011                                1
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                2
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                3
Saturday, June 18, 2011   4
t




Saturday, June 18, 2011       5
Saturday, June 18, 2011   6
•
                 •

                 •        A   B

           •

Saturday, June 18, 2011           7
:       “however”



                              109    347        8    493

                          [   ] However, ....
                          [   ] ..., however, ....
                          [   ] ..., however.

Saturday, June 18, 2011                                    8
> freq <- c(109,347,8)
    > chisq.test(freq,correct=FALSE)

            Chi-squared test for given probabilities

        data:             freq
        X-squared = 391.7371, df = 2, p-value < 2.2e-16


    #                                               2
    #      http://homepage2.nifty.com/nandemoarchive/toukei_kiso/t_F_chi.htm




Saturday, June 18, 2011                                                        9
Saturday, June 18, 2011   10
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                11
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                12
Saturday, June 18, 2011   13
1.
                          2.
                          3.
                          4.
                          5.
                          6.

Saturday, June 18, 2011        14
1.
              •
                          • ns <- scan("ns_raw.txt", what="character")
              •
                          • ns <- scan(choose.files(), what="char")
              •
                    • getwd()                                            !


Saturday, June 18, 2011                                                      15
2.

               •          head(           ,       )

               • tail(                ,       )
                          •       /



Saturday, June 18, 2011                               16
2.
                    •grep (“              ”,          )
                          •
                                   > grep("school", ns)


                          •   ns
                               > ns[grep("school", ns)]


Saturday, June 18, 2011                                   17
2.
            •              [       ]

                  • > ns[100]
                     • 100
                  • > ns[c(98,99,100)]
                     • 98, 99, 100
                     •c
Saturday, June 18, 2011                  18
3.
                    •
                    •strsplit (           ,“             ”)

                                  > strsplit (ns, " ")

                          •ns
                          •
                          •            list

Saturday, June 18, 2011                                       19
3.
     •
                             > ns_list <- strsplit (ns, " ")

                   •                     ns_list

                                    > unlist (ns_list)

                  • ns_list
                  • unlist(strsplit(ns, " "))
Saturday, June 18, 2011                                        20
4.


                          sort (       )
                          > ns2 <- sort(unlist(ns_list))




Saturday, June 18, 2011                                    21
4.

     unique (                  )
     > ns3 <- unique (sort(unlist(ns_list)))
     #                            (          )
     # sort(unique(unlist(ns_list)))




Saturday, June 18, 2011                          22
5.
                 table (        )
                 > ns4 <- table(unlist(strsplit (ns, " ")))

                 #                                  table
                 #




Saturday, June 18, 2011                                       23
5.

                 > ns5 <- length(unlist(strsplit (ns, " ")))

                 #




Saturday, June 18, 2011                                        24
5.

     > ns6 <- length(unique(sort(unlist(strsplit (ns, " ")))))

     #
     #

        > ns7 <- unique(sort(unlist (ns_list)))
        > length(ns7)


Saturday, June 18, 2011                                          25
6.
             > write.table(ns4, file=“freq1.txt”)
             > write.table(ns5, file=“freq2.txt”)
             > write.table(ns6, file=“freq3.txt”)

             # getwd()
             # Excel




Saturday, June 18, 2011                            26
Saturday, June 18, 2011   27
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                28
Agenda
          1. R             ―       ―

          2. R

          3. R

Saturday, June 18, 2011                29
•
                          •
                          •
                          •
                              •   ... orz


Saturday, June 18, 2011                     30
RMeCab

Saturday, June 18, 2011            31
RMeCab
                 •
                  •R           MeCab



                          •        R



Saturday, June 18, 2011                32
• RMeCabText() :
          • RMeCabFreq() :

          • Ngram() : N-gram

          • collocate() :

Saturday, June 18, 2011        33
Saturday, June 18, 2011   34
2,940    1,785   3,780

Saturday, June 18, 2011                   35
Saturday, June 18, 2011   36
twitter: @sakaue

                          e-mail: tsakaue<AT>hiroshima-u.ac.jp




Saturday, June 18, 2011                                          37

Weitere ähnliche Inhalte

Mehr von SAKAUE, Tatsuya

MethokenOkinawa_Sakaue_LearnerCorpus
MethokenOkinawa_Sakaue_LearnerCorpusMethokenOkinawa_Sakaue_LearnerCorpus
MethokenOkinawa_Sakaue_LearnerCorpus
SAKAUE, Tatsuya
 
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
SAKAUE, Tatsuya
 
LET2012 ワークショップ「R による教育・言語データ処理のススメ」
LET2012 ワークショップ「R による教育・言語データ処理のススメ」LET2012 ワークショップ「R による教育・言語データ処理のススメ」
LET2012 ワークショップ「R による教育・言語データ処理のススメ」
SAKAUE, Tatsuya
 
授業外で個別学習を促進するためのポッドキャスト利用
授業外で個別学習を促進するためのポッドキャスト利用授業外で個別学習を促進するためのポッドキャスト利用
授業外で個別学習を促進するためのポッドキャスト利用
SAKAUE, Tatsuya
 
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
SAKAUE, Tatsuya
 

Mehr von SAKAUE, Tatsuya (20)

HiroshimaR6_Introduction
HiroshimaR6_IntroductionHiroshimaR6_Introduction
HiroshimaR6_Introduction
 
HiroshimaR5_Intro
HiroshimaR5_IntroHiroshimaR5_Intro
HiroshimaR5_Intro
 
JASELE2015-KumamotoWS
JASELE2015-KumamotoWSJASELE2015-KumamotoWS
JASELE2015-KumamotoWS
 
HiroshimaR4_LT_sakaue
HiroshimaR4_LT_sakaueHiroshimaR4_LT_sakaue
HiroshimaR4_LT_sakaue
 
Hiroshimar4_Rintro
Hiroshimar4_RintroHiroshimar4_Rintro
Hiroshimar4_Rintro
 
Hiroshimar3_rmecab
Hiroshimar3_rmecabHiroshimar3_rmecab
Hiroshimar3_rmecab
 
HiRoshimaR3_IntroR
HiRoshimaR3_IntroRHiRoshimaR3_IntroR
HiRoshimaR3_IntroR
 
MethokenOkinawa_Sakaue_LearnerCorpus
MethokenOkinawa_Sakaue_LearnerCorpusMethokenOkinawa_Sakaue_LearnerCorpus
MethokenOkinawa_Sakaue_LearnerCorpus
 
外国語教育メディア学会第54回全国研究大会ワークショップ「Rによる外国語教育データの分析と可視化の基本」
外国語教育メディア学会第54回全国研究大会ワークショップ「Rによる外国語教育データの分析と可視化の基本」外国語教育メディア学会第54回全国研究大会ワークショップ「Rによる外国語教育データの分析と可視化の基本」
外国語教育メディア学会第54回全国研究大会ワークショップ「Rによる外国語教育データの分析と可視化の基本」
 
統計解析環境Rによる統計処理の基本―検定と視覚化―
統計解析環境Rによる統計処理の基本―検定と視覚化―統計解析環境Rによる統計処理の基本―検定と視覚化―
統計解析環境Rによる統計処理の基本―検定と視覚化―
 
統計解析環境Rによる言語データの分析
統計解析環境Rによる言語データの分析統計解析環境Rによる言語データの分析
統計解析環境Rによる言語データの分析
 
ポッドキャスト利用による個別学習の支援とリスニング不安への影響
ポッドキャスト利用による個別学習の支援とリスニング不安への影響ポッドキャスト利用による個別学習の支援とリスニング不安への影響
ポッドキャスト利用による個別学習の支援とリスニング不安への影響
 
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
Nagoya.R #10 LT 「グラフはベクタ(ベクトル)画像で出力しようじゃありませんか」
 
ベクタ画像と PNG 画像の比較
ベクタ画像と PNG 画像の比較ベクタ画像と PNG 画像の比較
ベクタ画像と PNG 画像の比較
 
反応時間データをどう分析し図示するか
反応時間データをどう分析し図示するか反応時間データをどう分析し図示するか
反応時間データをどう分析し図示するか
 
LET2012 ワークショップ「R による教育・言語データ処理のススメ」
LET2012 ワークショップ「R による教育・言語データ処理のススメ」LET2012 ワークショップ「R による教育・言語データ処理のススメ」
LET2012 ワークショップ「R による教育・言語データ処理のススメ」
 
授業外で個別学習を促進するためのポッドキャスト利用
授業外で個別学習を促進するためのポッドキャスト利用授業外で個別学習を促進するためのポッドキャスト利用
授業外で個別学習を促進するためのポッドキャスト利用
 
R のインストール手順(LET 2012用)
R のインストール手順(LET 2012用)R のインストール手順(LET 2012用)
R のインストール手順(LET 2012用)
 
Nagoya.R #8 入門者講習資料
Nagoya.R #8 入門者講習資料Nagoya.R #8 入門者講習資料
Nagoya.R #8 入門者講習資料
 
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
「R による英語コーパスの処理入門―接続詞 and/but の使用実態調査を例に―」Computing Language and Culture with...
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

HiRoshima.R #1 1-3 LT

  • 2. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 2
  • 3. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 3
  • 7. • • A B • Saturday, June 18, 2011 7
  • 8. : “however” 109 347 8 493 [ ] However, .... [ ] ..., however, .... [ ] ..., however. Saturday, June 18, 2011 8
  • 9. > freq <- c(109,347,8) > chisq.test(freq,correct=FALSE) Chi-squared test for given probabilities data: freq X-squared = 391.7371, df = 2, p-value < 2.2e-16 # 2 # http://homepage2.nifty.com/nandemoarchive/toukei_kiso/t_F_chi.htm Saturday, June 18, 2011 9
  • 11. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 11
  • 12. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 12
  • 14. 1. 2. 3. 4. 5. 6. Saturday, June 18, 2011 14
  • 15. 1. • • ns <- scan("ns_raw.txt", what="character") • • ns <- scan(choose.files(), what="char") • • getwd() ! Saturday, June 18, 2011 15
  • 16. 2. • head( , ) • tail( , ) • / Saturday, June 18, 2011 16
  • 17. 2. •grep (“ ”, ) • > grep("school", ns) • ns > ns[grep("school", ns)] Saturday, June 18, 2011 17
  • 18. 2. • [ ] • > ns[100] • 100 • > ns[c(98,99,100)] • 98, 99, 100 •c Saturday, June 18, 2011 18
  • 19. 3. • •strsplit ( ,“ ”) > strsplit (ns, " ") •ns • • list Saturday, June 18, 2011 19
  • 20. 3. • > ns_list <- strsplit (ns, " ") • ns_list > unlist (ns_list) • ns_list • unlist(strsplit(ns, " ")) Saturday, June 18, 2011 20
  • 21. 4. sort ( ) > ns2 <- sort(unlist(ns_list)) Saturday, June 18, 2011 21
  • 22. 4. unique ( ) > ns3 <- unique (sort(unlist(ns_list))) # ( ) # sort(unique(unlist(ns_list))) Saturday, June 18, 2011 22
  • 23. 5. table ( ) > ns4 <- table(unlist(strsplit (ns, " "))) # table # Saturday, June 18, 2011 23
  • 24. 5. > ns5 <- length(unlist(strsplit (ns, " "))) # Saturday, June 18, 2011 24
  • 25. 5. > ns6 <- length(unique(sort(unlist(strsplit (ns, " "))))) # # > ns7 <- unique(sort(unlist (ns_list))) > length(ns7) Saturday, June 18, 2011 25
  • 26. 6. > write.table(ns4, file=“freq1.txt”) > write.table(ns5, file=“freq2.txt”) > write.table(ns6, file=“freq3.txt”) # getwd() # Excel Saturday, June 18, 2011 26
  • 28. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 28
  • 29. Agenda 1. R ― ― 2. R 3. R Saturday, June 18, 2011 29
  • 30. • • • • ... orz Saturday, June 18, 2011 30
  • 32. RMeCab • •R MeCab • R Saturday, June 18, 2011 32
  • 33. • RMeCabText() : • RMeCabFreq() : • Ngram() : N-gram • collocate() : Saturday, June 18, 2011 33
  • 35. 2,940 1,785 3,780 Saturday, June 18, 2011 35
  • 37. twitter: @sakaue e-mail: tsakaue<AT>hiroshima-u.ac.jp Saturday, June 18, 2011 37