Weitere ähnliche Inhalte Mehr von SAKAUE, Tatsuya (20) Kürzlich hochgeladen (20) HiRoshima.R #1 1-3 LT2. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 2
3. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 3
7. •
•
• A B
•
Saturday, June 18, 2011 7
8. : “however”
109 347 8 493
[ ] However, ....
[ ] ..., however, ....
[ ] ..., however.
Saturday, June 18, 2011 8
9. > freq <- c(109,347,8)
> chisq.test(freq,correct=FALSE)
Chi-squared test for given probabilities
data: freq
X-squared = 391.7371, df = 2, p-value < 2.2e-16
# 2
# http://homepage2.nifty.com/nandemoarchive/toukei_kiso/t_F_chi.htm
Saturday, June 18, 2011 9
11. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 11
12. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 12
14. 1.
2.
3.
4.
5.
6.
Saturday, June 18, 2011 14
15. 1.
•
• ns <- scan("ns_raw.txt", what="character")
•
• ns <- scan(choose.files(), what="char")
•
• getwd() !
Saturday, June 18, 2011 15
16. 2.
• head( , )
• tail( , )
• /
Saturday, June 18, 2011 16
17. 2.
•grep (“ ”, )
•
> grep("school", ns)
• ns
> ns[grep("school", ns)]
Saturday, June 18, 2011 17
18. 2.
• [ ]
• > ns[100]
• 100
• > ns[c(98,99,100)]
• 98, 99, 100
•c
Saturday, June 18, 2011 18
19. 3.
•
•strsplit ( ,“ ”)
> strsplit (ns, " ")
•ns
•
• list
Saturday, June 18, 2011 19
20. 3.
•
> ns_list <- strsplit (ns, " ")
• ns_list
> unlist (ns_list)
• ns_list
• unlist(strsplit(ns, " "))
Saturday, June 18, 2011 20
21. 4.
sort ( )
> ns2 <- sort(unlist(ns_list))
Saturday, June 18, 2011 21
22. 4.
unique ( )
> ns3 <- unique (sort(unlist(ns_list)))
# ( )
# sort(unique(unlist(ns_list)))
Saturday, June 18, 2011 22
23. 5.
table ( )
> ns4 <- table(unlist(strsplit (ns, " ")))
# table
#
Saturday, June 18, 2011 23
24. 5.
> ns5 <- length(unlist(strsplit (ns, " ")))
#
Saturday, June 18, 2011 24
25. 5.
> ns6 <- length(unique(sort(unlist(strsplit (ns, " ")))))
#
#
> ns7 <- unique(sort(unlist (ns_list)))
> length(ns7)
Saturday, June 18, 2011 25
26. 6.
> write.table(ns4, file=“freq1.txt”)
> write.table(ns5, file=“freq2.txt”)
> write.table(ns6, file=“freq3.txt”)
# getwd()
# Excel
Saturday, June 18, 2011 26
28. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 28
29. Agenda
1. R ― ―
2. R
3. R
Saturday, June 18, 2011 29
30. •
•
•
•
• ... orz
Saturday, June 18, 2011 30
32. RMeCab
•
•R MeCab
• R
Saturday, June 18, 2011 32
33. • RMeCabText() :
• RMeCabFreq() :
• Ngram() : N-gram
• collocate() :
Saturday, June 18, 2011 33
35. 2,940 1,785 3,780
Saturday, June 18, 2011 35
37. twitter: @sakaue
e-mail: tsakaue<AT>hiroshima-u.ac.jp
Saturday, June 18, 2011 37