Suche senden
Hochladen
ComplementaryNaiveBayesClassifier
•
7 gefällt mir
•
5,659 views
Naoki Yanai
Folgen
Diashow-Anzeige
Melden
Teilen
Diashow-Anzeige
Melden
Teilen
1 von 28
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
はじめてでもわかるベイズ分類器 -基礎からMahout実装まで-
はじめてでもわかるベイズ分類器 -基礎からMahout実装まで-
Naoki Yanai
Hadoop I/O Analysis
Hadoop I/O Analysis
Richard McDougall
MLconf NYC Shan Shan Huang
MLconf NYC Shan Shan Huang
MLconf
Mahoutにパッチを送ってみた
Mahoutにパッチを送ってみた
issaymk2
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Naoki Yanai
Introduction to fuzzy kmeans on mahout
Introduction to fuzzy kmeans on mahout
takaya imai
Introduction to Mahout Clustering - #TokyoWebmining #6
Introduction to Mahout Clustering - #TokyoWebmining #6
Koichi Hamada
Frequency Pattern Mining
Frequency Pattern Mining
Katsuhiro Takata
Empfohlen
はじめてでもわかるベイズ分類器 -基礎からMahout実装まで-
はじめてでもわかるベイズ分類器 -基礎からMahout実装まで-
Naoki Yanai
Hadoop I/O Analysis
Hadoop I/O Analysis
Richard McDougall
MLconf NYC Shan Shan Huang
MLconf NYC Shan Shan Huang
MLconf
Mahoutにパッチを送ってみた
Mahoutにパッチを送ってみた
issaymk2
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Hadoop/Mahout/HBaseで テキスト分類器を作ったよ
Naoki Yanai
Introduction to fuzzy kmeans on mahout
Introduction to fuzzy kmeans on mahout
takaya imai
Introduction to Mahout Clustering - #TokyoWebmining #6
Introduction to Mahout Clustering - #TokyoWebmining #6
Koichi Hamada
Frequency Pattern Mining
Frequency Pattern Mining
Katsuhiro Takata
Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8
Koichi Hamada
協調フィルタリング with Mahout
協調フィルタリング with Mahout
Katsuhiro Takata
Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9
Koichi Hamada
"Mahout Recommendation" - #TokyoWebmining 14th
"Mahout Recommendation" - #TokyoWebmining 14th
Koichi Hamada
MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習
Preferred Networks
20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share
Yasushi Gunya
計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)
Shota Yasui
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
Koichi Hamada
Caching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4Developers
Wim Godden
Cache Tooling
Cache Tooling
Rohit Kelapure
Cache Tooling
Cache Tooling
Rohit Kelapure
Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
Wim Godden
ruby on rails pitfalls
ruby on rails pitfalls
Robbin Fan
Caching and tuning fun for high scalability
Caching and tuning fun for high scalability
Wim Godden
Mahout Introduction BarCampDC
Mahout Introduction BarCampDC
Drew Farris
MS Word.doc
MS Word.doc
butest
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
Schubert Zhang
Nginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your site
Wim Godden
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
Robert Swisher
Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISV
Frank Rietta
MySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features Summary
Olivier DASINI
Jaoo Michael Neale 09
Jaoo Michael Neale 09
Michael Neale
Weitere ähnliche Inhalte
Andere mochten auch
Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8
Koichi Hamada
協調フィルタリング with Mahout
協調フィルタリング with Mahout
Katsuhiro Takata
Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9
Koichi Hamada
"Mahout Recommendation" - #TokyoWebmining 14th
"Mahout Recommendation" - #TokyoWebmining 14th
Koichi Hamada
MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習
Preferred Networks
20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share
Yasushi Gunya
計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)
Shota Yasui
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
Koichi Hamada
Andere mochten auch
(8)
Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8
協調フィルタリング with Mahout
協調フィルタリング with Mahout
Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9
"Mahout Recommendation" - #TokyoWebmining 14th
"Mahout Recommendation" - #TokyoWebmining 14th
MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習
20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share
計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針 -データマイニング+WEB勉強会@東京
Ähnlich wie ComplementaryNaiveBayesClassifier
Caching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4Developers
Wim Godden
Cache Tooling
Cache Tooling
Rohit Kelapure
Cache Tooling
Cache Tooling
Rohit Kelapure
Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
Wim Godden
ruby on rails pitfalls
ruby on rails pitfalls
Robbin Fan
Caching and tuning fun for high scalability
Caching and tuning fun for high scalability
Wim Godden
Mahout Introduction BarCampDC
Mahout Introduction BarCampDC
Drew Farris
MS Word.doc
MS Word.doc
butest
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
Schubert Zhang
Nginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your site
Wim Godden
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
Robert Swisher
Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISV
Frank Rietta
MySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features Summary
Olivier DASINI
Jaoo Michael Neale 09
Jaoo Michael Neale 09
Michael Neale
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
siculars
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Wix Engineering
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Zabbix
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
Alain Ganuchaud
浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編
Masakuni Kato
Intro to sbt-web
Intro to sbt-web
Marius Soutier
Ähnlich wie ComplementaryNaiveBayesClassifier
(20)
Caching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4Developers
Cache Tooling
Cache Tooling
Cache Tooling
Cache Tooling
Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
ruby on rails pitfalls
ruby on rails pitfalls
Caching and tuning fun for high scalability
Caching and tuning fun for high scalability
Mahout Introduction BarCampDC
Mahout Introduction BarCampDC
MS Word.doc
MS Word.doc
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
Nginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your site
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISV
MySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features Summary
Jaoo Michael Neale 09
Jaoo Michael Neale 09
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編
Intro to sbt-web
Intro to sbt-web
ComplementaryNaiveBayesClassifier
1.
(Complementary) Naive Bayes
15 2011/01/23 TokyoWebmining #9-2nd 2011 1 23
2.
2011
1 23
3.
naoki yanai
@yanaoki web Java Ruby Hadoop TokyoWebmining 2011 1 23
4.
- Naive Bayes
- 2011 1 23
5.
Naive Bayes
,Supervised Naive Baye ComplementNaiveBayes 2011 1 23
6.
2011
1 23
7.
Web 2011
1 23
8.
Naive Bayes
F1,...,Fn C C c ※ gihyo.jp http://gihyo.jp/dev/serial/01/machine-learning/0002 2011 1 23
9.
Naive Bayes
w j i i c j θcj i ※ Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes 2011 1 23
10.
Complementaly Naive Bayes
Naive Bayes Naive Bayes θcj NaiveBayes c j θcj CNaiveBayes c j θˆij ※ Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes 2011 1 23
11.
-
- 2011 1 23
12.
API
Mahout Mahout Mahout 2011 1 23
13.
Mahout
Mahout CF Clustering Hadoop (Classifier) NaiveBayes ComplementaryNaiveBayes 2011 1 23
14.
bayes
cbayes 2011 1 23
15.
1.
API API API mecab ID[TAB] c100371 1 ! : 45 45 10 SIZE - 02 : 56 B 76 31 SIZE - 03 : 57 B 82 2011 1 23
16.
2.
bayes|cbayes N-gram N-gram alpha Bayes/CBayes 35MB EC2 3 15 $ mahout trainclassifier --gramSize 1 --input /classifier/rakuten/data_search --output /classifier/rakuten/model_searchcbig --classifierType cbayes --dataSource hdfs --alpha 1 2011 1 23
17.
3.
$ mahout testclassifier --gramSize 1 --testDir /home/yanaoki/classifier/rakuten/data_rank1 --model /classifier/rakuten/model_searchcbig --classifierType cbayes --dataSource hdfs --alpha 1 2011 1 23
18.
4.
Classify HBase export CLASSPATH=... $ java org.apache.mahout.classifier.Classify --path /home/yanaoki/classifier/rakuten/model --classify /home/yanaoki/classifier/rakuten/d.doc --encoding UTF-8 --gramSize 1 --classifierType cbayes --dataSource hdfs 2011 1 23
19.
4.
Classify HBase export CLASSPATH=... $ java org.apache.mahout.classifier.Classify --path /home/yanaoki/classifier/rakuten/model --classify /home/yanaoki/classifier/rakuten/d.doc --encoding UTF-8 --gramSize 1 --classifierType cbayes --dataSource hdfs 2011 1 23
20.
2011
1 23
21.
ID
100005 DIY 2997 4.8M 100227 2994 3.9M 100371 3000 4.6M 100939 2990 4.7M 101114 2997 3.3M 101381 2905 4.2M 200162 2997 2.6M 216131 2999 3.4M 2011 1 23
22.
bayes ======================================================= Summary
------------------------------------------------------- Correctly Classified Instances : 22442 93.9822% Incorrectly Classified Instances : 1437 6.0178% Total Classified Instances : 23879 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h <--Classified as 2802 0 127 3 8 19 4 27 | 2990 a = c100939 0 2982 2 0 0 0 0 13 | 2997 b = c200162 0 7 2948 34 0 1 0 9 | 2999 c = c216131 1 0 133 2846 0 1 0 19 | 3000 d = c100371 233 2 53 0 2434 13 0 259 | 2994 e = c100227 20 0 15 0 0 2935 3 24 | 2997 f = c101114 0 0 39 0 1 139 2753 65 | 2997 g = c100005 0 6 125 4 1 26 1 2742 | 2905 h = c101381 Default Category: unknown: 8 2011 1 23
23.
bayes ======================================================= Summary
------------------------------------------------------- Correctly Classified Instances : 22442 93.9822% Incorrectly Classified Instances : 1437 6.0178% Total Classified Instances : 23879 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h <--Classified as 2802 0 127 3 8 19 4 27 | 2990 a = c100939 0 2982 2 0 0 0 0 13 | 2997 b = c200162 0 7 2948 34 0 1 0 9 | 2999 c = c216131 1 0 133 2846 0 1 0 19 | 3000 d = c100371 233 2 53 0 2434 13 0 259 | 2994 e = c100227 20 0 15 0 0 2935 3 24 | 2997 f = c101114 0 0 39 0 1 139 2753 65 | 2997 g = c100005 0 6 125 4 1 26 1 2742 | 2905 h = c101381 Default Category: unknown: 8 2011 1 23
24.
ID
100005 DIY 5997 9.0M 100227 5969 7.1M 100371 3000 4.6M 100939 2990 4.7M 101114 1500 2.4M 101381 1500 1.5M 200162 500 0.48M 216131 500 0.45M 2011 1 23
25.
bayes ======================================================= Summary
------------------------------------------------------- Correctly Classified Instances : 14459 60.5511% Incorrectly Classified Instances : 9420 39.4489% Total Classified Instances : 23879 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h <--Classified as 1945 0 0 5 799 0 241 0 | 2990 a = c100939 11 0 0 0 349 0 2632 5 | 2997 b = c200162 6 0 0 833 82 0 2078 0 | 2999 c = c216131 0 0 0 2925 13 0 62 0 | 3000 d = c100371 0 0 0 0 2993 0 1 0 | 2994 e = c100227 3 0 0 0 86 2479 429 0 | 2997 f = c101114 0 0 0 0 39 0 2958 0 | 2997 g = c100005 0 0 1 8 800 2 935 1159 | 2905 h = c101381 Default Category: unknown: 8 2011 1 23
26.
complement naive bayes
======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 17225 72.1345% Incorrectly Classified Instances : 6654 27.8655% Total Classified Instances : 23879 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h <--Classified as 2249 0 0 5 588 0 148 0 | 2990 a = c100939 10 1806 0 3 370 0 808 0 | 2997 b = c200162 2 0 471 1084 252 0 1190 0 | 2999 c = c216131 0 0 0 2985 4 0 11 0 | 3000 d = c100371 0 0 0 0 2994 0 0 0 | 2994 e = c100227 27 0 0 1 69 2422 478 0 | 2997 f = c101114 0 0 0 0 33 0 2964 0 | 2997 g = c100005 1 0 0 30 910 0 630 1334 | 2905 h = c101381 Default Category: unknown: 8 2011 1 23
27.
API
complement naive bayes Mahout 2011 1 23
28.
2011
1 23
Jetzt herunterladen