SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
(Complementary) Naive Bayes
                                15




                   2011/01/23        TokyoWebmining #9-2nd

2011   1   23
2011   1   23
naoki yanai           @yanaoki
                web

                   Java Ruby Hadoop




                TokyoWebmining




2011   1   23
- Naive Bayes -


2011   1   23
Naive Bayes




                     ,Supervised


                Naive Baye   ComplementNaiveBayes

2011   1   23
2011   1   23
Web




2011   1   23
Naive Bayes
                   F1,...,Fn                 C




                           C




                                         c




       ※        gihyo.jp          http://gihyo.jp/dev/serial/01/machine-learning/0002

2011   1   23
Naive Bayes




                w                j          i

                 i

                 c                                      j                  θcj

                 i

       ※        Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes
2011   1   23
Complementaly Naive Bayes

                Naive Bayes

                   Naive Bayes                       θcj




                   NaiveBayes         c                            j             θcj



                CNaiveBayes      c                                     j           θˆij




           ※            Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes
2011   1   23
-   -




2011   1   23
API



                Mahout
                 Mahout

                Mahout




2011   1   23
Mahout

                Mahout


                  CF Clustering

                  Hadoop


                        (Classifier)
                      NaiveBayes   ComplementaryNaiveBayes




2011   1   23
bayes   cbayes




2011   1   23
1.

                                   API                 API               API



                      mecab

                                         ID[TAB]
                c100371                            1               !




                                                                                        :
                45            45           10      SIZE - 02   :   56 B 76     31 SIZE - 03
                          :    57 B 82

2011   1   23
2.
                                     bayes|cbayes     N-gram     N-gram    alpha


                                 Bayes/CBayes

                      35MB                   EC2                  3       15

                $ mahout trainclassifier 
                --gramSize 1 
                --input /classifier/rakuten/data_search 
                --output /classifier/rakuten/model_searchcbig 
                --classifierType cbayes 
                --dataSource hdfs 
                --alpha 1



2011   1   23
3.




                $ mahout testclassifier 
                 --gramSize 1 
                 --testDir /home/yanaoki/classifier/rakuten/data_rank1 
                 --model /classifier/rakuten/model_searchcbig 
                 --classifierType cbayes 
                 --dataSource hdfs 
                 --alpha 1


2011   1   23
4.


                                             Classify

                                                                HBase




                export CLASSPATH=...
                $ java org.apache.mahout.classifier.Classify 
                --path /home/yanaoki/classifier/rakuten/model 
                --classify /home/yanaoki/classifier/rakuten/d.doc 
                --encoding UTF-8 
                --gramSize 1 
                --classifierType cbayes 
                --dataSource hdfs 
2011   1   23
4.


                                             Classify

                                                                HBase




                export CLASSPATH=...
                $ java org.apache.mahout.classifier.Classify 
                --path /home/yanaoki/classifier/rakuten/model 
                --classify /home/yanaoki/classifier/rakuten/d.doc 
                --encoding UTF-8 
                --gramSize 1 
                --classifierType cbayes 
                --dataSource hdfs 
2011   1   23
2011   1   23
ID

                100005    DIY   2997   4.8M
                100227          2994   3.9M
                100371          3000   4.6M
                100939          2990   4.7M
                101114          2997   3.3M
                101381          2905   4.2M
                200162          2997   2.6M
                216131          2999   3.4M
2011   1   23
bayes
 =======================================================
 Summary
 -------------------------------------------------------
 Correctly Classified Instances          :      22442       93.9822%
 Incorrectly Classified Instances        :       1437        6.0178%
 Total Classified Instances              :      23879

 =======================================================
 Confusion Matrix
 -------------------------------------------------------
 a       b       c       d       e       f       g         h       <--Classified as
 2802    0       127     3       8       19      4         27       |  2990        a       =   c100939
 0       2982    2       0       0       0       0         13       |  2997        b       =   c200162
 0       7       2948    34      0       1       0         9        |  2999        c       =   c216131
 1       0       133     2846    0       1       0         19       |  3000        d       =   c100371
 233     2       53      0       2434    13      0         259      |  2994        e       =   c100227
 20      0       15      0       0       2935    3         24       |  2997        f       =   c101114
 0       0       39      0       1       139     2753      65       |  2997        g       =   c100005
 0       6       125     4       1       26      1         2742     |  2905        h       =   c101381
 Default Category: unknown: 8




2011   1   23
bayes
 =======================================================
 Summary
 -------------------------------------------------------
 Correctly Classified Instances          :      22442       93.9822%
 Incorrectly Classified Instances        :       1437        6.0178%
 Total Classified Instances              :      23879

 =======================================================
 Confusion Matrix
 -------------------------------------------------------
 a       b       c       d       e       f       g         h       <--Classified as
 2802    0       127     3       8       19      4         27       |  2990        a       =   c100939
 0       2982    2       0       0       0       0         13       |  2997        b       =   c200162
 0       7       2948    34      0       1       0         9        |  2999        c       =   c216131
 1       0       133     2846    0       1       0         19       |  3000        d       =   c100371
 233     2       53      0       2434    13      0         259      |  2994        e       =   c100227
 20      0       15      0       0       2935    3         24       |  2997        f       =   c101114
 0       0       39      0       1       139     2753      65       |  2997        g       =   c100005
 0       6       125     4       1       26      1         2742     |  2905        h       =   c101381
 Default Category: unknown: 8




2011   1   23
ID

                100005    DIY   5997   9.0M
                100227          5969   7.1M
                100371          3000   4.6M
                100939          2990   4.7M
                101114          1500   2.4M
                101381          1500   1.5M
                200162          500    0.48M
                216131          500    0.45M
2011   1   23
bayes
 =======================================================
 Summary
 -------------------------------------------------------
 Correctly Classified Instances          :      14459       60.5511%
 Incorrectly Classified Instances        :       9420       39.4489%
 Total Classified Instances              :      23879

 =======================================================
 Confusion Matrix
 -------------------------------------------------------
 a       b       c       d       e       f       g         h       <--Classified as
 1945    0       0       5       799     0       241       0        |  2990        a       =   c100939
 11      0       0       0       349     0       2632      5        |  2997        b       =   c200162
 6       0       0       833     82      0       2078      0        |  2999        c       =   c216131
 0       0       0       2925    13      0       62        0        |  3000        d       =   c100371
 0       0       0       0       2993    0       1         0        |  2994        e       =   c100227
 3       0       0       0       86      2479    429       0        |  2997        f       =   c101114
 0       0       0       0       39      0       2958      0        |  2997        g       =   c100005
 0       0       1       8       800     2       935       1159     |  2905        h       =   c101381
 Default Category: unknown: 8




2011   1   23
complement naive bayes
 =======================================================
 Summary
 -------------------------------------------------------
 Correctly Classified Instances          :      17225       72.1345%
 Incorrectly Classified Instances        :       6654       27.8655%
 Total Classified Instances              :      23879

 =======================================================
 Confusion Matrix
 -------------------------------------------------------
 a       b       c       d       e       f       g         h       <--Classified as
 2249    0       0       5       588     0       148       0        |  2990        a       =   c100939
 10      1806    0       3       370     0       808       0        |  2997        b       =   c200162
 2       0       471     1084    252     0       1190      0        |  2999        c       =   c216131
 0       0       0       2985    4       0       11        0        |  3000        d       =   c100371
 0       0       0       0       2994    0       0         0        |  2994        e       =   c100227
 27      0       0       1       69      2422    478       0        |  2997        f       =   c101114
 0       0       0       0       33      0       2964      0        |  2997        g       =   c100005
 1       0       0       30      910     0       630       1334     |  2905        h       =   c101381
 Default Category: unknown: 8




2011   1   23
API


                      complement naive
           bayes




           Mahout


2011   1   23
2011   1   23

Weitere ähnliche Inhalte

Andere mochten auch

Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8 Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8 Koichi Hamada
 
協調フィルタリング with Mahout
協調フィルタリング with Mahout協調フィルタリング with Mahout
協調フィルタリング with MahoutKatsuhiro Takata
 
Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9Koichi Hamada
 
"Mahout Recommendation" - #TokyoWebmining 14th
"Mahout Recommendation" -  #TokyoWebmining 14th"Mahout Recommendation" -  #TokyoWebmining 14th
"Mahout Recommendation" - #TokyoWebmining 14thKoichi Hamada
 
MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習Preferred Networks
 
20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for ShareYasushi Gunya
 
計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)Shota Yasui
 
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京Koichi Hamada
 

Andere mochten auch (8)

Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8 Apache Mahout - Random Forests - #TokyoWebmining #8
Apache Mahout - Random Forests - #TokyoWebmining #8
 
協調フィルタリング with Mahout
協調フィルタリング with Mahout協調フィルタリング with Mahout
協調フィルタリング with Mahout
 
Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9Mahout Canopy Clustering - #TokyoWebmining 9
Mahout Canopy Clustering - #TokyoWebmining 9
 
"Mahout Recommendation" - #TokyoWebmining 14th
"Mahout Recommendation" -  #TokyoWebmining 14th"Mahout Recommendation" -  #TokyoWebmining 14th
"Mahout Recommendation" - #TokyoWebmining 14th
 
MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習MapReduceによる大規模データを利用した機械学習
MapReduceによる大規模データを利用した機械学習
 
20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share20161029 TVI Tokyowebmining Seminar for Share
20161029 TVI Tokyowebmining Seminar for Share
 
計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)計量経済学と 機械学習の交差点入り口 (公開用)
計量経済学と 機械学習の交差点入り口 (公開用)
 
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京
オープニングトーク - 創設の思い・目的・進行方針  -データマイニング+WEB勉強会@東京
 

Ähnlich wie ComplementaryNaiveBayesClassifier

Caching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4DevelopersCaching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4DevelopersWim Godden
 
Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012Wim Godden
 
ruby on rails pitfalls
ruby on rails pitfallsruby on rails pitfalls
ruby on rails pitfallsRobbin Fan
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalabilityWim Godden
 
Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDCDrew Farris
 
MS Word.doc
MS Word.docMS Word.doc
MS Word.docbutest
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 
Nginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your siteNginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your siteWim Godden
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)Robert Swisher
 
Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVRietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVFrank Rietta
 
MySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features SummaryMySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features SummaryOlivier DASINI
 
Jaoo Michael Neale 09
Jaoo Michael Neale 09Jaoo Michael Neale 09
Jaoo Michael Neale 09Michael Neale
 
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup GroupRiak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Groupsiculars
 
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharMigrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharWix Engineering
 
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix DatabaseAlain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix DatabaseZabbix
 
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseReporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseAlain Ganuchaud
 
浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編 浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編 Masakuni Kato
 

Ähnlich wie ComplementaryNaiveBayesClassifier (20)

Caching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4DevelopersCaching and tuning fun for high scalability @ 4Developers
Caching and tuning fun for high scalability @ 4Developers
 
Cache Tooling
Cache ToolingCache Tooling
Cache Tooling
 
Cache Tooling
Cache ToolingCache Tooling
Cache Tooling
 
Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
 
ruby on rails pitfalls
ruby on rails pitfallsruby on rails pitfalls
ruby on rails pitfalls
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
Mahout Introduction BarCampDC
Mahout Introduction BarCampDCMahout Introduction BarCampDC
Mahout Introduction BarCampDC
 
MS Word.doc
MS Word.docMS Word.doc
MS Word.doc
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
Nginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your siteNginx and friends - putting a turbo button on your site
Nginx and friends - putting a turbo button on your site
 
SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)SDPHP - Percona Toolkit (It's Basically Magic)
SDPHP - Percona Toolkit (It's Basically Magic)
 
Rietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISVRietta Business Intelligence for the MicroISV
Rietta Business Intelligence for the MicroISV
 
MySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features SummaryMySQL 8.0.16 New Features Summary
MySQL 8.0.16 New Features Summary
 
Jaoo Michael Neale 09
Jaoo Michael Neale 09Jaoo Michael Neale 09
Jaoo Michael Neale 09
 
Riak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup GroupRiak at The NYC Cloud Computing Meetup Group
Riak at The NYC Cloud Computing Meetup Group
 
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or ShacharMigrating to a Bazel-based CI System: 6 Learnings - Or Shachar
Migrating to a Bazel-based CI System: 6 Learnings - Or Shachar
 
Alain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix DatabaseAlain Ganuchaud - Reporting Large Environment Zabbix Database
Alain Ganuchaud - Reporting Large Environment Zabbix Database
 
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseReporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
 
浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編 浜松Rails3道場 其の弐 Model編
浜松Rails3道場 其の弐 Model編
 
Intro to sbt-web
Intro to sbt-webIntro to sbt-web
Intro to sbt-web
 

ComplementaryNaiveBayesClassifier

  • 1. (Complementary) Naive Bayes 15 2011/01/23 TokyoWebmining #9-2nd 2011 1 23
  • 2. 2011 1 23
  • 3. naoki yanai @yanaoki web Java Ruby Hadoop TokyoWebmining 2011 1 23
  • 4. - Naive Bayes - 2011 1 23
  • 5. Naive Bayes ,Supervised Naive Baye ComplementNaiveBayes 2011 1 23
  • 6. 2011 1 23
  • 7. Web 2011 1 23
  • 8. Naive Bayes F1,...,Fn C C c ※ gihyo.jp http://gihyo.jp/dev/serial/01/machine-learning/0002 2011 1 23
  • 9. Naive Bayes w j i i c j θcj i ※ Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes 2011 1 23
  • 10. Complementaly Naive Bayes Naive Bayes Naive Bayes θcj NaiveBayes c j θcj CNaiveBayes c j θˆij ※ Wiki http://ibisforest.org/index.php?complement%20naive%20Bayes 2011 1 23
  • 11. - - 2011 1 23
  • 12. API Mahout Mahout Mahout 2011 1 23
  • 13. Mahout Mahout CF Clustering Hadoop (Classifier) NaiveBayes ComplementaryNaiveBayes 2011 1 23
  • 14. bayes cbayes 2011 1 23
  • 15. 1. API API API mecab ID[TAB] c100371 1 ! : 45 45 10 SIZE - 02 : 56 B 76 31 SIZE - 03 : 57 B 82 2011 1 23
  • 16. 2. bayes|cbayes N-gram N-gram alpha Bayes/CBayes 35MB EC2 3 15 $ mahout trainclassifier --gramSize 1 --input /classifier/rakuten/data_search --output /classifier/rakuten/model_searchcbig --classifierType cbayes --dataSource hdfs --alpha 1 2011 1 23
  • 17. 3. $ mahout testclassifier --gramSize 1 --testDir /home/yanaoki/classifier/rakuten/data_rank1 --model /classifier/rakuten/model_searchcbig --classifierType cbayes --dataSource hdfs --alpha 1 2011 1 23
  • 18. 4. Classify HBase export CLASSPATH=... $ java org.apache.mahout.classifier.Classify --path /home/yanaoki/classifier/rakuten/model --classify /home/yanaoki/classifier/rakuten/d.doc --encoding UTF-8 --gramSize 1 --classifierType cbayes --dataSource hdfs 2011 1 23
  • 19. 4. Classify HBase export CLASSPATH=... $ java org.apache.mahout.classifier.Classify --path /home/yanaoki/classifier/rakuten/model --classify /home/yanaoki/classifier/rakuten/d.doc --encoding UTF-8 --gramSize 1 --classifierType cbayes --dataSource hdfs 2011 1 23
  • 20. 2011 1 23
  • 21. ID 100005 DIY 2997 4.8M 100227 2994 3.9M 100371 3000 4.6M 100939 2990 4.7M 101114 2997 3.3M 101381 2905 4.2M 200162 2997 2.6M 216131 2999 3.4M 2011 1 23
  • 22. bayes ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances          :      22442       93.9822% Incorrectly Classified Instances        :       1437        6.0178% Total Classified Instances              :      23879 ======================================================= Confusion Matrix ------------------------------------------------------- a       b       c       d       e       f       g       h       <--Classified as 2802    0       127     3       8       19      4       27       |  2990        a     = c100939 0       2982    2       0       0       0       0       13       |  2997        b     = c200162 0       7       2948    34      0       1       0       9        |  2999        c     = c216131 1       0       133     2846    0       1       0       19       |  3000        d     = c100371 233     2       53      0       2434    13      0       259      |  2994        e     = c100227 20      0       15      0       0       2935    3       24       |  2997        f     = c101114 0       0       39      0       1       139     2753    65       |  2997        g     = c100005 0       6       125     4       1       26      1       2742     |  2905        h     = c101381 Default Category: unknown: 8 2011 1 23
  • 23. bayes ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances          :      22442       93.9822% Incorrectly Classified Instances        :       1437        6.0178% Total Classified Instances              :      23879 ======================================================= Confusion Matrix ------------------------------------------------------- a       b       c       d       e       f       g       h       <--Classified as 2802    0       127     3       8       19      4       27       |  2990        a     = c100939 0       2982    2       0       0       0       0       13       |  2997        b     = c200162 0       7       2948    34      0       1       0       9        |  2999        c     = c216131 1       0       133     2846    0       1       0       19       |  3000        d     = c100371 233     2       53      0       2434    13      0       259      |  2994        e     = c100227 20      0       15      0       0       2935    3       24       |  2997        f     = c101114 0       0       39      0       1       139     2753    65       |  2997        g     = c100005 0       6       125     4       1       26      1       2742     |  2905        h     = c101381 Default Category: unknown: 8 2011 1 23
  • 24. ID 100005 DIY 5997 9.0M 100227 5969 7.1M 100371 3000 4.6M 100939 2990 4.7M 101114 1500 2.4M 101381 1500 1.5M 200162 500 0.48M 216131 500 0.45M 2011 1 23
  • 25. bayes ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances          :      14459       60.5511% Incorrectly Classified Instances        :       9420       39.4489% Total Classified Instances              :      23879 ======================================================= Confusion Matrix ------------------------------------------------------- a       b       c       d       e       f       g       h       <--Classified as 1945    0       0       5       799     0       241     0        |  2990        a     = c100939 11      0       0       0       349     0       2632    5        |  2997        b     = c200162 6       0       0       833     82      0       2078    0        |  2999        c     = c216131 0       0       0       2925    13      0       62      0        |  3000        d     = c100371 0       0       0       0       2993    0       1       0        |  2994        e     = c100227 3       0       0       0       86      2479    429     0        |  2997        f     = c101114 0       0       0       0       39      0       2958    0        |  2997        g     = c100005 0       0       1       8       800     2       935     1159     |  2905        h     = c101381 Default Category: unknown: 8 2011 1 23
  • 26. complement naive bayes ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances          :      17225       72.1345% Incorrectly Classified Instances        :       6654       27.8655% Total Classified Instances              :      23879 ======================================================= Confusion Matrix ------------------------------------------------------- a       b       c       d       e       f       g       h       <--Classified as 2249    0       0       5       588     0       148     0        |  2990        a     = c100939 10      1806    0       3       370     0       808     0        |  2997        b     = c200162 2       0       471     1084    252     0       1190    0        |  2999        c     = c216131 0       0       0       2985    4       0       11      0        |  3000        d     = c100371 0       0       0       0       2994    0       0       0        |  2994        e     = c100227 27      0       0       1       69      2422    478     0        |  2997        f     = c101114 0       0       0       0       33      0       2964    0        |  2997        g     = c100005 1       0       0       30      910     0       630     1334     |  2905        h     = c101381 Default Category: unknown: 8 2011 1 23
  • 27. API complement naive bayes Mahout 2011 1 23
  • 28. 2011 1 23