Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

機械学習によるデータ分析 実践編

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 72 Anzeige

機械学習によるデータ分析 実践編

演習用のスクリプトは以下にあります.
Python
http://nbviewer.ipython.org/gist/canard0328/a5911ee5b4bf1a07fbcb/
https://gist.github.com/canard0328/07a65584c134a2700725
R
http://nbviewer.ipython.org/gist/canard0328/6f44229365f53b7bd30f/
https://gist.github.com/canard0328/b2f8aec2b9c286f53400

演習用のスクリプトは以下にあります.
Python
http://nbviewer.ipython.org/gist/canard0328/a5911ee5b4bf1a07fbcb/
https://gist.github.com/canard0328/07a65584c134a2700725
R
http://nbviewer.ipython.org/gist/canard0328/6f44229365f53b7bd30f/
https://gist.github.com/canard0328/b2f8aec2b9c286f53400

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie 機械学習によるデータ分析 実践編 (20)

Anzeige

Aktuellste (20)

機械学習によるデータ分析 実践編

  1. 1. @canard0328 t t
  2. 2. 2 r Tn t t or
  3. 3. 3 http://nbviewer.ipython.org/gist/canard0328/6f44229365f53b7bd30f/ http://nbviewer.ipython.org/gist/canard0328/a5911ee5b4bf1a07fbcb/ https://gist.github.com/canard0328/07a65584c134a2700725 https://gist.github.com/canard0328/b2f8aec2b9c286f53400
  4. 4. 4 Sample Explore Modify Model Assess Sample Explore Modify Model Assess t t r t SEMMA
  5. 5. 5 CRISPLDM CRossLIndustryNStandardNProcessNforNDataNMining BusinessNUnderstanding DataNUnderstanding DataNPreparation Modeling Evaluation Deployment KDD KnowledgeNDiscoveryNinNDatabases Selection Preprocessing Transformation DataNMining Interpretation/Evaluation KKD Keiken,NKan andNDokyo
  6. 6. 6 t http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.csv (DataNobtainedNfromNhttp://biostat.mc.vanderbilt.edu/DataSets) > data = read.csv(“titanic3.csv”, + stringsAsFactors=F, na.strings=c("","NA")) >>> import pandas as pd >>> data = pd.read_csv(‘titanic3.csv') Sample Explore Modify Assess Model
  7. 7. 7 t t t t t Sample Explore Modify Assess Model
  8. 8. 8 () ( ( ) ) ) ( () ( ( ) ( ( ) ( ÷
  9. 9. 9 r t r r r
  10. 10. 10
  11. 11. 11 1. t 2. t 3. t 4. t
  12. 12. 12 t t t t Sample Explore Modify Assess Model
  13. 13. 13 t rT
  14. 14. 14 u nT t T
  15. 15. 10of0K 15 N t NL1 t
  16. 16. Feature hashing /=Hashing trick 16 FeatureNhashing t Nt v t xN:=NnewNvector[N] forNfNinNfeatures: hN:=Nhash(f) x[hNmodNN]N+=N1 http://en.wikipedia.org/wiki/Feature_hashing
  17. 17. (Curse=of=dimensionality) 17 t r g r ur n u t e e e t T Tn u t e
  18. 18. (Standardization) 18 xt 10 i t n
  19. 19. (Standardization) 19 a (Standardization) σ µ− = x z σ µ xt xt P 1 e
  20. 20. 20 t r (Feature selection) t r t e (ForwardNstepwiseNselection) (BackwardNstepwiseNselection)
  21. 21. 21 UglyNducklingNtheorem T t t t t u t t t t T t
  22. 22. 22 4. t 5. t 6. t
  23. 23. 23 u “MachineNlearningNisNtheNscienceNofNgettingNcomputersNtoN actNwithoutNbeingNexplicitlyNprogrammed.”NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN AndrewNNg u t T t e e 23 Sample Explore Modify Assess Model
  24. 24. 24 supervisedNlearning t • classification • regression unsupervisedNlearning u t • • • outlierNdetection
  25. 25. 25 gt t • semiLsupervisedNlearning • reinforcementNlearning • activeNlearning • onlineNlearning • transferNlearning
  26. 26. 26 • • • • k • • • • •
  27. 27. 27 r • KLmeansN • • Apriori • OneLclassNSVM
  28. 28. 28 nu TnT t rT r rT
  29. 29. 29 x y εββββ +++++= ii xxxy !22110 u generalizedNlinearNmodel u t
  30. 30. 30 KLmeans KLmeans u t n T GaussianNmixtureNmodel t
  31. 31. 31 t T ÷ u n T t T t
  32. 32. 32 7.
  33. 33. 3333 Sample Explore Modify Assess Model
  34. 34. 34 (MeanNabsoluteNerror) T T (MeanNsquare(d)Nerror) T T RootNMeanNSquare(d)N Error R2(CoefficientNofNdetermination) ÷ T e 0( T) 1( T) T r
  35. 35. 35 (Accuracy) (ErrorNrate) 1N 1 t t 100 t e t u99% u T T i
  36. 36. 36 (ConfusionNmatrix) (Positive)  26 5 8 6 (TrueNpositiveN:NTP) (FalseNnegativeN:NFN) (FalseNpositiveN:NFP)  4: 6 96 5 8 6 / 42 T nT v t r
  37. 37. 37 (Precision) TP/(TPN+NFP) tt (Recall) TP/(TPN+NFN) t F (F1Nscore,NFLmeasure) 2 ( )N/N( ) P 2 3 TP FN 2 FP 42
  38. 38. 38 (True Positive Rate) TP/(TPN+NFN) t (False Positive Rate) FP/(FPN+NTN) t n P 2 3 TP FN 2 FP 42
  39. 39. 39 1 t t 100 t e (Positive) (Negative) 0 100 0 9900 0.99 0 0 F 0
  40. 40. 40 t u e r T t rT e T e SMOTE u r rT T...
  41. 41. 41 u t T e u r r ROC t r t AUC ROC t t 1.0
  42. 42. 42 ROC AUC
  43. 43. 43 n r T t u rT >Nclf =NSVC().fit(X,Ny)
  44. 44. 44 u e >Nclf =NSVC(kernel=‘rbf’,NC=1.0Ngamma=0.1).fit(X,Ny)
  45. 45. 45 r t T t e
  46. 46. 46 t r t( : t )u r g rT tu n(10L2,10L1,100,101,102) u n
  47. 47. 47 n r 0.0 F 1.0 i r
  48. 48. 48 t 0.0 u t
  49. 49. 49 (OverNfitting) n n T u T e n t e e t T T r T eT
  50. 50. 50 e r e rT (Regularization) t Lasso SVMr t t r e n rT(UnderNfitting)
  51. 51. 51 (Cross validation) e 1. B E A 2. A,C E B 3. A,B,D,E C 4. A C,E D 5. A D E 6. 5t 5 5LfoldNcrossNvalidation
  52. 52. 52 t K 1 (LeaveLoneLout cross validation) (StratifiedNcrossNvalidation) t t K t a r t e t
  53. 53. 53 8. 9.
  54. 54. 54 t ε=N(0,Nσ2) σ2+Bias2+Variance Bias( ) t e Variance( ) e
  55. 55. 55 t ε t
  56. 56. 56 ε t u T tv u T → 1
  57. 57. 57 ε t T →
  58. 58. 58 u t t T (OverNfitting) t T UnderNfitting
  59. 59. 59 r ( ) ( )
  60. 60. 60 ( ) T( T) t T t T t nTrT
  61. 61. 61 T t T t T
  62. 62. 62 r e t T t e e r e
  63. 63. 63 10. 11. t 12. t
  64. 64. 64 (EnsembleNlearning) • t t • Stacking Bagging Boosting • u DeepNlearning • NeuralNnetworkst • r
  65. 65. … 65 https://www.linkedin.com/pulse/inconvenientLtruthLdataLscienceLkamilLbartocha
  66. 66. 66 MALSS (MachineNLearningNSupportNSystem) t e Python • • • • •
  67. 67. 67 MALSS > pip install –U malss > from malss import MALSS > clf = MALSS('classification‘, lang=‘jp’) > clf.fit(X, y, ‘report_output_dir') > clf.make_sample_code('sample_code.py')
  68. 68. 68 MALSS
  69. 69. 69 MALSS
  70. 70. 70 F.NProvost Coursera:=Machine=Learning AndrewNNg https://www.coursera.org/course/ml scikit0learn=Tutorials http://scikitLlearn.org/stable/tutorial/ Tutorial:=Machine=Learning=for=Astronomy=with=Scikit0learn http://www.astroml.org/sklearn_tutorial/
  71. 71. 71 MALSS=(Machine=Learning=Support=System) https://pypi.python.org/pypi/malss/ https://github.com/canard0328/malss Python MALSS Qiita http://qiita.com/canard0328/items/fe1ccd5721d59d76cc77 Python MALSS Qiita http://qiita.com/canard0328/items/5da95ff4f2e1611f87e1 Python MALSS Qiita http://qiita.com/canard0328/items/3713d6758fe9c045a19d
  72. 72. 72 1. SEMMA CRISPLDM KDD KKD 2. t t T T t 3. 4.

×