Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

How data science works and how can customers help

668 Aufrufe

Veröffentlicht am

This slide deck is design primarily with a data science customer in mind. Data science process is explained so that the customer knows how do to ensure success of a data science project? Also, I think, junior data scientists can profit from understanding better the process of creating models customized for customers -- and avoid some pitfalls. In addition, sales people and managers should be able to grasp better the jobs of data scientists.

Veröffentlicht in: Daten & Analysen
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

How data science works and how can customers help

  1. 1. DATA SCIENCE – how we do the magic And how can the customer help. Prof. Danko Nikolic, PhD
  2. 2. How we become successful together? Explain how CSC does data science and what we need from the customer. MAIN GOAL:
  3. 3. PRIMARY AUDIENCE technical personnel at the customer side
  4. 4. How technically deep is this document (scale 1 – 10)?: 5 For effective data science, how essential is collaboration with the customer (scale 1-10)?: 10
  5. 5. INTRO
  6. 6. Textbooks… … use simplified data to explain how you apply statistical methods, do not say much on how you deal with data in real life. … and are thus, misleading.
  7. 7. Textbooks make you believe that an appropriate model for your data already exists. “You just needs to select the right model and apply it.”
  8. 8. Unfortunately, data science is not that simple. Data scientists do not just pick models.
  9. 9. Correction: a data scientist creates a model. Misconception: a data scientist applies a model.
  10. 10. Each data set has its own oddities, quirks, issues, … Each phenomenon that we want to model lives in its own world. The job of a data scientist is to understand this world, and to tailor a model accordingly.
  11. 11. Rarely will an off-the-shelf model be outright optimal for a real-life problem.
  12. 12. What a customer buys: a unique model optimized for the customer’s needs.
  13. 13. Skills and experience of a data scientist translate into the ability to create customized models. It may take 10 or 20 years to stack up a skill set to effectively build customized models.
  14. 14. At we offer that experience.
  15. 15. Creation of a model requires one to master: statistics, coding, optimization, story telling, visualization, experimental design, Big Data technology, clustering, business models, regression, handling data bases, probability …
  16. 16. … scientific thinking, deep learning, intuition, distributions, overfitting, information theory, cross-correlation, fractal geometry, computation, multivariate analysis, statistical biases, no-free-lunch theorem, support- vector machine, normalization, regularization, matrix algebra, graph theory, … … Boltzman machine, drop out, entropy, auto- associative networks, reinforcement learning, Lasso, Cohonen network, back propagation, … . … natural language processing, scientific publishing, Bayes theorem, genetic algorithms, swarm intelligence, boosting, Markov process, softmax, power spectrum, good regulator theorem, presentation skills, … … + keeping up with 100s of new models and tools announced every year.
  17. 17. Hence, a team of experienced data scientists can often navigate this world more effectively Experience + Team is what gets the customer the best model at the end. and creatively than an individual alone.
  18. 18. Examples of notable team efforts:
  19. 19. US$ 1,000,000
  20. 20. These are all unique, newly created models tailored for a particular purpose. No existing model off-the-shelf could be simply applied.
  21. 21. But what will a data scientist do? How does one create a new model?
  22. 22. Important to distinguish model architecture from a complete model. Architecture: model specified but without training. Equations and interactions between equations are defined, but parameter values are not yet known. Complete model: trained model. Parameter values are known. Machine learning has been applied. The model has been fully trained and tested, and is ready to be deployed.
  23. 23. Example architecture: A wiring diagram, defined data flow, topology, equations,… but parameter values are not yet specified.
  24. 24. Example complete model: W1,1 = 0.12 W1,2 = 0.03 W2,4 = -0.45 … …+ Optimal values of parameters are found through machine learning (training) process.
  25. 25. Architecture Training Human person does the work. Machine does the work. +
  26. 26. A data scientist works with a tradeoff between effort invested in designing model’s architecture and training a model. The more specialized the architecture for a given problem, the less training is needed. IMPORTANT: Architecture Training
  27. 27. Advantages from a specialized architecture: - smaller datasets for training - more resilient to over-fitting - closer to global maximum - fewer resources - cost effective - better overall performance
  28. 28. The opposite is an eclectic architecture. Eclectic architecture can be applied to many different data but needs more training. As a result, - larger amounts of data needed - intensive computation - easily over-fitted - likely ending in local minima - higher development costs - weaker performance
  29. 29. Architecture Training Specialized architecture brings heavy weight to the performance of a model.
  30. 30. Why does specialized architecture enhance learning? - The architecture possesses already a part of the needed knowledge — less is left to be learned. - The learning space becomes smaller (reduced dimensionality) - During learning, specialized architecture rises signal above noise.
  31. 31. Big Data, due to their mass, allow working with more general (eclectic) architectures.
  32. 32. Relative contributions to model’s knowledge Highly specialized architecture “Small” data This is the ratio we prefer. Eclectic architecture Big Data This tradeoff is often successful.
  33. 33. Example of specialized architecture - general liner model (GLM): Regression based on GLM can work well already with as few as 100 data points. The architecture of GLM already contains knowledge about: - Gaussian distributions, - linear relationships, - independent sampling, - pairwise correlations, - …
  34. 34. The specialization of GLM is founded in the discoveries by generations of statisticians. Over years, they discovered a set of properties that tended to repeat in real-life data sets. The result is GLM.
  35. 35. A neural net can learn the same linear relations as GLM + many other relations that GLM cannot. This makes neural nets more eclectic. However, much larger data sets are needed. The price for the generality of architecture is data size and training time. it can learn a lot of different things. Example of an eclectic architecture – Multi-layer perceptron: (aka, artificial neural network)
  36. 36. Small architecture (general) Big Data also profit from specialized architectures! Bigger architecture (more specific) (less) Big Data More data cannot always replace architecture: (curse of dimensionality)
  37. 37. Example Big Data combined with specialized architecture – Convolutional NN: Only local connectivity; the same weights are repeated across all neurons of one layer. Convolutional layers in a neural network contain specific knowledge on how the visual world is organized. Addition of convolutional layers improves learning.
  38. 38. Better NN architecture; more suitable for processing images; the model ‘knows’ that local pixels are correlated and that they contain information on visual features. Consequences: A deep neural network with convolutional layers will perform more effectively than either an all-to-all connected deep network or any other “shallow” network.
  39. 39. A customer can assist data scientists in developing: as specialized architecture as possible.
  40. 40. “Any two optimization algorithms are equivalent when their performance is averaged across all possible problems.” No free lunch theorem Can there exist an eclectic model that also learns easily, like a specialized model? No! Because of:
  41. 41. This is what machine learning is not - even with Big Data. Any data science problem will require working on an appropriate model architecture.
  42. 42. high training effort, lower performance Specialized architecture low training effort, often high performance Eclectic architecture FastlearnersSlowlearners If you are in this corner, you may be using a wrong model for the given data. Laws of physics Linear regression Deep learning Genetic algo- rithms SVM Decision tree Random forest Naïve Bayes Various off-the- shelf models can be approximately sorted according to how specialized they are: the black triangle of unreality due to the no– free–lunch theorem The slope of optimal model application
  43. 43. Off-the-shelf models usually are not end architectures. More often, they are only components of specialized models. The more eclectic an off-the-shelf model, the more room for adding specializations there is.
  44. 44. A data scientists will often combine of-the-shelf models with other components to build a model specialized for customer’s data.
  45. 45. Commonly used specialization tool: data wrangling. Data wrangling extracts from the data what is important (the signal!) and in a way that is suitable for an off-the-shelf model. Example: Equations for data wrangling Data Neural net + Specific wrangling steps -> form together a highly specialized model. Here, data wrangling plays a role similar to that of convolution in deep neural nets. Less thought may be needed to apply a neural net. This is because neural net alone provides an eclectic architecture. + Extensive thought given to data wrangling.
  46. 46. Remember: A data scientist CREATES a model.
  47. 47. High training effort Specialized architecture Low training effort Eclectic architecture An inexperienced data scientists may spend a lot of time in this corner. Where does a data scientist operate? A naïve ‘data scientist’ would hope to end up here. .
  48. 48. How does a data scientist do that? Three main steps for building a specialized architecture:
  49. 49. 1. UNDERSTAND! - Analyze data, dependencies between variables, distributions, etc. - Study the (physical) system that generated the data.
  50. 50. A data scientist will perform calculations with the goal to understand the data. Various tools to help understanding: A data scientist will talk to experts, ask questions, read literature, go for a walk to think. descriptive statistics, distribution plots, visualizations, scatter plots, time series, cross- correlation, fractal dimension, … By doing so, a data scientist will seek insights necessary to implement novel model architectures.
  51. 51. 2. Formally describe Describe the insight by drawing a graph, writing equations, listing the rules, … ?
  52. 52. 3. Implement into software (code)
  53. 53. Understand Formalize Code
  54. 54. Various software tools lay on data scientist’s disposal.
  55. 55. No simple recipe on which parts of a model to begin working first it’s a creative process!
  56. 56. Understand Formalize Code model Test Train Evaluate Important help from the customer comes here. Therefore, iterations:
  57. 57. Examples of successful specialized models created by Data Science team:
  58. 58. Example I: Predictive maintenance—fan operations Vibration analysis
  59. 59. Goal: Detect healthy and unhealthy operations of a fan + classify the source of disturbances. 3-axis vibration sensor mounted on the fan. Data wrangling and insights: power spectrum to identify frequency bands carrying signals. Anomaly detection: An auto-associative neural network on full power spectrum. Disturbance classification: Logistic regression on selected frequency bands. Performance: 100% on new data sets. Data Science tools:
  60. 60. Example II: Mind reading Brain signals
  61. 61. Goal: Reconstruct what the animal sees (stimulus) from the activity of neurons in the visual cortex. Data wrangling and insights: Spike sorting; Convolution of neuronal spiking activity. Stimulus identification: Support vector machine fed with convoluted neural activity. Stimulus reconstruction: An array of naïve bias classifiers. Performance: Up to 90%, 10-fold cross-validation. Data Science tools: Reference: Nikolić, D.*, S. Häusler*, W. Singer and W. Maass (2009) Distributed fading memory for stimulus properties in the primary visual cortex. PLoS Biology 2009, 7: e1000260.
  62. 62. Example III: Predictive maintenance—Coffee machines Visits from a service technician
  63. 63. Goal: Predict whether a coffee machine will be visited by a technician within the next 3 months. Data: telemetric data on machine usage. Data wrangling and insights: cumulative variables, cross- correlation, heat map. Model: 4-layer artificial neural network on wrangled data. Performance: 14.1% above chance, 10-fold cross validation. Best performance among 10 competitors. Data Science tools:
  64. 64. Example IV: Train departure and arrival time
  65. 65. Goal: Compute new timetables in real-time depending on the current traffic situation. Model specialization: Railway network implemented as a graph; nodes and edges executed as neural nets. Predictions: individual delays; departure, arrival and waiting times. Performance: We could predict with 68% accuracy a 3-minute window in which a train will arrive/depart, for as far as 48 hours in the future; Data Science and Big Data tools:
  66. 66. How exactly does a customer help?
  67. 67. Customer does not only deliver data.
  68. 68. WHAT WE NEED FROM THE CUSTOMER IS: Make us understand your world!
  69. 69. You need to do everything in your power to transfer model-relevant knowledge to us. (We’ll do the rest.)
  70. 70. Customer’s homework: - Know your economics. - Describe the process that created the data. - Formulate hypotheses. - Ensure access to relevant experts in your company.
  71. 71. Your economics: Which model could possibly make you money, or bring other benefits? Costs increase with Data Science and analytics effort. As a result savings and profits rise, but not linearly. Sweet spot: Data Science costs are low, benefits are large Data Science can cost you more than what it saves.
  72. 72. The process that created data Be it a single machine or an entire factory floor, a hospital ward or a marketing campaign, the more we understand about the process, the more specialization can we insert into the model.
  73. 73. Where do you think the signal in the data is? What is your hypothesis? Good specialized architecture extracts signal over noise. Point us to the direction you think is right. We’ll check whether there is a signal.
  74. 74. The person we may need to talk to
  75. 75. CSC + Customer form a full team.
  76. 76. The difference between taking an off- the-shelf-model and investing time and expertise to create a specialized model translates into a difference between mediocre results and excellent results.
  77. 77. At we are after excellent results.
  78. 78. CSC provides top Data Science expertise for developing specialized model architectures in industry.
  79. 79. Dr. Günter Koch Senior Manager gkoch@csc.com Davor Andric Principal Solution Architect dandric@csc.com Christian Kaupa Director BD&A ckaupa@csc.com Prof. Dr. Danko Nikolic Lead Data Scientist dnikolic3@csc.com Contacts: