This presentation highlights the 10 things women should focus on when building a career in Data Science. Starting with the business question is key. Talking to the business users, business managers. stakeholders to understand the business question and how the results will impact the different employee roles is most important. Next is using only the relevant data to solve the business problem. After that, we should have good evaluation methods to ensure the analytical solution is sound. And lastly, but not least, show how the analytical results and models impact business in terms of its revenue, profitability, and operational efficiency.
2. 1. Need to be Comfortable with Maths / Stats
Types of Mathematics & Statistics necessary for data science:
• Probability Theory
• Probability Distributions
• Hypothesis Testing
• Bayesian Analysis
• Machine Learning (SVM, Random Forests, Neural Networks,
Logistic Regression, Decision Trees)
• Unsupervised Learning (K-Means Clustering, Factor Analysis)
• NLP/Information Retrieval
• Model Validation & Comparison
3. • Data Structures
• Algorithms
• Data Visualization (Tableau, Qlikview)
• Data Mungling
• Distributed computing
2. Learn to Code in R or Python
4. • What is the Business Problem?
• What do you think is causing the problems?
• How do you know the problem exists?
• What is your business goal /objectives?
• How will you know you have achieved your
business goal?
• What are the success factors you would like to
see & achieve?
3. Understand Business Questions and Problems
5. • Check your early
results with the
domain expert.
• Ask the domain expert
more questions based
on the insights you
now have?
4. Work with Industry Domain Experts
6. • Ask questions before starting the analysis
• Keep asking yourself questions even when
you see some results
• Explore the data more, as more questions
come about.
• Ensure that the results make business sense,
work closely with the domain expert
5. Ask the Right Questions for Business Results
7. • Check for outliers?
• Are there too many missing values?
• Is there multi-collinearity?
6. Spend Time Cleaning & Preparing the Data
8. • Mentors help to exponentially increase your
learning curve and help to ensure that you are
doing the right checks and applying the right
techniques to the right type of data
• If models do not run, a mentor can quickly help
you to debug the model and get it to run
smoothly.
7. Find the Right Mentors
9. • Many Companies do not have an analytics department and therefore rely
on junior data analysts to make sense of their data and to make
recommendations based on analysis results.
• Junior analysts are only interested in running models quickly and getting
good overall accuracy for the models they run.
• Mentors will always link the results to the business and identify errors that
the junior analyst would miss
8. Find a Job in a Company with an Experienced Data Scientist
10. Perform three similar methods that will direct
you to a similar outcome and the same decision.
• Cluster Analysis
• Principle Component Analysis
• Decision Tree
All three methods classify the customer
correctly
9. Validate your Analysis using the Triangular System
11. When using the analytics models in the future –
• How much time will the company save?
• How much costs will be reduced?
• How much increase in Revenue?
• How much increased profitability?
• How many more jobs cam be completed?
• How many more new customers will buy from you?
• How many fewer errors were made
10. Understand the Business Value of your Analysis Results
12. • 28 years of experience working in the data analytics
field
• PhD in Statistics and a Master of Business
Administration (MBA
• Chief of Business Analytics at the National University
of Singapore (ISS)
• Quantitative Methods Manager at Cegedim Strategic
Data (IMS)
• Statistical Analyst at Foxtel
• Advanced Analytics Statistician at Aztec(IRI)
• Biostatistician at NHMRC Clinical Trials Centre
Founder & Chief Analytics Officer
Business Data Analytics Solutions
Carol Hargreaves (PhD)