SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
Intro to
Python:
Build a
Predictive
Model
Introductions
➔ What's your name?
➔ What brought you here today?
➔ What is your programming experience?
We train developers and
data scientists through
1x1 mentorship and
project-based learning.
Guaranteed.
About Thinkful
Learn
by
Doing
➔ Why is Data Science a thing?
➔ What is Python?
➔ How do we use it with a real
world project?
➔ How do I learn more?
What
is
a
Data
Scientist?
“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know anyone. So
you just stand in the corner sipping your drink —
and you probably leave early.”
— LinkedIn Manager, June 2006
Example:
LinkedIn
2006
➔ Joined LinkedIn in 2006, only 8M
users (450M in 2016)
➔ Started experiments to predict
people’s networks
➔ Engineers were dismissive: “you
can already import your address
book”
Enter:
Data
Scientist
➔ Frame the question
➔ Collect the raw data
➔ Process the data
➔ Explore the data
➔ Communicate results
The
Process:
LinkedIn
Example
➔ What questions do we want to answer?
◆ Who?
◆ What?
◆ When?
◆ Where?
◆ Why?
◆ How?
Case:
Frame
the
Question
➔ What connections (type and number) lead to higher
user engagement?
➔ Which connections do people want to make but are
currently limited from making?
➔ How might we predict these types of connections with
limited data from the user?
Case:
Frame
the
Question
➔ What data do we need to
answer these questions?
Case:
Collect
the
Data
➔ Connection data (who is who connected to?)
➔ Demographic data (what is the profile of
the connection)
➔ Engagement data (how do they use the site)
Case:
Collect
the
Data
➔ How is the data
“dirty” and how can
we clean it?
Case:
Process
the
Data
➔ User input
➔ Redundancies
➔ Feature changes
➔ Data model changes
Case:
Process
the
Data
➔ What are the meaningful
patterns in the data?
Case:
Explore
the
Data
➔ Triangle closing
➔ Time Overlaps
➔ Geographic Overlaps
Case:
Explore
the Data
➔ How do we communicate this?
➔ To whom?
Case:
Communicate
Findings
➔ Marketing - sell X more ad space, results in X more
impressions per day
➔ Product - build X more features
➔ Development - grow our team by X
➔ Sales - attract X more premium accounts
➔ C-Level - more revenue, 8M - 450M in 10 years
Case:
Communicate
Findings
The
Result
Python for Programming
➔ Great for Data Science
➔ Robotics
➔ Web Development
(Python/Django)
➔ Automation
Let’s
Learn
Python
Let’s
Learn
Python
➔ Our model is going to be a Decision Tree
➔ Decision Trees predict the most likely outcome
based on input
➔ Like a computer building a version of 20
questions
The
Model
Decision
Trees:
Golf?
➔ We’ll be using a
Google-hosted Python notebook
to build this model called
Colaboratory
➔ Go to:
Colab.research.google.com
➔ Click New Python 3 Notebook
The
Notebook
from sklearn import tree
➔ Import Tree functionality from
the SKLearn Python Package
➔ bit.ly/sklearn-python
Code
Block 1
X = [[181,80], [177,70], [160,60], [154,54], [166,65],
[190,90], [175,64], [177,70], [159,55], [171,75], [181,85]]
Y = ['male','female','female','female','male','male','male','female',
'male','female','male']
➔ Load in our seed data
➔ X is an array of inputs, each input is itself
an array that contains Height (in cm) and
Weight (in kg)
➔ Y is an array of strings that map to the
inputs in X so we can train the model
Code
Block 2
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X,Y)
#print tree.export_graphviz(clf,None)
➔ We create an empty DecisionTreeClassifier and
assign it to the variable clf
➔ We fit the decision tree with our X and Y
seed data
➔ SKLearn is automatically creating our
Decision Tree questions for us (Example: Is
height > 177? Yes - Male)
➔ Uncomment the last line and paste the return
string into: webgraphviz.com
Code
Block 3
prediction = clf.predict([[183,76]])
print prediction
➔ Now we give our inputs, in the same format
➔ Height (cm), Weight (kg)
➔ Print our prediction
Code
Block 4
Our model has a few weaknesses:
➔ Limited inputs
➔ Assumptions
Shortcomings
Ways
to
Learn
Data
Science
➔ Start with Python and Statistics
➔ Personal Program Manager
➔ Unlimited Q&A Sessions
➔ Student Slack Community
➔ bit.ly/freetrial-ds
Thinkful
Two-Week
Free
Trial
The
Student
Experience
Marnie Boyer, Thinkful Graduate
Capstone
Wolfgang Hall, Thinkful Graduate
Capstone
➔ bit.ly/tf-event-feedback
Survey

Weitere ähnliche Inhalte

Ähnlich wie Tf itpbapm

Ähnlich wie Tf itpbapm (20)

Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Tf byow
Tf byowTf byow
Tf byow
 
Tf byow
Tf byowTf byow
Tf byow
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
Maryna Sokyrko & Oleksandr Chugui: Building Product Passion: Developing AI ch...
 
Tf byows
Tf byowsTf byows
Tf byows
 
Tf byowwhc
Tf byowwhcTf byowwhc
Tf byowwhc
 
Tf byowwhc
Tf byowwhcTf byowwhc
Tf byowwhc
 
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
 
Tf itjsbagg
Tf itjsbaggTf itjsbagg
Tf itjsbagg
 
Virtual Collaboration
Virtual CollaborationVirtual Collaboration
Virtual Collaboration
 
Batbwjs1121
Batbwjs1121Batbwjs1121
Batbwjs1121
 
An Ultimate Guide To Hire Python Developer
An Ultimate Guide To Hire Python DeveloperAn Ultimate Guide To Hire Python Developer
An Ultimate Guide To Hire Python Developer
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Tech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @CriteoTech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @Criteo
 
How to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystHow to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business Analyst
 
Python
PythonPython
Python
 

Mehr von Shannon Gallagher (19)

Tf wiads
Tf wiadsTf wiads
Tf wiads
 
Tf wdvds
Tf wdvdsTf wdvds
Tf wdvds
 
Tf gsit
Tf gsitTf gsit
Tf gsit
 
Tf itjsbagg
Tf itjsbaggTf itjsbagg
Tf itjsbagg
 
Tf ffccjs
Tf ffccjsTf ffccjs
Tf ffccjs
 
Tf ffcchtmlcss
Tf ffcchtmlcssTf ffcchtmlcss
Tf ffcchtmlcss
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Tf dsyv
Tf dsyvTf dsyv
Tf dsyv
 
Tf ffccjs
Tf ffccjsTf ffccjs
Tf ffccjs
 
Ffcchtml
FfcchtmlFfcchtml
Ffcchtml
 
Tf gsds
Tf gsdsTf gsds
Tf gsds
 
Tf ffccjs
Tf   ffccjsTf   ffccjs
Tf ffccjs
 
Tf frccjs
Tf frccjsTf frccjs
Tf frccjs
 
Tf fcchc
Tf fcchcTf fcchc
Tf fcchc
 
Bavp sd
Bavp sdBavp sd
Bavp sd
 
Den bavp
Den bavpDen bavp
Den bavp
 
Tf bavp
Tf bavpTf bavp
Tf bavp
 
Fcchc424
Fcchc424Fcchc424
Fcchc424
 
Byowwhc43
Byowwhc43Byowwhc43
Byowwhc43
 

Kürzlich hochgeladen

SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
Peter Brusilovsky
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
中 央社
 

Kürzlich hochgeladen (20)

SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
How to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 InventoryHow to Manage Closest Location in Odoo 17 Inventory
How to Manage Closest Location in Odoo 17 Inventory
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategies
 
How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17How To Create Editable Tree View in Odoo 17
How To Create Editable Tree View in Odoo 17
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
demyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptxdemyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptx
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
Exploring Gemini AI and Integration with MuleSoft | MuleSoft Mysore Meetup #45
 
philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the life
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.IPL Online Quiz by Pragya; Question Set.
IPL Online Quiz by Pragya; Question Set.
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptx
 

Tf itpbapm

  • 2. Introductions ➔ What's your name? ➔ What brought you here today? ➔ What is your programming experience?
  • 3. We train developers and data scientists through 1x1 mentorship and project-based learning. Guaranteed. About Thinkful
  • 4. Learn by Doing ➔ Why is Data Science a thing? ➔ What is Python? ➔ How do we use it with a real world project? ➔ How do I learn more?
  • 6. “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink — and you probably leave early.” — LinkedIn Manager, June 2006 Example: LinkedIn 2006
  • 7. ➔ Joined LinkedIn in 2006, only 8M users (450M in 2016) ➔ Started experiments to predict people’s networks ➔ Engineers were dismissive: “you can already import your address book” Enter: Data Scientist
  • 8. ➔ Frame the question ➔ Collect the raw data ➔ Process the data ➔ Explore the data ➔ Communicate results The Process: LinkedIn Example
  • 9. ➔ What questions do we want to answer? ◆ Who? ◆ What? ◆ When? ◆ Where? ◆ Why? ◆ How? Case: Frame the Question
  • 10. ➔ What connections (type and number) lead to higher user engagement? ➔ Which connections do people want to make but are currently limited from making? ➔ How might we predict these types of connections with limited data from the user? Case: Frame the Question
  • 11. ➔ What data do we need to answer these questions? Case: Collect the Data
  • 12. ➔ Connection data (who is who connected to?) ➔ Demographic data (what is the profile of the connection) ➔ Engagement data (how do they use the site) Case: Collect the Data
  • 13. ➔ How is the data “dirty” and how can we clean it? Case: Process the Data
  • 14. ➔ User input ➔ Redundancies ➔ Feature changes ➔ Data model changes Case: Process the Data
  • 15. ➔ What are the meaningful patterns in the data? Case: Explore the Data
  • 16. ➔ Triangle closing ➔ Time Overlaps ➔ Geographic Overlaps Case: Explore the Data
  • 17. ➔ How do we communicate this? ➔ To whom? Case: Communicate Findings
  • 18. ➔ Marketing - sell X more ad space, results in X more impressions per day ➔ Product - build X more features ➔ Development - grow our team by X ➔ Sales - attract X more premium accounts ➔ C-Level - more revenue, 8M - 450M in 10 years Case: Communicate Findings
  • 20. Python for Programming ➔ Great for Data Science ➔ Robotics ➔ Web Development (Python/Django) ➔ Automation Let’s Learn Python
  • 22. ➔ Our model is going to be a Decision Tree ➔ Decision Trees predict the most likely outcome based on input ➔ Like a computer building a version of 20 questions The Model
  • 24. ➔ We’ll be using a Google-hosted Python notebook to build this model called Colaboratory ➔ Go to: Colab.research.google.com ➔ Click New Python 3 Notebook The Notebook
  • 25. from sklearn import tree ➔ Import Tree functionality from the SKLearn Python Package ➔ bit.ly/sklearn-python Code Block 1
  • 26. X = [[181,80], [177,70], [160,60], [154,54], [166,65], [190,90], [175,64], [177,70], [159,55], [171,75], [181,85]] Y = ['male','female','female','female','male','male','male','female', 'male','female','male'] ➔ Load in our seed data ➔ X is an array of inputs, each input is itself an array that contains Height (in cm) and Weight (in kg) ➔ Y is an array of strings that map to the inputs in X so we can train the model Code Block 2
  • 27. clf = tree.DecisionTreeClassifier() clf = clf.fit(X,Y) #print tree.export_graphviz(clf,None) ➔ We create an empty DecisionTreeClassifier and assign it to the variable clf ➔ We fit the decision tree with our X and Y seed data ➔ SKLearn is automatically creating our Decision Tree questions for us (Example: Is height > 177? Yes - Male) ➔ Uncomment the last line and paste the return string into: webgraphviz.com Code Block 3
  • 28. prediction = clf.predict([[183,76]]) print prediction ➔ Now we give our inputs, in the same format ➔ Height (cm), Weight (kg) ➔ Print our prediction Code Block 4
  • 29. Our model has a few weaknesses: ➔ Limited inputs ➔ Assumptions Shortcomings
  • 31. ➔ Start with Python and Statistics ➔ Personal Program Manager ➔ Unlimited Q&A Sessions ➔ Student Slack Community ➔ bit.ly/freetrial-ds Thinkful Two-Week Free Trial
  • 32. The Student Experience Marnie Boyer, Thinkful Graduate Capstone Wolfgang Hall, Thinkful Graduate Capstone