SlideShare a Scribd company logo
1 of 55
Download to read offline
jeroen at strata in ~ $ learn-shell-for-data-science --title
50 reasons to
learn the shell
for doing data science
jeroen at strata in ~ $ learn-shell-for-data-science --speaker
Jeroen Janssens
@jeroenhjanssens
CEO at Data Science Workshops B.V.
Author of Data Science at the Command Line
The shell makes you look
like a 1337 hacker.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 01
jeroen at strata in ~ $ learn-shell-for-data-science --reason 02
When it comes to
hacking, the shell
is indispensable.
Source: Drew Conway
jeroen at strata in ~ $ learn-shell-for-data-science --osemn
Data science is OSEMN:
Obtaining data
Scrubbing data
Exploring data
Modelling data
iNterpreting data Source: Mason & Wiggins (2010)
jeroen at strata in ~ $ learn-shell-for-data-science --reason 03
$ pip install scikit-learn
Requirement already satisfied: scikit-learn in /usr/lib/python3.6/site-packages
$ cd ~/.ssh
$ ssh-keygen
$ cat ~/.ssh/id_rsa.pub | pbcopy
$ curl 'http://api.citybik.es/v2/networks/santander-cycles' |
> jq '.network.stations[].free_bikes' |
> paste -sd+ | bc
9525
The shell, with its read-eval-print-loop,
enables you to play with your data.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 04
The shell is very close to the
filesystem, which makes it very
convenient to work with files on a
large scale.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 05
Velociraptors.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 06
Plenty of great resources
are available to learn the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 07
There's a fantastic book
about using the shell
for doing data science.
Read it for free at:
data science at the command line .com
jeroen at strata in ~ $ learn-shell-for-data-science --reason 08
The shell has a
vast and interesting history.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 09
Like wine, the shell takes time to be
appreciated. Good thing the shell also
ages like wine.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 10
There's always something new to learn
about the shell and its many tools.
And learning is fun.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 11
Docker containers are great
for safely learning the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 12
The shell gives you access to
man pages, which is like an
offline Stack Overflow.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 13
explainshell.com explains a given
command line by matching each
argument to the relevant help text in
the man page.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 14
The shell is free.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 15
The shell doesn't care whether a tool
has been implemented in Bash, C, Go,
Java, JavaScript, Lisp, Perl, Python, R,
Rust, or Scala.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 16
You can
customize
the hell
out of
the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 17
The shell uses text as the universal
interface, which enables tools from all
over the world to work together and
solve problems.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 18
Most command-line tools do one thing
and do it well.
The shell is there to let these tools
work together in various ways.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 19
The shell never bothers you
about software updates.
Unless you want it to.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 20
The shell gives you great
control over your system.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 21
When shit hits the fan with git,
the shell is the only interface
that can clean up the mess.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 22
You can also program in the shell.
A simple for-loop can do miracles.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 23
Want to parallelize or distribute your
task to multiple cores or machines?
Use the shell with a pinch of parallel.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 24
The shell: come for the tools,
stay for the environment.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 25
By default, the shell comes with many
great tools such as find, grep, and cut.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 26
Package managers such as apt-get,
brew, and pacman make it a pleasure to
install additional command-line tools.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 27
New tools are being developed
every day for the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 28
The shell keeps a history.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 29
You can easily extend the shell with
your own tools, making you a more
efficient and effective data scientist.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 30
The shell lets you quickly find out
things like: the size of a directory, the
encoding of a CSV file, and the
resolution of an image.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 31
The shell lets you query databases,
access APIs, open remote sheets, and
even scrape websites.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 32
With tools like
csvkit, jq, and xmlstarlet,
you can easily wrangle
CSV, JSON, and XML in the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 33
csvsql allows you to perform
SQL queries directly on
CSV files in the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 34
telnet towel.blinkenlights.nl
lets you watch Star Wars IV.
Use the shell, Luke.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 35
The shell isn’t just available on UNIX
machines and supercomputers. It can
also be found on macOS, Raspberry Pi,
and even Windows 10.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 36
Sometimes the shell outperforms
fancy big data technologies.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 37
You can easily invoke Python and R
from the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 38
Want to continue working in your
favourite programming language or
statistical environment? The shell is
totally cool with that.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 39
You can easily invoke the shell from
Jupyter Notebook and RStudio.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 40
$ echo data science at the command line | cowsay
jeroen at strata in ~ $ learn-shell-for-data-science --reason 41
$ echo data science at the command line | cowsay
__________________________________
< data science at the command line >
----------------------------------
 ^__^
 (oo)_______
(__) )/
||----w |
|| ||
jeroen at strata in ~ $ learn-shell-for-data-science --reason 41
These days, many
frontend developers
also use the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 42
Invoke sudo and
the shell will make
you a sandwich.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 43
Source: XKCD
Note: Do not try on frontend developers
You can automate just about
everything using the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 44
Good luck managing a gazillion
instances on AWS, Azure, and Google
Cloud using the mouse.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 45
The shell often requires less typing
than a programming language.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 46
The shell allows you to rename
750 files with just three lines of code.
Or one, if you have the right tool.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 47
Your wrists will thank you
for using the shell.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 48
jeroen at strata in ~ $ learn-shell-for-data-science --reason 49
The shell has been around for almost
50 years, and probably will be around
for the rest of your career.
Because
Tim
says so.
jeroen at strata in ~ $ learn-shell-for-data-science --reason 50
jeroen at strata in ~ $ learn-shell-for-data-science --thank-you
Jeroen Janssens
@jeroenhjanssens
CEO at Data Science Workshops B.V.
Author of Data Science at the Command Line

More Related Content

Recently uploaded

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Recently uploaded (20)

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

50 Reasons to Learn the Shell for Doing Data Science

  • 1. jeroen at strata in ~ $ learn-shell-for-data-science --title 50 reasons to learn the shell for doing data science
  • 2. jeroen at strata in ~ $ learn-shell-for-data-science --speaker Jeroen Janssens @jeroenhjanssens CEO at Data Science Workshops B.V. Author of Data Science at the Command Line
  • 3. The shell makes you look like a 1337 hacker. jeroen at strata in ~ $ learn-shell-for-data-science --reason 01
  • 4. jeroen at strata in ~ $ learn-shell-for-data-science --reason 02 When it comes to hacking, the shell is indispensable. Source: Drew Conway
  • 5. jeroen at strata in ~ $ learn-shell-for-data-science --osemn Data science is OSEMN: Obtaining data Scrubbing data Exploring data Modelling data iNterpreting data Source: Mason & Wiggins (2010)
  • 6. jeroen at strata in ~ $ learn-shell-for-data-science --reason 03 $ pip install scikit-learn Requirement already satisfied: scikit-learn in /usr/lib/python3.6/site-packages $ cd ~/.ssh $ ssh-keygen $ cat ~/.ssh/id_rsa.pub | pbcopy $ curl 'http://api.citybik.es/v2/networks/santander-cycles' | > jq '.network.stations[].free_bikes' | > paste -sd+ | bc 9525
  • 7. The shell, with its read-eval-print-loop, enables you to play with your data. jeroen at strata in ~ $ learn-shell-for-data-science --reason 04
  • 8. The shell is very close to the filesystem, which makes it very convenient to work with files on a large scale. jeroen at strata in ~ $ learn-shell-for-data-science --reason 05
  • 9. Velociraptors. jeroen at strata in ~ $ learn-shell-for-data-science --reason 06
  • 10. Plenty of great resources are available to learn the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 07
  • 11. There's a fantastic book about using the shell for doing data science. Read it for free at: data science at the command line .com jeroen at strata in ~ $ learn-shell-for-data-science --reason 08
  • 12. The shell has a vast and interesting history. jeroen at strata in ~ $ learn-shell-for-data-science --reason 09
  • 13. Like wine, the shell takes time to be appreciated. Good thing the shell also ages like wine. jeroen at strata in ~ $ learn-shell-for-data-science --reason 10
  • 14. There's always something new to learn about the shell and its many tools. And learning is fun. jeroen at strata in ~ $ learn-shell-for-data-science --reason 11
  • 15. Docker containers are great for safely learning the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 12
  • 16. The shell gives you access to man pages, which is like an offline Stack Overflow. jeroen at strata in ~ $ learn-shell-for-data-science --reason 13
  • 17. explainshell.com explains a given command line by matching each argument to the relevant help text in the man page. jeroen at strata in ~ $ learn-shell-for-data-science --reason 14
  • 18. The shell is free. jeroen at strata in ~ $ learn-shell-for-data-science --reason 15
  • 19. The shell doesn't care whether a tool has been implemented in Bash, C, Go, Java, JavaScript, Lisp, Perl, Python, R, Rust, or Scala. jeroen at strata in ~ $ learn-shell-for-data-science --reason 16
  • 20. You can customize the hell out of the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 17
  • 21. The shell uses text as the universal interface, which enables tools from all over the world to work together and solve problems. jeroen at strata in ~ $ learn-shell-for-data-science --reason 18
  • 22. Most command-line tools do one thing and do it well. The shell is there to let these tools work together in various ways. jeroen at strata in ~ $ learn-shell-for-data-science --reason 19
  • 23. The shell never bothers you about software updates. Unless you want it to. jeroen at strata in ~ $ learn-shell-for-data-science --reason 20
  • 24. The shell gives you great control over your system. jeroen at strata in ~ $ learn-shell-for-data-science --reason 21
  • 25. When shit hits the fan with git, the shell is the only interface that can clean up the mess. jeroen at strata in ~ $ learn-shell-for-data-science --reason 22
  • 26. You can also program in the shell. A simple for-loop can do miracles. jeroen at strata in ~ $ learn-shell-for-data-science --reason 23
  • 27. Want to parallelize or distribute your task to multiple cores or machines? Use the shell with a pinch of parallel. jeroen at strata in ~ $ learn-shell-for-data-science --reason 24
  • 28. The shell: come for the tools, stay for the environment. jeroen at strata in ~ $ learn-shell-for-data-science --reason 25
  • 29. By default, the shell comes with many great tools such as find, grep, and cut. jeroen at strata in ~ $ learn-shell-for-data-science --reason 26
  • 30. Package managers such as apt-get, brew, and pacman make it a pleasure to install additional command-line tools. jeroen at strata in ~ $ learn-shell-for-data-science --reason 27
  • 31. New tools are being developed every day for the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 28
  • 32. The shell keeps a history. jeroen at strata in ~ $ learn-shell-for-data-science --reason 29
  • 33. You can easily extend the shell with your own tools, making you a more efficient and effective data scientist. jeroen at strata in ~ $ learn-shell-for-data-science --reason 30
  • 34. The shell lets you quickly find out things like: the size of a directory, the encoding of a CSV file, and the resolution of an image. jeroen at strata in ~ $ learn-shell-for-data-science --reason 31
  • 35. The shell lets you query databases, access APIs, open remote sheets, and even scrape websites. jeroen at strata in ~ $ learn-shell-for-data-science --reason 32
  • 36. With tools like csvkit, jq, and xmlstarlet, you can easily wrangle CSV, JSON, and XML in the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 33
  • 37. csvsql allows you to perform SQL queries directly on CSV files in the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 34
  • 38. telnet towel.blinkenlights.nl lets you watch Star Wars IV. Use the shell, Luke. jeroen at strata in ~ $ learn-shell-for-data-science --reason 35
  • 39. The shell isn’t just available on UNIX machines and supercomputers. It can also be found on macOS, Raspberry Pi, and even Windows 10. jeroen at strata in ~ $ learn-shell-for-data-science --reason 36
  • 40. Sometimes the shell outperforms fancy big data technologies. jeroen at strata in ~ $ learn-shell-for-data-science --reason 37
  • 41. You can easily invoke Python and R from the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 38
  • 42. Want to continue working in your favourite programming language or statistical environment? The shell is totally cool with that. jeroen at strata in ~ $ learn-shell-for-data-science --reason 39
  • 43. You can easily invoke the shell from Jupyter Notebook and RStudio. jeroen at strata in ~ $ learn-shell-for-data-science --reason 40
  • 44. $ echo data science at the command line | cowsay jeroen at strata in ~ $ learn-shell-for-data-science --reason 41
  • 45. $ echo data science at the command line | cowsay __________________________________ < data science at the command line > ---------------------------------- ^__^ (oo)_______ (__) )/ ||----w | || || jeroen at strata in ~ $ learn-shell-for-data-science --reason 41
  • 46. These days, many frontend developers also use the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 42
  • 47. Invoke sudo and the shell will make you a sandwich. jeroen at strata in ~ $ learn-shell-for-data-science --reason 43 Source: XKCD Note: Do not try on frontend developers
  • 48. You can automate just about everything using the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 44
  • 49. Good luck managing a gazillion instances on AWS, Azure, and Google Cloud using the mouse. jeroen at strata in ~ $ learn-shell-for-data-science --reason 45
  • 50. The shell often requires less typing than a programming language. jeroen at strata in ~ $ learn-shell-for-data-science --reason 46
  • 51. The shell allows you to rename 750 files with just three lines of code. Or one, if you have the right tool. jeroen at strata in ~ $ learn-shell-for-data-science --reason 47
  • 52. Your wrists will thank you for using the shell. jeroen at strata in ~ $ learn-shell-for-data-science --reason 48
  • 53. jeroen at strata in ~ $ learn-shell-for-data-science --reason 49 The shell has been around for almost 50 years, and probably will be around for the rest of your career.
  • 54. Because Tim says so. jeroen at strata in ~ $ learn-shell-for-data-science --reason 50
  • 55. jeroen at strata in ~ $ learn-shell-for-data-science --thank-you Jeroen Janssens @jeroenhjanssens CEO at Data Science Workshops B.V. Author of Data Science at the Command Line