SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
The Linux command line Writing and running Python code Next meetings
Big Data and Automated Content Analysis
Week 1 – Wednesday
»First steps in the VM«
Damian Trilling
d.c.trilling@uva.nl
@damian0604
www.damiantrilling.net
Afdeling Communicatiewetenschap
Universiteit van Amsterdam
8 February 2017
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Today
1 The Linux command line
2 Writing and running Python code
3 Next meetings
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
When point-and-click doesn’t help you further:
The Linux command line
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
The tools
General idea
Whereever possbile, we use tools that are
• platform-independent
• free (as in beer and as in speech)
• open source
.
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
The tools
General idea
Whereever possbile, we use tools that are
• platform-independent
• free (as in beer and as in speech)
• open source
To make things easier, we work with a virtual machine in which
everyone runs the same Linux version.
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Let’s switch to Linux!
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
a.k.a. the terminal, shell or, more specifically, bash
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
• Direct access to your computer’s functions
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
• Direct access to your computer’s functions
• In contrast to point-and-click programs, command line
programs can easily be linked to each other, scripted, . . .
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
• Direct access to your computer’s functions
• In contrast to point-and-click programs, command line
programs can easily be linked to each other, scripted, . . .
• Suitable for handling even huge files
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
• Direct access to your computer’s functions
• In contrast to point-and-click programs, command line
programs can easily be linked to each other, scripted, . . .
• Suitable for handling even huge files
• You simply cannot open them in many GUI programs
• . . . or it takes ages
• The command line allows you to do such things without
problems
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Tools: The linux command line
Why?
• Direct access to your computer’s functions
• In contrast to point-and-click programs, command line
programs can easily be linked to each other, scripted, . . .
• Suitable for handling even huge files
• You simply cannot open them in many GUI programs
• . . . or it takes ages
• The command line allows you to do such things without
problems
• It is reproducible (ever tried to explain to your parents on the
phone where they have to click?)
Big Data and Automated Content Analysis Damian Trilling
There are endless tutorials, cheat
sheets, videos . . . online. Google it!
The Linux command line Writing and running Python code Next meetings
Exercise
Take the book.
Follow the instructions in Chapter 2.
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
A language, not a program:
Python
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Python
What?
• A language, not a specific program
• Huge advantage: flexibility, portability
• One of the languages for data analysis. (The other one is R.)
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Python
What?
• A language, not a specific program
• Huge advantage: flexibility, portability
• One of the languages for data analysis. (The other one is R.)
But Python is more flexible—the original version of Dropbox was written in Python. I’d say: R for
numbers, Python for text and messy stuff. But that’s just my personal view.
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Python
What?
• A language, not a specific program
• Huge advantage: flexibility, portability
• One of the languages for data analysis. (The other one is R.)
Which version?
We use Python 3.
http://www.google.com or http://www.stackexchange.com still offer a lot
of Python2-code, but that can easily be adapted. Most notable difference: In
Python 2, you write print "Hi", this has changed to print ("Hi")
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
If it’s not a program, how do you work with it?
Interactive mode
• Just type python3 on the command line, and you can start
entering Python commands (You can leave again by entering quit())
• Great for quick try-outs, but you cannot even save your code
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
If it’s not a program, how do you work with it?
Interactive mode
• Just type python3 on the command line, and you can start
entering Python commands (You can leave again by entering quit())
• Great for quick try-outs, but you cannot even save your code
An editor of your choice
• Write your program in any text editor, save it as myprog.py
• and run it from the command line with ./myprog.py or
python3 myprog.py
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
If it’s not a program, how do you start it?
An IDE (Integrated Development Environment)
• Provides an interface
• Both quick interactive try-outs and writing larger programs
• We use spyder, which looks a bit like RStudio (and to some
extent like Stata)
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
If it’s not a program, how do you start it?
An IDE (Integrated Development Environment)
• Provides an interface
• Both quick interactive try-outs and writing larger programs
• We use spyder, which looks a bit like RStudio (and to some
extent like Stata)
Jupyter Notebook
• Runs in your browser
• Stores results and text along with code
• Great for interactive playing with data and for sharing results
Big Data and Automated Content Analysis Damian Trilling
Let’s start up a Python environment and write a
Hello-world-program!
The Linux command line Writing and running Python code Next meetings
Start playing!
Exercises 1. Run a program that greets you.
The code for this is
1 print("Hello world")
After that, do some calculations. You can do that in a similar way:
1 a=2
2 print(a*3)
Just play around.
Additional ressources
Codecademy course on Python
https://www.codecademy.com/learn/python
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Next meetings
Big Data and Automated Content Analysis Damian Trilling
The Linux command line Writing and running Python code Next meetings
Week 2: Getting started with Python
Monday, 13–2
Lecture.
Chapter 4.
Wednesday, 15–2
Lab Session.
Exercise A1.
Big Data and Automated Content Analysis Damian Trilling

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

BDACA1516s2 - Lecture6
BDACA1516s2 - Lecture6BDACA1516s2 - Lecture6
BDACA1516s2 - Lecture6
 
BD-ACA week3a
BD-ACA week3aBD-ACA week3a
BD-ACA week3a
 
BDACA - Lecture5
BDACA - Lecture5BDACA - Lecture5
BDACA - Lecture5
 
BDACA1516s2 - Lecture8
BDACA1516s2 - Lecture8BDACA1516s2 - Lecture8
BDACA1516s2 - Lecture8
 
BDACA1617s2 - Lecture7
BDACA1617s2 - Lecture7BDACA1617s2 - Lecture7
BDACA1617s2 - Lecture7
 
BD-ACA week5
BD-ACA week5BD-ACA week5
BD-ACA week5
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
Analyzing Stack Overflow - Problem
Analyzing Stack Overflow - ProblemAnalyzing Stack Overflow - Problem
Analyzing Stack Overflow - Problem
 
Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics Stack Overflow slides Data Analytics
Stack Overflow slides Data Analytics
 
Introduction to Python for Data Science and Machine Learning
Introduction to Python for Data Science and Machine Learning Introduction to Python for Data Science and Machine Learning
Introduction to Python for Data Science and Machine Learning
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Python training
Python trainingPython training
Python training
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Python programming l2
Python programming l2Python programming l2
Python programming l2
 
Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...Building a Dynamic Bidding system for a location based Display advertising Pl...
Building a Dynamic Bidding system for a location based Display advertising Pl...
 
Basics of IR: Web Information Systems class
Basics of IR: Web Information Systems class Basics of IR: Web Information Systems class
Basics of IR: Web Information Systems class
 
PyCon 2013 - Experiments in data mining, entity disambiguation and how to thi...
PyCon 2013 - Experiments in data mining, entity disambiguation and how to thi...PyCon 2013 - Experiments in data mining, entity disambiguation and how to thi...
PyCon 2013 - Experiments in data mining, entity disambiguation and how to thi...
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 

Andere mochten auch

Real Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case StudyReal Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case Study
Nati Shalom
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial Visualization
Gephi Consortium
 

Andere mochten auch (14)

Analyzing social media with Python and other tools (2/4)
Analyzing social media with Python and other tools (2/4) Analyzing social media with Python and other tools (2/4)
Analyzing social media with Python and other tools (2/4)
 
ірімшіктің түрлері
ірімшіктің түрлеріірімшіктің түрлері
ірімшіктің түрлері
 
BDACA1617s2 - Lecture 1
BDACA1617s2 - Lecture 1BDACA1617s2 - Lecture 1
BDACA1617s2 - Lecture 1
 
NAPL Quick Tips: A Primer on Hydrocarbon Sheens
NAPL Quick Tips: A Primer on Hydrocarbon SheensNAPL Quick Tips: A Primer on Hydrocarbon Sheens
NAPL Quick Tips: A Primer on Hydrocarbon Sheens
 
Illuminate Ad Effectiveness
Illuminate Ad EffectivenessIlluminate Ad Effectiveness
Illuminate Ad Effectiveness
 
U3 t1 aa1_marcos_gaspar
U3 t1 aa1_marcos_gasparU3 t1 aa1_marcos_gaspar
U3 t1 aa1_marcos_gaspar
 
International Journal of Solid State Materials vol 2 issue 1
International Journal of Solid State Materials vol 2 issue 1International Journal of Solid State Materials vol 2 issue 1
International Journal of Solid State Materials vol 2 issue 1
 
Real Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case StudyReal Time Analytics for Big Data a Twitter Case Study
Real Time Analytics for Big Data a Twitter Case Study
 
Real Time Analytics for Big Data - A twitter inspired case study
Real Time Analytics for Big Data - A twitter inspired case studyReal Time Analytics for Big Data - A twitter inspired case study
Real Time Analytics for Big Data - A twitter inspired case study
 
impact of tv advertisment on consumer buying behaviour
impact of tv advertisment on consumer buying behaviourimpact of tv advertisment on consumer buying behaviour
impact of tv advertisment on consumer buying behaviour
 
Gephi Tutorial Visualization
Gephi Tutorial VisualizationGephi Tutorial Visualization
Gephi Tutorial Visualization
 
Twitter bootstrap tutorial
Twitter bootstrap tutorialTwitter bootstrap tutorial
Twitter bootstrap tutorial
 
Communication in health and social care
Communication in health and social careCommunication in health and social care
Communication in health and social care
 
Parlamento e governo
Parlamento e governoParlamento e governo
Parlamento e governo
 

Ähnlich wie BDACA1617s2 - Tutorial 1

Basic Python Introduction Lecture 1.pptx
Basic Python Introduction Lecture 1.pptxBasic Python Introduction Lecture 1.pptx
Basic Python Introduction Lecture 1.pptx
Aditya Patel
 
Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425
Sapna Tyagi
 
Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdf
Rahul Mogal
 

Ähnlich wie BDACA1617s2 - Tutorial 1 (20)

BDACA - Tutorial1
BDACA - Tutorial1BDACA - Tutorial1
BDACA - Tutorial1
 
MODULE 1.pptx
MODULE 1.pptxMODULE 1.pptx
MODULE 1.pptx
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Python Programming Draft PPT.pptx
Python Programming Draft PPT.pptxPython Programming Draft PPT.pptx
Python Programming Draft PPT.pptx
 
Python basic
Python basicPython basic
Python basic
 
Py4 inf 01-intro
Py4 inf 01-introPy4 inf 01-intro
Py4 inf 01-intro
 
Basic Python Introduction Lecture 1.pptx
Basic Python Introduction Lecture 1.pptxBasic Python Introduction Lecture 1.pptx
Basic Python Introduction Lecture 1.pptx
 
Python Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to usePython Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to use
 
Python and its Applications
Python and its ApplicationsPython and its Applications
Python and its Applications
 
Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425
 
Introduction to python for Beginners
Introduction to python for Beginners Introduction to python for Beginners
Introduction to python for Beginners
 
intro.pptx (1).pdf
intro.pptx (1).pdfintro.pptx (1).pdf
intro.pptx (1).pdf
 
Introduction python
Introduction pythonIntroduction python
Introduction python
 
Python tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academyPython tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academy
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptx
 
Python for DevOps use
Python for DevOps usePython for DevOps use
Python for DevOps use
 
Python Tutorial | Python Programming Language
Python Tutorial | Python Programming LanguagePython Tutorial | Python Programming Language
Python Tutorial | Python Programming Language
 
Phython Programming Language
Phython Programming LanguagePhython Programming Language
Phython Programming Language
 
Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdf
 
Simple calulator using GUI tkinter.pptx
Simple calulator using GUI tkinter.pptxSimple calulator using GUI tkinter.pptx
Simple calulator using GUI tkinter.pptx
 

Mehr von Department of Communication Science, University of Amsterdam

Mehr von Department of Communication Science, University of Amsterdam (18)

BDACA - Lecture8
BDACA - Lecture8BDACA - Lecture8
BDACA - Lecture8
 
BDACA - Lecture7
BDACA - Lecture7BDACA - Lecture7
BDACA - Lecture7
 
BDACA - Lecture6
BDACA - Lecture6BDACA - Lecture6
BDACA - Lecture6
 
BDACA - Tutorial5
BDACA - Tutorial5BDACA - Tutorial5
BDACA - Tutorial5
 
BDACA - Lecture4
BDACA - Lecture4BDACA - Lecture4
BDACA - Lecture4
 
BDACA - Lecture3
BDACA - Lecture3BDACA - Lecture3
BDACA - Lecture3
 
BDACA - Lecture2
BDACA - Lecture2BDACA - Lecture2
BDACA - Lecture2
 
BDACA - Lecture1
BDACA - Lecture1BDACA - Lecture1
BDACA - Lecture1
 
BDACA1617s2 - Lecture6
BDACA1617s2 - Lecture6BDACA1617s2 - Lecture6
BDACA1617s2 - Lecture6
 
BDACA1617s2 - Lecture5
BDACA1617s2 - Lecture5BDACA1617s2 - Lecture5
BDACA1617s2 - Lecture5
 
Media diets in an age of apps and social media: Dealing with a third layer of...
Media diets in an age of apps and social media: Dealing with a third layer of...Media diets in an age of apps and social media: Dealing with a third layer of...
Media diets in an age of apps and social media: Dealing with a third layer of...
 
Conceptualizing and measuring news exposure as network of users and news items
Conceptualizing and measuring news exposure as network of users and news itemsConceptualizing and measuring news exposure as network of users and news items
Conceptualizing and measuring news exposure as network of users and news items
 
Data Science: Case "Political Communication 2/2"
Data Science: Case "Political Communication 2/2"Data Science: Case "Political Communication 2/2"
Data Science: Case "Political Communication 2/2"
 
Data Science: Case "Political Communication 1/2"
Data Science: Case "Political Communication 1/2"Data Science: Case "Political Communication 1/2"
Data Science: Case "Political Communication 1/2"
 
BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7
 
BDACA1516s2 - Lecture4
 BDACA1516s2 - Lecture4 BDACA1516s2 - Lecture4
BDACA1516s2 - Lecture4
 
BDACA1516s2 - Lecture1
BDACA1516s2 - Lecture1BDACA1516s2 - Lecture1
BDACA1516s2 - Lecture1
 
Should we worry about filter bubbles?
Should we worry about filter bubbles?Should we worry about filter bubbles?
Should we worry about filter bubbles?
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Kürzlich hochgeladen (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 

BDACA1617s2 - Tutorial 1

  • 1. The Linux command line Writing and running Python code Next meetings Big Data and Automated Content Analysis Week 1 – Wednesday »First steps in the VM« Damian Trilling d.c.trilling@uva.nl @damian0604 www.damiantrilling.net Afdeling Communicatiewetenschap Universiteit van Amsterdam 8 February 2017 Big Data and Automated Content Analysis Damian Trilling
  • 2. The Linux command line Writing and running Python code Next meetings Today 1 The Linux command line 2 Writing and running Python code 3 Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 3. The Linux command line Writing and running Python code Next meetings When point-and-click doesn’t help you further: The Linux command line Big Data and Automated Content Analysis Damian Trilling
  • 4. The Linux command line Writing and running Python code Next meetings The tools General idea Whereever possbile, we use tools that are • platform-independent • free (as in beer and as in speech) • open source . Big Data and Automated Content Analysis Damian Trilling
  • 5. The Linux command line Writing and running Python code Next meetings The tools General idea Whereever possbile, we use tools that are • platform-independent • free (as in beer and as in speech) • open source To make things easier, we work with a virtual machine in which everyone runs the same Linux version. Big Data and Automated Content Analysis Damian Trilling
  • 6.
  • 7. The Linux command line Writing and running Python code Next meetings Let’s switch to Linux! Big Data and Automated Content Analysis Damian Trilling
  • 8. The Linux command line Writing and running Python code Next meetings Tools: The linux command line a.k.a. the terminal, shell or, more specifically, bash Big Data and Automated Content Analysis Damian Trilling
  • 9. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Big Data and Automated Content Analysis Damian Trilling
  • 10. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? Big Data and Automated Content Analysis Damian Trilling
  • 11. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? • Direct access to your computer’s functions Big Data and Automated Content Analysis Damian Trilling
  • 12. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? • Direct access to your computer’s functions • In contrast to point-and-click programs, command line programs can easily be linked to each other, scripted, . . . Big Data and Automated Content Analysis Damian Trilling
  • 13. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? • Direct access to your computer’s functions • In contrast to point-and-click programs, command line programs can easily be linked to each other, scripted, . . . • Suitable for handling even huge files Big Data and Automated Content Analysis Damian Trilling
  • 14. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? • Direct access to your computer’s functions • In contrast to point-and-click programs, command line programs can easily be linked to each other, scripted, . . . • Suitable for handling even huge files • You simply cannot open them in many GUI programs • . . . or it takes ages • The command line allows you to do such things without problems Big Data and Automated Content Analysis Damian Trilling
  • 15. The Linux command line Writing and running Python code Next meetings Tools: The linux command line Why? • Direct access to your computer’s functions • In contrast to point-and-click programs, command line programs can easily be linked to each other, scripted, . . . • Suitable for handling even huge files • You simply cannot open them in many GUI programs • . . . or it takes ages • The command line allows you to do such things without problems • It is reproducible (ever tried to explain to your parents on the phone where they have to click?) Big Data and Automated Content Analysis Damian Trilling
  • 16. There are endless tutorials, cheat sheets, videos . . . online. Google it!
  • 17. The Linux command line Writing and running Python code Next meetings Exercise Take the book. Follow the instructions in Chapter 2. Big Data and Automated Content Analysis Damian Trilling
  • 18. The Linux command line Writing and running Python code Next meetings A language, not a program: Python Big Data and Automated Content Analysis Damian Trilling
  • 19. The Linux command line Writing and running Python code Next meetings Python What? • A language, not a specific program • Huge advantage: flexibility, portability • One of the languages for data analysis. (The other one is R.) Big Data and Automated Content Analysis Damian Trilling
  • 20. The Linux command line Writing and running Python code Next meetings Python What? • A language, not a specific program • Huge advantage: flexibility, portability • One of the languages for data analysis. (The other one is R.) But Python is more flexible—the original version of Dropbox was written in Python. I’d say: R for numbers, Python for text and messy stuff. But that’s just my personal view. Big Data and Automated Content Analysis Damian Trilling
  • 21. The Linux command line Writing and running Python code Next meetings Python What? • A language, not a specific program • Huge advantage: flexibility, portability • One of the languages for data analysis. (The other one is R.) Which version? We use Python 3. http://www.google.com or http://www.stackexchange.com still offer a lot of Python2-code, but that can easily be adapted. Most notable difference: In Python 2, you write print "Hi", this has changed to print ("Hi") Big Data and Automated Content Analysis Damian Trilling
  • 22. The Linux command line Writing and running Python code Next meetings If it’s not a program, how do you work with it? Interactive mode • Just type python3 on the command line, and you can start entering Python commands (You can leave again by entering quit()) • Great for quick try-outs, but you cannot even save your code Big Data and Automated Content Analysis Damian Trilling
  • 23. The Linux command line Writing and running Python code Next meetings If it’s not a program, how do you work with it? Interactive mode • Just type python3 on the command line, and you can start entering Python commands (You can leave again by entering quit()) • Great for quick try-outs, but you cannot even save your code An editor of your choice • Write your program in any text editor, save it as myprog.py • and run it from the command line with ./myprog.py or python3 myprog.py Big Data and Automated Content Analysis Damian Trilling
  • 24. The Linux command line Writing and running Python code Next meetings If it’s not a program, how do you start it? An IDE (Integrated Development Environment) • Provides an interface • Both quick interactive try-outs and writing larger programs • We use spyder, which looks a bit like RStudio (and to some extent like Stata) Big Data and Automated Content Analysis Damian Trilling
  • 25. The Linux command line Writing and running Python code Next meetings If it’s not a program, how do you start it? An IDE (Integrated Development Environment) • Provides an interface • Both quick interactive try-outs and writing larger programs • We use spyder, which looks a bit like RStudio (and to some extent like Stata) Jupyter Notebook • Runs in your browser • Stores results and text along with code • Great for interactive playing with data and for sharing results Big Data and Automated Content Analysis Damian Trilling
  • 26.
  • 27.
  • 28. Let’s start up a Python environment and write a Hello-world-program!
  • 29. The Linux command line Writing and running Python code Next meetings Start playing! Exercises 1. Run a program that greets you. The code for this is 1 print("Hello world") After that, do some calculations. You can do that in a similar way: 1 a=2 2 print(a*3) Just play around. Additional ressources Codecademy course on Python https://www.codecademy.com/learn/python Big Data and Automated Content Analysis Damian Trilling
  • 30. The Linux command line Writing and running Python code Next meetings Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 31. The Linux command line Writing and running Python code Next meetings Week 2: Getting started with Python Monday, 13–2 Lecture. Chapter 4. Wednesday, 15–2 Lab Session. Exercise A1. Big Data and Automated Content Analysis Damian Trilling