SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 1
Python for Big Data
Analytics
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 2
Session Objectives
This session will help you to understand:
ᗍ Introduction to Python
ᗍ Web Scraping Use Case
ᗍ Introduction to Big data
ᗍ Getting your doubt’s cleared
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 3
What is Python?
ᗍ Python is a general purpose High-level Programming Language designed to be easy to read and simple
to implement
ᗍ It’s high-level built in Data Structures, combined with dynamic typing and dynamic binding, makes it very
attractive for Rapid Application Development
ᗍ Python supports Modules and Packages, which encourages Program Modularity (feature of subdividing a
program into separate sub-programs) and Code Reuse
ᗍ It is similar to PERL and RUBY but with certain differences such as Object-oriented features
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 4
What is Python? (Cont’d)
Python has Object-oriented Structure. It supports:
Polymorphism
Static
Polymorphism
Runtime
Polymorphism
Class A
Class B Class C
Polymorphism Multiple Inheritance Object Overloading
Operator ‘+’
5+5=10
Skill+Speed
=SkillSpeed
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 5
Why Python?
Good for Text Processing
Generates HTML Content
Your C++ Program
Extended in C and C++
Script.py
Cpython
Interpreter
Cpython
Interpreter
Clear Syntax
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 6
Why Python? (Cont’d)
Interpreted Environment
Source Code
Interpreter
Output
Automatic Memory Management
Good for Code Steering and for
Merging Multiple Programs
Supports Library Utilities and Third Party
Utilities (Example: Numeric, NumPy, SciPy)
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 7
Job Trends
PercentageGrowth
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 8
Users of Python
Google App Engine is an eminent sample of Python-written application, it
allows building web applications with Python programming language, using its
rich collection of libraries, tools and frameworks
YouTube is a big user of Python, the entire site uses Python for different
purposes: view video, control templates for website, administer video, access
to canonical data, and many more. Python is everywhere at YouTube
Amazon Web Services uses Python
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 9
Some More Users of Python
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 10
System Utilities GUIs (Tkinter) Internet Scripting Embedded Scripting
Database Programming Artificial Intelligence Image Processing
Major Uses of Python
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 11
Demo: Web Scraping – Flipkart.com
ᗍ This Example demonstrates how to extract data from flipkart for a particular product like “Watch”
ᗍ We shall use requests (Python Package) which gets the web page for you, then you need to parse the HTML from
the page to retrieve the data. That is done by BeautifulSoup
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 12
ᗍ Big data is the term for a collection of data sets so large and complex that it becomes difficult to process
using on-hand database management tools or traditional data processing applications
ᗍ Huge Amount of Data (Terabytes or Petabytes)
ᗍ The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization
ᗍ Many systems or a collection of systems generates these huge data, few examples are Space Exploration,
Deep Sea Navigation, Social Media etc.
What is Big Data?
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 13
Why Big Data?
ᗍ Data being generated today is so huge that traditional systems are unable to process it neither are able to
store it
ᗍ To create better DSS (Decision Support System) system
ᗍ Google alone receives 4 million search queries per minute
ᗍ Data is generated from everywhere such as Sensors for Climate Information, Social Media, Music Audio’s and
Videos, Global Positioning System
ᗍ Only 10 percent of worlds data today resides in RDBMS and 90% elsewhere, how do we deal with this
enormous data?
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 14
Big Data Statistics
Every minute:
ᗍ Facebook users share nearly 2.5 million pieces of content
ᗍ Twitter users tweet nearly 300,000 times
ᗍ Instagram users post nearly 220,000 new photos
ᗍ YouTube users upload 72 hours of new video content
ᗍ Apple users download nearly 50,000 apps
ᗍ Email users send over 200 million messages
ᗍ Amazon generates over $80,000 in online sales
Refer: http://aci.info/2014/07/12/the-data-explosion-in-2014-minute-by-minute-infographic/
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 15
Characteristics of Big Data
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 16
Case Study 1: Big Data from Space
Satellite Imaging
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 17
ᗍ Structure of Data:
Data in social media are unstructured or semi-structured. Data from twitter /Facebook are in JSON. where do we
store it? How do we process it?
ᗍ Quantity of Data:
These are tons of unstructured, structured and semi structured data. How do I derive a pattern out of it?
ᗍ Processing of Data:
How do we process this complex data structure, what technologies do we use?
ᗍ Prediction Algorithm:
After having done all the good work of cleansing and slicing/dicing the data, which algorithm do we use. Is it
decision tree, SVM, k-mean, kNN and the list goes on
Case Study 2: Social Media Analytics
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 18
Why SkillSpeed?
Course
Curriculum
from Industry
Experts
Instructor Led
Live Virtual
Sessions
Lifetime access
to Course
Content via
LMS
100% Placement
Assistance
24x7 Support
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 19
Course Topics
Module 1
Introduction to Python
Module 2
Built-In Data Types, Strings,
Sequence and Files
Module 3
Functions, Sorting, Exceptions,
Standard Libraries
Module 4
Regular Expression and
Object-oriented Programming
Module 5
Debugging Python, Project
Skeleton in Python and SQLite
Database
Module 6
Introduction to Big Data and
Hadoop
Module 7
Python and Big Data
Module 8
Implementation of Machine
Learning in Python
Module 9
Working Examples of
Machine Learning in Python
Module 10
Project Implementation
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 20
Corporate Partners
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 21
Lines open 24/7
To know more about the course, Please contact:
IND +91-90660-20904 USA 1866-607-6547 (Toll Free)
Or reach us at
sales@skillspeed.com
Contact Us
Get Started with Python
© 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 22
References
https://harshbhimjyani.wordpress.com/2014/10/21/scraping-flipkart/
https://www.vlab.org/sandbox/events/satellite-imaging-big-data-from-space-shared/
http://www.datasciencecentral.com/profiles/blogs/data-veracity
Python and BIG Data analytics | Python Fundamentals | Python Architecture

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with PythonDonald Miner
 
Collaborations in the Extreme: 
The rise of open code development in the scie...
Collaborations in the Extreme: 
The rise of open code development in the scie...Collaborations in the Extreme: 
The rise of open code development in the scie...
Collaborations in the Extreme: 
The rise of open code development in the scie...Kelle Cruz
 
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopDataWorks Summit
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in PythonJagriti Goswami
 
H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio Sri Ambati
 
Analyzing Data With Python
Analyzing Data With PythonAnalyzing Data With Python
Analyzing Data With PythonSarah Guido
 
PyData Barcelona Keynote
PyData Barcelona KeynotePyData Barcelona Keynote
PyData Barcelona KeynoteTravis Oliphant
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
Data Analysis and Visualization using Python
Data Analysis and Visualization using PythonData Analysis and Visualization using Python
Data Analysis and Visualization using PythonChariza Pladin
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013MLconf
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and SparkJongwook Woo
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019Travis Oliphant
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)PyData
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookKeiichiro Ono
 
SciPy Latin America 2019
SciPy Latin America 2019SciPy Latin America 2019
SciPy Latin America 2019Travis Oliphant
 
Python vs. r for data science
Python vs. r for data sciencePython vs. r for data science
Python vs. r for data scienceHugo Shi
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsRokesh Jankie
 

Was ist angesagt? (20)

Hadoop with Python
Hadoop with PythonHadoop with Python
Hadoop with Python
 
Collaborations in the Extreme: 
The rise of open code development in the scie...
Collaborations in the Extreme: 
The rise of open code development in the scie...Collaborations in the Extreme: 
The rise of open code development in the scie...
Collaborations in the Extreme: 
The rise of open code development in the scie...
 
IPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for HadoopIPython Notebook as a Unified Data Science Interface for Hadoop
IPython Notebook as a Unified Data Science Interface for Hadoop
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in Python
 
H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio
 
PyCon Estonia 2019
PyCon Estonia 2019PyCon Estonia 2019
PyCon Estonia 2019
 
Analyzing Data With Python
Analyzing Data With PythonAnalyzing Data With Python
Analyzing Data With Python
 
PyData Barcelona Keynote
PyData Barcelona KeynotePyData Barcelona Keynote
PyData Barcelona Keynote
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
Data Analysis and Visualization using Python
Data Analysis and Visualization using PythonData Analysis and Visualization using Python
Data Analysis and Visualization using Python
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and SparkAlphago vs Lee Se-Dol: Tweeter Analysis using Hadoop and Spark
Alphago vs Lee Se-Dol : Tweeter Analysis using Hadoop and Spark
 
BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5
 
Keynote at Converge 2019
Keynote at Converge 2019Keynote at Converge 2019
Keynote at Converge 2019
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
 
Reproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter NotebookReproducible Workflow with Cytoscape and Jupyter Notebook
Reproducible Workflow with Cytoscape and Jupyter Notebook
 
SciPy Latin America 2019
SciPy Latin America 2019SciPy Latin America 2019
SciPy Latin America 2019
 
Python vs. r for data science
Python vs. r for data sciencePython vs. r for data science
Python vs. r for data science
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applications
 

Ähnlich wie Python and BIG Data analytics | Python Fundamentals | Python Architecture

Python programming for beginners
Python programming for beginnersPython programming for beginners
Python programming for beginnersBenishchoco
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to pythonsjagadeeswari
 
Best Python Certification Course In Delhi.
Best Python Certification Course In Delhi.Best Python Certification Course In Delhi.
Best Python Certification Course In Delhi.sushmitasharan1
 
Python Training Certification Course In Mumbai
Python Training Certification Course In MumbaiPython Training Certification Course In Mumbai
Python Training Certification Course In Mumbaisushmitasharan1
 
Python Certification Training In Kolkata
Python Certification Training In KolkataPython Certification Training In Kolkata
Python Certification Training In Kolkatasushmitasharan1
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonInsuk (Chris) Cho
 
Programming for data science in python
Programming for data science in pythonProgramming for data science in python
Programming for data science in pythonUmmeSalmaM1
 
Python PPT
Python PPTPython PPT
Python PPTEdureka!
 
Python Certification Course In Bangalore
Python Certification Course In BangalorePython Certification Course In Bangalore
Python Certification Course In Bangaloresushmitasharan1
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptxKaviya452563
 
Data_Scientist_Master_Program (2).pdf
Data_Scientist_Master_Program (2).pdfData_Scientist_Master_Program (2).pdf
Data_Scientist_Master_Program (2).pdfssuser2bf502
 
Data_Scientist_Master_Program.pdf
Data_Scientist_Master_Program.pdfData_Scientist_Master_Program.pdf
Data_Scientist_Master_Program.pdfSantoshMuduli1
 
Python course task 10 guruprasanth.s
Python course task 10 guruprasanth.sPython course task 10 guruprasanth.s
Python course task 10 guruprasanth.sGURUPRASANTH33
 
Why Python Should Be Your First Programming Language
Why Python Should Be Your First Programming LanguageWhy Python Should Be Your First Programming Language
Why Python Should Be Your First Programming LanguageEdureka!
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesRudiger Wolf
 
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit Shah
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit ShahANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit Shah
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit ShahAgileNetwork
 
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYA
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYAMAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYA
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYAssuserb054d21
 
Python – The Fastest Growing Programming Language
Python – The Fastest Growing Programming LanguagePython – The Fastest Growing Programming Language
Python – The Fastest Growing Programming LanguageIRJET Journal
 

Ähnlich wie Python and BIG Data analytics | Python Fundamentals | Python Architecture (20)

Python programming for beginners
Python programming for beginnersPython programming for beginners
Python programming for beginners
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Best Python Certification Course In Delhi.
Best Python Certification Course In Delhi.Best Python Certification Course In Delhi.
Best Python Certification Course In Delhi.
 
Python Training Certification Course In Mumbai
Python Training Certification Course In MumbaiPython Training Certification Course In Mumbai
Python Training Certification Course In Mumbai
 
Python Certification Training In Kolkata
Python Certification Training In KolkataPython Certification Training In Kolkata
Python Certification Training In Kolkata
 
Python
Python Python
Python
 
Samsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of PythonSamsung SDS OpeniT - The possibility of Python
Samsung SDS OpeniT - The possibility of Python
 
Programming for data science in python
Programming for data science in pythonProgramming for data science in python
Programming for data science in python
 
Python PPT
Python PPTPython PPT
Python PPT
 
Python Certification Course In Bangalore
Python Certification Course In BangalorePython Certification Course In Bangalore
Python Certification Course In Bangalore
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptx
 
Data_Scientist_Master_Program (2).pdf
Data_Scientist_Master_Program (2).pdfData_Scientist_Master_Program (2).pdf
Data_Scientist_Master_Program (2).pdf
 
Data_Scientist_Master_Program.pdf
Data_Scientist_Master_Program.pdfData_Scientist_Master_Program.pdf
Data_Scientist_Master_Program.pdf
 
Python course task 10 guruprasanth.s
Python course task 10 guruprasanth.sPython course task 10 guruprasanth.s
Python course task 10 guruprasanth.s
 
Why Python Should Be Your First Programming Language
Why Python Should Be Your First Programming LanguageWhy Python Should Be Your First Programming Language
Why Python Should Be Your First Programming Language
 
Python
PythonPython
Python
 
London atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slidesLondon atlassian meetup 31 jan 2016 jira metrics-extract slides
London atlassian meetup 31 jan 2016 jira metrics-extract slides
 
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit Shah
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit ShahANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit Shah
ANIn Pune July 2023 |Prompt Engineering and AI first SDLC by Abhijit Shah
 
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYA
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYAMAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYA
MAKHANLAL CHATURVEDI RASHTRIYA PATRAKARITA AVAM SANCHAR VISHWAVIDYALAYA
 
Python – The Fastest Growing Programming Language
Python – The Fastest Growing Programming LanguagePython – The Fastest Growing Programming Language
Python – The Fastest Growing Programming Language
 

Mehr von Skillspeed

Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x ProgramRun Your First Hadoop 2.x Program
Run Your First Hadoop 2.x ProgramSkillspeed
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSkillspeed
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopSkillspeed
 
Top 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer WebinarTop 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer WebinarSkillspeed
 
Decoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOpsDecoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOpsSkillspeed
 
Skillspeed Affiliate Program
Skillspeed Affiliate ProgramSkillspeed Affiliate Program
Skillspeed Affiliate ProgramSkillspeed
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaSkillspeed
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Hadoop for Business Intelligence Professionals
Hadoop for Business Intelligence ProfessionalsHadoop for Business Intelligence Professionals
Hadoop for Business Intelligence ProfessionalsSkillspeed
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsSkillspeed
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceSkillspeed
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceSkillspeed
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureSkillspeed
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsSkillspeed
 
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig FundamentalsIntroduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig FundamentalsSkillspeed
 
HDFS & MapReduce
HDFS & MapReduceHDFS & MapReduce
HDFS & MapReduceSkillspeed
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailSkillspeed
 

Mehr von Skillspeed (17)

Run Your First Hadoop 2.x Program
Run Your First Hadoop 2.x ProgramRun Your First Hadoop 2.x Program
Run Your First Hadoop 2.x Program
 
Sentiment Analysis via R Programming
Sentiment Analysis via R ProgrammingSentiment Analysis via R Programming
Sentiment Analysis via R Programming
 
Predicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via HadoopPredicting Consumer Behaviour via Hadoop
Predicting Consumer Behaviour via Hadoop
 
Top 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer WebinarTop 5 Tasks Of A Hadoop Developer Webinar
Top 5 Tasks Of A Hadoop Developer Webinar
 
Decoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOpsDecoding Puppet & Jenkins via DevOps
Decoding Puppet & Jenkins via DevOps
 
Skillspeed Affiliate Program
Skillspeed Affiliate ProgramSkillspeed Affiliate Program
Skillspeed Affiliate Program
 
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social MediaBIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Hadoop for Business Intelligence Professionals
Hadoop for Business Intelligence ProfessionalsHadoop for Business Intelligence Professionals
Hadoop for Business Intelligence Professionals
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
BIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in FinanceBIG Data & Hadoop Applications in Finance
BIG Data & Hadoop Applications in Finance
 
BIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-CommerceBIG Data & Hadoop Applications in E-Commerce
BIG Data & Hadoop Applications in E-Commerce
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
 
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig FundamentalsIntroduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
 
HDFS & MapReduce
HDFS & MapReduceHDFS & MapReduce
HDFS & MapReduce
 
BIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in RetailBIG Data & Hadoop Applications in Retail
BIG Data & Hadoop Applications in Retail
 

Kürzlich hochgeladen

MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 

Kürzlich hochgeladen (17)

MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 

Python and BIG Data analytics | Python Fundamentals | Python Architecture

  • 1. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 1 Python for Big Data Analytics
  • 2. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 2 Session Objectives This session will help you to understand: ᗍ Introduction to Python ᗍ Web Scraping Use Case ᗍ Introduction to Big data ᗍ Getting your doubt’s cleared
  • 3. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 3 What is Python? ᗍ Python is a general purpose High-level Programming Language designed to be easy to read and simple to implement ᗍ It’s high-level built in Data Structures, combined with dynamic typing and dynamic binding, makes it very attractive for Rapid Application Development ᗍ Python supports Modules and Packages, which encourages Program Modularity (feature of subdividing a program into separate sub-programs) and Code Reuse ᗍ It is similar to PERL and RUBY but with certain differences such as Object-oriented features
  • 4. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 4 What is Python? (Cont’d) Python has Object-oriented Structure. It supports: Polymorphism Static Polymorphism Runtime Polymorphism Class A Class B Class C Polymorphism Multiple Inheritance Object Overloading Operator ‘+’ 5+5=10 Skill+Speed =SkillSpeed Get Started with Python
  • 5. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 5 Why Python? Good for Text Processing Generates HTML Content Your C++ Program Extended in C and C++ Script.py Cpython Interpreter Cpython Interpreter Clear Syntax Get Started with Python
  • 6. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 6 Why Python? (Cont’d) Interpreted Environment Source Code Interpreter Output Automatic Memory Management Good for Code Steering and for Merging Multiple Programs Supports Library Utilities and Third Party Utilities (Example: Numeric, NumPy, SciPy) Get Started with Python
  • 7. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 7 Job Trends PercentageGrowth Get Started with Python
  • 8. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 8 Users of Python Google App Engine is an eminent sample of Python-written application, it allows building web applications with Python programming language, using its rich collection of libraries, tools and frameworks YouTube is a big user of Python, the entire site uses Python for different purposes: view video, control templates for website, administer video, access to canonical data, and many more. Python is everywhere at YouTube Amazon Web Services uses Python Get Started with Python
  • 9. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 9 Some More Users of Python Get Started with Python
  • 10. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 10 System Utilities GUIs (Tkinter) Internet Scripting Embedded Scripting Database Programming Artificial Intelligence Image Processing Major Uses of Python Get Started with Python
  • 11. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 11 Demo: Web Scraping – Flipkart.com ᗍ This Example demonstrates how to extract data from flipkart for a particular product like “Watch” ᗍ We shall use requests (Python Package) which gets the web page for you, then you need to parse the HTML from the page to retrieve the data. That is done by BeautifulSoup Get Started with Python
  • 12. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 12 ᗍ Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications ᗍ Huge Amount of Data (Terabytes or Petabytes) ᗍ The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization ᗍ Many systems or a collection of systems generates these huge data, few examples are Space Exploration, Deep Sea Navigation, Social Media etc. What is Big Data? Get Started with Python
  • 13. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 13 Why Big Data? ᗍ Data being generated today is so huge that traditional systems are unable to process it neither are able to store it ᗍ To create better DSS (Decision Support System) system ᗍ Google alone receives 4 million search queries per minute ᗍ Data is generated from everywhere such as Sensors for Climate Information, Social Media, Music Audio’s and Videos, Global Positioning System ᗍ Only 10 percent of worlds data today resides in RDBMS and 90% elsewhere, how do we deal with this enormous data? Get Started with Python
  • 14. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 14 Big Data Statistics Every minute: ᗍ Facebook users share nearly 2.5 million pieces of content ᗍ Twitter users tweet nearly 300,000 times ᗍ Instagram users post nearly 220,000 new photos ᗍ YouTube users upload 72 hours of new video content ᗍ Apple users download nearly 50,000 apps ᗍ Email users send over 200 million messages ᗍ Amazon generates over $80,000 in online sales Refer: http://aci.info/2014/07/12/the-data-explosion-in-2014-minute-by-minute-infographic/ Get Started with Python
  • 15. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 15 Characteristics of Big Data Get Started with Python
  • 16. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 16 Case Study 1: Big Data from Space Satellite Imaging Get Started with Python
  • 17. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 17 ᗍ Structure of Data: Data in social media are unstructured or semi-structured. Data from twitter /Facebook are in JSON. where do we store it? How do we process it? ᗍ Quantity of Data: These are tons of unstructured, structured and semi structured data. How do I derive a pattern out of it? ᗍ Processing of Data: How do we process this complex data structure, what technologies do we use? ᗍ Prediction Algorithm: After having done all the good work of cleansing and slicing/dicing the data, which algorithm do we use. Is it decision tree, SVM, k-mean, kNN and the list goes on Case Study 2: Social Media Analytics Get Started with Python
  • 18. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 18 Why SkillSpeed? Course Curriculum from Industry Experts Instructor Led Live Virtual Sessions Lifetime access to Course Content via LMS 100% Placement Assistance 24x7 Support Get Started with Python
  • 19. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 19 Course Topics Module 1 Introduction to Python Module 2 Built-In Data Types, Strings, Sequence and Files Module 3 Functions, Sorting, Exceptions, Standard Libraries Module 4 Regular Expression and Object-oriented Programming Module 5 Debugging Python, Project Skeleton in Python and SQLite Database Module 6 Introduction to Big Data and Hadoop Module 7 Python and Big Data Module 8 Implementation of Machine Learning in Python Module 9 Working Examples of Machine Learning in Python Module 10 Project Implementation Get Started with Python
  • 20. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 20 Corporate Partners Get Started with Python
  • 21. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 21 Lines open 24/7 To know more about the course, Please contact: IND +91-90660-20904 USA 1866-607-6547 (Toll Free) Or reach us at sales@skillspeed.com Contact Us Get Started with Python
  • 22. © 2015 Blue Camphor Technologies (P) Ltd. www.skillspeed.com Slide 22 References https://harshbhimjyani.wordpress.com/2014/10/21/scraping-flipkart/ https://www.vlab.org/sandbox/events/satellite-imaging-big-data-from-space-shared/ http://www.datasciencecentral.com/profiles/blogs/data-veracity

Hinweis der Redaktion

  1. Why use Python :- Python is object-oriented Structure supports such concepts as polymorphism, operation overloading, and multiple inheritance It's free (open source) Downloading and installing Python is free and easy Source code is easily accessible Free doesn't mean unsupported! Online Python community is huge It's portable Python runs virtually every major platform used today As long as you have a compatible Python interpreter installed, Python programs will run in exactly the same manner, irrespective of platform It's powerful Dynamic typing Built-in types and tools Library utilities Third party utilities (e.g. Numeric, NumPy, SciPy) Automatic memory management It's mixable Python can be linked to components written in other languages easily Linking to fast, compiled code is useful to computationally intensive problems Python is good for code steering and for merging multiple programs in otherwise conflicting languages Python/C integration is quite common WARP is implemented in a mixture of Python and Fortran It's easy to use Rapid turnaround: no intermediate compile and link steps as in C or C++ Python programs are compiled automatically to an intermediate form called bytecode, which the interpreter then reads This gives Python the development speed of an interpreter without the performance loss inherent in purely interpreted languages It's easy to learn Structure and syntax are pretty intuitive and easy to grasp
  2. Image copied from : http://www.datasciencecentral.com/profiles/blogs/data-veracity
  3. http://www.framingmymessage.nl/wp-content/uploads/2013/10/Social-media.jpg
  4. SkillSpeed offer virtual instructor lead courses designed to bridge the time to competency gap experienced by the technology companies. USP of SkillSpeed is the subject matter expert (SME). SMEs are industry experts and has a good understanding and hands-on industry experience of the technology. This industry expert designs, develops, and delivers the course. SkillSpeed provides you: Course Curriculum from Industry Experts Instructor Led Live Virtual Sessions Real life industry case studies  - Live Virtual Interactions Interaction with industry experts  - Lifetime access to all course content via the LMS   - 24*7 support   - 100% placement assistance