SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
Tools
Popular Data Processing Tools In Data Science
This collection of tools is used in data science to perform various operations on data to extract
useful insights. Most of them are widely used in the industry and they usually get the job done
easily.
Jupyter
Current version: 5.7.2
The Jupyter Notebook is an open-source tool that permits users to create and distribute
documents that contain coding, equations, visualizations, and narrative text. It is favorable for
data cleansing, data transformation, statistical modeling, data visualization, machine learning,
and countless more. It supports Python, R, Scala, and Julia. It helps in leveraging big data
engines such as Apache Spark, Python, R, and Scala. One can explore libraries like pandas,
scikit learn, keras, matplotlib and many more which belong to python. It provides support for
TensorFlow for computer vision analysis.
Currently, many big organizations like IBM, Google, Microsoft, Berkeley, and NYU are taking
advantage of this tool to work with machine learning and big data. It's a simple web page which
uses the system browser to carry on its operations. Most of the libraries that are needed for data
imputation and data processing are included in Jupyter. Others can be installed directly through
Jupyter Terminal or the systems command prompt.
Founded in 2014 as an open-source project, later it evolved to support interactive data science
and scientific computing across all programming languages.
Majority of the data science enthusiasts and professionals are aware with the flexibility and
reliability Jupyter offers and hence it is their most preferred tool to carry data cleaning, feature
engineering, model implementation and data visualization for Python.
Jupyter is still a common tool for data scientists and data analysts to use when performing
operations on data in python. It will continue to dominate as a favorable tool for python due to its
reliability and easy interface.
R Studio
Current version: 1.1.463
R Studio is an open-source tool used to perform operations on data using the R language. It
includes packages for data imputation and manipulation like mice, dplyr, Hmisc, and
missForest. R takes care of visualization by providing a shiny tool. RShiny takes care of
interactive web applications for visualizing data which brings data analysis in R to life.
NASA, Accenture, GE Global, Nestle, CAVA and countless more multinational companies are
using this tool to enlighten their data-driven capabilities. If a newbie is interested in working on
simple and complex datasets and wants to implement the same using R then, this is the ideal
tool. It brings tons of functionalities on the table like data imputation, data cleaning, data
manipulation, exploratory data analysis in the form of scatter-plots and histograms, SQL-
integration, natural language processing, model fabrication in machine learning, visualization of
the efficiency of the model and artificial intelligence.
JJ Allaire is the brain behind R studio. He wanted to make a tool for R which is both, universally
accessible and effortless to use. Its first beta version was released in 28th February 2011 which
was v0.92. Later, a stable build was released on 1st November 2016(v1.00).
RStudio is downloadable in two editions: RStudio Desktop and RStudio Server. The desktop
variant permits the program to run locally on any local machine which has R installed. Rstudio
server edition allows accessing RStudio on a web browser while it is operating on a remote
server.
R studio includes an array of functionalities in its tool to ease the pain of installing packages and
finding help for queries through external sources. The interface has been designed in a way
which provides four panes for different functions- The first left pane is reserved for coding, the
second pane is left for output and errors. The third pane which is at the right reserves the history
of variables executed and stored in the memory. The fourth pane is used to install packages,
get help for any functions, query or library, the access file directory of the system and visualize
plots for better understanding.
This tool will stay for a while in the computers and servers for where and when R is initiated and
executed. R is still followed as a norm in data science and data analytics by companies who still
use python as some functionalities in R are better when compared to Python.
SAS
Current version: 9.4
SAS was the first analytics tool which was developed at SAS Institute. It was meant for
business intelligence, multivariate analysis, and conventional data management. They designed
it to match the requirements of descriptive and predictive analytics. It was there in the market
even before R and Python and catered to a specific set of audience at that time.
It was first designed at a State University in 1966. Its development was further instrumental in
the 80s and 90s with the addition of new statistical features and additional components
SAS is a software suite which the enables managing and changing the retrieved data from a
diversity of sources and conduct statistical analysis on it. It has a graphical point-and-click user
interface for non-technical professionals and more advanced options through the SAS
language.
It been prevailing in the market for more than 40 years and still being used for analytical
decisions and statistical manipulations. It makes it quite easy to realize which data is important
and which isn’t. Making intelligent decisions and help companies grow has been its key factor
throughout the years.
SAS has been proved to make important key business decisions and will accelerate the current
scenario in doing so.
Apache Spark
Current version: 2.4
Apache Spark is an open-source shared software which specializes in the cluster-computing
framework. Spark provides a platform for programming complete clusters coupled with data
parallelism and fault tolerance. It can be used to execute projects in Scala, Java, SQL, Python,
or R. It can be considered as a unified engine which is used to take analytical decisions from
large-scale data processing.
A cluster manager and a distributed storage system are the important factors in Apache Spark.
For managing clusters, Spark provides standalone Hadoop YARN. For distributed storage,
Spark offers an interface with a wide variety of applications like Hadoop Distributed File System,
MapReduce File System, Cassandra, or else a custom solution can be implemented. Spark can
run on a single machine with one executor per CPU core.
Spark’s development began in the year 2009 and was later open sourced in 2010. In 2013, it
was handed over to the Apache Software Foundation and was transformed to Apache 2.0.
Since 2015, Spark had many active contributors, making it one of the most active projects in the
Apache Software Foundation and one of the most active open source big data projects.
Microsoft Excel
Current version: 2016
Microsoft Excel or widely pronounced as MS Excel is a spreadsheet used to carry calculations,
visualize graphs. It can be integrated with Visual Basic Applications(VBA) for macro
programming language capabilities. There are many free online tools available for processing
huge chunks of data in the market, but to its array of functionalities and capabilities, most
enterprises prefer MS Excel. It grants a platform to the business users that enhances usability
and credibility.
Data is arranged in terms of cells which is made of rows and columns. It provides column
integration, pivot tables, option for charts and graphs. Many mathematical and statistical
functions and formulae have been fed to initiate lengthy and complex calculations which are
quite common for businesses.
Established on September 30, 1985, its first-ever public release was crucial for enterprises and
biggies who were in need of a tool that will serve as a platform to solve their day-to-day
problems.
It has a wide support for VBA, allowing the user to perform tons of algebraic calculations, for
example, for solving differential equations of mathematical problems, and then reporting the
results back to the spreadsheet. It also has a variety of interactive features allowing user
interfaces that can completely hide the spreadsheet from the user. This language has been
crucial for enabling many useful features and functions necessary for writing macros.
It has native support for Windows and MacOS and even runs on Android and iOS for on-the-go
access and capabilities. It is a market trend and is preferred by many enterprises to run records
which includes huge chunks of data and calculations.
Structured Query Language - SQL
Current version: 2016
SQL is a programming language used to handle and manipulate structured queries stored in
relational database systems also know as RDBMS. Its primary function is to handle the
structured data in which there can be associations between different entities of the data. During
the time when it was introduced, SQL had many tricks under its sleeve, it introduced the
concept of accessing many records with one single command, eventually eliminating the need
to specify how to reach a record with the help of its queries.
SQL is used to interact with data stored in relational databases. It was formally developed at
IBM Labs by Donald D and Raymond F. in the early 1970s. This version which was called
SEQUEL (Structured English Query Language) at the time of its advent, was created to
manipulate and retrieve data stored in IBM's original relational database management system.
Later, Oracle Corporation saw potential in this market and developed their own SQL-based
RDBMS with the hopes of selling it to the U.S. Navy, Central Intelligence Agency, and other
U.S. government agencies.
Today, it is practiced in enterprises when they need to retrieve data from the original database
for operational purposes. It has built-in queries which enable the users to extract, select,
manipulate and alter data at their own will. SQL uses clauses, expressions, predicates, queries,
and statements to interact with data.
Right now, MySQL, Oracle, PostgreSQL are some of the common tools used in this domain.
There are many alternatives for SQL even though companies are using it for all their databases.
SQL uses many different queries, both simple as well as complex. Joins help in combining two
or more tables stored in a database based on a related column. It includes inner and outer joins.
There are many queries like tagging a primary key to make an attribute unique, dropping or
altering values, union, group_by which are concerned with the way the values need to be
represented.
Tableau
Current version: 2018.3
Tableau is a data visualization tool used for representing data in terms of charts and
dashboards. It is available online as well as offline. It has the capability to handle relational
databases, OLAP cubes, spreadsheets and also generate a number of graph types depending
on the type of data retrieved. It can also retrieve data and store data in its in-memory data
engine. The latitude and longitudes features of a location are offered in Tableau which creates a
geographic representation of any reports regarding sales, profit or any other factors which can
be represented with the help of maps.
It was first developed in January 2003 by three visualization mavericks who specialized in
visualization techniques for exploring and analyzing relational cubes and databases. Later, it
went on to become a big product which slowly introduced its other products for the emerging
market- Tableau Desktop, Tableau Server, Tableau Online, Tableau Reader and Tableau
Public.
Tableau has the capability to open a ton of possibilities in business analytics and business
intelligence. Enterprises are taking full advantage of the vast opportunities, this tool can open
and generate useful business insights from the company's growth perspective.
In 2008, Tableau was awarded the best business intelligence solution for its easy and quick
visualization capabilities. It was at the top of all the visualization tools in the market and
continues to dominate the current scenario.
Realizing its future scope, companies have started deploying dashboards in their meetings and
discussions. These visualizations have now managed to gain some momentum with respect to
enterprise-level functionalities by helping in determining the factors that can contribute to the
acceptance of new market strategies and skills needed to stay in
PowerBI
Current version: 2.64
Power BI is a business intelligence tool developed by Microsoft. It provides interactive
visualizations coupled with business intelligence capabilities, where users can build their own
customized reports and dashboards, without having to depend on information technology users
or database administrators. It provides cloud-based BI services as well. It offers data warehouse
capabilities including data discovery, data preparation, and interactive dashboards at the blink of
an eye.
In 2016, Microsoft released their additional add-on service called Embedded Power BI on its
Azure cloud platform. One main differentiator of the product was the ability to load custom
visualizations.
Power BI was first created in 2010 and named Project Crescent. Later it got renamed to Power
BI and was unveiled by Microsoft in September 2013 for Office 365 Suite. Later, Microsoft
added additional features like Q&A, enterprise-level data connectivity and various security
options. Power BI made its first public entrance in the year 2015.
After Tableau had captured the market, Power BI came up with the simple philosophy of
capturing enterprise and analytics companies who want to just visualize their data on the go by
accessing functionalities of PowerBI either online or offline. It enables connecting to hundreds of
data sources in the cloud. It uses power query to simplify data ingestion, transformation,
integration, and enrichment.

Weitere ähnliche Inhalte

Was ist angesagt?

How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
Simplilearn
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
Rick Perry
 
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best PracticesNon-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Jyrki Määttä
 

Was ist angesagt? (16)

IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
notes
notesnotes
notes
 
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
 
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An Overview
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big Data
 
Social Media Market Trender with Dache Manager Using Hadoop and Visualization...
Social Media Market Trender with Dache Manager Using Hadoop and Visualization...Social Media Market Trender with Dache Manager Using Hadoop and Visualization...
Social Media Market Trender with Dache Manager Using Hadoop and Visualization...
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big data
Big dataBig data
Big data
 
Which NoSQL Database to Combine with Spark for Real Time Big Data Analytics?
Which NoSQL Database to Combine with Spark for Real Time Big Data Analytics?Which NoSQL Database to Combine with Spark for Real Time Big Data Analytics?
Which NoSQL Database to Combine with Spark for Real Time Big Data Analytics?
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
How To Become A Big Data Engineer | Big Data Engineer Skills, Roles & Respons...
 
Hadoop essentials by shiva achari - sample chapter
Hadoop essentials by shiva achari - sample chapterHadoop essentials by shiva achari - sample chapter
Hadoop essentials by shiva achari - sample chapter
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
 
Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947
 
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best PracticesNon-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
 

Ähnlich wie tools

Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
Ravi Teja
 

Ähnlich wie tools (20)

R_L1-Aug-2022.pptx
R_L1-Aug-2022.pptxR_L1-Aug-2022.pptx
R_L1-Aug-2022.pptx
 
Tools used by ba
Tools used by baTools used by ba
Tools used by ba
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
 
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
Coding‌ ‌Software‌ ‌and‌ ‌Tools‌ ‌used‌ ‌for‌ ‌Data‌ ‌Science‌ ‌Management‌ ‌...
 
Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)Overview of tools for data analysis and visualisation (2021)
Overview of tools for data analysis and visualisation (2021)
 
Coding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - PhdassistanceCoding software and tools used for data science management - Phdassistance
Coding software and tools used for data science management - Phdassistance
 
Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021Top 10 Data analytics tools to look for in 2021
Top 10 Data analytics tools to look for in 2021
 
Job Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data ScientistsJob Data Analysis Reveals Key Skills Required for Data Scientists
Job Data Analysis Reveals Key Skills Required for Data Scientists
 
Gurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptxGurney · SlidesCarnival.pptx
Gurney · SlidesCarnival.pptx
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020Overview data analyis and visualisation tools 2020
Overview data analyis and visualisation tools 2020
 
Top 10 statistics tools to get better data insights
Top 10 statistics tools to get better data insightsTop 10 statistics tools to get better data insights
Top 10 statistics tools to get better data insights
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
DATA SCIENCE
DATA SCIENCEDATA SCIENCE
DATA SCIENCE
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and spark
 
10 things you need to know about Spark
10 things you need to know about Spark10 things you need to know about Spark
10 things you need to know about Spark
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
R as supporting tool for analytics and simulation
R as supporting tool for analytics and simulationR as supporting tool for analytics and simulation
R as supporting tool for analytics and simulation
 

Mehr von bhavesh lande

Mehr von bhavesh lande (20)

The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019 The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019
 
information control and Security system
information control and Security systeminformation control and Security system
information control and Security system
 
information technology and infrastructures choices
information technology and  infrastructures choicesinformation technology and  infrastructures choices
information technology and infrastructures choices
 
ethical issues,social issues
 ethical issues,social issues ethical issues,social issues
ethical issues,social issues
 
managing inforamation system
managing inforamation systemmanaging inforamation system
managing inforamation system
 
• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance
 
IT and innovations
 IT and  innovations  IT and  innovations
IT and innovations
 
organisations and information systems
organisations and  information systemsorganisations and  information systems
organisations and information systems
 
IT stratergy and digital goods
IT stratergy and digital goodsIT stratergy and digital goods
IT stratergy and digital goods
 
Implement Mapreduce with suitable example using MongoDB.
 Implement Mapreduce with suitable example using MongoDB. Implement Mapreduce with suitable example using MongoDB.
Implement Mapreduce with suitable example using MongoDB.
 
aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.
 
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
 Unnamed PL/SQL code block: Use of Control structure and Exception handling i... Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
 
database application using SQL DML statements: all types of Join, Sub-Query ...
 database application using SQL DML statements: all types of Join, Sub-Query ... database application using SQL DML statements: all types of Join, Sub-Query ...
database application using SQL DML statements: all types of Join, Sub-Query ...
 
database application using SQL DML statements: Insert, Select, Update, Delet...
 database application using SQL DML statements: Insert, Select, Update, Delet... database application using SQL DML statements: Insert, Select, Update, Delet...
database application using SQL DML statements: Insert, Select, Update, Delet...
 
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
 Design and Develop SQL DDL statements which demonstrate the use of SQL objec... Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
 
working with python
working with pythonworking with python
working with python
 
applications and advantages of python
applications and advantages of pythonapplications and advantages of python
applications and advantages of python
 
introduction of python in data science
introduction of python in data scienceintroduction of python in data science
introduction of python in data science
 
data scientists and their role
data scientists and their roledata scientists and their role
data scientists and their role
 
applications
applicationsapplications
applications
 

Kürzlich hochgeladen

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 

Kürzlich hochgeladen (20)

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 

tools

  • 1. Tools Popular Data Processing Tools In Data Science This collection of tools is used in data science to perform various operations on data to extract useful insights. Most of them are widely used in the industry and they usually get the job done easily. Jupyter Current version: 5.7.2 The Jupyter Notebook is an open-source tool that permits users to create and distribute documents that contain coding, equations, visualizations, and narrative text. It is favorable for data cleansing, data transformation, statistical modeling, data visualization, machine learning, and countless more. It supports Python, R, Scala, and Julia. It helps in leveraging big data engines such as Apache Spark, Python, R, and Scala. One can explore libraries like pandas, scikit learn, keras, matplotlib and many more which belong to python. It provides support for TensorFlow for computer vision analysis. Currently, many big organizations like IBM, Google, Microsoft, Berkeley, and NYU are taking advantage of this tool to work with machine learning and big data. It's a simple web page which uses the system browser to carry on its operations. Most of the libraries that are needed for data imputation and data processing are included in Jupyter. Others can be installed directly through Jupyter Terminal or the systems command prompt. Founded in 2014 as an open-source project, later it evolved to support interactive data science and scientific computing across all programming languages. Majority of the data science enthusiasts and professionals are aware with the flexibility and reliability Jupyter offers and hence it is their most preferred tool to carry data cleaning, feature engineering, model implementation and data visualization for Python.
  • 2. Jupyter is still a common tool for data scientists and data analysts to use when performing operations on data in python. It will continue to dominate as a favorable tool for python due to its reliability and easy interface. R Studio Current version: 1.1.463 R Studio is an open-source tool used to perform operations on data using the R language. It includes packages for data imputation and manipulation like mice, dplyr, Hmisc, and missForest. R takes care of visualization by providing a shiny tool. RShiny takes care of interactive web applications for visualizing data which brings data analysis in R to life. NASA, Accenture, GE Global, Nestle, CAVA and countless more multinational companies are using this tool to enlighten their data-driven capabilities. If a newbie is interested in working on simple and complex datasets and wants to implement the same using R then, this is the ideal tool. It brings tons of functionalities on the table like data imputation, data cleaning, data manipulation, exploratory data analysis in the form of scatter-plots and histograms, SQL- integration, natural language processing, model fabrication in machine learning, visualization of the efficiency of the model and artificial intelligence. JJ Allaire is the brain behind R studio. He wanted to make a tool for R which is both, universally accessible and effortless to use. Its first beta version was released in 28th February 2011 which was v0.92. Later, a stable build was released on 1st November 2016(v1.00). RStudio is downloadable in two editions: RStudio Desktop and RStudio Server. The desktop variant permits the program to run locally on any local machine which has R installed. Rstudio server edition allows accessing RStudio on a web browser while it is operating on a remote server. R studio includes an array of functionalities in its tool to ease the pain of installing packages and finding help for queries through external sources. The interface has been designed in a way which provides four panes for different functions- The first left pane is reserved for coding, the second pane is left for output and errors. The third pane which is at the right reserves the history
  • 3. of variables executed and stored in the memory. The fourth pane is used to install packages, get help for any functions, query or library, the access file directory of the system and visualize plots for better understanding. This tool will stay for a while in the computers and servers for where and when R is initiated and executed. R is still followed as a norm in data science and data analytics by companies who still use python as some functionalities in R are better when compared to Python. SAS Current version: 9.4 SAS was the first analytics tool which was developed at SAS Institute. It was meant for business intelligence, multivariate analysis, and conventional data management. They designed it to match the requirements of descriptive and predictive analytics. It was there in the market even before R and Python and catered to a specific set of audience at that time. It was first designed at a State University in 1966. Its development was further instrumental in the 80s and 90s with the addition of new statistical features and additional components SAS is a software suite which the enables managing and changing the retrieved data from a diversity of sources and conduct statistical analysis on it. It has a graphical point-and-click user
  • 4. interface for non-technical professionals and more advanced options through the SAS language. It been prevailing in the market for more than 40 years and still being used for analytical decisions and statistical manipulations. It makes it quite easy to realize which data is important and which isn’t. Making intelligent decisions and help companies grow has been its key factor throughout the years. SAS has been proved to make important key business decisions and will accelerate the current scenario in doing so. Apache Spark Current version: 2.4 Apache Spark is an open-source shared software which specializes in the cluster-computing framework. Spark provides a platform for programming complete clusters coupled with data parallelism and fault tolerance. It can be used to execute projects in Scala, Java, SQL, Python, or R. It can be considered as a unified engine which is used to take analytical decisions from large-scale data processing. A cluster manager and a distributed storage system are the important factors in Apache Spark. For managing clusters, Spark provides standalone Hadoop YARN. For distributed storage, Spark offers an interface with a wide variety of applications like Hadoop Distributed File System, MapReduce File System, Cassandra, or else a custom solution can be implemented. Spark can run on a single machine with one executor per CPU core.
  • 5. Spark’s development began in the year 2009 and was later open sourced in 2010. In 2013, it was handed over to the Apache Software Foundation and was transformed to Apache 2.0. Since 2015, Spark had many active contributors, making it one of the most active projects in the Apache Software Foundation and one of the most active open source big data projects. Microsoft Excel Current version: 2016 Microsoft Excel or widely pronounced as MS Excel is a spreadsheet used to carry calculations, visualize graphs. It can be integrated with Visual Basic Applications(VBA) for macro programming language capabilities. There are many free online tools available for processing huge chunks of data in the market, but to its array of functionalities and capabilities, most enterprises prefer MS Excel. It grants a platform to the business users that enhances usability and credibility. Data is arranged in terms of cells which is made of rows and columns. It provides column integration, pivot tables, option for charts and graphs. Many mathematical and statistical functions and formulae have been fed to initiate lengthy and complex calculations which are quite common for businesses. Established on September 30, 1985, its first-ever public release was crucial for enterprises and biggies who were in need of a tool that will serve as a platform to solve their day-to-day problems. It has a wide support for VBA, allowing the user to perform tons of algebraic calculations, for example, for solving differential equations of mathematical problems, and then reporting the results back to the spreadsheet. It also has a variety of interactive features allowing user interfaces that can completely hide the spreadsheet from the user. This language has been crucial for enabling many useful features and functions necessary for writing macros.
  • 6. It has native support for Windows and MacOS and even runs on Android and iOS for on-the-go access and capabilities. It is a market trend and is preferred by many enterprises to run records which includes huge chunks of data and calculations. Structured Query Language - SQL Current version: 2016 SQL is a programming language used to handle and manipulate structured queries stored in relational database systems also know as RDBMS. Its primary function is to handle the structured data in which there can be associations between different entities of the data. During the time when it was introduced, SQL had many tricks under its sleeve, it introduced the concept of accessing many records with one single command, eventually eliminating the need to specify how to reach a record with the help of its queries. SQL is used to interact with data stored in relational databases. It was formally developed at IBM Labs by Donald D and Raymond F. in the early 1970s. This version which was called SEQUEL (Structured English Query Language) at the time of its advent, was created to manipulate and retrieve data stored in IBM's original relational database management system. Later, Oracle Corporation saw potential in this market and developed their own SQL-based RDBMS with the hopes of selling it to the U.S. Navy, Central Intelligence Agency, and other U.S. government agencies. Today, it is practiced in enterprises when they need to retrieve data from the original database for operational purposes. It has built-in queries which enable the users to extract, select, manipulate and alter data at their own will. SQL uses clauses, expressions, predicates, queries, and statements to interact with data.
  • 7. Right now, MySQL, Oracle, PostgreSQL are some of the common tools used in this domain. There are many alternatives for SQL even though companies are using it for all their databases. SQL uses many different queries, both simple as well as complex. Joins help in combining two or more tables stored in a database based on a related column. It includes inner and outer joins. There are many queries like tagging a primary key to make an attribute unique, dropping or altering values, union, group_by which are concerned with the way the values need to be represented. Tableau Current version: 2018.3 Tableau is a data visualization tool used for representing data in terms of charts and dashboards. It is available online as well as offline. It has the capability to handle relational databases, OLAP cubes, spreadsheets and also generate a number of graph types depending on the type of data retrieved. It can also retrieve data and store data in its in-memory data engine. The latitude and longitudes features of a location are offered in Tableau which creates a geographic representation of any reports regarding sales, profit or any other factors which can be represented with the help of maps. It was first developed in January 2003 by three visualization mavericks who specialized in visualization techniques for exploring and analyzing relational cubes and databases. Later, it went on to become a big product which slowly introduced its other products for the emerging
  • 8. market- Tableau Desktop, Tableau Server, Tableau Online, Tableau Reader and Tableau Public. Tableau has the capability to open a ton of possibilities in business analytics and business intelligence. Enterprises are taking full advantage of the vast opportunities, this tool can open and generate useful business insights from the company's growth perspective. In 2008, Tableau was awarded the best business intelligence solution for its easy and quick visualization capabilities. It was at the top of all the visualization tools in the market and continues to dominate the current scenario. Realizing its future scope, companies have started deploying dashboards in their meetings and discussions. These visualizations have now managed to gain some momentum with respect to enterprise-level functionalities by helping in determining the factors that can contribute to the acceptance of new market strategies and skills needed to stay in PowerBI Current version: 2.64 Power BI is a business intelligence tool developed by Microsoft. It provides interactive visualizations coupled with business intelligence capabilities, where users can build their own customized reports and dashboards, without having to depend on information technology users or database administrators. It provides cloud-based BI services as well. It offers data warehouse capabilities including data discovery, data preparation, and interactive dashboards at the blink of an eye. In 2016, Microsoft released their additional add-on service called Embedded Power BI on its Azure cloud platform. One main differentiator of the product was the ability to load custom visualizations.
  • 9. Power BI was first created in 2010 and named Project Crescent. Later it got renamed to Power BI and was unveiled by Microsoft in September 2013 for Office 365 Suite. Later, Microsoft added additional features like Q&A, enterprise-level data connectivity and various security options. Power BI made its first public entrance in the year 2015. After Tableau had captured the market, Power BI came up with the simple philosophy of capturing enterprise and analytics companies who want to just visualize their data on the go by accessing functionalities of PowerBI either online or offline. It enables connecting to hundreds of data sources in the cloud. It uses power query to simplify data ingestion, transformation, integration, and enrichment.