SlideShare ist ein Scribd-Unternehmen logo
1 von 39
PYTHON for R Users
By- Satyarth Praveen
Functions R Python
Downloading and installing a
package
install.packages('name') pip install name
Load a package library(name) import name as other_name
Checking working directory getwd() import os
os.getcwd()
Setting working directory setwd(‘path’) os.chdir(‘path’)
List files in a directory dir(‘path’) os.listdir(‘path’)
List all objects ls() globals()
Remove an object rm('name') del('object')
See manual for the function help(help) help(help)
See the type of an object class(object) type(object)
Basic Commands
Functions R
Python
(Using pandas package)
Creating a data frame “df”
of dimension 6x4 (6 rows
and 4 columns) containing
random numbers
A<- matrix(runif(24,0,1),nrow=6,ncol=4)
df<-data.frame(A)
Here,
• runif function generates 24 random
numbers between 0 to 1
• matrix function creates a matrix from
those random numbers, nrow and ncol
sets the numbers of rows and columns
to the matrix
• data.frame converts the matrix to data
frame
import numpy as np
import pandas as pd
A=np.random.randn(6,4)
df=pd.DataFrame(A)
Here,
• np.random.randn generates a
matrix of 6 rows and 4 columns;
this function is a part of numpy**
library
• pd.DataFrame converts the matrix
in to a data frame
To read a csv from a URL
or a file.
read.csv("URL or file_name") Import pandas as pd
pd.read_csv("URL or file_name")
Data Frame Creation
*To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd;
**To import Numpy library type: import numpy as np;
Data Frame Creation
R Python
Functions R Python
Getting the names of rows and
columns of data frame “df”
rownames(df)
colnames(df)
df.index
df.columns
Seeing the top and bottom “x”
rows of the data frame “df”
head(df, x)
tail(df, x)
df.head(x)
df.tail(x)
Getting dimension of data frame
“df”
dim(df) df.shape
Length of data frame “df” length(df)
returns the number of columns of the
data frame
len(df)
returns no. of rows in data frames
To view the correlation among the columns of the data
frames
cor(df) df.corr()
To view the unique entries of a vector unique(df$column_name) df.column_name.unique()
Frequency count of each value in the column in the data
frame
table(diamonds$cut)
OR
summary(diamonds$cut)
pd.value_counts(diamonds.cut)
Data Frame: Inspecting and Viewing Data
Functions R Python
Group the data frame as
per the values of a
column
library(dplyr)
df <- tbl_df(df)
groupby_df <- df %>% group_by(column_name)
Import pandas as pd
groupby_df = pd.groupby(df,
df.column_name)
To perform a grouped
mean on the column
‘column_name’ of the
data frame
summarise(groupby_df, mean(column_name)) groupby_df.column_name.mean()
Frequency table for the
values in 2 columns
table(df$column_name1, df$column_name2) pd.crosstab(df.column_name1,
df.column_name2,
margins='TRUE')
Data Frame: Inspecting and Viewing Data
Data Frame: Inspecting and Viewing Data
R Python
Functions R Python
Getting quick summary(like
mean, std. deviation etc. ) of data in
the data frame “df”
summary(df)
returns mean, median , maximum, minimum, first quarter and
third quarter
df.describe()
returns count, mean, standard
deviation, maximum, minimum, 25%,
50% and 75%
Getting a compact view of the data
structure of the object.
str(df) df.info()
To check for all available methods
for the object.
isS4(object_name)
# to check if the object is S4 type
is(object_name, 'refClass')
# to check if the object is RC type
# if both are false then object is of S3 type:
methods(class = class(object_name))
# if object is of S4 type:
showMethods(classes = class(object_name))
dir(diamonds)
Setting row names and columns
names of the data frame “df”
rownames(df) = c("A", "B", "C", "D", "E", "F")
colnames(df) = c("P", "Q", "R", "S")
df.index=[“A”, ”B”, “C”, ”D”,
“E”, ”F”]
df.columns=[“P”, ”Q”, “R”, ”S”]
Data Frame: Inspecting and Viewing Data
Data Frame: Inspecting and Viewing Data
R Python
Functions R Python
Sorting the data in the data
frame “df” by column name “P”
df[order(df$P),] df.sort_values(by=['P'])
This command is deprecated:
df.sort(['P'])
Data Frame: Sorting Data
Data Frame: Sorting Data
R Python
Functions R Python
Slicing the rows of a data frame
from row no. “x” to row no.
“y”(including row x and y)
df[x:y,] df[x1:y]
Python starts counting from 0
Slicing the columns name “X”,”Y” etc. of
a data frame “df”
myvars < c(“X”,”Y”)
newdata < df[myvars]
df.loc[:,[‘X’,’Y’]]
Selecting the the data from row no. “x”
to “y” and column no. “a” to “b”
df[x:y,a:b] df.iloc[x1:y,a1:b]
Selecting the element at row no.
“x” and column no. “y”
df[x,y] df.iat[x1,y1]
Selecting column of the data frame df$column_name df.column_name
Data Frame: Data Selection
Data Frame: Data Selection
R Python
Functions R Python
Using a single column’s
values to select data, column
name “A”
subset(df,A>0) df[df.A > 0]
It will select the all the rows in which the
corresponding value in column A of that
row is greater than 0
df[df.A > 0]
It will do the same as the R function
Data Frame: Data Selection
R Python
Functions R Python
To run an sql query on a
data frame
library(sqldf)
library(tcltk)
sqldf("SELECT * FROM
diamonds2 LIMIT 5;")
sqldf("SELECT * FROM
diamonds2 WHERE carat
>4;")
from pandasql import sqldf
pysqldf = lambda q: sqldf(q, globals())
# a replacement of the above lambda function would
be:
# def pysqldf(q):
# return sqldf(q, globals())
# q here is the query.
pysqldf("SELECT * FROM diamonds2 LIMIT 5;")
pysqldf("SELECT * FROM diamonds2 WHERE carat
> 4;")
Data Frame: Using SQL
Data Frame: Using SQL
R Python
Functions R
Python
(import math and numpy library)
Sum sum(x) math.fsum(x)
Square Root sqrt(x) math.sqrt(x)
Standard Deviation sd(x) numpy.std(x)
Log log(x, base) math.log(x,base)
Mean mean(x) numpy.mean(x)
Median median(x) numpy.median(x)
Mathematical Functions
Mathematical Functions
R Python
Functions R
Python
(import math and numpy library)
Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x)
For list, vectors etc.: map(int,x), map(float,x)
Convert factor/numeric variable to character
variable
paste(x) For a single value: str(x)
For list, vectors etc.: map(str,x)
Check missing value in an object is.na(x) math.isnan(x)
Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) != 'nan']
Calculate the number of characters in character
value
nchar(x) len(x)
Data Manipulation
Functions
R
(import lubridate library)
Python
(import datetime library)
Getting time and date at an instant Sys.time() datetime.datetime.now()
Parsing date and time in format:
YYYY MM DD HH:MM:SS
d<Sys.time()
d_format<ymd_hms(d)
d=datetime.datetime.now()
format= “%Y %b %d %H:%M:%S”
d_format=d.strftime(format)
Data & Time Manipulation
R Python
Functions
R
(import lubridate library)
Python
(import datetime library)
Scatter Plot plot(variable1,variable2) import matplotlib.pyplot as plt
plt.scatter(variable1,variable2)
plt.show()
Boxplot boxplot(Var) plt.boxplot(Var)
plt.show()
Histogram hist(Var) plt.hist(Var)
plt.show()
Line Plot plot(var1, var2, type = 'l') plt.plot(var1, var2)
plt.show()
Bubble Plot symbols(var1, var2, circles = var3,
inches = 0.2)
plt.scatter(var1, var2, s = var3*200)
plt.show()
Bar Plot barplot(var) plt.bar(np.arange(len(var)), df[:,1])
plt.show()
Pie Chart pie(Var) from pylab import *
pie(Var)
show()
Data Visualization
Data Visualization: Scatter Plot
R Python
Data Visualization: Box Plot
R Python
Data Visualization: Box Plot
R Python
Data Visualization: Factor Plot
R Python
Data Visualization: Histogram
R Python
Data Visualization: Line Plot
R Python
Data Visualization: Bubble Plot
R Python
Data Visualization: Bar Plot
R Python
Data Visualization: Pie Chart
R Python
Data Visualization: Joint Plot
R Python
Data Visualization: ggplot
R Python
R(Using svm* function) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <iris[1:149,]
testset <iris[150,]
svm.model < svm(Species ~ ., data =
trainset, cost = 100, gamma = 1, type= 'C classification')
svm.pred< predict(svm.model,testset[5])
svm.pred
#Loading Library
from sklearn import svm
#Importing Dataset
from sklearn import datasets
#Calling SVM
clf = svm.SVC()
#Loading the package
iris = datasets.load_iris()
#Constructing training data
X, y = iris.data[:1], iris.target[:1]
#Fitting SVM
clf.fit(X, y)
#Testing the model on test data
print clf.predict(iris.data[1])
Output: Virginica Output: 2, corresponds to Virginica
*To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/
**To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit-
learn.org/stable/modules/generated/sklearn.svm.SVC.html
Machine Learning: SVM on Iris Dataset
R(Using lm* function) Python(Using sklearn** library)
data(iris)
total_size<dim(iris)[1]
num_target<c(rep(0,total_size))
for (i in 1:length(num_target)){
if(iris$Species[i]=='setosa'){num_target[i]<0}
else if(iris$Species[i]=='versicolor')
{num_target[i]<1}
else{num_target[i]<2}
}
iris$Species<num_target
train_set <iris[1:149,]
test_set <iris[150,]
fit<lm(Species ~ 0+Sepal.Length+ Sepal.Width+
Petal.Length+ Petal.Width , data=train_set)
coefficients(fit)
predict.lm(fit,test_set)
from sklearn import linear_model
from sklearn import datasets
iris = datasets.load_iris()
regr = linear_model.LinearRegression()
X, y = iris.data[:1], iris.target[:1]
regr.fit(X, y)
print(regr.coef_)
print regr.predict(iris.data[1])
Output: 1.64 Output: 1.65
*To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html
**To know more about sklearn linear regression visit : http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
Machine Learning: Linear Regression on Iris Dataset
R(Using randomForest* package) Python(Using sklearn** library)
library(randomforest)
data(iris)
total_size <- dim(iris)[1]
num_target <- c(rep(0, total_size))
for (i in 1:length(num_target)){
if(iris$Species[i]=='setosa'){num_target[i]<0}
else if(iris$Species[i]=='versicolor'){num_target[i]<1}
else{num_target[i]<2}}
iris$Species<num_target
train_set <iris[1:149,]
test_set <iris[150,]
iris.rf < randomForest(Species ~ .,
data=train_set,ntree=100,importance=TRUE, proximity=TRUE)
print(iris.rf)
predict(iris.rf, test_set[5], predict.all=TRUE)
from sklearn import ensemble
from sklearn import datasets
clf = ensemble.RandomForestClassifier(n_estimators = 100,
max_depth=10)
iris = datasets.load_iris()
X, y = iris.data[:1], iris.target[:1]
clf.fit(X, y)
print clf.predict(iris.data[1])
Output: 1.845 Output: 2
*To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/
** To know more about sklearn random forest visit : http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
Machine Learning: RandomForest on Iris Dataset
R(Using rpart* package) Python(Using sklearn** library)
library(rpart)
data(iris)
sub <- c(1:149)
fit <- rpart(Species ~ ., data = iris, subset = sub)
fit
predict(fit, iris[sub,], type = "class")
from sklearn import datasets
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=0)
iris = datasets.load_iris()
X, y = iris.data[:1], iris.target[:1]
clf.fit(X, y)
print clf.predict(iris.data[1])
Output: Virginica Output: 2, corresponds to virginica
*To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/
**To know more about sklearn desicion tree visit : http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
Machine Learning: Decision Tree on Iris Dataset
R(Using e1071* package) Python(Using sklearn** library)
library(e1071)
data(iris)
trainset <iris[1:149,]
testset <iris[150,]
classifier<naiveBayes(trainset[,1:4], trainset[,5])
predict(classifier, testset[,5])
from sklearn import datasets
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
iris = datasets.load_iris()
X, y = iris.data[:1], iris.target[:1]
clf.fit(X, y)
print clf.predict(iris.data[1])
Output: Virginica Output: 2, corresponds to virginica
*To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/
**To know more about sklearn Naive Bayes visit : http://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html
Machine Learning: Gaussian Naive Bayes on Iris
Dataset
R(Using kknn* package) Python(Using sklearn** library)
library(kknn)
data(iris)
trainset <iris[1:149,]
testset <iris[150,]
iris.kknn < kknn(Species~., trainset,testset, distance =
1, kernel = "triangular")
summary(iris.kknn)
fit < fitted(iris.kknn)
fit
from sklearn import datasets
from sklearn.neighbors import
KNeighborsClassifier
knn = KNeighborsClassifier()
iris = datasets.load_iris()
X, y = iris.data[:1], iris.target[:1]
knn.fit(X,y)
print knn.predict(iris.data[1])
Output: Virginica Output: 2, corresponds to virginica
*To know more about kknn package in R visit: https://cran.r-project.org/web/packages/kknn/
**To know more about sklearn k nearest neighbours visit: http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html
Machine Learning: K Nearest Neighbours on Iris
Dataset
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Language R
Language RLanguage R
Language R
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)Data Manipulation Using R (& dplyr)
Data Manipulation Using R (& dplyr)
 
Statistical computing 01
Statistical computing 01Statistical computing 01
Statistical computing 01
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Data manipulation with dplyr
Data manipulation with dplyrData manipulation with dplyr
Data manipulation with dplyr
 
Multiple file programs, inheritance, templates
Multiple file programs, inheritance, templatesMultiple file programs, inheritance, templates
Multiple file programs, inheritance, templates
 
R Introduction
R IntroductionR Introduction
R Introduction
 
R seminar dplyr package
R seminar dplyr packageR seminar dplyr package
R seminar dplyr package
 
R Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In RR Programming: Learn To Manipulate Strings In R
R Programming: Learn To Manipulate Strings In R
 
Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet
 
R Programming: Export/Output Data In R
R Programming: Export/Output Data In RR Programming: Export/Output Data In R
R Programming: Export/Output Data In R
 
Data Analysis with Python Pandas
Data Analysis with Python PandasData Analysis with Python Pandas
Data Analysis with Python Pandas
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
Sparklyr
SparklyrSparklyr
Sparklyr
 
Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.Statistical inference for (Python) Data Analysis. An introduction.
Statistical inference for (Python) Data Analysis. An introduction.
 

Ähnlich wie Python for R users

Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
Chia-Chi Chang
 
Modular Module Systems
Modular Module SystemsModular Module Systems
Modular Module Systems
league
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
g3_nittala
 

Ähnlich wie Python for R users (20)

pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdf
 
R language introduction
R language introductionR language introduction
R language introduction
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
 
R Programming Reference Card
R Programming Reference CardR Programming Reference Card
R Programming Reference Card
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
Pandas cheat sheet_data science
Pandas cheat sheet_data sciencePandas cheat sheet_data science
Pandas cheat sheet_data science
 
Pandas Cheat Sheet
Pandas Cheat SheetPandas Cheat Sheet
Pandas Cheat Sheet
 
Pandas cheat sheet
Pandas cheat sheetPandas cheat sheet
Pandas cheat sheet
 
Data Wrangling with Pandas
Data Wrangling with PandasData Wrangling with Pandas
Data Wrangling with Pandas
 
Modular Module Systems
Modular Module SystemsModular Module Systems
Modular Module Systems
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
20170509 rand db_lesugent
20170509 rand db_lesugent20170509 rand db_lesugent
20170509 rand db_lesugent
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
 
Practical cats
Practical catsPractical cats
Practical cats
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Short Reference Card for R users.
Short Reference Card for R users.Short Reference Card for R users.
Short Reference Card for R users.
 
Reference card for R
Reference card for RReference card for R
Reference card for R
 

Kürzlich hochgeladen

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 

Kürzlich hochgeladen (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

Python for R users

  • 1. PYTHON for R Users By- Satyarth Praveen
  • 2. Functions R Python Downloading and installing a package install.packages('name') pip install name Load a package library(name) import name as other_name Checking working directory getwd() import os os.getcwd() Setting working directory setwd(‘path’) os.chdir(‘path’) List files in a directory dir(‘path’) os.listdir(‘path’) List all objects ls() globals() Remove an object rm('name') del('object') See manual for the function help(help) help(help) See the type of an object class(object) type(object) Basic Commands
  • 3. Functions R Python (Using pandas package) Creating a data frame “df” of dimension 6x4 (6 rows and 4 columns) containing random numbers A<- matrix(runif(24,0,1),nrow=6,ncol=4) df<-data.frame(A) Here, • runif function generates 24 random numbers between 0 to 1 • matrix function creates a matrix from those random numbers, nrow and ncol sets the numbers of rows and columns to the matrix • data.frame converts the matrix to data frame import numpy as np import pandas as pd A=np.random.randn(6,4) df=pd.DataFrame(A) Here, • np.random.randn generates a matrix of 6 rows and 4 columns; this function is a part of numpy** library • pd.DataFrame converts the matrix in to a data frame To read a csv from a URL or a file. read.csv("URL or file_name") Import pandas as pd pd.read_csv("URL or file_name") Data Frame Creation *To install Pandas library visit: http://pandas.pydata.org/; To import Pandas library type: import pandas as pd; **To import Numpy library type: import numpy as np;
  • 5. Functions R Python Getting the names of rows and columns of data frame “df” rownames(df) colnames(df) df.index df.columns Seeing the top and bottom “x” rows of the data frame “df” head(df, x) tail(df, x) df.head(x) df.tail(x) Getting dimension of data frame “df” dim(df) df.shape Length of data frame “df” length(df) returns the number of columns of the data frame len(df) returns no. of rows in data frames To view the correlation among the columns of the data frames cor(df) df.corr() To view the unique entries of a vector unique(df$column_name) df.column_name.unique() Frequency count of each value in the column in the data frame table(diamonds$cut) OR summary(diamonds$cut) pd.value_counts(diamonds.cut) Data Frame: Inspecting and Viewing Data
  • 6. Functions R Python Group the data frame as per the values of a column library(dplyr) df <- tbl_df(df) groupby_df <- df %>% group_by(column_name) Import pandas as pd groupby_df = pd.groupby(df, df.column_name) To perform a grouped mean on the column ‘column_name’ of the data frame summarise(groupby_df, mean(column_name)) groupby_df.column_name.mean() Frequency table for the values in 2 columns table(df$column_name1, df$column_name2) pd.crosstab(df.column_name1, df.column_name2, margins='TRUE') Data Frame: Inspecting and Viewing Data
  • 7. Data Frame: Inspecting and Viewing Data R Python
  • 8. Functions R Python Getting quick summary(like mean, std. deviation etc. ) of data in the data frame “df” summary(df) returns mean, median , maximum, minimum, first quarter and third quarter df.describe() returns count, mean, standard deviation, maximum, minimum, 25%, 50% and 75% Getting a compact view of the data structure of the object. str(df) df.info() To check for all available methods for the object. isS4(object_name) # to check if the object is S4 type is(object_name, 'refClass') # to check if the object is RC type # if both are false then object is of S3 type: methods(class = class(object_name)) # if object is of S4 type: showMethods(classes = class(object_name)) dir(diamonds) Setting row names and columns names of the data frame “df” rownames(df) = c("A", "B", "C", "D", "E", "F") colnames(df) = c("P", "Q", "R", "S") df.index=[“A”, ”B”, “C”, ”D”, “E”, ”F”] df.columns=[“P”, ”Q”, “R”, ”S”] Data Frame: Inspecting and Viewing Data
  • 9. Data Frame: Inspecting and Viewing Data R Python
  • 10. Functions R Python Sorting the data in the data frame “df” by column name “P” df[order(df$P),] df.sort_values(by=['P']) This command is deprecated: df.sort(['P']) Data Frame: Sorting Data
  • 11. Data Frame: Sorting Data R Python
  • 12. Functions R Python Slicing the rows of a data frame from row no. “x” to row no. “y”(including row x and y) df[x:y,] df[x1:y] Python starts counting from 0 Slicing the columns name “X”,”Y” etc. of a data frame “df” myvars < c(“X”,”Y”) newdata < df[myvars] df.loc[:,[‘X’,’Y’]] Selecting the the data from row no. “x” to “y” and column no. “a” to “b” df[x:y,a:b] df.iloc[x1:y,a1:b] Selecting the element at row no. “x” and column no. “y” df[x,y] df.iat[x1,y1] Selecting column of the data frame df$column_name df.column_name Data Frame: Data Selection
  • 13. Data Frame: Data Selection R Python
  • 14. Functions R Python Using a single column’s values to select data, column name “A” subset(df,A>0) df[df.A > 0] It will select the all the rows in which the corresponding value in column A of that row is greater than 0 df[df.A > 0] It will do the same as the R function Data Frame: Data Selection R Python
  • 15. Functions R Python To run an sql query on a data frame library(sqldf) library(tcltk) sqldf("SELECT * FROM diamonds2 LIMIT 5;") sqldf("SELECT * FROM diamonds2 WHERE carat >4;") from pandasql import sqldf pysqldf = lambda q: sqldf(q, globals()) # a replacement of the above lambda function would be: # def pysqldf(q): # return sqldf(q, globals()) # q here is the query. pysqldf("SELECT * FROM diamonds2 LIMIT 5;") pysqldf("SELECT * FROM diamonds2 WHERE carat > 4;") Data Frame: Using SQL
  • 16. Data Frame: Using SQL R Python
  • 17. Functions R Python (import math and numpy library) Sum sum(x) math.fsum(x) Square Root sqrt(x) math.sqrt(x) Standard Deviation sd(x) numpy.std(x) Log log(x, base) math.log(x,base) Mean mean(x) numpy.mean(x) Median median(x) numpy.median(x) Mathematical Functions
  • 19. Functions R Python (import math and numpy library) Convert character variable to numeric variable as.numeric(x) For a single value: int(x), long(x), float(x) For list, vectors etc.: map(int,x), map(float,x) Convert factor/numeric variable to character variable paste(x) For a single value: str(x) For list, vectors etc.: map(str,x) Check missing value in an object is.na(x) math.isnan(x) Delete missing value from an object na.omit(list) cleanedList = [x for x in list if str(x) != 'nan'] Calculate the number of characters in character value nchar(x) len(x) Data Manipulation
  • 20. Functions R (import lubridate library) Python (import datetime library) Getting time and date at an instant Sys.time() datetime.datetime.now() Parsing date and time in format: YYYY MM DD HH:MM:SS d<Sys.time() d_format<ymd_hms(d) d=datetime.datetime.now() format= “%Y %b %d %H:%M:%S” d_format=d.strftime(format) Data & Time Manipulation R Python
  • 21. Functions R (import lubridate library) Python (import datetime library) Scatter Plot plot(variable1,variable2) import matplotlib.pyplot as plt plt.scatter(variable1,variable2) plt.show() Boxplot boxplot(Var) plt.boxplot(Var) plt.show() Histogram hist(Var) plt.hist(Var) plt.show() Line Plot plot(var1, var2, type = 'l') plt.plot(var1, var2) plt.show() Bubble Plot symbols(var1, var2, circles = var3, inches = 0.2) plt.scatter(var1, var2, s = var3*200) plt.show() Bar Plot barplot(var) plt.bar(np.arange(len(var)), df[:,1]) plt.show() Pie Chart pie(Var) from pylab import * pie(Var) show() Data Visualization
  • 23. Data Visualization: Box Plot R Python
  • 24. Data Visualization: Box Plot R Python
  • 25. Data Visualization: Factor Plot R Python
  • 27. Data Visualization: Line Plot R Python
  • 28. Data Visualization: Bubble Plot R Python
  • 29. Data Visualization: Bar Plot R Python
  • 30. Data Visualization: Pie Chart R Python
  • 31. Data Visualization: Joint Plot R Python
  • 33. R(Using svm* function) Python(Using sklearn** library) library(e1071) data(iris) trainset <iris[1:149,] testset <iris[150,] svm.model < svm(Species ~ ., data = trainset, cost = 100, gamma = 1, type= 'C classification') svm.pred< predict(svm.model,testset[5]) svm.pred #Loading Library from sklearn import svm #Importing Dataset from sklearn import datasets #Calling SVM clf = svm.SVC() #Loading the package iris = datasets.load_iris() #Constructing training data X, y = iris.data[:1], iris.target[:1] #Fitting SVM clf.fit(X, y) #Testing the model on test data print clf.predict(iris.data[1]) Output: Virginica Output: 2, corresponds to Virginica *To know more about svm function in R visit: http://cran.r-project.org/web/packages/e1071/ **To install sklearn library visit : http://scikit-learn.org/, To know more about sklearn svm visit: http://scikit- learn.org/stable/modules/generated/sklearn.svm.SVC.html Machine Learning: SVM on Iris Dataset
  • 34. R(Using lm* function) Python(Using sklearn** library) data(iris) total_size<dim(iris)[1] num_target<c(rep(0,total_size)) for (i in 1:length(num_target)){ if(iris$Species[i]=='setosa'){num_target[i]<0} else if(iris$Species[i]=='versicolor') {num_target[i]<1} else{num_target[i]<2} } iris$Species<num_target train_set <iris[1:149,] test_set <iris[150,] fit<lm(Species ~ 0+Sepal.Length+ Sepal.Width+ Petal.Length+ Petal.Width , data=train_set) coefficients(fit) predict.lm(fit,test_set) from sklearn import linear_model from sklearn import datasets iris = datasets.load_iris() regr = linear_model.LinearRegression() X, y = iris.data[:1], iris.target[:1] regr.fit(X, y) print(regr.coef_) print regr.predict(iris.data[1]) Output: 1.64 Output: 1.65 *To know more about lm function in R visit: https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html **To know more about sklearn linear regression visit : http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html Machine Learning: Linear Regression on Iris Dataset
  • 35. R(Using randomForest* package) Python(Using sklearn** library) library(randomforest) data(iris) total_size <- dim(iris)[1] num_target <- c(rep(0, total_size)) for (i in 1:length(num_target)){ if(iris$Species[i]=='setosa'){num_target[i]<0} else if(iris$Species[i]=='versicolor'){num_target[i]<1} else{num_target[i]<2}} iris$Species<num_target train_set <iris[1:149,] test_set <iris[150,] iris.rf < randomForest(Species ~ ., data=train_set,ntree=100,importance=TRUE, proximity=TRUE) print(iris.rf) predict(iris.rf, test_set[5], predict.all=TRUE) from sklearn import ensemble from sklearn import datasets clf = ensemble.RandomForestClassifier(n_estimators = 100, max_depth=10) iris = datasets.load_iris() X, y = iris.data[:1], iris.target[:1] clf.fit(X, y) print clf.predict(iris.data[1]) Output: 1.845 Output: 2 *To know more about randomForest package in R visit: http://cran.r-project.org/web/packages/randomForest/ ** To know more about sklearn random forest visit : http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html Machine Learning: RandomForest on Iris Dataset
  • 36. R(Using rpart* package) Python(Using sklearn** library) library(rpart) data(iris) sub <- c(1:149) fit <- rpart(Species ~ ., data = iris, subset = sub) fit predict(fit, iris[sub,], type = "class") from sklearn import datasets from sklearn.tree import DecisionTreeClassifier clf = DecisionTreeClassifier(random_state=0) iris = datasets.load_iris() X, y = iris.data[:1], iris.target[:1] clf.fit(X, y) print clf.predict(iris.data[1]) Output: Virginica Output: 2, corresponds to virginica *To know more about rpart package in R visit: http://cran.r-project.org/web/packages/rpart/ **To know more about sklearn desicion tree visit : http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html Machine Learning: Decision Tree on Iris Dataset
  • 37. R(Using e1071* package) Python(Using sklearn** library) library(e1071) data(iris) trainset <iris[1:149,] testset <iris[150,] classifier<naiveBayes(trainset[,1:4], trainset[,5]) predict(classifier, testset[,5]) from sklearn import datasets from sklearn.naive_bayes import GaussianNB clf = GaussianNB() iris = datasets.load_iris() X, y = iris.data[:1], iris.target[:1] clf.fit(X, y) print clf.predict(iris.data[1]) Output: Virginica Output: 2, corresponds to virginica *To know more about e1071 package in R visit: http://cran.r-project.org/web/packages/e1071/ **To know more about sklearn Naive Bayes visit : http://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html Machine Learning: Gaussian Naive Bayes on Iris Dataset
  • 38. R(Using kknn* package) Python(Using sklearn** library) library(kknn) data(iris) trainset <iris[1:149,] testset <iris[150,] iris.kknn < kknn(Species~., trainset,testset, distance = 1, kernel = "triangular") summary(iris.kknn) fit < fitted(iris.kknn) fit from sklearn import datasets from sklearn.neighbors import KNeighborsClassifier knn = KNeighborsClassifier() iris = datasets.load_iris() X, y = iris.data[:1], iris.target[:1] knn.fit(X,y) print knn.predict(iris.data[1]) Output: Virginica Output: 2, corresponds to virginica *To know more about kknn package in R visit: https://cran.r-project.org/web/packages/kknn/ **To know more about sklearn k nearest neighbours visit: http://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html Machine Learning: K Nearest Neighbours on Iris Dataset