SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Programming Language
History and Introduction
 R is a programming language and free software environment for statistical
computing and graphics supported by the R Foundation for Statistical Computing.
 R is widely used by statisticians, data analysts and researchers for developing
statistical software and data analysis.
 It compiles and runs on a wide variety of UNIX platforms, Windows and Mac OS.
 The copyright for the primary source code for R is held by the R Foundation and is
published under the GNU General Public License version 2.0.
2
3
History and Introduction
 R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New
Zealand.
 Currently R is developed & maintained by the R Development Core Team.
 The Applications of R programming language includes :
1.Statical Computing
2.Machine Learning
3.Data Science
 R can be downloaded and installed from CRAN(Comprehensive R Archive Network) website.
 R language is cross platform interoperable and fully portable which means R program that
you write on one platform can be carried out to other platform and run there.(Platform
independent)
4
History and Introduction
Top Tier companies using R – companies all over the world use R language for statical analysis.
These are some of top tier companies that uses R.
Company Name Applications
Facebook For behavior analysis related to status updates and profile pictures.
Google For advertising effectiveness and economic forecasting.
Twitter For data visualization and semantic clustering
Microsoft
Acquired Revolution R company and use it for a variety of
purposes.
Uber For statistical analysis
Airbnb Scale data science.
5
Evolution of R
R is a dialect of S language…
 It means that R is an implementation of the S programming language combined
with lexical scoping semantics & inspired by Scheme.
 S language was created by John Chambers in 1976 at Bell Labs.
 A commercial version of S was offered as S-PLUS starting in 1988.
S version1
S version 2
S version 3
S version4
6
Features of R
These are the some of important features of R -
 R is a simple, effective and well-developed, programming language which
includes conditionals, loops, user defined & recursive functions and input &
output facilities.
 R provides a large, coherent and integrated collection of tools for data
analysis.
 R has an effective data handling and storage facility.
 R provides a suite of operators for calculations on arrays, lists, vectors and
matrices.
 R provides graphical facilities for data analysis and display either directly at
the computer or printing at the papers.
7
Features of R
Fast Calculation
Extremely Compatible
Open Source
Cross Platform Support
Wide Packages
Large Standard Library
 Fast Calculation - R can be used to perform complex mathematical
and statistical calculations on data objects of a wide variety.
 Extreme Compatibility - R is an interpreted language which means
that it does not need a compiler to make a program from the code.
 Open Source - R is an open-source software environment. You can
make improvements and add packages for additional functionalities
 Cross Platform Support - R is machine-independent. It supports
the cross-platform operation. Therefore, it can be used on many
different operating systems.
 Wide Packages - CRAN houses more
than 10,000 different packages and extensions that help solve all
sorts of problems in data science.
 Large Standard Library - R can produce static graphics with
production quality visualizations and has extended libraries
providing interactive graphic capabilities.
8
Syntax of R
Once we have R environment setup, then it’s easy to start our R command prompt by just
typing R in command prompt.
Hello World Program –
>myString <- “Hello world !”
>print(myString)
Output :
[1] “Hello World !”
The [ ] in the output of R can be used
to reference data frame columns
In the Syntax of R we will discuss –
 Data Types
 Variables
 Keywords
 Operators
 Data Structures
Data Types
Logical
Integer
Character
Numeric
raw
Complex
Data
Types
In R there are basically 6 data types –
Data Type Examples
Integer 2L,5L,8L
Numeric 6,2,1,9
Logical true,false,0,1
raw Raw Bytes
complex Z=3+7i
Character ‘A’ , ”Aditya” , ”AB12”
9
10
Variables
Rules For Naming Variables in R –
1. In R variable name must be a combination of letters, digits, period(.) and
underscores.
2. It must start with a letter or period(.) and if it starts with period then it period
should not be followed by number.
3. Reserved words in R cannot be used in variable name.
Valid variables Invalid Variables
 myValue
 .my.value.one
 my_value_one
 Data4
 .1nikku
 TRUE
 vik@sh
 _temp
Keywords
Reserved Keywords in R – Reserved words are set of words that have special meaning
and cannot be used as names of identifiers.
If Else Repeat While Function
For In Next Break TRUE
FALSE NULL inf NaN -
Reserved Keywords in R
11
12
Operators
In any programming language, an operator is a symbol which is used to represent an
action. R has several operators to perform tasks including arithmetic, logical and bitwise
operations.
Operators in R can mainly be classified into the following categories –
1.Arithmetic Operators = {+ , - , * , / , %% , %/%}
2.Logical Operators = { ! , & , && , | , ||}
3.Assignment operators = { <- , <<- , = , -> , ->>}
4.Relational Operators = { < , > , <= , >= , != , ==}
13
Functions
Functions are used to incorporate sets of instructions that you want to use repeatedly. There are two types of functions.
Function
Built In User Defined
14
Built - In
Built-in functions are those functions which are provided by R so that we can use directly
within the language and its standard libraries.
In R there are so many built-in functions which make our programming fast and easy.
For Example :
1.The sum(a,b) function will return (a+b)
>print(sum(10,20))
[1] 30
2.The seq(a,b) function is used to get sequence from a to b.
>print(seq(5,15))
[1] 5 6 7 8 9 10 11 12 13 14 15
User Defined
User defined functions are those functions which we define in our code and use them
repeatedly. These functions can be defined with two types.
1.Without Arguments 2.With Arguments
Without Arguments With Arguments
myFunction <- function()
{
#This will be printed on calling this funcition
print(“Without Arguments”)
}
myFunction <- function(a,b)
{
#This function will print sum of passed args
print(a+b)
}
16
Conditional Statements
 Conditional Statements in R programming are used to make decisions based on the conditions.
 Conditional statements execute sequentially when there is no condition around the statements.
In R language we’ll discuss 3 types of Conditional Statements –
1.If - else statements
2.If – else if – else statements
3.Switch statements
17
If-else
Start
Execute
Else block
End
Execute
If Block
Condition
True?
yes no
Syntax –
If(condition) {
expression 1
}
Else {
expression 2
}
Example –
If(a>b) {
print(“a is greater than b”)
}
Else {
print(“ a is less than b”)
}
18
Switch statement
 In switch() function we pass two types of arguments one is value and others is list of
items.
 The expression is evaluated based on the value and corresponding item is returned.
 If the value evaluated from the expression matches with more than one item of the
list then switch() function returns the item which was matched first.
Examples:
> switch(2,”Delhi”,”Jaipur”,”Mumbai”) > a=3
>[1] “Jaipur” > switch(a,”red”,”blue”,”green”,”yellow”)
> [1] “green”
19
Loops
Loops
In
R
Loops are used When we need to
execute particular code
repeatedly.
In R Language there are 3 types
of Loops –
1.For Loop
2.While Loop
3.Repeat Loop
For Loop
Example to count the number of even numbers in a
vector.
Program -
x <- c(2,5,3,9,8,11,6)
count <- 0
for (i in x) {
if(i %% 2 == 0) {
count=count+1
}
}
print(count)
Output -
[1] 3
No
Last item
Reached??
Body of
For Loop
Exit Loop
Yes
For each item
in Sequence
A for loop is used to iterate over a vector in R programming.
20
21
While Loop
In R programming, while loops are used to loop until a specific condition is met.
Program –
i <- 1
while(i<5) {
print(i)
i=i+1
}
Output –
[1] 1
[1] 2
[1] 3
[1] 4
Yes
No
Condition
True??
Execute code
of while block
Start
Execute code
outside while block
Repeat Loop
 A repeat loop is used to iterate over a block of code multiple number of times.
 There is no condition check in repeat loop to exit the loop.
 We must ourselves put a condition explicitly inside the body of the loop and use the break statement to
exit the loop. Failing to do so will result into an infinite loop.
Example –
x <- 1
repeat {
print(x)
x = x+1
if (x == 4) {
break
}
}
Output –
[1] 1
[1] 2
[1] 3
Body of
Loop
Break
?
Remaining
body of loop Exit
Enter Loop
Yes
No
22
23
Data Structures
Data
Structures
Vectors
Factors
Data
Frames
Lists
Matrices
Arrays
A data structure is a particular way of
organizing data in a computer so that it can
be used effectively. The idea is to reduce
the space and time complexities of different
tasks. Data structures in R programming are
tools for holding multiple values.
The most essential data structures used in R include :
 Vectors
 Arrays
 Factors
 Lists
 Matrices
 Data Frames
24
Vector
 Vector is the one of basic data structure of R which supports integer,
double, Character, logical, complex and raw data types.
 The elements in a vector are known as components of a vector.
Vector Creation
Vector can be created using these two methods :-
1.By Using Colon(:) Operator –
a <- 2:8
print(a) # 2 3 4 5 6 7 8
2.By Using seq() function–
a <- seq(2,10,by=2)
print(a) # 2 4 6 8 10
25
Vector
Vector Operations
1.Combining Vectors 2.Arithmetic Operations
a <- c(4,3,5) a <- c(1,2,3)
b <- c(‘x’,’y’,’z’) b <- c(4,5,6)
c <- c(a,b) d <- a+b
print(c) o/p= 4 3 5 x y z print(d) o/p = 5 7 9
3.Numeric Indexing 4.Duplicate Indexing
a <- c(4,3,5) a <- c(4,3,5)
Print(a[2]) op = 5 print(a[1,2,2,3,3]) o/p=4 3 3 5 5
5.Logical Indexing 6.Range Indexing
a <- c(4,3,5) a <- c(1,2,3,4,5,6,7)
print(a[true,false,true]) o/p = 4 5 print(a[2:6]) o/p = 2 3 4 5 6
26
Array
Arrays allow us to store data in multi - dimensions and use in efficient way.
array Creation
Syntax -
Array_Name <- array(data, dim=(row_size,column_size,matrices), dim_names)
array Operations
1.Accessing Array Elements –
Accessing array in R is similar to other programming languages like c,c++ and java.
Eg. Print(Arr[2,2])
2.Arithmetic Operations –
Eg. Arr3 <- Arr2 + Arr1
Or
Arr3 = Arr1 – Arr2
27
Data Frame
Data Frame is a table or a two dimensional Array type structure.
Important Considerations
 The Column names should be non-empty.
 The row names should be unique.
 The Data stored in Data Frames can be only Numeric, Factor or Character Type.
 Each column should contain same number of data types.
Data Frame Creation
products <-data.frame(
product_number = seq(1:4)
product_name = c(“Apple”,”Samsung”,”Redmi”,”Oppo”))
print(products)
Product_number Product_name
1 Apple
2 Samsung
3 Redmi
4 Oppo
28
Lists
 List is a data structure which have components of Mixed data types.
 So a vector having elements of different data types is called a list.
 List can be created using list() function.
Eg. –
x <- list( a=“amba”,b=9.23,c=TRUE) #list storing 3 different data types
Accessing List Elements
print(x[‘b’]) o/p = 9.23
print(x[‘a’]) o/p = “amba”
Manipulating List Elements
x[‘a’] <- “nitin”
print([‘a’]) o/p = “nitin”
29
Matrices
 In R two dimensional rectangular data set is known as Matrix.
 A Matrix is created with the help of the input vector to the matrix() function.
 We can Perform addition, subtraction, multiplication and division operations on matrices.
Creating matrix -
Matrix1 <- matrix (2:7,nrow=2,ncol=3)
print(Matrix1)
o/p = 2 4 6
3 5 7
Accessing Elements –
Matrix1[2,3] # 7
Assigning Value –
Matrix1[2,3]=1
Matrices
Operations On Matrices
1.Addition :
Matrix3=Matrix1+Matrix2
2.Subtraction :
Matrix3 = Matrix2 – Matrix1
3.Multiply by a Constant :
Ex : 7*Matrix1
4.Identity Matrix :
Ex – diag(5)
5.Transposition
Ex – t(Matrix1)
30
31
Factors
Factors are data objects which are used to categorise the data and store it as levels.
For example: a data field such as marital status may contain only values from single, married,
separated, divorced, or widowed.
>x
[1] single married married single
Levels : married single
Here, we can see that factor x has four elements and two levels. We can check if a variable is a
factor or not using class() function.
>class(x)
[1] “factor”
>levels(x)
[1] married single
R - Studio
Interacting with R Studio –
 R-Studio is a free and open-source integrated development
environment (IDE) for R, a programming language for statistical
computing and graphics.
 R-Studio was founded by JJ Allaire,creator of the programming
language ColdFusion.
 There are 4 main sections in R-Studio IDE…
1.Code Editor
2.Workspace and History
3.R console
4.Plots and Files
33
R - Studio
 RStudio is available in two editions:
1.RStudio Desktop, where the program is run
locally as a regular desktop application.
2.RStudio Server, Prepackaged distributions of
RStudio Desktop are available for Windows, OS
X, and Linux.
 RStudio is written in the C++ programming
language and uses the Qt framework for its
graphical user interface.
34
Why learn R?
R Programming Language

Weitere ähnliche Inhalte

Was ist angesagt?

Logistic regression
Logistic regressionLogistic regression
Logistic regression
saba khan
 
14. Query Optimization in DBMS
14. Query Optimization in DBMS14. Query Optimization in DBMS
14. Query Optimization in DBMS
koolkampus
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with R
Kazuki Yoshida
 

Was ist angesagt? (20)

R programming language
R programming languageR programming language
R programming language
 
Unit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptx
 
R studio
R studio R studio
R studio
 
R Programming: Introduction to Matrices
R Programming: Introduction to MatricesR Programming: Introduction to Matrices
R Programming: Introduction to Matrices
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Data Reduction
Data ReductionData Reduction
Data Reduction
 
Installing R and R-Studio
Installing R and R-StudioInstalling R and R-Studio
Installing R and R-Studio
 
Dbms relational model
Dbms relational modelDbms relational model
Dbms relational model
 
Database Management System ppt
Database Management System pptDatabase Management System ppt
Database Management System ppt
 
R language tutorial
R language tutorialR language tutorial
R language tutorial
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Introduction to statistical software R
Introduction to statistical software RIntroduction to statistical software R
Introduction to statistical software R
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
Step By Step Guide to Learn R
Step By Step Guide to Learn RStep By Step Guide to Learn R
Step By Step Guide to Learn R
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptx
 
14. Query Optimization in DBMS
14. Query Optimization in DBMS14. Query Optimization in DBMS
14. Query Optimization in DBMS
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with R
 

Ähnlich wie R Programming Language

FULL R PROGRAMMING METERIAL_2.pdf
FULL R PROGRAMMING METERIAL_2.pdfFULL R PROGRAMMING METERIAL_2.pdf
FULL R PROGRAMMING METERIAL_2.pdf
attalurilalitha
 
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
BRNSSPublicationHubI
 
Modeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.pptModeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.ppt
anshikagoel52
 

Ähnlich wie R Programming Language (20)

R basics for MBA Students[1].pptx
R basics for MBA Students[1].pptxR basics for MBA Students[1].pptx
R basics for MBA Students[1].pptx
 
1_Introduction.pptx
1_Introduction.pptx1_Introduction.pptx
1_Introduction.pptx
 
FULL R PROGRAMMING METERIAL_2.pdf
FULL R PROGRAMMING METERIAL_2.pdfFULL R PROGRAMMING METERIAL_2.pdf
FULL R PROGRAMMING METERIAL_2.pdf
 
Inroduction to r
Inroduction to rInroduction to r
Inroduction to r
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 
Lecture_R.ppt
Lecture_R.pptLecture_R.ppt
Lecture_R.ppt
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Introduction to R for beginners
Introduction to R for beginnersIntroduction to R for beginners
Introduction to R for beginners
 
R Course Online
R Course OnlineR Course Online
R Course Online
 
Lecture1
Lecture1Lecture1
Lecture1
 
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdfSTAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
 
Special topics in finance lecture 2
Special topics in finance   lecture 2Special topics in finance   lecture 2
Special topics in finance lecture 2
 
R programming Language
R programming LanguageR programming Language
R programming Language
 
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...Statistical Analysis and Data Analysis using R Programming Language: Efficien...
Statistical Analysis and Data Analysis using R Programming Language: Efficien...
 
Lecture1_R.ppt
Lecture1_R.pptLecture1_R.ppt
Lecture1_R.ppt
 
Lecture1_R.ppt
Lecture1_R.pptLecture1_R.ppt
Lecture1_R.ppt
 
Lecture1 r
Lecture1 rLecture1 r
Lecture1 r
 
Modeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.pptModeling in R Programming Language for Beginers.ppt
Modeling in R Programming Language for Beginers.ppt
 
R programming Language , Rahul Singh
R programming Language , Rahul SinghR programming Language , Rahul Singh
R programming Language , Rahul Singh
 
Unit 3
Unit 3Unit 3
Unit 3
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

R Programming Language

  • 2. History and Introduction  R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.  R is widely used by statisticians, data analysts and researchers for developing statistical software and data analysis.  It compiles and runs on a wide variety of UNIX platforms, Windows and Mac OS.  The copyright for the primary source code for R is held by the R Foundation and is published under the GNU General Public License version 2.0. 2
  • 3. 3 History and Introduction  R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand.  Currently R is developed & maintained by the R Development Core Team.  The Applications of R programming language includes : 1.Statical Computing 2.Machine Learning 3.Data Science  R can be downloaded and installed from CRAN(Comprehensive R Archive Network) website.  R language is cross platform interoperable and fully portable which means R program that you write on one platform can be carried out to other platform and run there.(Platform independent)
  • 4. 4 History and Introduction Top Tier companies using R – companies all over the world use R language for statical analysis. These are some of top tier companies that uses R. Company Name Applications Facebook For behavior analysis related to status updates and profile pictures. Google For advertising effectiveness and economic forecasting. Twitter For data visualization and semantic clustering Microsoft Acquired Revolution R company and use it for a variety of purposes. Uber For statistical analysis Airbnb Scale data science.
  • 5. 5 Evolution of R R is a dialect of S language…  It means that R is an implementation of the S programming language combined with lexical scoping semantics & inspired by Scheme.  S language was created by John Chambers in 1976 at Bell Labs.  A commercial version of S was offered as S-PLUS starting in 1988. S version1 S version 2 S version 3 S version4
  • 6. 6 Features of R These are the some of important features of R -  R is a simple, effective and well-developed, programming language which includes conditionals, loops, user defined & recursive functions and input & output facilities.  R provides a large, coherent and integrated collection of tools for data analysis.  R has an effective data handling and storage facility.  R provides a suite of operators for calculations on arrays, lists, vectors and matrices.  R provides graphical facilities for data analysis and display either directly at the computer or printing at the papers.
  • 7. 7 Features of R Fast Calculation Extremely Compatible Open Source Cross Platform Support Wide Packages Large Standard Library  Fast Calculation - R can be used to perform complex mathematical and statistical calculations on data objects of a wide variety.  Extreme Compatibility - R is an interpreted language which means that it does not need a compiler to make a program from the code.  Open Source - R is an open-source software environment. You can make improvements and add packages for additional functionalities  Cross Platform Support - R is machine-independent. It supports the cross-platform operation. Therefore, it can be used on many different operating systems.  Wide Packages - CRAN houses more than 10,000 different packages and extensions that help solve all sorts of problems in data science.  Large Standard Library - R can produce static graphics with production quality visualizations and has extended libraries providing interactive graphic capabilities.
  • 8. 8 Syntax of R Once we have R environment setup, then it’s easy to start our R command prompt by just typing R in command prompt. Hello World Program – >myString <- “Hello world !” >print(myString) Output : [1] “Hello World !” The [ ] in the output of R can be used to reference data frame columns In the Syntax of R we will discuss –  Data Types  Variables  Keywords  Operators  Data Structures
  • 9. Data Types Logical Integer Character Numeric raw Complex Data Types In R there are basically 6 data types – Data Type Examples Integer 2L,5L,8L Numeric 6,2,1,9 Logical true,false,0,1 raw Raw Bytes complex Z=3+7i Character ‘A’ , ”Aditya” , ”AB12” 9
  • 10. 10 Variables Rules For Naming Variables in R – 1. In R variable name must be a combination of letters, digits, period(.) and underscores. 2. It must start with a letter or period(.) and if it starts with period then it period should not be followed by number. 3. Reserved words in R cannot be used in variable name. Valid variables Invalid Variables  myValue  .my.value.one  my_value_one  Data4  .1nikku  TRUE  vik@sh  _temp
  • 11. Keywords Reserved Keywords in R – Reserved words are set of words that have special meaning and cannot be used as names of identifiers. If Else Repeat While Function For In Next Break TRUE FALSE NULL inf NaN - Reserved Keywords in R 11
  • 12. 12 Operators In any programming language, an operator is a symbol which is used to represent an action. R has several operators to perform tasks including arithmetic, logical and bitwise operations. Operators in R can mainly be classified into the following categories – 1.Arithmetic Operators = {+ , - , * , / , %% , %/%} 2.Logical Operators = { ! , & , && , | , ||} 3.Assignment operators = { <- , <<- , = , -> , ->>} 4.Relational Operators = { < , > , <= , >= , != , ==}
  • 13. 13 Functions Functions are used to incorporate sets of instructions that you want to use repeatedly. There are two types of functions. Function Built In User Defined
  • 14. 14 Built - In Built-in functions are those functions which are provided by R so that we can use directly within the language and its standard libraries. In R there are so many built-in functions which make our programming fast and easy. For Example : 1.The sum(a,b) function will return (a+b) >print(sum(10,20)) [1] 30 2.The seq(a,b) function is used to get sequence from a to b. >print(seq(5,15)) [1] 5 6 7 8 9 10 11 12 13 14 15
  • 15. User Defined User defined functions are those functions which we define in our code and use them repeatedly. These functions can be defined with two types. 1.Without Arguments 2.With Arguments Without Arguments With Arguments myFunction <- function() { #This will be printed on calling this funcition print(“Without Arguments”) } myFunction <- function(a,b) { #This function will print sum of passed args print(a+b) }
  • 16. 16 Conditional Statements  Conditional Statements in R programming are used to make decisions based on the conditions.  Conditional statements execute sequentially when there is no condition around the statements. In R language we’ll discuss 3 types of Conditional Statements – 1.If - else statements 2.If – else if – else statements 3.Switch statements
  • 17. 17 If-else Start Execute Else block End Execute If Block Condition True? yes no Syntax – If(condition) { expression 1 } Else { expression 2 } Example – If(a>b) { print(“a is greater than b”) } Else { print(“ a is less than b”) }
  • 18. 18 Switch statement  In switch() function we pass two types of arguments one is value and others is list of items.  The expression is evaluated based on the value and corresponding item is returned.  If the value evaluated from the expression matches with more than one item of the list then switch() function returns the item which was matched first. Examples: > switch(2,”Delhi”,”Jaipur”,”Mumbai”) > a=3 >[1] “Jaipur” > switch(a,”red”,”blue”,”green”,”yellow”) > [1] “green”
  • 19. 19 Loops Loops In R Loops are used When we need to execute particular code repeatedly. In R Language there are 3 types of Loops – 1.For Loop 2.While Loop 3.Repeat Loop
  • 20. For Loop Example to count the number of even numbers in a vector. Program - x <- c(2,5,3,9,8,11,6) count <- 0 for (i in x) { if(i %% 2 == 0) { count=count+1 } } print(count) Output - [1] 3 No Last item Reached?? Body of For Loop Exit Loop Yes For each item in Sequence A for loop is used to iterate over a vector in R programming. 20
  • 21. 21 While Loop In R programming, while loops are used to loop until a specific condition is met. Program – i <- 1 while(i<5) { print(i) i=i+1 } Output – [1] 1 [1] 2 [1] 3 [1] 4 Yes No Condition True?? Execute code of while block Start Execute code outside while block
  • 22. Repeat Loop  A repeat loop is used to iterate over a block of code multiple number of times.  There is no condition check in repeat loop to exit the loop.  We must ourselves put a condition explicitly inside the body of the loop and use the break statement to exit the loop. Failing to do so will result into an infinite loop. Example – x <- 1 repeat { print(x) x = x+1 if (x == 4) { break } } Output – [1] 1 [1] 2 [1] 3 Body of Loop Break ? Remaining body of loop Exit Enter Loop Yes No 22
  • 23. 23 Data Structures Data Structures Vectors Factors Data Frames Lists Matrices Arrays A data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks. Data structures in R programming are tools for holding multiple values. The most essential data structures used in R include :  Vectors  Arrays  Factors  Lists  Matrices  Data Frames
  • 24. 24 Vector  Vector is the one of basic data structure of R which supports integer, double, Character, logical, complex and raw data types.  The elements in a vector are known as components of a vector. Vector Creation Vector can be created using these two methods :- 1.By Using Colon(:) Operator – a <- 2:8 print(a) # 2 3 4 5 6 7 8 2.By Using seq() function– a <- seq(2,10,by=2) print(a) # 2 4 6 8 10
  • 25. 25 Vector Vector Operations 1.Combining Vectors 2.Arithmetic Operations a <- c(4,3,5) a <- c(1,2,3) b <- c(‘x’,’y’,’z’) b <- c(4,5,6) c <- c(a,b) d <- a+b print(c) o/p= 4 3 5 x y z print(d) o/p = 5 7 9 3.Numeric Indexing 4.Duplicate Indexing a <- c(4,3,5) a <- c(4,3,5) Print(a[2]) op = 5 print(a[1,2,2,3,3]) o/p=4 3 3 5 5 5.Logical Indexing 6.Range Indexing a <- c(4,3,5) a <- c(1,2,3,4,5,6,7) print(a[true,false,true]) o/p = 4 5 print(a[2:6]) o/p = 2 3 4 5 6
  • 26. 26 Array Arrays allow us to store data in multi - dimensions and use in efficient way. array Creation Syntax - Array_Name <- array(data, dim=(row_size,column_size,matrices), dim_names) array Operations 1.Accessing Array Elements – Accessing array in R is similar to other programming languages like c,c++ and java. Eg. Print(Arr[2,2]) 2.Arithmetic Operations – Eg. Arr3 <- Arr2 + Arr1 Or Arr3 = Arr1 – Arr2
  • 27. 27 Data Frame Data Frame is a table or a two dimensional Array type structure. Important Considerations  The Column names should be non-empty.  The row names should be unique.  The Data stored in Data Frames can be only Numeric, Factor or Character Type.  Each column should contain same number of data types. Data Frame Creation products <-data.frame( product_number = seq(1:4) product_name = c(“Apple”,”Samsung”,”Redmi”,”Oppo”)) print(products) Product_number Product_name 1 Apple 2 Samsung 3 Redmi 4 Oppo
  • 28. 28 Lists  List is a data structure which have components of Mixed data types.  So a vector having elements of different data types is called a list.  List can be created using list() function. Eg. – x <- list( a=“amba”,b=9.23,c=TRUE) #list storing 3 different data types Accessing List Elements print(x[‘b’]) o/p = 9.23 print(x[‘a’]) o/p = “amba” Manipulating List Elements x[‘a’] <- “nitin” print([‘a’]) o/p = “nitin”
  • 29. 29 Matrices  In R two dimensional rectangular data set is known as Matrix.  A Matrix is created with the help of the input vector to the matrix() function.  We can Perform addition, subtraction, multiplication and division operations on matrices. Creating matrix - Matrix1 <- matrix (2:7,nrow=2,ncol=3) print(Matrix1) o/p = 2 4 6 3 5 7 Accessing Elements – Matrix1[2,3] # 7 Assigning Value – Matrix1[2,3]=1
  • 30. Matrices Operations On Matrices 1.Addition : Matrix3=Matrix1+Matrix2 2.Subtraction : Matrix3 = Matrix2 – Matrix1 3.Multiply by a Constant : Ex : 7*Matrix1 4.Identity Matrix : Ex – diag(5) 5.Transposition Ex – t(Matrix1) 30
  • 31. 31 Factors Factors are data objects which are used to categorise the data and store it as levels. For example: a data field such as marital status may contain only values from single, married, separated, divorced, or widowed. >x [1] single married married single Levels : married single Here, we can see that factor x has four elements and two levels. We can check if a variable is a factor or not using class() function. >class(x) [1] “factor” >levels(x) [1] married single
  • 32. R - Studio Interacting with R Studio –  R-Studio is a free and open-source integrated development environment (IDE) for R, a programming language for statistical computing and graphics.  R-Studio was founded by JJ Allaire,creator of the programming language ColdFusion.  There are 4 main sections in R-Studio IDE… 1.Code Editor 2.Workspace and History 3.R console 4.Plots and Files
  • 33. 33 R - Studio  RStudio is available in two editions: 1.RStudio Desktop, where the program is run locally as a regular desktop application. 2.RStudio Server, Prepackaged distributions of RStudio Desktop are available for Windows, OS X, and Linux.  RStudio is written in the C++ programming language and uses the Qt framework for its graphical user interface.