The document discusses key concepts related to the Pig analytics framework. It covers topics like why Pig was developed, what Pig is, comparisons of Pig to MapReduce and Hive, Pig architecture involving Pig Latin scripts, a runtime engine, and execution via a Grunt shell or Pig server, how Pig works by loading data and executing Pig Latin scripts, Pig's data model using atoms and tuples, and features of Pig like its ability to process structured, semi-structured, and unstructured data without requiring complex coding.
3. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
4. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
5. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
6. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
7. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
8. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
9. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
10. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
11. What’s in it for you?
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
Let’s get started with
Pig!
1. Why Pig?
2. What is Pig?
3. MapReduce vs Hive vs Pig
4. Pig architecture
5. Working of Pig
6. Pig Latin data model
7. Pig Execution modes
8. Use case – Twitter
9. Features of Pig
13. As we all know, Hadoop uses MapReduce to analyze and process big data
Why Pig?
14. As we all know, Hadoop uses MapReduce to analyze and process big data
Before
Processing Big Data consumed more time
Why Pig?
15. As we all know, Hadoop uses MapReduce to analyze and process big data
Before
Processing Big Data consumed more time
Why Pig?
16. As we all know, Hadoop uses MapReduce to analyze and process big data
Before
Processing Big Data consumed more time Processing Big Data was faster using Mapreduce
Why Pig?
After
17. As we all know, Hadoop uses MapReduce to analyze and process big data
Before
Processing Big Data consumed more time Processing Big Data was faster using Mapreduce
Why Pig?
AfterThen, what is the
problem with
MapReduce ?
18. Prior to 2006, all MapReduce programs were written in Java
Why Pig?
19. Non-programmers found it
difficult to write lengthy Java
codes
They faced issues in
incorporating map, sort,
reduce fundamentals of
MapReduce while creating a
program
Eventually, it became a
difficult task to maintain and
optimize the code due to
which the processing time
increased
Map phase
Shuffle and sort
Reduce phase
Prior to 2006, all MapReduce programs were written in Java
Why Pig?
20. Non-programmers found it
difficult to write lengthy Java
codes
They faced issues in
incorporating map, sort,
reduce fundamentals of
MapReduce while creating a
program
Eventually, it became a
difficult task to maintain and
optimize the code due to
which the processing time
increased
Map phase
Shuffle and sort
Reduce phase
Prior to 2006, all MapReduce programs were written in Java
Why Pig?
21. Non-programmers found it
difficult to write lengthy Java
codes
They faced issues in
incorporating map, sort,
reduce fundamentals of
MapReduce while creating a
program
Eventually, it became a
difficult task to maintain and
optimize the code due to
which the processing time
increased
Map phase
Shuffle and sort
Reduce phase
Prior to 2006, all MapReduce programs were written in Java
Why Pig?
22. Why Pig?
Yahoo faced problems to process and analyze large
datasets using Java as the codes were complex and
lengthy
Problem
23. Why Pig?
Yahoo faced problems to process and analyze large
datasets using Java as the codes were complex and
lengthy
There was a necessity to develop an easier way to
analyze large datasets without using time consuming
complex Java codes
Problem
Necessity
24. Why Pig?
Yahoo faced problems to process and analyze large
datasets using Java as the codes were complex and
lengthy
• Apache Pig was developed by Yahoo researchers.
• It was developed with a vision to analyze and process large
datasets without using complex Java codes. Pig was
developed especially for non-programmers.
• Pig used simple steps to analyze datasets which was time
efficient.
Problem
Necessity
Solution
There was a necessity to develop an easier way to
analyze large datasets without using time consuming
complex Java codes
26. What is Pig?
Pig is a scripting platform that runs on Hadoop clusters, designed to process and analyze
large datasets
27. What is Pig?
Pig is a scripting platform that runs on Hadoop clusters, designed to process and analyze
large datasets
Uses SQL
like queries
Analyze data
28. What is Pig?
Pig is a scripting platform that runs on Hadoop clusters, designed to process and analyze
large datasets
Pig operates on various types of data like
structured, semi-structured and
unstructured data
Uses SQL
like queries
Analyze data
30. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
Can process only structured data Can process structured, semi
structured and unstructured data
No need to write complex
codes
31. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
Can process only structured data Can process structured, semi
structured and unstructured data
No need to write complex
codes
32. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
Can process only structured data Can process structured, semi
structured and unstructured data
No need to write complex
codes
33. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
Can process only structured data Can process structured, semi
structured and unstructured data
No need to write complex
codes
34. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
Can process only structured data Can process structured, semi
structured and unstructured data
No need to write complex
codes
35. How is Blockchain distributed ledger different from a traditional ledger?
SQL like query Scripting language
vs vs
Complied language
MapReduce vs Hive vs Pig
Need to write long complex codes
Lower level of abstraction
No need to write complex
codes
Higher level of abstraction
Can process structured, semi
structured and unstructured data
Higher level of abstraction
No need to write complex
codes
Can process only structured data Can process structured, semi
structured and unstructured data
This is the advantage Pig has over Hive
36. How is Blockchain distributed ledger different from a traditional ledger?
vs vs
Supports partitioning feature
MapReduce vs Hive vs Pig
MapReduce uses Java and
Python
Code performance is good
Hive uses a SQL like query language
known as HiveQL
Code performance is lesser than
MapReduce and Pig
MapReduce is used by programmers
Code performance is lesser than
MapReduce but better than Hive
Hive is used by data analysts Pig is used by researchers and
programmers
Pig Latin is used which is a
procedural data flow language
Supports partitioning feature No concept of partitioning in
Pig
37. How is Blockchain distributed ledger different from a traditional ledger?
vs vs
Supports partitioning feature
MapReduce vs Hive vs Pig
MapReduce uses Java and
Python
Code performance is good
Hive uses a SQL like query language
known as HiveQL
Code performance is lesser than
MapReduce and Pig
MapReduce is used by programmers
Code performance is lesser than
MapReduce but better than Hive
Hive is used by data analysts Pig is used by researchers and
programmers
Pig Latin is used which is a
procedural data flow language
Supports partitioning feature No concept of partitioning in
Pig
38. How is Blockchain distributed ledger different from a traditional ledger?
vs vs
Supports partitioning feature
MapReduce vs Hive vs Pig
MapReduce uses Java and
Python
Code performance is good
Hive uses a SQL like query language
known as HiveQL
Code performance is lesser than
MapReduce and Pig
MapReduce is used by programmers
Code performance is lesser than
MapReduce but better than Hive
Hive is used by data analysts Pig is used by researchers and
programmers
Pig Latin is used which is a
procedural data flow language
Supports partitioning feature No concept of partitioning in
Pig
39. How is Blockchain distributed ledger different from a traditional ledger?
vs vs
Supports partitioning feature
MapReduce vs Hive vs Pig
MapReduce uses Java and
Python
Code performance is good
Hive uses a SQL like query language
known as HiveQL
Code performance is lesser than
MapReduce and Pig
MapReduce is used by programmers
Code performance is lesser than
MapReduce but better than Hive
Hive is used by data analysts Pig is used by researchers and
programmers
Pig Latin is used which is a
procedural data flow language
Supports partitioning feature No concept of partitioning in
Pig
42. Components of Pig
Pig has two components
Runtime engine
Pig Latin
Pig Latin is the procedural data
flow language used in Pig to
analyze data
It is easy to program using Pig
Latin as it is similar to SQL
Runtime engine represents the
execution environment created
to run Pig Latin programs
It is also a compiler that
produces MapReduce
programs
Uses HDFS for storing and
retrieving data
43. Components of Pig
Pig has two components
Pig Latin
Runtime engine
Pig Latin is the procedural
data flow language used in
Pig to analyze data
It is easy to program using
Pig Latin as it is similar to
SQL
It is also a compiler that produces
MapReduce programs
Uses HDFS for storing and
retrieving data
Runtime engine represents the
execution environment created
to run Pig Latin programs
45. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Programmers write a script In Pig
Latin to analyze data using Pig
46. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell
Grunt Shell is Pig’s interactive shell which
is used to execute all Pig scripts
47. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
If the Pig script is written in a script file, the
execution is done by the Pig Server
48. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
Parser
Parser checks the syntax of the Pig script
After checking, the output will be a
DAG – Directed Acyclic Graph
49. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
Parser
Optimizer DAG (logical plan) is passed to the logical
Optimizer where optimizations take place
50. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
Parser
Optimizer
Compiler
The Compiler converts the DAG into
MapReduce jobs
51. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
Parser
Optimizer
Compiler
Execution Engine
The MapReduce jobs are executed at the
Execution Engine
The results are displayed using “DUMP”
statement and stored in HDFS using
“STORE” statement
52. Pig architecture
There are 3 ways to execute the
written Pig script
Pig Latin Scripts
Grunt Shell Pig Server
Parser
Optimizer
Compiler
Execution Engine
MapReduce
HDFS
54. Working of Pig
Load data and
write Pig script
Pig Latin script is written
by the users
1
55. Working of Pig
Load data and
write Pig script
Pig operations In this step, all the Pig
operations are performed by
parser, optimizer and
compiler
21
56. Working of Pig
Load data and
write Pig script
Pig operations
Execution of the
plan
In this stage, the results are shown
on the screen otherwise stored in
HDFS as per the code
1 2
3
58. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
59. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
Atom represents any single value of primitive data type in
Pig Latin like int, float, string. It is stored as string
Examples
‘Rob’ or
50
Atom
60. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
Examples
‘Rob’ or
50
Atom Tuple
(Rob,5)
Tuple represents sequence of fields that can be of any
data type. It is same as a row in RDBMS i.e.; a set of data
from a single row
61. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
Examples
‘Rob’ or
50
Atom Tuple
(Rob,5)
Bag
{(Rob,5),(
Mike,10}
Bag is a collection of tuples. It is the same as a table in
RDBMS. It is represented by ‘{}’
62. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
Examples
‘Rob’ or
50
Atom Tuple
(Rob,5)
Bag
{(Rob,5),(
Mike,10}
Map
[name#Mi
ke,
age#10]
Map is a set of key-value pairs. Key is of chararray type
and value can be of any type. It is represented by ‘[]’
63. Pig Latin data model
The data model of Pig Latin helps Pig to handle various
types of data
Examples
‘Rob’ or
50
Atom Tuple
(Rob,5)
Bag
{(Rob,5),(
Mike,10}
Map
[name#Mi
ke,
age#10]
Map is a set of key-value pairs. Key is of chararray type
and value can be of any type. It is represented by ‘[]’
Pig Latin has a fully nestable data model
that means one data type can be nested
with another
64. Pig Latin data model
Here is a diagrammatical representation of the Pig Latin data
model
Sl. no Name Age Place
01
02
03
Jack
Bob
Joe
23
25
29
Goa
London
California
65. Pig Latin data model
Here is a diagrammatical representation of the Pig Latin data
model
Sl. no Name Age Place
01
02
03
Jack
Bob
Joe
23
25
29
Goa
London
California
Field
66. Pig Latin data model
Here is a diagrammatical representation of the Pig Latin data
model
Sl. no Name Age Place
01
02
03
Jack
Bob
Joe
23
25
29
Goa
London
California Tuple
67. Pig Latin data model
Here is a diagrammatical representation of the Pig Latin data
model
Sl. no Name Age Place
01
02
03
Jack
Bob
Joe
23
25
29
Goa
London
California
}Bag
69. Pig Execution modes
Pig works in two execution modes. Depending
on where the data is residing and where the Pig
script is going to run
70. Pig Execution modes
Pig works in two execution modes. Depending
on where the data is residing and where the Pig
script is going to run
Local Mode MapReduce Mode
71. Pig Execution modes
Local Mode MapReduce Mode
Here, the Pig engine takes input from the Linux file system and the output is
stored in the same file system
Local Mode is useful in analyzing small datasets using Pig
Pig works in two execution modes. Depending
on where the data is residing and where the Pig
script is going to run
72. Pig Execution modes
Local Mode MapReduce Mode
Pig works in two execution modes. Depending
on where the data is residing and where the Pig
script is going to run
Here, the Pig engine directly interacts and executes in HDFS and
MapReduce
In the MapReduce mode, queries written in Pig Latin are translated into
MapReduce jobs and are run on a Hadoop cluster. By default, Pig runs on
this mode
74. Pig Execution modes
Interactive Mode
Batch Mode
Embedded Mode
There are three modes in Pig, depending on
how a Pig Latin code can be written
75. Pig Execution modes
Interactive Mode
Batch Mode
Embedded Mode
Interactive mode means coding and executing the script, line by line
There are three modes in Pig, depending on
how a Pig Latin code can be written
76. Pig Execution modes
Interactive Mode
Batch Mode
Embedded Mode
There are three modes in Pig, depending on
how a Pig Latin code can be written
In Batch mode, all scripts are coded in a file with the extension .pig and
the file is directly executed
77. Pig Execution modes
Interactive Mode
Batch Mode
Embedded Mode
There are three modes in Pig, depending on
how a Pig Latin code can be written
Pig lets it’s users define their own functions (UDFs) in
programming languages such as Java
79. Use case – Twitter
Users on Twitter generate about 500 million tweets
on a daily basis
80. Use case – Twitter
Users on Twitter generate about 500 million tweets
on a daily basis
Hadoop MapReduce was used to
process and analyze this data
Analyzing the number of tweets created by a user in
the tweet table was done using MapReduce in Java
programming language
81. Use case – Twitter
Users on Twitter generate about 500 million tweets
on a daily basis
Hadoop MapReduce was used to
process and analyze this data
Analyzing the number of tweets created by a user in
the tweet table was done using MapReduce in Java
programming language
It was difficult to perform MapReduce operations as users
were not well versed with writing complex Java codes
82. Use case – Twitter
The problems that were faced by Twitter while
analyzing datasets using MapReduce were :
Joining Datasets Sorting
Datasets
Grouping
Datasets
It was difficult to perform these operations on MapReduce as
it consumed more time since the Java codes were lengthy
and complex
Twitter used Apache Pig to overcome
these problems. Let’s see how.
83. Use case – Twitter
Problem statement
Analyze the user table and tweet table and find out how many tweets
are created by a person
84. Use case – Twitter
ID Name
1
2
3
Alice
Tim
John
User Table Tweet Table
1
2
1
3
1
2
Google….
Tennis…
Spacecraft…
Oscar…
Politics..…
Olympics…
ID Tweet
Problem statement
Analyze the user table and tweet table and find out how many tweets
are created by a person
85. Use case – Twitter
ID Name
1
2
3
Alice
Tim
John
User Table Tweet Table
1
2
1
3
1
2
Google….
Tennis…
Spacecraft…
Oscar…
Politics..…
Olympics…
ID Tweet
Problem statement
The following operations were
performed for analyzing the given data
Analyze the user table and tweet table and find out how many tweets
are created by a person
86. Use case – Twitter
ID Name
1
2
3
Alice
Tim
John
User Table Tweet Table
1
2
1
3
1
2
Google...
Tennis...
Spacecraft...
Oscar...
Politics...
Olympics...
ID Tweet
First, the twitter data is loaded onto the Pig storage using
LOAD command
87. Use case – Twitter
ID Name
1
2
3
Alice
Tim
John
User Table Tweet Table
1
2
1
3
1
2
Google...
Tennis...
Spacecraft...
Oscar...
Politics...
Olympics...
ID Tweet ID Name
1
2
3
Alice
Tim
John
User Table Tweet Table
1
2
1
3
1
2
Google...
Tennis...
Spacecraft...
Oscar...
Politics...
Olympics...
ID Tweet
First, the twitter data is loaded onto the Pig storage using
LOAD command
88. Use case – Twitter
In join and group operation, the tweet and user tables are joined
and grouped using COGROUP command
ID Name Tweet
1
1
2
1
2
3
Alice
Alice
Alice
Tim
Tim
John
Google...
Spacecraft...
Politics...
Tennis...
Oscar...
Olympics...
The remaining operations performed are shown below
89. Use case – Twitter
In join and group operation, the tweet and user tables are joined
and grouped using COGROUP command
ID Count
1
2
3
3
2
1
The next operation is the aggregation, the tweets are counted
according to the names. The command used is COUNT
The remaining operations performed are shown below
90. Use case – Twitter
The remaining operations performed are shown below
ID
1
2
3
Name Count
3
2
1
Alice
Tim
John
In join and group operation, the tweet and user tables are joined
and grouped using COGROUP command
The result after the count operation is joined with the user table to
find out the user name
The next operation is the aggregation, the tweets are counted
according to the names. The command used is COUNT
91. Use case – Twitter
The remaining operations performed are shown below
ID
1
2
3
Name Count
3
2
1
Alice
Tim
John
In join and group operation, the tweet and user tables are joined
and grouped using COGROUP command
The result after the count operation is joined with the user table to
find out the user name
The next operation is the aggregation, the tweets are counted
according to the names. The command used is COUNT
Pig reduces the complexity of
the operations which would
have been lengthier using
MapReduce
92. Use case – Twitter
The remaining operations performed are shown below
ID
1
2
3
Name Count
3
2
1
Alice
Tim
John
In join and group operation, the tweet and user tables are joined
and grouped using COGROUP command
The result after the count operation is joined with the user table to
find out the user name
The next operation is the aggregation, the tweets are counted
according to the names. The command used is COUNT
Finally, we could find out the
number of tweets created by a
user in a simple way
93. Optimization and
compilation is easy
as it is done
automatically and
internally
Allows multiple
queries to process
parallelly
Pig offers a large
set of operators
such as join, filter
and so on
Pig lets us create
User-defined
Functions
Handles all kind of data
like structured, semi
structured and
unstructured
Short development
time as the code is
simpler
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
94. Optimization and
compilation is easy
as it is done
automatically and
internally
Allows multiple
queries to process
parallelly
Pig offers a large
set of operators
such as join, filter
and so on
Handles all kind of data
like structured, semi
structured and
unstructured
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Pig lets us create
User-defined
Functions
95. Optimization and
compilation is easy
as it is done
automatically and
internally
Allows multiple
queries to process
parallelly
Pig offers a large
set of operators
such as join, filter
and so on
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Handles all kind of data
like structured, semi
structured and
unstructured
Pig lets us create
User-defined
Functions
96. Optimization and
compilation is easy
as it is done
automatically and
internally
Allows multiple
queries to process
parallelly
Pig offers a large
set of operators
such as join, filter
and so on
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Handles all kind of data
like structured, semi
structured and
unstructured
Pig lets us create
User-defined
Functions
97. Optimization and
compilation is easy
as it is done
automatically and
internally
Allows multiple
queries to process
parallelly
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Handles all kind of data
like structured, semi
structured and
unstructured
Pig offers a large
set of operators
such as join, filter
and so on
Pig lets us create
User-defined
Functions
98. Optimization and
compilation is easy
as it is done
automatically and
internally
Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Handles all kind of data
like structured, semi
structured and
unstructured
Pig offers a large
set of operators
such as join, filter
and so on
Allows multiple
queries to process
parallelly
Pig lets us create
User-defined
Functions
99. Features of Pig
Ease of programming
as Pig Latin is similar
to SQL. Lesser lines
of code needs to be
written
Short development
time as the code is
simpler
Handles all kind of data
like structured, semi
structured and
unstructured
Pig offers a large
set of operators
such as join, filter
and so on
Allows multiple
queries to process
parallelly
Optimization and
compilation is easy
as it is done
automatically and
internally
Pig lets us create
User-defined
Functions