SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Page 1 of 1
PRASHANT AGRAWAL
Email: prashanttct07@gmail.com
Mobile: +91-8097606642
Professional Summary:
● Innovative Software Professional with 5+ years of progressive experience and continued
success as a Big Data Analyst.
● Day to day experience in working with Agile/Scrum methodology.
● Vast experience in search engine solution viz. E-commerce search, Enterprise Search,
Log Analytics and Monitoring etc.
● Hands on Exposure on Log analytics for various logs such as syslog, authlog, Postfix
logs, Router logs, apache logs, netflow logs, Application logs using ELK
● Hands on experience on ETL using Spark, Spark Streaming and Spark SQL
● Full Text Search Solution with analytics and visualization using Elasticsearch, Logstash
and Kibana.
● Good knowledge on distributed computing system such as Hadoop Eco System, Spark,
Flume etc. to analyze the network logs and perform ETL on various data sets.
● Hands on experience on working with HDP (Horton Works) cluster with various
components such as Spark, HDFS, Flume, Kafka, Yarn, Oozie, Phoenix, Presto etc.
● Good hands on with SVN, VSS, Git and build tools like Maven.
● Hands on experience in developing the product for capturing, intercepting and
monitoring the internet traffic for LEA’s (Law Enforcement Agency’s).
● Good exposure in handling the big data (in TB’s) with Elasticsearch using cluster of 12
nodes at deployment level.
● Good exposure on writing the spark application using Scala
Domain and Skill Set:
Domain
Engineering and Network forensics, Big Data Analytics , Digital
Marketing and Advertisement
Programming Languages Core Java, Scala , C#, PHP
Operating System Mac, Linux (Ubuntu ,Red Hat, Cent OS) , Windows7
Tools /DB/Packages
Elasticsearch, Logstash, Kibana, X-pack, Spark, Flume, Oozie,
Intellij Idea, Visual Studio 2010, MySQL, Version
Control(Perforce, SVN, VSS, GIT), Defect Tracker(Jira, Fogbugz)
Page 2 of 2
Professional Project Details:
Project - 1
Project Name Log Analytics and Visualization using ELK Team Size 1
Start Date Jan 2016 End Date Till Now
Description
This Project Involves Log analytics using ELK which also caters writing of
Elastic queries for E-commerce and Enterprise search
Role &
Contribution
● ELK 5.x Setup and Configuration on various OS such as Mac, Linux,
Windows
● Migration of data from ELK 2.x to 5.x
● Well verse in setting up the various nodes in prod cluster like master, data
and client node.
● Implementation of shards and replica (to avoid single node failure) for better
management of indexes.
● Preparation of schema and analyzers (through template) to store the data in
elasticsearch
● Written Elasticsearch query to support various search features such as Auto
Complete, Synonyms, Grammar Based Search, Exact and Non Exact search,
misspelled search, aggregation, Boolean search,aggregations etc.
● Build and development of Logstash plugin to support specific requirement or
feature.
● Data extraction using various logstash Input Plugin such as JDBC, File, TCP,
UDP, S3 etc.
● Data filtration using Logstash Filter plugin (CSV, Grok, Mutate, Date, Geo
etc.)
● Data indexing to Elasticsearch using Logstash output plugin
● Used various beats as data shipper to Logstash.Includes various beats such
as File beat,Metric beat etc.
● Data visualization and dashboard reporting using Kibana
● Setting up backup and restore using snapshot and restore
● Fine tuning and optimization of queries in order to get response faster.
● Preparation of multi index architecture (Time series Indexing arch) in order
to perform faster search and get the response as quick as possible.
Technologies JSON
Tools/Tool
chain
Elasticsearch 5.x/2.x, Logstash 5.x/2.x, Kibana 5.x/4.x, Beats, X-pack,
Head Plug-in, Kopf Plug-in, Carrot2, Lingo3g, Putty, Win SCP
Page 3 of 3
Project - 2
Project Name Data Lake Modules in Spark Team Size 1
Start Date June 2016 End Date Till Now
Description This project involves creating a generic data lake solution to migrate the
data from one SQL/No SQL database to another.
Role &
Contribution
● Created a module on spark which reads data from SQL or Hbase and dumps
the same to HDFS as AVRO or ORC
● Module is created in a way where user can specify their input type if its has
to be Hbase or SQL and output type as to be ORC or AVRO in HDFS
● Implemented the data lake modules to run periodically using the oozie
scheduler, where job runs every hour and dump the data.
● Added a functionality to auto clean up the older dumps which is X version
and Y days are old.
● Created an offline index module which acts as secondary index to hbase table
with required fields only to speedup the select query.
Technologies Scala
Tools/Tool
chain
Kafka, Spark, Spark Streaming and Spark SQL, Phoenix, Hbase , Hive ,
Maven, Git
Project - 3
Project Name Spark ETL for fact and dim types of data Team Size 2
Start Date Jan 2016 End Date May 2016
Description
This project involves ETL processing with Spark which involves pulling up the
information from a variety of sources, transforms the data, and then
pushes to Presto for OLTP/OLAP analytics for dim and fact type of data.
Role &
Contribution
● Consume the dim/fact Kafka message in spark using Kafka consumer which
are produced by Kafka producer
● Extract the messages consumed by Kafka consumer
● Perform business logic and transformation on those messages
● Push the transformed data to either Hive or Presto for further OLTP and
OLAP analytics
Page 4 of 4
Technologies Scala
Tools/Tool
chain
Maven, Git, Kafka, Spark, Spark Streaming and Spark SQL, Phoenix, Presto
Project - 4
Project Name
Deployment of Elasticsearch to handle big
data in IIMS
Team Size 2
Start Date Feb 2014 End Date Dec 2015
Description
This Project Involves deployment of Elasticsearch to handle data in GB’s (in
a day)
Role &
Contribution
● Deployment architecture preparation to handle such a big data using
cluster with 12 nodes.
● Data visualization and dashboard loading using Kibana
● Well verse in setting up the various nodes in deployment such as
Master, Data and client node.
● Hands on experience with various plug-in such as Mapper
Attachment, Head, Big desk, Carrot2 (with lingo3g categorization
algorithm)
● Preparation of schema to store the data in Elasticsearch
● Preparation of search query to be used for retrieval of data from
Elasticsearch using Query String, Match, Boolean, Aggregation etc.
● Though its big data still backup and restore is required in case of
any failure hence setup the Snapshot and restore process to take
daily backup of the same.
● Implementation of shards and replica (to avoid single node failure)
● Fine tuning and optimization of queries in order to get response
faster.
● Preparation of multi index architecture in order to perform faster
search and get the response as quick as possible.
● Implementation of feature like Synonyms, Stemming, Grammar
extension, wild card search etc.
Technologies C#, Elasticsearch(NoSQL Database)
Tools/Tool
chain
Elasticsearch 1.5.x, Mapper Attachment, Head Plug-in, Big desk Plug-in,
Carrot2, Lingo3g, Putty, WinSCP
Page 5 of 5
Project – 5
Project Name Big Data Platform Development Team Size 2
Start Date June 2015 End Date Dec 2015
Description
This project involves processing of the Logs being generated from various
system and devices and perform predictive analysis to form the pattern and
trace the attacking device or user
Role &
Contribution
● Collecting the high speed logs coming from various devices and
inject the same using Flume
● Pass on the logs data from flume to spark streaming and spark SQL
so as to store the same onto Memory (As Spark is being known for in
memory data processing)
● Perform predictive analysis onto the log as per the defined
algorithm, also perform the computation with self derived algorithm
as well
● Persist the log, in memory for specific time duration using spark
streaming and then persist the same permanently to Elastic
● Performed all above operation and computation using distributed
computing. Which includes setting up the 5 node cluster of HDFS
using Horton Works Development Platform
Technologies
Core Java, Flume, HDFS, HDP Clustering, Spark, Spark Streaming and Spark
SQL, Elasticsearch
Tools/Tool
chain
Maven, Git, MySQL
Page 6 of 6
Educational Qualifications:
Course Board/University Year of Passing Percentage
10th
CBSE 2005 79.20
12th
CBSE 2007 76.00
B.E.(Computer Science) RGPV Bhopal 2011 76.44
GATE - 2011 91 Percentile
CAT - 2010 85 Percentile
Personal Profile:
Date of Birth : 23rd Nov, 1989
Passport No : J3031277
Willing to re-allocate : Depends upon Location
Willingness for Onsite : Yes
PAN : ATXPA9120F
Declaration:
I hereby declare that the information provided above is correct and true to the best of my
knowledge and believe.
Date: 09 Jan 2017
Place: Pune (Prashant Agrawal)

Weitere ähnliche Inhalte

Was ist angesagt?

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Telliusdatamantra
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 Databricks
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudRick Bilodeau
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache FlinkAKASH SIHAG
 
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
Anatomy of Data Frame API :  A deep dive into Spark Data Frame APIAnatomy of Data Frame API :  A deep dive into Spark Data Frame API
Anatomy of Data Frame API : A deep dive into Spark Data Frame APIdatamantra
 
Building end to end streaming application on Spark
Building end to end streaming application on SparkBuilding end to end streaming application on Spark
Building end to end streaming application on Sparkdatamantra
 
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkSuneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkFlink Forward
 
Evolution of apache spark
Evolution of apache sparkEvolution of apache spark
Evolution of apache sparkdatamantra
 
Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in RetailHari Shreedharan
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeAngad Singh
 
Introduction to Flink Streaming
Introduction to Flink StreamingIntroduction to Flink Streaming
Introduction to Flink Streamingdatamantra
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun JeongSpark Summit
 
Streamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupStreamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupHari Shreedharan
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkFlink Forward
 
Introduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset APIIntroduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset APIdatamantra
 
Productionalizing a spark application
Productionalizing a spark applicationProductionalizing a spark application
Productionalizing a spark applicationdatamantra
 
From R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillFrom R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillDatabricks
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafkadatamantra
 
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Sparkdatamantra
 

Was ist angesagt? (20)

Multi Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and TelliusMulti Source Data Analysis using Spark and Tellius
Multi Source Data Analysis using Spark and Tellius
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache Flink
 
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
Anatomy of Data Frame API :  A deep dive into Spark Data Frame APIAnatomy of Data Frame API :  A deep dive into Spark Data Frame API
Anatomy of Data Frame API : A deep dive into Spark Data Frame API
 
Building end to end streaming application on Spark
Building end to end streaming application on SparkBuilding end to end streaming application on Spark
Building end to end streaming application on Spark
 
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache FlinkSuneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
 
Evolution of apache spark
Evolution of apache sparkEvolution of apache spark
Evolution of apache spark
 
Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in Retail
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
 
Introduction to Flink Streaming
Introduction to Flink StreamingIntroduction to Flink Streaming
Introduction to Flink Streaming
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun Jeong
 
Streamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupStreamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User Group
 
Slim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. SparkSlim Baltagi – Flink vs. Spark
Slim Baltagi – Flink vs. Spark
 
Introduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset APIIntroduction to Spark 2.0 Dataset API
Introduction to Spark 2.0 Dataset API
 
Productionalizing a spark application
Productionalizing a spark applicationProductionalizing a spark application
Productionalizing a spark application
 
From R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillFrom R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep Gill
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
 
Exploratory Data Analysis in Spark
Exploratory Data Analysis in SparkExploratory Data Analysis in Spark
Exploratory Data Analysis in Spark
 

Andere mochten auch

Spider & F5 Round Table - Application Centric Security
Spider & F5 Round Table - Application Centric SecuritySpider & F5 Round Table - Application Centric Security
Spider & F5 Round Table - Application Centric SecurityTzoori Tamam
 
Las lesiones en los músicos presentación
Las lesiones en los músicos presentaciónLas lesiones en los músicos presentación
Las lesiones en los músicos presentaciónSergio Herrera
 
Doc111111111111111111111
Doc111111111111111111111Doc111111111111111111111
Doc111111111111111111111myersreynaangel
 
Manufacturing 3.0 / The Greatest Challenges
Manufacturing 3.0 / The Greatest ChallengesManufacturing 3.0 / The Greatest Challenges
Manufacturing 3.0 / The Greatest ChallengesAlexander Kozlov
 
HMT Machine Tools Ltd Ajmer Summer Training Presentation
HMT Machine Tools Ltd Ajmer Summer Training PresentationHMT Machine Tools Ltd Ajmer Summer Training Presentation
HMT Machine Tools Ltd Ajmer Summer Training PresentationSiddharth Bhatnagar
 
Product Management Metrics | Saeed Khan | ProductTank Toronto
Product Management Metrics | Saeed Khan | ProductTank Toronto Product Management Metrics | Saeed Khan | ProductTank Toronto
Product Management Metrics | Saeed Khan | ProductTank Toronto Product Tank Toronto
 
Colgate- Palmolive Company : The Precision Toothbrush
Colgate- Palmolive Company : The Precision ToothbrushColgate- Palmolive Company : The Precision Toothbrush
Colgate- Palmolive Company : The Precision ToothbrushSneh Ankur
 
PM [B10] In comes Euler
PM [B10] In comes EulerPM [B10] In comes Euler
PM [B10] In comes EulerStephen Kwong
 
RAM installation Rev 5
RAM installation Rev 5RAM installation Rev 5
RAM installation Rev 5Scott Graser
 
On Site Flu Clinics For Your Business
On Site Flu Clinics For Your BusinessOn Site Flu Clinics For Your Business
On Site Flu Clinics For Your BusinessMeagan Wilson
 
City Central Limerick Final
City Central Limerick FinalCity Central Limerick Final
City Central Limerick FinalPeter Flanagan
 

Andere mochten auch (20)

Spider & F5 Round Table - Application Centric Security
Spider & F5 Round Table - Application Centric SecuritySpider & F5 Round Table - Application Centric Security
Spider & F5 Round Table - Application Centric Security
 
Presentacion sobre microsoft..
Presentacion sobre microsoft..Presentacion sobre microsoft..
Presentacion sobre microsoft..
 
Las lesiones en los músicos presentación
Las lesiones en los músicos presentaciónLas lesiones en los músicos presentación
Las lesiones en los músicos presentación
 
Strange Pictures
Strange PicturesStrange Pictures
Strange Pictures
 
Doc111111111111111111111
Doc111111111111111111111Doc111111111111111111111
Doc111111111111111111111
 
PROFESSIONAL RESUME OF benny
PROFESSIONAL RESUME OF bennyPROFESSIONAL RESUME OF benny
PROFESSIONAL RESUME OF benny
 
Manufacturing 3.0 / The Greatest Challenges
Manufacturing 3.0 / The Greatest ChallengesManufacturing 3.0 / The Greatest Challenges
Manufacturing 3.0 / The Greatest Challenges
 
HMT Machine Tools Ltd Ajmer Summer Training Presentation
HMT Machine Tools Ltd Ajmer Summer Training PresentationHMT Machine Tools Ltd Ajmer Summer Training Presentation
HMT Machine Tools Ltd Ajmer Summer Training Presentation
 
Product Management Metrics | Saeed Khan | ProductTank Toronto
Product Management Metrics | Saeed Khan | ProductTank Toronto Product Management Metrics | Saeed Khan | ProductTank Toronto
Product Management Metrics | Saeed Khan | ProductTank Toronto
 
MOHAMMED ARABIC TEACHER
MOHAMMED ARABIC TEACHERMOHAMMED ARABIC TEACHER
MOHAMMED ARABIC TEACHER
 
Colgate- Palmolive Company : The Precision Toothbrush
Colgate- Palmolive Company : The Precision ToothbrushColgate- Palmolive Company : The Precision Toothbrush
Colgate- Palmolive Company : The Precision Toothbrush
 
Pescados y mariscos
Pescados y mariscosPescados y mariscos
Pescados y mariscos
 
Power within you
Power within youPower within you
Power within you
 
PM [B10] In comes Euler
PM [B10] In comes EulerPM [B10] In comes Euler
PM [B10] In comes Euler
 
RAM installation Rev 5
RAM installation Rev 5RAM installation Rev 5
RAM installation Rev 5
 
02.jnae10085
02.jnae1008502.jnae10085
02.jnae10085
 
5 Ways to Battle Insomnia
5 Ways to Battle Insomnia5 Ways to Battle Insomnia
5 Ways to Battle Insomnia
 
Paradise3magazine.doc
Paradise3magazine.docParadise3magazine.doc
Paradise3magazine.doc
 
On Site Flu Clinics For Your Business
On Site Flu Clinics For Your BusinessOn Site Flu Clinics For Your Business
On Site Flu Clinics For Your Business
 
City Central Limerick Final
City Central Limerick FinalCity Central Limerick Final
City Central Limerick Final
 

Ähnlich wie Prashant_Agrawal_CV

Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022HostedbyConfluent
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackRohit Sharma
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analyticsinoshg
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseMarcelo Ochoa
 
Improving ad hoc and production workflows at Stitch Fix
Improving ad hoc and production workflows at Stitch FixImproving ad hoc and production workflows at Stitch Fix
Improving ad hoc and production workflows at Stitch FixStitch Fix Algorithms
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjay Mane
 
Bigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExpBigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExpbigdata sunil
 
Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015sourav giri
 
A compute infrastructure for data scientists
A compute infrastructure for data scientistsA compute infrastructure for data scientists
A compute infrastructure for data scientistsStitch Fix Algorithms
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev Kumar
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriDemi Ben-Ari
 
Pullareddy_tavva_resume.doc
Pullareddy_tavva_resume.docPullareddy_tavva_resume.doc
Pullareddy_tavva_resume.docT Pulla Reddy
 
Saranteja gutta wells
Saranteja gutta wellsSaranteja gutta wells
Saranteja gutta wellsramesh5080
 
Apache spark - History and market overview
Apache spark - History and market overviewApache spark - History and market overview
Apache spark - History and market overviewMartin Zapletal
 
Arun-Kumar-OEDQ-Developer
Arun-Kumar-OEDQ-DeveloperArun-Kumar-OEDQ-Developer
Arun-Kumar-OEDQ-DeveloperArun Kumar
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark exampleShidrokhGoudarzi1
 
Resume_Informatica&IDQ_4+years_of_exp
Resume_Informatica&IDQ_4+years_of_expResume_Informatica&IDQ_4+years_of_exp
Resume_Informatica&IDQ_4+years_of_exprajarao marisa
 
Apache Spark e AWS Glue
Apache Spark e AWS GlueApache Spark e AWS Glue
Apache Spark e AWS GlueLaercio Serra
 

Ähnlich wie Prashant_Agrawal_CV (20)

Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022
Why Wait? Realtime Ingestion With Chen Qin and Heng Zhang | Current 2022
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
 
Experiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the DatabaseExperiences with Evangelizing Java Within the Database
Experiences with Evangelizing Java Within the Database
 
Improving ad hoc and production workflows at Stitch Fix
Improving ad hoc and production workflows at Stitch FixImproving ad hoc and production workflows at Stitch Fix
Improving ad hoc and production workflows at Stitch Fix
 
Veera Narayanaswamy_PLSQL_Profile
Veera Narayanaswamy_PLSQL_ProfileVeera Narayanaswamy_PLSQL_Profile
Veera Narayanaswamy_PLSQL_Profile
 
Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016Sanjaykumar Kakaso Mane_MAY2016
Sanjaykumar Kakaso Mane_MAY2016
 
Bigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExpBigdata.sunil_6+yearsExp
Bigdata.sunil_6+yearsExp
 
Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015Sourav_Giri_Resume_2015
Sourav_Giri_Resume_2015
 
A compute infrastructure for data scientists
A compute infrastructure for data scientistsA compute infrastructure for data scientists
A compute infrastructure for data scientists
 
Rajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developerRajeev kumar apache_spark & scala developer
Rajeev kumar apache_spark & scala developer
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Pullareddy_tavva_resume.doc
Pullareddy_tavva_resume.docPullareddy_tavva_resume.doc
Pullareddy_tavva_resume.doc
 
Saranteja gutta wells
Saranteja gutta wellsSaranteja gutta wells
Saranteja gutta wells
 
Apache spark - History and market overview
Apache spark - History and market overviewApache spark - History and market overview
Apache spark - History and market overview
 
Arun-Kumar-OEDQ-Developer
Arun-Kumar-OEDQ-DeveloperArun-Kumar-OEDQ-Developer
Arun-Kumar-OEDQ-Developer
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark example
 
Resume_Informatica&IDQ_4+years_of_exp
Resume_Informatica&IDQ_4+years_of_expResume_Informatica&IDQ_4+years_of_exp
Resume_Informatica&IDQ_4+years_of_exp
 
Apache Spark e AWS Glue
Apache Spark e AWS GlueApache Spark e AWS Glue
Apache Spark e AWS Glue
 

Prashant_Agrawal_CV

  • 1. Page 1 of 1 PRASHANT AGRAWAL Email: prashanttct07@gmail.com Mobile: +91-8097606642 Professional Summary: ● Innovative Software Professional with 5+ years of progressive experience and continued success as a Big Data Analyst. ● Day to day experience in working with Agile/Scrum methodology. ● Vast experience in search engine solution viz. E-commerce search, Enterprise Search, Log Analytics and Monitoring etc. ● Hands on Exposure on Log analytics for various logs such as syslog, authlog, Postfix logs, Router logs, apache logs, netflow logs, Application logs using ELK ● Hands on experience on ETL using Spark, Spark Streaming and Spark SQL ● Full Text Search Solution with analytics and visualization using Elasticsearch, Logstash and Kibana. ● Good knowledge on distributed computing system such as Hadoop Eco System, Spark, Flume etc. to analyze the network logs and perform ETL on various data sets. ● Hands on experience on working with HDP (Horton Works) cluster with various components such as Spark, HDFS, Flume, Kafka, Yarn, Oozie, Phoenix, Presto etc. ● Good hands on with SVN, VSS, Git and build tools like Maven. ● Hands on experience in developing the product for capturing, intercepting and monitoring the internet traffic for LEA’s (Law Enforcement Agency’s). ● Good exposure in handling the big data (in TB’s) with Elasticsearch using cluster of 12 nodes at deployment level. ● Good exposure on writing the spark application using Scala Domain and Skill Set: Domain Engineering and Network forensics, Big Data Analytics , Digital Marketing and Advertisement Programming Languages Core Java, Scala , C#, PHP Operating System Mac, Linux (Ubuntu ,Red Hat, Cent OS) , Windows7 Tools /DB/Packages Elasticsearch, Logstash, Kibana, X-pack, Spark, Flume, Oozie, Intellij Idea, Visual Studio 2010, MySQL, Version Control(Perforce, SVN, VSS, GIT), Defect Tracker(Jira, Fogbugz)
  • 2. Page 2 of 2 Professional Project Details: Project - 1 Project Name Log Analytics and Visualization using ELK Team Size 1 Start Date Jan 2016 End Date Till Now Description This Project Involves Log analytics using ELK which also caters writing of Elastic queries for E-commerce and Enterprise search Role & Contribution ● ELK 5.x Setup and Configuration on various OS such as Mac, Linux, Windows ● Migration of data from ELK 2.x to 5.x ● Well verse in setting up the various nodes in prod cluster like master, data and client node. ● Implementation of shards and replica (to avoid single node failure) for better management of indexes. ● Preparation of schema and analyzers (through template) to store the data in elasticsearch ● Written Elasticsearch query to support various search features such as Auto Complete, Synonyms, Grammar Based Search, Exact and Non Exact search, misspelled search, aggregation, Boolean search,aggregations etc. ● Build and development of Logstash plugin to support specific requirement or feature. ● Data extraction using various logstash Input Plugin such as JDBC, File, TCP, UDP, S3 etc. ● Data filtration using Logstash Filter plugin (CSV, Grok, Mutate, Date, Geo etc.) ● Data indexing to Elasticsearch using Logstash output plugin ● Used various beats as data shipper to Logstash.Includes various beats such as File beat,Metric beat etc. ● Data visualization and dashboard reporting using Kibana ● Setting up backup and restore using snapshot and restore ● Fine tuning and optimization of queries in order to get response faster. ● Preparation of multi index architecture (Time series Indexing arch) in order to perform faster search and get the response as quick as possible. Technologies JSON Tools/Tool chain Elasticsearch 5.x/2.x, Logstash 5.x/2.x, Kibana 5.x/4.x, Beats, X-pack, Head Plug-in, Kopf Plug-in, Carrot2, Lingo3g, Putty, Win SCP
  • 3. Page 3 of 3 Project - 2 Project Name Data Lake Modules in Spark Team Size 1 Start Date June 2016 End Date Till Now Description This project involves creating a generic data lake solution to migrate the data from one SQL/No SQL database to another. Role & Contribution ● Created a module on spark which reads data from SQL or Hbase and dumps the same to HDFS as AVRO or ORC ● Module is created in a way where user can specify their input type if its has to be Hbase or SQL and output type as to be ORC or AVRO in HDFS ● Implemented the data lake modules to run periodically using the oozie scheduler, where job runs every hour and dump the data. ● Added a functionality to auto clean up the older dumps which is X version and Y days are old. ● Created an offline index module which acts as secondary index to hbase table with required fields only to speedup the select query. Technologies Scala Tools/Tool chain Kafka, Spark, Spark Streaming and Spark SQL, Phoenix, Hbase , Hive , Maven, Git Project - 3 Project Name Spark ETL for fact and dim types of data Team Size 2 Start Date Jan 2016 End Date May 2016 Description This project involves ETL processing with Spark which involves pulling up the information from a variety of sources, transforms the data, and then pushes to Presto for OLTP/OLAP analytics for dim and fact type of data. Role & Contribution ● Consume the dim/fact Kafka message in spark using Kafka consumer which are produced by Kafka producer ● Extract the messages consumed by Kafka consumer ● Perform business logic and transformation on those messages ● Push the transformed data to either Hive or Presto for further OLTP and OLAP analytics
  • 4. Page 4 of 4 Technologies Scala Tools/Tool chain Maven, Git, Kafka, Spark, Spark Streaming and Spark SQL, Phoenix, Presto Project - 4 Project Name Deployment of Elasticsearch to handle big data in IIMS Team Size 2 Start Date Feb 2014 End Date Dec 2015 Description This Project Involves deployment of Elasticsearch to handle data in GB’s (in a day) Role & Contribution ● Deployment architecture preparation to handle such a big data using cluster with 12 nodes. ● Data visualization and dashboard loading using Kibana ● Well verse in setting up the various nodes in deployment such as Master, Data and client node. ● Hands on experience with various plug-in such as Mapper Attachment, Head, Big desk, Carrot2 (with lingo3g categorization algorithm) ● Preparation of schema to store the data in Elasticsearch ● Preparation of search query to be used for retrieval of data from Elasticsearch using Query String, Match, Boolean, Aggregation etc. ● Though its big data still backup and restore is required in case of any failure hence setup the Snapshot and restore process to take daily backup of the same. ● Implementation of shards and replica (to avoid single node failure) ● Fine tuning and optimization of queries in order to get response faster. ● Preparation of multi index architecture in order to perform faster search and get the response as quick as possible. ● Implementation of feature like Synonyms, Stemming, Grammar extension, wild card search etc. Technologies C#, Elasticsearch(NoSQL Database) Tools/Tool chain Elasticsearch 1.5.x, Mapper Attachment, Head Plug-in, Big desk Plug-in, Carrot2, Lingo3g, Putty, WinSCP
  • 5. Page 5 of 5 Project – 5 Project Name Big Data Platform Development Team Size 2 Start Date June 2015 End Date Dec 2015 Description This project involves processing of the Logs being generated from various system and devices and perform predictive analysis to form the pattern and trace the attacking device or user Role & Contribution ● Collecting the high speed logs coming from various devices and inject the same using Flume ● Pass on the logs data from flume to spark streaming and spark SQL so as to store the same onto Memory (As Spark is being known for in memory data processing) ● Perform predictive analysis onto the log as per the defined algorithm, also perform the computation with self derived algorithm as well ● Persist the log, in memory for specific time duration using spark streaming and then persist the same permanently to Elastic ● Performed all above operation and computation using distributed computing. Which includes setting up the 5 node cluster of HDFS using Horton Works Development Platform Technologies Core Java, Flume, HDFS, HDP Clustering, Spark, Spark Streaming and Spark SQL, Elasticsearch Tools/Tool chain Maven, Git, MySQL
  • 6. Page 6 of 6 Educational Qualifications: Course Board/University Year of Passing Percentage 10th CBSE 2005 79.20 12th CBSE 2007 76.00 B.E.(Computer Science) RGPV Bhopal 2011 76.44 GATE - 2011 91 Percentile CAT - 2010 85 Percentile Personal Profile: Date of Birth : 23rd Nov, 1989 Passport No : J3031277 Willing to re-allocate : Depends upon Location Willingness for Onsite : Yes PAN : ATXPA9120F Declaration: I hereby declare that the information provided above is correct and true to the best of my knowledge and believe. Date: 09 Jan 2017 Place: Pune (Prashant Agrawal)