SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Robin David
Mobile: +919952447654 email: robinhadoop@gmail.com
Objective:
Intend to build a career with leading corporate of hi-tech environment with committed and
dedicated people. Willing to work as a key player in challenging and creative environment
A position in which my technical abilities, education and past work experience will be utilized
to best benefit the organization.
Experience:
 Around 9 years of total experience in IT with 3+ years of Experience in Hadoop
 Professional experience of 4 months with Polaris, as Senior consultant mainly focusing
on building Reference Data Management (RDM) solution within Hadoop
 Professional experience of 18 months with iGATE Global Solutions, as Technical lead
mainly focusing on Data Processing with Hadoop, Developing IV3
engine (iGATE’S
Proprietary Big Data Platform) and Hadoop cluster Administration
 Professional experience of 5.7 Years with Cindrel Info Tech, as Technology Specialist
mainly focusing on ADS, LDAP, IIS, FTP, DW/BI and Hadoop
 Professional experience of 11 months with Purple Info Tech, as Software Engineer
mainly focusing on Routers and Switches
Achievements in Hadoop
• Having experience to build own plug-in with web user interface for
o HDFS data encryption/decryption
o Column based data masking on Hive
o Hive Benchmarking
o Sqoop automation for data ingestion on HDFS
o HDD space issues alert automation on Hadoop cluster environments
• Integrated Revolution R with cloudera distribution
• Integrated Informatica - 9.5 and HDFS for Data processing in Apache and cloudera
Cluster
Professional Summary
• Expertise in creating data lake environment in HDFS within secure manner
• Having experience to pull the data from RDBMS and various staging area to HDFS
using Sqoop, Flume and Shell scripts
• Providing solution and technical architecture to migrate existing DWH to Hadoop
platform
• Using the data Integration tool Pentaho for designing ETL jobs in the process of
building Data lake
• Having experience to build and processing data within the Datastax Cassandra cluster
• Having good experience to handle data within advance hive
• Experienced in Installation, Configuration and Management Pivotal, Cloudera,
Hortonworks and Apache Hadoop Clusters
• Designed and built IV3
IGATE proprietary Big Data platform
• Good experience in Hadoop Distributed File System [HDFS] management
• Having experience in shell scripting for various HDFS operations
• Installation, Configuration and Management on EMC tools (gemfireXD, Green plum
DB, and HAWQ)
• Configure Hadoop clusters on amazon cloud
• Recovery, Hadoop cluster, and name node or data node failures
• End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines
against very large data sets
• Kerberos implementation with Hadoop clusters
• Configuring HA on Hadoop clusters
• Integrate Splunk server to HDFS
• Installing Hadoop cluster Monitoring tools (Pivotal Command Center, Ambari,
Cloudera manager and Ganglia)
• Cluster health monitoring and fixing performance issues
• Data balancing on cluster and Commissioning/Decommissioning data nodes in
existing Hadoop cluster
• Having experience on HDFS create directory structure/permission for project specific
needs and Set access permission for groups and users as required for project specific
needs
• Understand/Analyze specific job (project) and the run process
• Understand/Analyze the scripts, map reduce codes, and input/output files/data for
operations support
• Build an archiving platform in Hadoop environment
• Good exposure on understanding on Hadoop services and quick problem resolution
skills
• Having good experience in ADS, LDAP, DNS, DHCP, IIS, GPO, User Administration,
Patch Maintenance, SSH, SUDO ,Configuring RPM through YUM, FTP and NFS
Technical Skills
Operating System RHEL 5.x/6.x, Centos 5.x/6.x, UBUNTU, Windows servers and
Client family
Hardware Dell , IBM and HP
Database MySQL, Postgresql, MSSQL and ORACLE
Tools Command Center, Check_MK, Ambari, Ganglia, Cloudera Manager
and GIT LAB.
Languages Shell Scripting and Core Java
Cloud Computing
Framework
AWS
Hadoop Ecosystem Hadoop, ZooKeeper, Pig, Hive, Sqoop, Flume, Hue and Spark.
Certifications:
 Microsoft Certified IT Professional (MCITP - 2008)
 Microsoft Certified Professional (MCP-2003)
 CCNA
Educational Qualifications:
Bachelor of Science (Computer)
St. Joseph’s College, Bharthidasan University - Trichy
Major Assignments:
Project1
Market Reference Data Management is a pure play reference data management solution built
using big data platform and RDBMS to be used in the securities market industry. As part of this
project data is collected from different market data vendors such as Reuters, interactive data etc.
for different types of asset classes such as equity, fixed income and derivatives. The entire
solution is built using pentaho as the ETL tool and the final tables are stored in Hive. The
different downstream application will access data from hive as and when required.
• Responsible for provide an architecture plan to implement entire RDM solution
• Understanding the securities master data model
• Create an API for getting data from various sources.
• Create Hive data models
• Design ETL jobs with pentaho for Data cleansing, Data identification and Load the data
to Hive tables
• Create a Hive (HQL) scripts to process the data
(12 – 2015) To (till date)
Project: Reference Data Management
Domain Solution Building in Finance Domain
Environment: CDH 5.0
Role: Senior Consultant
Project 2
GE purchases parts from different vendor across the world within all the business units. There
is no central repository to monitor the vendors across the business units. Parts purchased were
charged on different scale within the vendor and its subsidiary and other vendors. To identify
the purchase price difference and to build a master list of vendors, GE software COE and
IGATE together designed a data lake on Pivotal Hadoop consisting of all PO (Purchase order)
And invoice data imported from multiple SAP/ERP sources. The data in the data lake is
cleansed, integrated with DNB to build a master list of vendors and the data is analyzed to
identify anomaly behavior in PO.
Job Responsibilities:
• Monitoring POD and IPS Hadoop clusters - Each environment has Sandbox,
Development and Production division.
• Having experience in CHECK_MK tool for monitoring Hadoop cluster environment.
• Provide the solution for all Hadoop ecosystem and EMC tools.
• Having experience in working together with EMC support.
• Shell scripting for various Hadoop operations
• User creation and quota allocation on Hadoop cluster and GPDB environment.
• Talend Support
• Having experience in GIT-LAB tool
• Provide the solution for performance issues in Green plum DB environment
• Bring back failure segments to active in GPDB environment
Project 3
Retail Omni channel Solution leverages Cross Channel Analytics (Web, Mobile & Store) along
with Bluetooth LE technology in the store to deliver superior customer experience. The target
messages / personalized promotions are delivered at the right time (Moment of Truth) to
maximize sales conversions.
Job Responsibilities:
• Create a Hive SQL Script for Data Processing and Data merging from multiple tables
(03 – 2015) To (09 - 2015)
Project: Data Lake Support
Client GE - Software COE
Environment: Pivotal and EMC Tools
Role: Technical Lead
(01 – 2014) To (06 - 2014)
Project: Retail Omni channel Solution
Client Retail Giant in US
Environment: Cloudera Distribution CDH (4X)
Role: Technical Lead
• Loading data to HDFS
• Export Data from HDFS to RDBMS (MySQL) using Sqoop
• Create a script for IROCS automation
Project 4
The principle motivation for IV3
is to provide a turnkey Big Data platform that abstracts the
complexities of technology implementation and frees up bandwidth to focus on creating
differentiated business value. IV3 is software based big data analytics platform designed to
work with enterprise class Hadoop distributions providing an open architecture and big data
specific software engineering processes. IV3
is power-packed with components and enablers
covering the life cycle of Big Data implementation starting from Data Ingestion, storage &
transformation to various analytical models. It aims to marshal the three Vs of Big Data
(Volume x Velocity x Variety) to deliver the maximum business impact.
Job Responsibilities:
• Implement data ingestion (RDBMS to HDFS) in IV3
Platform
• Testing IV3
Tools on different Hadoop Distribution
• Configure auto Yarn-Memory Calculator on IV3
Platform
• HDFS data encryption/decryption
• Column based data masking on Hive
• Hive Benchmarking
• Sqoop automation for data ingestion on HDFS
• Create a automation script for detecting HDD space issues on Hadoop cluster
environments
Project 5
(06- 2014) – (09-2014)
Project: Predictive Fleet Maintenance
Client Penske
Environment: Cloudera (CDH 4.6) - Hive
Role: Technical Lead
The business requirement of Penske is about collecting data from repository of un tapped data –
Vehicle Diagnostics, Maintenance & Repair and this data potentially be leveraged to generate
economic value. Penske want to create a future ready Big Data platform to efficiently store,
process and analyze the data in consonance with their strategic initiatives. Penske engaged
IGATE to partner with them in this strategic initiative to tap insights hidden in -Diagnosis,
Maintenance & Repair data. IGATE would be leveraging its state of the art Big Data
Engineering lab to implement the data engineering and data science part of this project.
(12 - 2013) To (09 - 2015)
Project: IV3
(Proprietary Big Data Platform)
Client IGATE
Environment: CDH, HDP and Pivotal
Role: Technical Lead
Job Responsibilities:
• Understand project scope, business requirement and current business processes.
• Map business requirements with use cases.
• Implemented use cases using with Hive.
Project 6
(01 – 2012) To (11-2013)
Project: WA-Insights and Analytics
Client Watenmal Group – Global
Environment: Apache Hadoop - HIVE, Map Reduce, Sqoop
Role: Technology Specialist
WAIA is intended to support all retail business segments involved in sale of goods and
supporting services. WAIA Retail store integrated data model addresses 3 major aspects of a
store business. (1) The physical flow and control of merchandise into, through and out of the
store. (2)The selling process where the products and services offered for sale are transformed
into tender and sales revenue is recognized. (3)The control and tracking of tender from the
point of sale where it is received through its deposit into a bank or other depository
Job Responsibilities:
• Understanding Hadoop main components and architecture
• Data migration between RDBMS to HDFS by using Sqoop
• Understanding nuance map/reduce program in UDF
• Data merging and optimization in hive
• Adding new data nodes in existing Hadoop cluster
• Safely decommissioning failure data nodes
• Monitoring Hadoop cluster with ganglia tool
Project 7:
(5 – 2008) to (12 – 2011)
Project: eyeTprofit and Maginss
Client SI and Arjun Chemicals
Environment: Windows – sql server, .Net Framework and LDAP
Role: Technology Specialist
EyeTprofit and Magniss enables businesses to easily analyze profitability, budget versus actual,
revenue, inventory, and cash requirements etc instantaneously especially when the information
is spread across multiple applications. EyeTprofit and Magniss is a non-invasive reporting
system that facilitates information from multiple functions to be culled out and presented in an
appropriate form to enable informed decisions.
Job Responsibilities:
• LDAP integration with BI Tool
• Administering Active directory, DHCP and DNS Servers
• Managing group policies
• Distributed file system management (DFS)
• Administering FTP servers and IIS servers
• Patch Maintenance
• Administering File Shares, Disk Quotas
• Providing access to the share drive users
• Remote Support
• Configure virtual machines
Project 8:
(5 – 2007) to (04 - 2008)
Project: TRMS
Client TTP
Environment: Windows – Storage server , Router and Layer 3 Switches
Role: Software Engineer
A State of the art Traffic Management system, the first of its kind in India. It helps regulate and
enforce law with the efficiency, expediency and accuracy of technology.
Job Responsibilities:
• Expertise in handling Wireless Communication
• Maintaining with hand-held computer devices
• Expertise in handling storage server
• Responsibility of designing the topology of network in the client places implementing
the infrastructure with high security and with hierarchy
• Handling the issues on the Network related problems on the client places, mainly
debugged issues that oriented on Network peripherals

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Manoj(Java Developer)_Resume
Manoj(Java Developer)_ResumeManoj(Java Developer)_Resume
Manoj(Java Developer)_Resume
 
Rakesh_Resume_2016
Rakesh_Resume_2016Rakesh_Resume_2016
Rakesh_Resume_2016
 
Puneet_Senior_Java_Developer_Resume
Puneet_Senior_Java_Developer_ResumePuneet_Senior_Java_Developer_Resume
Puneet_Senior_Java_Developer_Resume
 
Subhadeep_Mukherjee_Java_7years
Subhadeep_Mukherjee_Java_7yearsSubhadeep_Mukherjee_Java_7years
Subhadeep_Mukherjee_Java_7years
 
Krishna_Agrawal_Resume
Krishna_Agrawal_ResumeKrishna_Agrawal_Resume
Krishna_Agrawal_Resume
 
Borja González - Resume ​Big Data Architect
Borja González - Resume ​Big Data ArchitectBorja González - Resume ​Big Data Architect
Borja González - Resume ​Big Data Architect
 
Hemant_Mittal_Resume
Hemant_Mittal_ResumeHemant_Mittal_Resume
Hemant_Mittal_Resume
 
Resume-Manish_Agrahari_IBM_BPM
Resume-Manish_Agrahari_IBM_BPMResume-Manish_Agrahari_IBM_BPM
Resume-Manish_Agrahari_IBM_BPM
 
Resume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - HadoopResume_of_Vasudevan - Hadoop
Resume_of_Vasudevan - Hadoop
 
Updated Mohiuddin Resume
Updated Mohiuddin ResumeUpdated Mohiuddin Resume
Updated Mohiuddin Resume
 
Job Suneel Mandam
Job Suneel MandamJob Suneel Mandam
Job Suneel Mandam
 
resume_abdul_up
resume_abdul_upresume_abdul_up
resume_abdul_up
 
Anoop Saxena
Anoop SaxenaAnoop Saxena
Anoop Saxena
 
Database Development
Database DevelopmentDatabase Development
Database Development
 
David Edson CV Abridged
David Edson CV AbridgedDavid Edson CV Abridged
David Edson CV Abridged
 
Malli Resume_30 Jun 2012
Malli Resume_30 Jun 2012Malli Resume_30 Jun 2012
Malli Resume_30 Jun 2012
 
ETL Profile-Rajnish Kumar
ETL Profile-Rajnish KumarETL Profile-Rajnish Kumar
ETL Profile-Rajnish Kumar
 
Resume For Java Devloper
Resume For Java DevloperResume For Java Devloper
Resume For Java Devloper
 
ANANTHAKUMAR Resume
ANANTHAKUMAR ResumeANANTHAKUMAR Resume
ANANTHAKUMAR Resume
 
Madhava_Sr_JAVA_J2EE
Madhava_Sr_JAVA_J2EEMadhava_Sr_JAVA_J2EE
Madhava_Sr_JAVA_J2EE
 

Andere mochten auch

J. keith hubbard commodity specialist
J.  keith hubbard   commodity specialistJ.  keith hubbard   commodity specialist
J. keith hubbard commodity specialistJ. Keith Hubbard
 
Regional Project Coordinator
Regional Project CoordinatorRegional Project Coordinator
Regional Project CoordinatorChristine Kaleeba
 
Timothy_Bergen Resume_2016
Timothy_Bergen Resume_2016Timothy_Bergen Resume_2016
Timothy_Bergen Resume_2016Timothy Bergen
 
J. keith hubbard medical plant manager
J.  keith hubbard   medical plant managerJ.  keith hubbard   medical plant manager
J. keith hubbard medical plant managerJ. Keith Hubbard
 
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUMVIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUMVivek Shukla
 
Emanis, Dustin Jr. Resume 2015 (1)
Emanis, Dustin Jr.     Resume    2015 (1)Emanis, Dustin Jr.     Resume    2015 (1)
Emanis, Dustin Jr. Resume 2015 (1)Dustin Emanis
 

Andere mochten auch (14)

Cindy Moskalyk_RESUME
Cindy Moskalyk_RESUMECindy Moskalyk_RESUME
Cindy Moskalyk_RESUME
 
spurthy_resume
spurthy_resumespurthy_resume
spurthy_resume
 
Cv
CvCv
Cv
 
Kirchner_J_Resume_2016_v2.2
Kirchner_J_Resume_2016_v2.2Kirchner_J_Resume_2016_v2.2
Kirchner_J_Resume_2016_v2.2
 
Resume arvind -csm
Resume arvind -csmResume arvind -csm
Resume arvind -csm
 
Michael Johnson
Michael JohnsonMichael Johnson
Michael Johnson
 
Resume[1]
Resume[1]Resume[1]
Resume[1]
 
J. keith hubbard commodity specialist
J.  keith hubbard   commodity specialistJ.  keith hubbard   commodity specialist
J. keith hubbard commodity specialist
 
Regional Project Coordinator
Regional Project CoordinatorRegional Project Coordinator
Regional Project Coordinator
 
Timothy_Bergen Resume_2016
Timothy_Bergen Resume_2016Timothy_Bergen Resume_2016
Timothy_Bergen Resume_2016
 
J. keith hubbard medical plant manager
J.  keith hubbard   medical plant managerJ.  keith hubbard   medical plant manager
J. keith hubbard medical plant manager
 
Anshul Verma_C.V
Anshul Verma_C.VAnshul Verma_C.V
Anshul Verma_C.V
 
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUMVIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
VIVEKSHUKLA_10YRS_TESTAUTOMATION_SELENIUM
 
Emanis, Dustin Jr. Resume 2015 (1)
Emanis, Dustin Jr.     Resume    2015 (1)Emanis, Dustin Jr.     Resume    2015 (1)
Emanis, Dustin Jr. Resume 2015 (1)
 

Ähnlich wie Robin_Hadoop

Ähnlich wie Robin_Hadoop (20)

hadoop exp
hadoop exphadoop exp
hadoop exp
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
Resume (1)
Resume (1)Resume (1)
Resume (1)
 
Resume
ResumeResume
Resume
 
RESUME_N
RESUME_NRESUME_N
RESUME_N
 
Resume_2706
Resume_2706Resume_2706
Resume_2706
 
Srikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copySrikanth hadoop hyderabad_3.4yeras - copy
Srikanth hadoop hyderabad_3.4yeras - copy
 
DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
 
HimaBindu
HimaBinduHimaBindu
HimaBindu
 
Sureh hadoop 3 years t
Sureh hadoop 3 years tSureh hadoop 3 years t
Sureh hadoop 3 years t
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
Srikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hydSrikanth hadoop 3.6yrs_hyd
Srikanth hadoop 3.6yrs_hyd
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
Sourav banerjee resume
Sourav banerjee   resumeSourav banerjee   resume
Sourav banerjee resume
 
Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
 
Prasanna Resume
Prasanna ResumePrasanna Resume
Prasanna Resume
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Amith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CVAmith_Hadoop_Admin_CV
Amith_Hadoop_Admin_CV
 
Hareesh
HareeshHareesh
Hareesh
 
Hadoop training kit from lcc infotech
Hadoop   training kit from lcc infotechHadoop   training kit from lcc infotech
Hadoop training kit from lcc infotech
 

Robin_Hadoop

  • 1. Robin David Mobile: +919952447654 email: robinhadoop@gmail.com Objective: Intend to build a career with leading corporate of hi-tech environment with committed and dedicated people. Willing to work as a key player in challenging and creative environment A position in which my technical abilities, education and past work experience will be utilized to best benefit the organization. Experience:  Around 9 years of total experience in IT with 3+ years of Experience in Hadoop  Professional experience of 4 months with Polaris, as Senior consultant mainly focusing on building Reference Data Management (RDM) solution within Hadoop  Professional experience of 18 months with iGATE Global Solutions, as Technical lead mainly focusing on Data Processing with Hadoop, Developing IV3 engine (iGATE’S Proprietary Big Data Platform) and Hadoop cluster Administration  Professional experience of 5.7 Years with Cindrel Info Tech, as Technology Specialist mainly focusing on ADS, LDAP, IIS, FTP, DW/BI and Hadoop  Professional experience of 11 months with Purple Info Tech, as Software Engineer mainly focusing on Routers and Switches Achievements in Hadoop • Having experience to build own plug-in with web user interface for o HDFS data encryption/decryption o Column based data masking on Hive o Hive Benchmarking o Sqoop automation for data ingestion on HDFS o HDD space issues alert automation on Hadoop cluster environments • Integrated Revolution R with cloudera distribution • Integrated Informatica - 9.5 and HDFS for Data processing in Apache and cloudera Cluster
  • 2. Professional Summary • Expertise in creating data lake environment in HDFS within secure manner • Having experience to pull the data from RDBMS and various staging area to HDFS using Sqoop, Flume and Shell scripts • Providing solution and technical architecture to migrate existing DWH to Hadoop platform • Using the data Integration tool Pentaho for designing ETL jobs in the process of building Data lake • Having experience to build and processing data within the Datastax Cassandra cluster • Having good experience to handle data within advance hive • Experienced in Installation, Configuration and Management Pivotal, Cloudera, Hortonworks and Apache Hadoop Clusters • Designed and built IV3 IGATE proprietary Big Data platform • Good experience in Hadoop Distributed File System [HDFS] management • Having experience in shell scripting for various HDFS operations • Installation, Configuration and Management on EMC tools (gemfireXD, Green plum DB, and HAWQ) • Configure Hadoop clusters on amazon cloud • Recovery, Hadoop cluster, and name node or data node failures • End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets • Kerberos implementation with Hadoop clusters • Configuring HA on Hadoop clusters • Integrate Splunk server to HDFS • Installing Hadoop cluster Monitoring tools (Pivotal Command Center, Ambari, Cloudera manager and Ganglia) • Cluster health monitoring and fixing performance issues • Data balancing on cluster and Commissioning/Decommissioning data nodes in existing Hadoop cluster • Having experience on HDFS create directory structure/permission for project specific needs and Set access permission for groups and users as required for project specific needs • Understand/Analyze specific job (project) and the run process • Understand/Analyze the scripts, map reduce codes, and input/output files/data for operations support • Build an archiving platform in Hadoop environment • Good exposure on understanding on Hadoop services and quick problem resolution skills • Having good experience in ADS, LDAP, DNS, DHCP, IIS, GPO, User Administration, Patch Maintenance, SSH, SUDO ,Configuring RPM through YUM, FTP and NFS Technical Skills Operating System RHEL 5.x/6.x, Centos 5.x/6.x, UBUNTU, Windows servers and Client family Hardware Dell , IBM and HP
  • 3. Database MySQL, Postgresql, MSSQL and ORACLE Tools Command Center, Check_MK, Ambari, Ganglia, Cloudera Manager and GIT LAB. Languages Shell Scripting and Core Java Cloud Computing Framework AWS Hadoop Ecosystem Hadoop, ZooKeeper, Pig, Hive, Sqoop, Flume, Hue and Spark. Certifications:  Microsoft Certified IT Professional (MCITP - 2008)  Microsoft Certified Professional (MCP-2003)  CCNA Educational Qualifications: Bachelor of Science (Computer) St. Joseph’s College, Bharthidasan University - Trichy Major Assignments: Project1 Market Reference Data Management is a pure play reference data management solution built using big data platform and RDBMS to be used in the securities market industry. As part of this project data is collected from different market data vendors such as Reuters, interactive data etc. for different types of asset classes such as equity, fixed income and derivatives. The entire solution is built using pentaho as the ETL tool and the final tables are stored in Hive. The different downstream application will access data from hive as and when required. • Responsible for provide an architecture plan to implement entire RDM solution • Understanding the securities master data model • Create an API for getting data from various sources. • Create Hive data models • Design ETL jobs with pentaho for Data cleansing, Data identification and Load the data to Hive tables • Create a Hive (HQL) scripts to process the data (12 – 2015) To (till date) Project: Reference Data Management Domain Solution Building in Finance Domain Environment: CDH 5.0 Role: Senior Consultant
  • 4. Project 2 GE purchases parts from different vendor across the world within all the business units. There is no central repository to monitor the vendors across the business units. Parts purchased were charged on different scale within the vendor and its subsidiary and other vendors. To identify the purchase price difference and to build a master list of vendors, GE software COE and IGATE together designed a data lake on Pivotal Hadoop consisting of all PO (Purchase order) And invoice data imported from multiple SAP/ERP sources. The data in the data lake is cleansed, integrated with DNB to build a master list of vendors and the data is analyzed to identify anomaly behavior in PO. Job Responsibilities: • Monitoring POD and IPS Hadoop clusters - Each environment has Sandbox, Development and Production division. • Having experience in CHECK_MK tool for monitoring Hadoop cluster environment. • Provide the solution for all Hadoop ecosystem and EMC tools. • Having experience in working together with EMC support. • Shell scripting for various Hadoop operations • User creation and quota allocation on Hadoop cluster and GPDB environment. • Talend Support • Having experience in GIT-LAB tool • Provide the solution for performance issues in Green plum DB environment • Bring back failure segments to active in GPDB environment Project 3 Retail Omni channel Solution leverages Cross Channel Analytics (Web, Mobile & Store) along with Bluetooth LE technology in the store to deliver superior customer experience. The target messages / personalized promotions are delivered at the right time (Moment of Truth) to maximize sales conversions. Job Responsibilities: • Create a Hive SQL Script for Data Processing and Data merging from multiple tables (03 – 2015) To (09 - 2015) Project: Data Lake Support Client GE - Software COE Environment: Pivotal and EMC Tools Role: Technical Lead (01 – 2014) To (06 - 2014) Project: Retail Omni channel Solution Client Retail Giant in US Environment: Cloudera Distribution CDH (4X) Role: Technical Lead
  • 5. • Loading data to HDFS • Export Data from HDFS to RDBMS (MySQL) using Sqoop • Create a script for IROCS automation Project 4 The principle motivation for IV3 is to provide a turnkey Big Data platform that abstracts the complexities of technology implementation and frees up bandwidth to focus on creating differentiated business value. IV3 is software based big data analytics platform designed to work with enterprise class Hadoop distributions providing an open architecture and big data specific software engineering processes. IV3 is power-packed with components and enablers covering the life cycle of Big Data implementation starting from Data Ingestion, storage & transformation to various analytical models. It aims to marshal the three Vs of Big Data (Volume x Velocity x Variety) to deliver the maximum business impact. Job Responsibilities: • Implement data ingestion (RDBMS to HDFS) in IV3 Platform • Testing IV3 Tools on different Hadoop Distribution • Configure auto Yarn-Memory Calculator on IV3 Platform • HDFS data encryption/decryption • Column based data masking on Hive • Hive Benchmarking • Sqoop automation for data ingestion on HDFS • Create a automation script for detecting HDD space issues on Hadoop cluster environments Project 5 (06- 2014) – (09-2014) Project: Predictive Fleet Maintenance Client Penske Environment: Cloudera (CDH 4.6) - Hive Role: Technical Lead The business requirement of Penske is about collecting data from repository of un tapped data – Vehicle Diagnostics, Maintenance & Repair and this data potentially be leveraged to generate economic value. Penske want to create a future ready Big Data platform to efficiently store, process and analyze the data in consonance with their strategic initiatives. Penske engaged IGATE to partner with them in this strategic initiative to tap insights hidden in -Diagnosis, Maintenance & Repair data. IGATE would be leveraging its state of the art Big Data Engineering lab to implement the data engineering and data science part of this project. (12 - 2013) To (09 - 2015) Project: IV3 (Proprietary Big Data Platform) Client IGATE Environment: CDH, HDP and Pivotal Role: Technical Lead
  • 6. Job Responsibilities: • Understand project scope, business requirement and current business processes. • Map business requirements with use cases. • Implemented use cases using with Hive. Project 6 (01 – 2012) To (11-2013) Project: WA-Insights and Analytics Client Watenmal Group – Global Environment: Apache Hadoop - HIVE, Map Reduce, Sqoop Role: Technology Specialist WAIA is intended to support all retail business segments involved in sale of goods and supporting services. WAIA Retail store integrated data model addresses 3 major aspects of a store business. (1) The physical flow and control of merchandise into, through and out of the store. (2)The selling process where the products and services offered for sale are transformed into tender and sales revenue is recognized. (3)The control and tracking of tender from the point of sale where it is received through its deposit into a bank or other depository Job Responsibilities: • Understanding Hadoop main components and architecture • Data migration between RDBMS to HDFS by using Sqoop • Understanding nuance map/reduce program in UDF • Data merging and optimization in hive • Adding new data nodes in existing Hadoop cluster • Safely decommissioning failure data nodes • Monitoring Hadoop cluster with ganglia tool Project 7: (5 – 2008) to (12 – 2011) Project: eyeTprofit and Maginss Client SI and Arjun Chemicals Environment: Windows – sql server, .Net Framework and LDAP Role: Technology Specialist EyeTprofit and Magniss enables businesses to easily analyze profitability, budget versus actual, revenue, inventory, and cash requirements etc instantaneously especially when the information is spread across multiple applications. EyeTprofit and Magniss is a non-invasive reporting system that facilitates information from multiple functions to be culled out and presented in an appropriate form to enable informed decisions.
  • 7. Job Responsibilities: • LDAP integration with BI Tool • Administering Active directory, DHCP and DNS Servers • Managing group policies • Distributed file system management (DFS) • Administering FTP servers and IIS servers • Patch Maintenance • Administering File Shares, Disk Quotas • Providing access to the share drive users • Remote Support • Configure virtual machines Project 8: (5 – 2007) to (04 - 2008) Project: TRMS Client TTP Environment: Windows – Storage server , Router and Layer 3 Switches Role: Software Engineer A State of the art Traffic Management system, the first of its kind in India. It helps regulate and enforce law with the efficiency, expediency and accuracy of technology. Job Responsibilities: • Expertise in handling Wireless Communication • Maintaining with hand-held computer devices • Expertise in handling storage server • Responsibility of designing the topology of network in the client places implementing the infrastructure with high security and with hierarchy • Handling the issues on the Network related problems on the client places, mainly debugged issues that oriented on Network peripherals