SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Google Certified Professional - Data Engineer
Job Role Description
A Google Certified Professional - Data Engineer enables data-driven decision making by collecting,
transforming, and visualizing data. The data engineer should be able to design, build, maintain, and
troubleshoot data processing systems with a particular emphasis on the security, reliability,
fault-tolerance, scalability, fidelity, and efficiency of such systems. The data engineer should also be able
to analyze data to gain insight into business outcomes, build statistical models to support
decision-making, and create machine learning models to automate and simplify key business processes.
Certification Exam Guide
Section 1: Designing data processing systems
1.1 Designing flexible data representations. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state and how to migrate the design to a future state
● data modeling
● tradeoffs
● distributed systems
● schema design
1.2 Designing data pipelines. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state and how to migrate the design to a future state
● data modeling
● tradeoffs
● system availability
● distributed systems
● schema design
● common sources of error (eg. removing selection bias)
1.3 Designing data processing infrastructure. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state, how to migrate the design to the future state
● data modeling
● tradeoffs
● system availability
● distributed systems
● schema design
● capacity planning
● different types of architectures: message brokers, message queues, middleware,
service-oriented
Section 2: Building and maintaining data structures and databases
2.1 Building and maintaining flexible data representations
2.2 Building and maintaining pipelines. Considerations include:
● data cleansing
● batch and streaming
● transformation
● acquire and import data
● testing and quality control
● connecting to new data sources
2.3 Building and maintaining processing infrastructure. Considerations include:
● provisioning resources
● monitoring pipelines
● adjusting pipelines
● testing and quality control
Section 3: Analyzing data and enabling machine learning
3.1 Analyzing data. Considerations include:
● data profiling
● data correlation
● patterns and insights
● anomaly detection
● statistical models
● machine learning
● assessing the statistical relevance of conclusions
3.2 Transforming data to enable machine learning and pattern discovery. Considerations
include:
● repeatability
● generalization
● distributed computing
● improved model accuracy
3.3 Identifying or building data visualization and reporting tools. Considerations include:
● automation
● decision support
● data summarization
● enabling patterns and insights
Section 4: Modeling business processes for analysis and optimization
4.1 Mapping business requirements to data representations. Considerations include:
● working with business users
● gathering business requirements
4.2 Optimizing data representations, data infrastructure performance and cost.
Considerations include:
● resizing and scaling resources
● data cleansing, distributed systems
● high performance algorithms
● common sources of error (eg. removing selection bias)
Section 5: Ensuring reliability
5.1 Performing quality control. Considerations include:
● verification
● building and running test suites
● pipeline monitoring
5.2 Assessing, troubleshooting, and improving data representations and data processing
infrastructure.
5.3 Recovering data. Considerations include:
● planning (e.g. fault-tolerance)
● executing (e.g., rerunning failed jobs, performing retrospective re-analysis)
● stress testing data recovery plans and processes
Section 6: Visualizing data and advocating policy
6.1 Building (or selecting) data visualization and reporting tools. Considerations include:
● automation
● decision support
● data summarization, (e.g, translation up the chain, fidelity, trackability, integrity)
6.2 Advocating policies and publishing data and reports.
Section 7: ​ ​Designing for security and compliance
7.1 Designing secure data infrastructure and processes. Considerations include:
● Identify and Access Management (IAM)
● data security
● penetration testing
● Separation of Duties (SoD)
● security control
7.2 Designing for legal compliance. Considerations include:
● Health Insurance Portability and Accountability Act (HIPAA), Children’s Online
Privacy Protection Act (COPPA), etc.
● audits

Weitere ähnliche Inhalte

Was ist angesagt?

12 ipt 0106/7 Project Implementation & Testing
12 ipt 0106/7   Project Implementation & Testing12 ipt 0106/7   Project Implementation & Testing
12 ipt 0106/7 Project Implementation & Testingctedds
 
12 ipt 0201 information systems
12 ipt 0201   information systems12 ipt 0201   information systems
12 ipt 0201 information systemsctedds
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system designRahul Hedau
 
Project Management for Information System Development
Project Management for Information System DevelopmentProject Management for Information System Development
Project Management for Information System DevelopmentNabilaNuzhat
 
System Analysis And Design 2011
System Analysis And Design  2011System Analysis And Design  2011
System Analysis And Design 2011tgushi12
 
12 ipt 0104 making decisions
12 ipt 0104   making decisions12 ipt 0104   making decisions
12 ipt 0104 making decisionsctedds
 
Information Management unit 3 Database management systems
Information Management unit 3 Database management systemsInformation Management unit 3 Database management systems
Information Management unit 3 Database management systemsGanesha Pandian
 
Bba205 – management information system
Bba205 – management information systemBba205 – management information system
Bba205 – management information systemsmumbahelp
 
System and Design-MIS-Seminar,Presentation
 System and Design-MIS-Seminar,Presentation System and Design-MIS-Seminar,Presentation
System and Design-MIS-Seminar,PresentationPraveen Gummadidala
 
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...ijitjournal
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementctedds
 
167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-dochomeworkping8
 
System Analysis & Design
System Analysis & DesignSystem Analysis & Design
System Analysis & DesignMustafa Ali
 
Introduction to System analysis part1
Introduction to System analysis part1Introduction to System analysis part1
Introduction to System analysis part1DrMohammed Qassim
 
1.2) Information systems in context
1.2) Information systems in context1.2) Information systems in context
1.2) Information systems in contextctedds
 
Ipt Syllabus Changes Project Management
Ipt Syllabus Changes   Project ManagementIpt Syllabus Changes   Project Management
Ipt Syllabus Changes Project ManagementLiam Dunphy
 

Was ist angesagt? (20)

12 ipt 0106/7 Project Implementation & Testing
12 ipt 0106/7   Project Implementation & Testing12 ipt 0106/7   Project Implementation & Testing
12 ipt 0106/7 Project Implementation & Testing
 
12 ipt 0201 information systems
12 ipt 0201   information systems12 ipt 0201   information systems
12 ipt 0201 information systems
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system design
 
Project Management for Information System Development
Project Management for Information System DevelopmentProject Management for Information System Development
Project Management for Information System Development
 
System Analysis And Design 2011
System Analysis And Design  2011System Analysis And Design  2011
System Analysis And Design 2011
 
12 ipt 0104 making decisions
12 ipt 0104   making decisions12 ipt 0104   making decisions
12 ipt 0104 making decisions
 
Information Management unit 3 Database management systems
Information Management unit 3 Database management systemsInformation Management unit 3 Database management systems
Information Management unit 3 Database management systems
 
Bba205 – management information system
Bba205 – management information systemBba205 – management information system
Bba205 – management information system
 
System and Design-MIS-Seminar,Presentation
 System and Design-MIS-Seminar,Presentation System and Design-MIS-Seminar,Presentation
System and Design-MIS-Seminar,Presentation
 
Ch06
Ch06Ch06
Ch06
 
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangement
 
167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc
 
Gr 6 sdlc models
Gr 6   sdlc modelsGr 6   sdlc models
Gr 6 sdlc models
 
System Analysis & Design
System Analysis & DesignSystem Analysis & Design
System Analysis & Design
 
Introduction to System analysis part1
Introduction to System analysis part1Introduction to System analysis part1
Introduction to System analysis part1
 
1.2) Information systems in context
1.2) Information systems in context1.2) Information systems in context
1.2) Information systems in context
 
Ipt Syllabus Changes Project Management
Ipt Syllabus Changes   Project ManagementIpt Syllabus Changes   Project Management
Ipt Syllabus Changes Project Management
 
Analysis vs reporting
Analysis vs reportingAnalysis vs reporting
Analysis vs reporting
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 

Ähnlich wie Google certified-professional-data-engineer

Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Gathr One
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
CIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignCIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignAhmad Ammari
 
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAlexChua42
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCgdgsurrey
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportAravindharamanan S
 
PTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfPTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfTmTri
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change managementRa'Fat Al-Msie'deen
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
DDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA
 
BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528moris lee
 
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceWhite Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceAnalytixDataServices
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Software Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxSoftware Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxsandhyakiran10
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaBilot
 
Logicentrix Dashboards And Scorecards
Logicentrix Dashboards And ScorecardsLogicentrix Dashboards And Scorecards
Logicentrix Dashboards And Scorecardssanolan
 

Ähnlich wie Google certified-professional-data-engineer (20)

Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
CIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignCIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and Design
 
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-report
 
PTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfPTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdf
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change management
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
1 introduction of OOAD
1 introduction of OOAD1 introduction of OOAD
1 introduction of OOAD
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Building information systems
Building information systemsBuilding information systems
Building information systems
 
DDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: Datakwaliteit
 
BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528
 
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceWhite Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Software Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxSoftware Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptx
 
Presentation2
Presentation2Presentation2
Presentation2
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
 
Logicentrix Dashboards And Scorecards
Logicentrix Dashboards And ScorecardsLogicentrix Dashboards And Scorecards
Logicentrix Dashboards And Scorecards
 

Mehr von aBIZinaBOX Inc - CPA's - Financial Advisory, Taxation, Predictive Analytics & Technology

Mehr von aBIZinaBOX Inc - CPA's - Financial Advisory, Taxation, Predictive Analytics & Technology (8)

Who Gets The Cash From a Pound of California Cannabis
Who Gets The Cash From a Pound of California CannabisWho Gets The Cash From a Pound of California Cannabis
Who Gets The Cash From a Pound of California Cannabis
 
Irc sec. 280 e memo j
Irc sec. 280 e memo   jIrc sec. 280 e memo   j
Irc sec. 280 e memo j
 
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
 
Can Roo Rooting Create Succession Rights?
Can Roo Rooting Create Succession Rights?Can Roo Rooting Create Succession Rights?
Can Roo Rooting Create Succession Rights?
 
aBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
aBIZinaBOX's View of the "US Market Leaders" in the Xero EcosystemaBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
aBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
 
Due dilligence on a cpa firm or other accounting services provdier
Due dilligence on a cpa firm or other accounting services provdierDue dilligence on a cpa firm or other accounting services provdier
Due dilligence on a cpa firm or other accounting services provdier
 
EAs and Circular 230
EAs and Circular 230EAs and Circular 230
EAs and Circular 230
 
“America’s Tax Experts” - A Cruel Hoax
“America’s Tax Experts” - A Cruel Hoax“America’s Tax Experts” - A Cruel Hoax
“America’s Tax Experts” - A Cruel Hoax
 

Kürzlich hochgeladen

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Google certified-professional-data-engineer

  • 1. Google Certified Professional - Data Engineer Job Role Description A Google Certified Professional - Data Engineer enables data-driven decision making by collecting, transforming, and visualizing data. The data engineer should be able to design, build, maintain, and troubleshoot data processing systems with a particular emphasis on the security, reliability, fault-tolerance, scalability, fidelity, and efficiency of such systems. The data engineer should also be able to analyze data to gain insight into business outcomes, build statistical models to support decision-making, and create machine learning models to automate and simplify key business processes. Certification Exam Guide Section 1: Designing data processing systems 1.1 Designing flexible data representations. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state and how to migrate the design to a future state ● data modeling ● tradeoffs ● distributed systems ● schema design 1.2 Designing data pipelines. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state and how to migrate the design to a future state ● data modeling ● tradeoffs ● system availability ● distributed systems ● schema design ● common sources of error (eg. removing selection bias) 1.3 Designing data processing infrastructure. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state, how to migrate the design to the future state ● data modeling ● tradeoffs ● system availability ● distributed systems ● schema design ● capacity planning
  • 2. ● different types of architectures: message brokers, message queues, middleware, service-oriented Section 2: Building and maintaining data structures and databases 2.1 Building and maintaining flexible data representations 2.2 Building and maintaining pipelines. Considerations include: ● data cleansing ● batch and streaming ● transformation ● acquire and import data ● testing and quality control ● connecting to new data sources 2.3 Building and maintaining processing infrastructure. Considerations include: ● provisioning resources ● monitoring pipelines ● adjusting pipelines ● testing and quality control Section 3: Analyzing data and enabling machine learning 3.1 Analyzing data. Considerations include: ● data profiling ● data correlation ● patterns and insights ● anomaly detection ● statistical models ● machine learning ● assessing the statistical relevance of conclusions 3.2 Transforming data to enable machine learning and pattern discovery. Considerations include: ● repeatability ● generalization ● distributed computing ● improved model accuracy 3.3 Identifying or building data visualization and reporting tools. Considerations include: ● automation ● decision support ● data summarization ● enabling patterns and insights
  • 3. Section 4: Modeling business processes for analysis and optimization 4.1 Mapping business requirements to data representations. Considerations include: ● working with business users ● gathering business requirements 4.2 Optimizing data representations, data infrastructure performance and cost. Considerations include: ● resizing and scaling resources ● data cleansing, distributed systems ● high performance algorithms ● common sources of error (eg. removing selection bias) Section 5: Ensuring reliability 5.1 Performing quality control. Considerations include: ● verification ● building and running test suites ● pipeline monitoring 5.2 Assessing, troubleshooting, and improving data representations and data processing infrastructure. 5.3 Recovering data. Considerations include: ● planning (e.g. fault-tolerance) ● executing (e.g., rerunning failed jobs, performing retrospective re-analysis) ● stress testing data recovery plans and processes Section 6: Visualizing data and advocating policy 6.1 Building (or selecting) data visualization and reporting tools. Considerations include: ● automation ● decision support ● data summarization, (e.g, translation up the chain, fidelity, trackability, integrity) 6.2 Advocating policies and publishing data and reports. Section 7: ​ ​Designing for security and compliance 7.1 Designing secure data infrastructure and processes. Considerations include: ● Identify and Access Management (IAM) ● data security ● penetration testing ● Separation of Duties (SoD) ● security control 7.2 Designing for legal compliance. Considerations include:
  • 4. ● Health Insurance Portability and Accountability Act (HIPAA), Children’s Online Privacy Protection Act (COPPA), etc. ● audits