Weitere ähnliche Inhalte Ähnlich wie Evolution is Continuous, and so are Big Data and Streaming Pipelines (20) Kürzlich hochgeladen (20) Evolution is Continuous, and so are Big Data and Streaming Pipelines2. Evolution is Continuous, and so are
Big Data and Streaming Pipelines
Gustav Rånby
Data Scientist
Sarah Hantosi Albertsson
Data Scientist
3. About us
▪ The team
▪ Use cases and needs
▪ Challenges & learnings
DevOps
▪ Components, releases and pipelines
▪ Putting it together
Summing up
7. 2020 20352022 2025 20332030
0
10
20
30
40
50
60
70
80
90
100
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
%
CO2 reduction
12. Tests enable iterative improvements and communication around properties
of a data product.
Photo by ThisisEngineering RAEng on Unsplash
16. Component - name: Copy job
copy:
src: ../target/scala-2.11/{{ jar }}
dest: "{{ local_dir }}/{{ jar }}"
- name: Get Kerberos ticket
command: kinit -kt {{ kerberos_keytab }} {{ kerberos_principal }}
- name: Install jaas config worker
template:
src: jaas.conf
dest: "{{ local_dir }}/{{ jaas_config_file }}"
- name: Install jaas config driver
template:
src: jaas_driver.conf
dest: "{{ local_dir }}/{{ jaas_config_file_driver }}"
- name: Submit job
command: "{{ spark_submit_command }}"
args:
chdir: "{{ local_dir }}"
environment:
SPARK_MAJOR_VERSION: 2
register:
submit_result
ignore_errors: True
Copy the program to the cluster
Ensure authentication
Run program
22. Pipeline specification
{"pipelines": [{
"pipeline_annotation": "2",
"release_name": “2.1",
"run_process_version": "2",
"dag": {
"start": {
"job": “positions-cleaned",
"next": ["positions-to-stops"]
},
"positions-to-stops": {
"job": "positions-to-stops",
"next": ["stops-to-poi-stops"]
},
"stops-to-locations": {
"job": "stops-to-location",
"next": [“post-process-stops", "post-process-locations"]
},
"post-process-stops": {
"job": "post-process",
"arguments": "--in stop_topic --out hdfs_path"
},
"post-process-locations": {
"job": "post-process",
"arguments": "--in location_topic --out hdfs_path"
}
}
}]
}
23. Putting it together
Code Re pos itory
Projec t
artifa ct
Int egration tes t
Upd ate Re pos itory Build – : Te s t – : Publis h
Run
COMPONENTBuild – : Te s t – : Publis h COMPONENT
RELEASE
PIPELINE
SPECIFICATION