3. Why Government Data Lake?
SREDIŠNJI DRŽAVNI URED ZA RAZVOJ DIGITALNOG DRUŠTVA 3
1.Enhanced data-drivendecisionmaking
2. Improvedtransparencyandaccountability
3. Cost-effective datamanagement
4. Cross-department collaborationandknowledge sharing
5.Real-timemonitoringandearlyintervention
6. Evidence-based policy-making
7.Predictiveanalyticsandforecasting
4. Introduction
⢠The project is funded by the National Resistance and Resilience Plan
⢠The investment is worth EUR 16 600 000
⢠Implementation period: 7/2021.-6/2026.
5. ETL pipeline
Dedicated ETL tools
Defined schema
Queries
Results
Relational
LOB Applications
Traditional business analytics process
1. Start with end-user requirements to identify desired reports
and analysis
2. Define corresponding database schema and queries
3. Identify the required data sources
4. Create a Extract-Transform-Load (ETL) pipeline to extract
required data (curation) and transform it to target schema (âschema-on-writeâ)
5. Create reports, analyze data
All data not immediately required is discarded or archived
7. Store indefinitely Analyze See results
Gather data
from all sources
Iterate
New big data thinking: All data has value
⢠All data has potential value
⢠Data hoarding
⢠No defined schemaâstored in native format
⢠Schema is imposed and transformations are done at query time (schema-on-read).
⢠Apps and users interpret the data as they see fit
8. Data Warehousing vs Data Lakes
⢠Data Warehousing
â˘Structured data
â˘Defined set of schemas
â˘Requires Extract-Transform Load
(ETL) before storing
â˘Exploratory analysis is hard
because of transforming the data
â˘âŚ
⢠Data Lake
â˘Raw data (unstructured/semi-
structured/structured)
â˘âDumpâ all your data in the lake
â˘Data scientists will interpret data
from the lake
â˘Without metadata, turns in a data
swamp pretty fast
â˘âŚ
8
9. â˘Collaborative projects
â˘Data ingestion
â˘Data Asset catalog
â˘Dataset blend
â˘Data assets, tagging &
sharing
â˘Data discovery and
recommendations
â˘Usage, lineage & security
Data Lake â requirements
10. Scheme of the platform
CENTRAL STATE OFFICE FOR THE DEVELOPMENT OF THE DIGITAL
SOCIETY
10
11. Data strategy
CENTRAL STATE OFFICE FOR THE DEVELOPMENT OF THE DIGITAL
SOCIETY
11
1
Proposals for a cross-sectoral governance framework
for data access and use
MANAGING THE CROSS-SECTORAL
USE OF DATA
2
Proposals for investments in data and strengthening
Europeâs capabilities and infrastructures for hosting,
processing and using data, interoperability
SCALABLE INFRASTRUCTURE
FOR VIABLE DATA MARKETS
4
Proposals for common European data spaces in
strategic sectors and domains of public interest
EUROPEAN SINGLE MARKETS NEED
CROSS-SECTORAL DATA FLOW
3
Proposals for empowering individuals, investing in skills
and in SMEs
EMPOWERING INDIVIDUALS AND
BUSINESS WITH THE RE-USE OF DATA
12. European digital strategy
CENTRAL STATE OFFICE FOR THE DEVELOPMENT OF THE DIGITAL
SOCIETY
12
⢠The Data Strategy and the White Paper on Artificial Intelligence are the first
pillars of the new digital strategy of the Commission.
⢠They all focus on the need to put people first in developing technology, as
well as on the need to defend and promote European values and rights in
how we design, make and deploy technology in the real economy.
⢠Access to data and the ability to use it are essential for innovation and growth.
Data-driven innovation will bring major and concrete benefits, such as:
⢠personalised medicine
⢠improved mobility
⢠better policymaking
⢠upgrading public services
13. European data strategy â European Data Act
CENTRAL STATE OFFICE FOR THE DEVELOPMENT OF THE DIGITAL
SOCIETY
13
14. Common European data spaces
SREDIŠNJI DRŽAVNI URED ZA RAZVOJ DIGITALNOG DRUŠTVA 14
15. CENTRAL STATE OFFICE FOR THE DEVELOPMENT OF THE DIGITAL
SOCIETY
15
data.europa.eu
Data Governance
Act (DGA)
General Data
Protection
Regulation
(GDPR)
Directive on
open data and
the re-use of
public sector
information
Regulation on a
framework for
the free flow of
non-personal
data in the
European Union
Government
Data into Data Spaces
1. Enhanced data-driven decision making: A data lake allows public bodies to collect and store vast amounts of data from various sources. By integrating Tableau or similar analytical tools, decision-makers can explore and visualize this data to gain insights and make informed decisions. It enables government officials to identify trends, patterns, and correlations within the data, leading to more effective policy planning and implementation.
Â
2. Improved transparency and accountability: Public bodies are responsible for serving the citizens and being accountable for their actions. By centralizing data from different government departments and agencies in a data lake, transparency can be improved. Tableau's visualization capabilities make it easier to present data in a user-friendly and accessible format for the public. This promotes trust and allows citizens to understand how public resources are being utilized.
Â
3. Cost-effective data management: Traditionally, government data is spread across multiple systems and databases, making it difficult to manage and utilize efficiently. By adopting a data lake architecture, public bodies can consolidate their data into a single, scalable platform. This approach reduces the need for costly data integration projects and simplifies data management, saving time and resources.
Â
4. Cross-department collaboration and knowledge sharing: A government data lake can act as a central repository for data from various departments and agencies. This shared data infrastructure facilitates collaboration and knowledge sharing among different government entities. Tableau's collaborative features allow users to share insights, dashboards, and reports, fostering a data-driven culture across the government.
Â
5. Real-time monitoring and early intervention: With a data lake and Tableau's analytical capabilities, public bodies can monitor key metrics and indicators in real-time. This enables early detection of issues, such as identifying emerging trends or detecting anomalies in data patterns. By promptly recognizing potential problems, governments can take proactive measures and intervene before they escalate, improving public service delivery.
Â
6. Evidence-based policy-making: A data lake combined with an analytical tool enables evidence-based policy-making. Decision-makers can analyze historical and current data to evaluate the effectiveness of existing policies and programs. By leveraging these insights, governments can design and implement policies that are better aligned with the needs and preferences of the citizens they serve.
Â
7. Predictive analytics and forecasting: By utilizing advanced analytical capabilities, such as predictive modeling and forecasting, governments can anticipate future trends and make more accurate predictions. This empowers public bodies to plan and allocate resources effectively, optimize service delivery, and mitigate potential risks.