SlideShare ist ein Scribd-Unternehmen logo
Co-funded by the European Commission
Horizon 2020 - Grant #777154
Managing Trustworthy Big-
Data Applications in the
Cloud with the
ATMOSPHERE Platform
Ignacio Blanquer
ATMOSPHERE EU Project coordinator
Francisco Brasileiro
ATMOSPHERE Brazil Project coordinator
• ATMOSPHERE is a 24-month H2020
project aiming at the design and
development of a framework and a
platform to implement trustworthy
cloud services on a federated
intercontinental cloud.
• Expected Results
• A federated cloud platform.
• A development framework
• Trustworthy evaluation and monitoring
• Trustworthy Distributed Data Management
• Trustworthy Distributed Data Processing
• A pilot use case on Medical Imaging
Processing.
The Project
Trustworthy Data Processing Services (TDPS)
Application
Trustworthy Data Management Services (TDMS)
Infrastructure Management Services (IMS)
Federated Infrastructure
Trustworthiness
Monit.&Assessment
(TMA)
The problem
I do not want to care for the infrastructure, resource
management, job scheduling, secure access and
similar burdens. Moreover, I want to guarantee that
no sensitive data is exposed outside of the country
where it was produced.
I need to build up an Image Processing Tool that
uses sensitive data that requires a high computing
demand. Once developed, I want to exploit it as a
service securely and with a Quality of Service.
Target: Diagnosis of RHD
• PROVAR study – the first large-scale RHD screening program in
Brazil.
• RHD Screening: public schools, private schools and primary health
units in the cities of Belo Horizonte,
Montes Claros and Bocaiúva,
Minas Gerais, Brazil.
The Data
• The characterization of Echo-cardio
images obtained in public schools
• 5,600 exams, with an average of 14
videos per exams (total of 75,836
videos)
• 5,330 exams are classified as normal (with a
total of 71,686 videos) - 95%
• 238 exams are classified as borderline RHD
(with a total of 3,649 videos) - 4%.
• 32 exams are classified as definite RHD (with a
total of 501 videos) - 1%.
• Additionally, there is another databank with 3.5
millions electrocardiograms from the same
population area and age.
Image Biobank Requirements
Mean age: 13 ± 3 y.o.
Female sex: 55%.
• Sensitive data must not be accessible out of the boundaries of
the hosting country
• Sensitive data is protected by the Brazilian LGPD and must be processed under high
access-protection means, robust even in a potentially vulnerable cloud offering.
• Anonymous data, though, can be released but should be kept accessible only in a
secured environment.
• Medical Imaging processing and Machine Learning model
building requires intensive computing resources
• The capabilities for processing may not be accessible in the boundaries where the
data is located and therefore such processing algorithms must run elsewhere.
• The access should be coherent and secure, and image processing should be efficient.
• Experiments should be reproducible and stable
• The model building, image processing and classification should run on well-defined
environments that could be reproduced for further analysis.
Image Biobank Requirements
• Trust is a choice that is based on past experience. Trust takes time to
build, but it can disappear in a second.
• Trusting cloud services is as complicated as trusting people. You need a
way to measure it and pieces of evidence to build trust.
• Trust in a cloud environment is considered as the reliance of a customer on a cloud
service and, consequently, on its provider.
• Trust bases on a broad spectrum of properties such as
Security, Privacy, Coherence, Isolation, Stability,
Fairness, Transparency and Dependability.
• Nowadays, few approaches deal with the
quantification of trust in cloud computing.
What is trust?
• Along with these
requirements, we explore
other requirements:
• Measurement of the Fairness of
the models to evaluate the bias
of the model with respect to
sensitive categories, such as
gender or race.
• Evaluation of the Explainability
of the model.
• Evaluation of the privacy loss
risk to determine the quality of
the anonymisation and the
potential leakage of personal
data inside the models.
Trust in Health Data Processing
... successfully reidentified the demographic data of
4478 adults (94.9%) & 2120 children (87.4%) …
(P < .001)
10
The Previous situation
Application Developers
- Who develop the tools for
processing the data.
- They require the
infrastructure to provide
some types of services and
resources, such as
computing, secure storage,
high-availability, data
persistence.
- They will deliver the
applications to others
to operate.
Application Manager
- An Application Developer may
not be in charge of deploying
the application on the
production infrast.
- The deployment implies the
monitoring and management
of the resources, services,
user accounts and data.
- The Application Manager will
have access credentials to the
infrastructure and will decide
the optimal allocation of the
resource.
End-Users
- Data providers and Data
scientists exploring and
processing data.
- Need for secure data
transfer and data access
tracing, as well as
simplified processing
tools.
- No need to worry about
achieving ICT skills.
The ATMOSPHERE Platform
12
One platform, multiple dimensions
• The platform can be described considering
different conceptual dimensions
• Users and their roles
• Service delivery models
• Service classes
• Application life cycle
13
Users and their roles
Federated Infrastructure
Resource
Provider
Resource
Provider
Resource
Provider
Trustworthy
Applications &
Services
ATMOSPHERE Platform
Application developer
Data scientist
Application manager
System administrator
Data owner
14
Service delivery models
Federation-wide
Services
Services Toolboxes
Trustworthy
Applications
EMBED USE
ATMOSPHERE Platform
15
Service classes
16
Application life cycle
Design time
Execution time
Summary of the
Main Services Available
18
TMA Layer
19
TMA: Design and interfaces
20
TDPS Layer
● Lemonade* is a web-based system for
designing and running analytics
applications.
● Users, who are not necessarily
programmers, describe applications as
workflows; Lemonade generates code and
controls their execution.
● Workflows consist of operations (boxes) and
data flows (arrows) among them,
performing:
⁃ Data preparation and engineering
⁃ Machine learning methods (MLib)
⁃ Visualization metaphors 21
LEMONADE
22
Supported Trustworthiness properties
Property Developers Data Scientists
Stability Stability strategies (e.g., cross-
validation)
Quality assurance of model outcome
(e.g., calibrate cross validation and
evaluate accuracy variance)
Privacy Privacy-preserving algorithms and
techniques (e.g., k-anonymity)
Assess the impact of preserving privacy
on the outcome utility and effectiveness
Transparency Transparency methods to be combined
with different data analytic flows (e.g.,
LIME/SHAP methods)
Execute ML models and, based on
explanations, calibrate the model or
enhance the input
Fairness Fairness-enhancing mechanisms and
strategies (e.g., Aequitas toolkit).
Generate report as to evaluate fairness
and decide on features to include on
models
• PAF assists organizations owning and
processing datasets to understand how the
processing of data can affect their
conformance with regulations related to
privacy (GDPR and LGPD)
• These assessments may be used to
generate appropriate security/privacy
policies used by other services (eg.
LEMONADE)
23
Privacy assessment forms (PAF)
24
TDMS Layer
• Typical best practices
• Data in transit and at rest can be encrypted
• Some processing can even be done over encrypted data
• Keys and certificates not included in repositories
• But this is not enough...
• If attacker has access to the machine (VM escapes, internal
attacker, cold boots), code can be changed, memory can be
dumped
• Keys or data can be stolen 25
Data access challenges
26
ATMOSPHERE approach for data
access security and privacy
• Use trusted execution environments (TEE) to protect data access
• Advantages
• Raw data is preserved: no noise or anonymization before
storage, value of the original data is preserved
• Proxies used for filtering queries and results to guarantee
protection of sensitive data
• Data is encrypted not only in transit and at rest, but also
during processing
• Enforcement of which applications can access data
• Vallum: the TEE-enabled Access and Privacy Protection Layer
Data Protection
Layer
(Vallum)
The Vallum Framework
Colunar DBMS
(e.g., Cassandra)
Relational DBMS
(e.g., MySQL)Proxying
Authentication
Authorization
Privacy
Auditing Document Store
(e.g., MongoDB)
File System
(e.g. IPFS)
Query
Compliant
Results
Query
Compliant
Results
Query
Compliant
Results
Modified
Query
Result
Modified
Query
Result
Modified
Query
Result
Modified
Query
Result
28
IMS Layer
29
Infrastructure Management Services
Federated Infrastructure
Resource
Provider
Resource
Provider
Resource
Provider
ATMOSPHERE Platform
Federation middleware
Fogbow Fogbow Fogbow
Federation-wide TMA
services
probes running at
each site
TMA service
Automated deployment service Performance prediction &
assessment serviceEC3 TOSCA-IM
Model training
Profiling
The Intercontinental Use Case
• The underlying infrastructure is a federated cloud
• Using fogbow (www.fogbowcloud.org) on OpenStack and OpenNebula.
• With a Federated Network to provide a coherent network space among nodes.
• Heterogeneous resources: SGX-enabled and GPU nodes.
• Using EC3(1) and Infrastructure Manager(2) to deploy a virtual infrastructure.
31
Intercontinental infrastructure
Cloud Resources @EU
Cloud Resources
@ Brazil
SGX-Enabled Resources
container
Encrypted
PROVAR
Study
Cloud
Manager
Cloud
Manager
Federation Layer
Secure overlay network
Central
TMA
TOSCA-IM
GPU-Enabled
Resources container
(1) https://marketplace.eosc-portal.eu/services/elastic-cloud-compute-cluster-ec3
(2) https://marketplace.eosc-portal.eu/services/infrastructure-manager-im
EC3
• The virtual infrastructure is managed by an elastic
Kubernetes cluster spawn over the federated network
• Containers and services are accessible from both sites but
only through the federated network.
• Resources are properly tagged (SGX and GPU capabilities
and Brazil / Europe) so K8s applications are placed in the
correct resource.
• Infrastructure is described as code(3).
• K8s Front-end is deployed and nodes are being
powered on as the applications are deployed, creating
the request for specific resources.
32
Deployment of the virtual
infrastructure
(3) https://github.com/grycap/ec3/tree/atmosphere
• A secure storage is deployed at the
Brazilian side
• It uses Vallum(4), a service that provides
on-the-fly annonymisation based on policies.
• It masks (or blurs) the fields that are marked
as sensitive to different profiles of users.
• It relies on an HDFS filesystem for the files
and on SQL databases for the structured data.
• It runs the data anonymisation and sensitive data access on enclaves running
on SGX-enabled containers, so they securely run even in untrusted clouds
• Data remains encrypted in disk.
33
Secure storage at Brazilian side
Cloud Resources
@ Brazil
SGX-Enabled
Resources
VALLUM
Encrypted
PROVAR
Study
Cloud
Manager
(4) https://www.atmosphere-eubrazil.eu/vallum-framework-access-privacy-protection
• Data is requested to Vallum from external users, but they will only access
to partially anonymised data
• Anonymised data (~1TB) is copied where the computing accelerators are placed.
34
Anonymised Data
Cloud Resources @EUCloud Resources @ Brazil
SGX-Enabled Resources
VALLUM
Encrypted
PROVAR
Study
Plain &
Anonymised
data
Application
TMA
Cloud
Manager
Cloud
Manager
Federation Layer
Secure overlay network Central
TMA
GPU-Enabled
Resources
TOSCA-IM
storage
service
• Videos are split into frames and
classified by color inspection
• A color-based segmentation using k-means
clustering extracts the color pixels from the
Doppler images.
• Images are classified according
their acquisition view using a CNN
• Parasternal long axis view has proven to be
relevant to obtain an accurate classification.
• First & second order texture analyses
characterize the images by the spatial variation of pixel intensities.
• Besides texture features, blood velocity information is also obtained.
• Finally, all the extracted features are classified through machine learning
techniques in order to differentiate between RHD positive and healthy subjects.
35
Building the models for the
Estimation pipeline.
Image
Classification
Frame
Splitting
Preparation of
images for classifier Color-Based
Segment.
Doppler
Data Preparation
View
Classification
Texture Analysis &
Velocity Extraction
Features
Classification
Parasternal Long Axis
Data Analysis
• The pipeline is developed
using LEMONADE(5)
• LEMONADE provides
a GUI and a Machine
Learning librarie to
develop data analytics
pipelines.
• Pipelines can be run
interactively or transformed into executable code.
• Code can be interactively run or further embed into
services to be exposed for production.
• A model building pipeline and an estimation
pipeline are developed.
36
Coding the pipeline: LEMONADE
(5) https://www.atmosphere-eubrazil.eu/lemonade-live-exploration-and-mining-non-trivial-amount-data-everywhere
Fairness
● Algorithms, in ML and IA, learn by identifying patterns in data collected
over many years. Why may algorithms become “unfair”?
○ By using unbalanced data sets, biased to certain population.
○ By using data sets that are perpetuating historical biases.
○ By inappropriate data handling.
○ As result of inappropriate model selection, uncorrect algorithm design or application.
● Algorithms Fairness components:
○ Aequitas Bias and Fairness Audit Toolkit, proposed
by the DSSG group from University of Chicago
(http://aequitas.dssg.io/)
○ Properties:
■ Equal Parity & Proportional Parity.
■ False Positive Rate and False Discovery
Rate Parity.
■ False Negative Rate and False Omission
Rate Parity.
Fairness
Tree
Equal
Parity
Proport.
Parity
Represent.
Fairness
Error
Fairness
FNRP FPRP FDRP FORP
● Model Complexity increase typically reduces Interpretability
○ Complex multilayer Convolutional Neural Networks are far more difficult to explain than
Decision Trees or Linear Regression.
● Effort is invested in characterizing explainability and providing
information to explain how the algorithm reached such results
○ 𝛿-Interprepetability (https://arxiv.org/pdf/1707.03886.pdf).
○ LIME (https://github.com/marcotcr/lime)
■ The output of LIME is a list of explanations,
reflecting the contribution of each feature to
the prediction of a data sample.
Interpretability
Retinopathy prediction using a 48 layers deep net)
https://www.kaggle.com/kmader/inceptionv3-for-retinopathy-gpu-hr
Severe
Retinopathy
Privacy Assessment Forms for GDPR
and LGPD
● The International context requires
dealing with multiple legal
frameworks
○ Brazilian LGPD and GDPR in our case.
● Integrated a tool for tagging and
following up sensitive fields
○ To provide a list of Personally Identifiable
Information (PII) and Sensitive Information
■ PIIs: Fullname, Ethnicity, Medical Record id,
Gender,..
■ Sensitive Info: Medical Information,
Genetics,..
○ Traces the use of sensitive data within a
processing workflow to guide on the
annotation of sensitive derived information.
Re-identification Risk
● Anonymisation defined by policies
○ Define actions (Removal, Blurring, Reduction,
Substitution) and fields.
○ The system starts with the less restrictive
policy, applies anonymisation and computes
the Metric.
● Data Privacy Model
○ Anonymisation Process.
○ K-anonymity Model Computation.
○ Threshold Checker.
○ Linkage Attack for Validation.
○ Increase Anonymity.
41
Conclusions
• Need to manually
configure the
environment.
• Lack of
reproducibility.
• Qualitative
appraisal of the
trustworthiness.
Before After
• Self-assessment of
GDPR/LGDP.
• Trustable storage
environment even
on an untrusted
provider.
• Quantitative
anonymisation
level.
• Manual analysis of
GDPR/LGDP risks
• Need to trust on the
storage provider.
• Anonymisation
level is qualitative.
• Applications templates
for complex &
distributed
applications.
• Provide a repeatable
way to deploy the
whole application.
• Quantitative measure
of trustworthiness

Weitere ähnliche Inhalte

Was ist angesagt?

CISSP Prep: Ch 3. Asset Security
CISSP Prep: Ch 3. Asset SecurityCISSP Prep: Ch 3. Asset Security
CISSP Prep: Ch 3. Asset SecuritySam Bowne
 
Chapter 5 database security
Chapter 5   database securityChapter 5   database security
Chapter 5 database securitySyaiful Ahdan
 
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data Protection
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data ProtectionISSA Boston - PCI and Beyond: A Cost Effective Approach to Data Protection
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data ProtectionUlf Mattsson
 
Database Security And Authentication
Database Security And AuthenticationDatabase Security And Authentication
Database Security And AuthenticationSudeb Das
 
Database security
Database securityDatabase security
Database securityCAS
 
Security and Integrity
Security and IntegritySecurity and Integrity
Security and Integritylubna19
 
Data security and Integrity
Data security and IntegrityData security and Integrity
Data security and IntegrityZaid Shabbir
 
Database Security Management
Database Security Management Database Security Management
Database Security Management Ahsin Yousaf
 
Distributed database security with discretionary access control
Distributed database security with discretionary access controlDistributed database security with discretionary access control
Distributed database security with discretionary access controlJyotishkar Dey
 
Cause 11 im final
Cause 11   im finalCause 11   im final
Cause 11 im finalcavapyta
 
Oruta phase1 report
Oruta phase1 reportOruta phase1 report
Oruta phase1 reportsuthi
 
security and privacy in dbms and in sql database
security and privacy in dbms and in sql databasesecurity and privacy in dbms and in sql database
security and privacy in dbms and in sql databasegourav kottawar
 

Was ist angesagt? (20)

CISSP Prep: Ch 3. Asset Security
CISSP Prep: Ch 3. Asset SecurityCISSP Prep: Ch 3. Asset Security
CISSP Prep: Ch 3. Asset Security
 
Chapter 5 database security
Chapter 5   database securityChapter 5   database security
Chapter 5 database security
 
Soa
SoaSoa
Soa
 
Information Security
Information SecurityInformation Security
Information Security
 
Database Security
Database SecurityDatabase Security
Database Security
 
DB security
 DB security DB security
DB security
 
Design for Security
Design for SecurityDesign for Security
Design for Security
 
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data Protection
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data ProtectionISSA Boston - PCI and Beyond: A Cost Effective Approach to Data Protection
ISSA Boston - PCI and Beyond: A Cost Effective Approach to Data Protection
 
Database Security
Database SecurityDatabase Security
Database Security
 
Database Security And Authentication
Database Security And AuthenticationDatabase Security And Authentication
Database Security And Authentication
 
Database security
Database securityDatabase security
Database security
 
Access Controls
Access ControlsAccess Controls
Access Controls
 
Security and Integrity
Security and IntegritySecurity and Integrity
Security and Integrity
 
Data security and Integrity
Data security and IntegrityData security and Integrity
Data security and Integrity
 
Database Security Management
Database Security Management Database Security Management
Database Security Management
 
Distributed database security with discretionary access control
Distributed database security with discretionary access controlDistributed database security with discretionary access control
Distributed database security with discretionary access control
 
Cause 11 im final
Cause 11   im finalCause 11   im final
Cause 11 im final
 
Oruta phase1 report
Oruta phase1 reportOruta phase1 report
Oruta phase1 report
 
security and privacy in dbms and in sql database
security and privacy in dbms and in sql databasesecurity and privacy in dbms and in sql database
security and privacy in dbms and in sql database
 
Database security
Database securityDatabase security
Database security
 

Ähnlich wie Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE Platform

Guide to security patterns for cloud systems and data security in aws and azure
Guide to security patterns for cloud systems and data security in aws and azureGuide to security patterns for cloud systems and data security in aws and azure
Guide to security patterns for cloud systems and data security in aws and azureAbdul Khan
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think Uni Systems S.M.S.A.
 
AFAC session 2 - September 8, 2014
AFAC session 2 - September 8, 2014AFAC session 2 - September 8, 2014
AFAC session 2 - September 8, 2014KBIZEAU
 
Cloud Security: A matter of trust?
Cloud Security: A matter of trust?Cloud Security: A matter of trust?
Cloud Security: A matter of trust?Mark Williams
 
Security Issues of Cloud Computing
Security Issues of Cloud ComputingSecurity Issues of Cloud Computing
Security Issues of Cloud ComputingFalgun Rathod
 
110307 cloud security requirements gourley
110307 cloud security requirements gourley110307 cloud security requirements gourley
110307 cloud security requirements gourleyGovCloud Network
 
Cloud Security for Regulated Firms - Securing my cloud and proving it
Cloud Security for Regulated Firms - Securing my cloud and proving itCloud Security for Regulated Firms - Securing my cloud and proving it
Cloud Security for Regulated Firms - Securing my cloud and proving itHentsū
 
dtechnClouologyassociatepart2
dtechnClouologyassociatepart2dtechnClouologyassociatepart2
dtechnClouologyassociatepart2Anne Starr
 
Decision Matrix for IoT Product Development
Decision Matrix for IoT Product DevelopmentDecision Matrix for IoT Product Development
Decision Matrix for IoT Product DevelopmentAlexey Pyshkin
 
talk6securingcloudamarprusty-191030091632.pptx
talk6securingcloudamarprusty-191030091632.pptxtalk6securingcloudamarprusty-191030091632.pptx
talk6securingcloudamarprusty-191030091632.pptxTrongMinhHoang1
 
Transforming cloud security into an advantage
Transforming cloud security into an advantageTransforming cloud security into an advantage
Transforming cloud security into an advantageMoshe Ferber
 
Security architecture best practices for saas applications
Security architecture best practices for saas applicationsSecurity architecture best practices for saas applications
Security architecture best practices for saas applicationskanimozhin
 
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018ATMOSPHERE .
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing SecurityNithin Raj
 
Security Architecture Best Practices for SaaS Applications
Security Architecture Best Practices for SaaS ApplicationsSecurity Architecture Best Practices for SaaS Applications
Security Architecture Best Practices for SaaS ApplicationsTechcello
 
Security & Compliance in the Cloud [2019]
Security & Compliance in the Cloud [2019]Security & Compliance in the Cloud [2019]
Security & Compliance in the Cloud [2019]Tudor Damian
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services MarketplaceDenodo
 

Ähnlich wie Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE Platform (20)

Guide to security patterns for cloud systems and data security in aws and azure
Guide to security patterns for cloud systems and data security in aws and azureGuide to security patterns for cloud systems and data security in aws and azure
Guide to security patterns for cloud systems and data security in aws and azure
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think
 
AFAC session 2 - September 8, 2014
AFAC session 2 - September 8, 2014AFAC session 2 - September 8, 2014
AFAC session 2 - September 8, 2014
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
Cloud Security: A matter of trust?
Cloud Security: A matter of trust?Cloud Security: A matter of trust?
Cloud Security: A matter of trust?
 
Cloud Security
Cloud SecurityCloud Security
Cloud Security
 
Security Issues of Cloud Computing
Security Issues of Cloud ComputingSecurity Issues of Cloud Computing
Security Issues of Cloud Computing
 
110307 cloud security requirements gourley
110307 cloud security requirements gourley110307 cloud security requirements gourley
110307 cloud security requirements gourley
 
Cloud Security for Regulated Firms - Securing my cloud and proving it
Cloud Security for Regulated Firms - Securing my cloud and proving itCloud Security for Regulated Firms - Securing my cloud and proving it
Cloud Security for Regulated Firms - Securing my cloud and proving it
 
Cloud Security
Cloud SecurityCloud Security
Cloud Security
 
dtechnClouologyassociatepart2
dtechnClouologyassociatepart2dtechnClouologyassociatepart2
dtechnClouologyassociatepart2
 
Decision Matrix for IoT Product Development
Decision Matrix for IoT Product DevelopmentDecision Matrix for IoT Product Development
Decision Matrix for IoT Product Development
 
talk6securingcloudamarprusty-191030091632.pptx
talk6securingcloudamarprusty-191030091632.pptxtalk6securingcloudamarprusty-191030091632.pptx
talk6securingcloudamarprusty-191030091632.pptx
 
Transforming cloud security into an advantage
Transforming cloud security into an advantageTransforming cloud security into an advantage
Transforming cloud security into an advantage
 
Security architecture best practices for saas applications
Security architecture best practices for saas applicationsSecurity architecture best practices for saas applications
Security architecture best practices for saas applications
 
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018
ATMOSPHERE at Digital Infrastructure for Research (DI4R) 2018
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing Security
 
Security Architecture Best Practices for SaaS Applications
Security Architecture Best Practices for SaaS ApplicationsSecurity Architecture Best Practices for SaaS Applications
Security Architecture Best Practices for SaaS Applications
 
Security & Compliance in the Cloud [2019]
Security & Compliance in the Cloud [2019]Security & Compliance in the Cloud [2019]
Security & Compliance in the Cloud [2019]
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services Marketplace
 

Mehr von ATMOSPHERE .

On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...
On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...
On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...ATMOSPHERE .
 
Control Plane Data Characterisation for an 5G NFV Environment
Control Plane Data Characterisation for an 5G NFV EnvironmentControl Plane Data Characterisation for an 5G NFV Environment
Control Plane Data Characterisation for an 5G NFV EnvironmentATMOSPHERE .
 
Designing an Open IoT Ecosystem
Designing an Open IoT EcosystemDesigning an Open IoT Ecosystem
Designing an Open IoT EcosystemATMOSPHERE .
 
Cloud Robotics: Cognitive Augmentation for Robots via the Cloud
Cloud Robotics: Cognitive Augmentation for Robots via the CloudCloud Robotics: Cognitive Augmentation for Robots via the Cloud
Cloud Robotics: Cognitive Augmentation for Robots via the CloudATMOSPHERE .
 
Artificial Neural Networks for Resource Allocation in 5G Remote Areas
Artificial Neural Networks for Resource Allocation in 5G Remote AreasArtificial Neural Networks for Resource Allocation in 5G Remote Areas
Artificial Neural Networks for Resource Allocation in 5G Remote AreasATMOSPHERE .
 
Compliance of the privacy regulations in an international Europe-Brazil context
Compliance of the privacy regulations in an international Europe-Brazil contextCompliance of the privacy regulations in an international Europe-Brazil context
Compliance of the privacy regulations in an international Europe-Brazil contextATMOSPHERE .
 
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...ATMOSPHERE .
 
Optimization Models for on-demand GPUs in the Cloud
Optimization Models for on-demand GPUs in the CloudOptimization Models for on-demand GPUs in the Cloud
Optimization Models for on-demand GPUs in the CloudATMOSPHERE .
 
SBC Thematic Groups Organisation
SBC Thematic Groups OrganisationSBC Thematic Groups Organisation
SBC Thematic Groups OrganisationATMOSPHERE .
 
Cloud Computing Interest Group
Cloud Computing Interest GroupCloud Computing Interest Group
Cloud Computing Interest GroupATMOSPHERE .
 
5G-Range - 5G networks for remote areas
5G-Range - 5G networks for remote areas5G-Range - 5G networks for remote areas
5G-Range - 5G networks for remote areasATMOSPHERE .
 
NECOS Project: Lightweight Slicing of CloudFederated Infrastructures
NECOS Project: Lightweight Slicing of CloudFederated InfrastructuresNECOS Project: Lightweight Slicing of CloudFederated Infrastructures
NECOS Project: Lightweight Slicing of CloudFederated InfrastructuresATMOSPHERE .
 
SWAMP: Smart Water Management Platform
SWAMP: Smart Water Management PlatformSWAMP: Smart Water Management Platform
SWAMP: Smart Water Management PlatformATMOSPHERE .
 
OCARIoT - Smart Childhood Obesity Caring Solution using IoT Potential
OCARIoT - Smart Childhood Obesity Caring Solution using IoT PotentialOCARIoT - Smart Childhood Obesity Caring Solution using IoT Potential
OCARIoT - Smart Childhood Obesity Caring Solution using IoT PotentialATMOSPHERE .
 
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...ATMOSPHERE .
 
Secure containers for trustworthy cloud services: business opportunities
 Secure containers for trustworthy cloud services: business opportunities Secure containers for trustworthy cloud services: business opportunities
Secure containers for trustworthy cloud services: business opportunitiesATMOSPHERE .
 
Integration of the Trustworthiness Assessment with Industry Systems
Integration of the Trustworthiness Assessment with Industry SystemsIntegration of the Trustworthiness Assessment with Industry Systems
Integration of the Trustworthiness Assessment with Industry SystemsATMOSPHERE .
 
Trustworthy cloud services for Medical Imaging Biomarkers
Trustworthy cloud services for Medical Imaging BiomarkersTrustworthy cloud services for Medical Imaging Biomarkers
Trustworthy cloud services for Medical Imaging BiomarkersATMOSPHERE .
 
ATMOSPHERE: An architecture for trustworthy cloud services
ATMOSPHERE: An architecture for trustworthy cloud servicesATMOSPHERE: An architecture for trustworthy cloud services
ATMOSPHERE: An architecture for trustworthy cloud servicesATMOSPHERE .
 
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...Connecting Robots to the Connected World of Modern Technology via Cloud Compu...
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...ATMOSPHERE .
 

Mehr von ATMOSPHERE . (20)

On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...
On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...
On the development of a Visual-Temporal-awareness Rheumatic Heart Disease cla...
 
Control Plane Data Characterisation for an 5G NFV Environment
Control Plane Data Characterisation for an 5G NFV EnvironmentControl Plane Data Characterisation for an 5G NFV Environment
Control Plane Data Characterisation for an 5G NFV Environment
 
Designing an Open IoT Ecosystem
Designing an Open IoT EcosystemDesigning an Open IoT Ecosystem
Designing an Open IoT Ecosystem
 
Cloud Robotics: Cognitive Augmentation for Robots via the Cloud
Cloud Robotics: Cognitive Augmentation for Robots via the CloudCloud Robotics: Cognitive Augmentation for Robots via the Cloud
Cloud Robotics: Cognitive Augmentation for Robots via the Cloud
 
Artificial Neural Networks for Resource Allocation in 5G Remote Areas
Artificial Neural Networks for Resource Allocation in 5G Remote AreasArtificial Neural Networks for Resource Allocation in 5G Remote Areas
Artificial Neural Networks for Resource Allocation in 5G Remote Areas
 
Compliance of the privacy regulations in an international Europe-Brazil context
Compliance of the privacy regulations in an international Europe-Brazil contextCompliance of the privacy regulations in an international Europe-Brazil context
Compliance of the privacy regulations in an international Europe-Brazil context
 
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...
Using Computational Back-ends for Artificial Intelligence in Childhood Cancer...
 
Optimization Models for on-demand GPUs in the Cloud
Optimization Models for on-demand GPUs in the CloudOptimization Models for on-demand GPUs in the Cloud
Optimization Models for on-demand GPUs in the Cloud
 
SBC Thematic Groups Organisation
SBC Thematic Groups OrganisationSBC Thematic Groups Organisation
SBC Thematic Groups Organisation
 
Cloud Computing Interest Group
Cloud Computing Interest GroupCloud Computing Interest Group
Cloud Computing Interest Group
 
5G-Range - 5G networks for remote areas
5G-Range - 5G networks for remote areas5G-Range - 5G networks for remote areas
5G-Range - 5G networks for remote areas
 
NECOS Project: Lightweight Slicing of CloudFederated Infrastructures
NECOS Project: Lightweight Slicing of CloudFederated InfrastructuresNECOS Project: Lightweight Slicing of CloudFederated Infrastructures
NECOS Project: Lightweight Slicing of CloudFederated Infrastructures
 
SWAMP: Smart Water Management Platform
SWAMP: Smart Water Management PlatformSWAMP: Smart Water Management Platform
SWAMP: Smart Water Management Platform
 
OCARIoT - Smart Childhood Obesity Caring Solution using IoT Potential
OCARIoT - Smart Childhood Obesity Caring Solution using IoT PotentialOCARIoT - Smart Childhood Obesity Caring Solution using IoT Potential
OCARIoT - Smart Childhood Obesity Caring Solution using IoT Potential
 
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure Privacy-...
 
Secure containers for trustworthy cloud services: business opportunities
 Secure containers for trustworthy cloud services: business opportunities Secure containers for trustworthy cloud services: business opportunities
Secure containers for trustworthy cloud services: business opportunities
 
Integration of the Trustworthiness Assessment with Industry Systems
Integration of the Trustworthiness Assessment with Industry SystemsIntegration of the Trustworthiness Assessment with Industry Systems
Integration of the Trustworthiness Assessment with Industry Systems
 
Trustworthy cloud services for Medical Imaging Biomarkers
Trustworthy cloud services for Medical Imaging BiomarkersTrustworthy cloud services for Medical Imaging Biomarkers
Trustworthy cloud services for Medical Imaging Biomarkers
 
ATMOSPHERE: An architecture for trustworthy cloud services
ATMOSPHERE: An architecture for trustworthy cloud servicesATMOSPHERE: An architecture for trustworthy cloud services
ATMOSPHERE: An architecture for trustworthy cloud services
 
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...Connecting Robots to the Connected World of Modern Technology via Cloud Compu...
Connecting Robots to the Connected World of Modern Technology via Cloud Compu...
 

Kürzlich hochgeladen

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfChristopherTHyatt
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXUXDXConf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIES VE
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Intelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdfIntelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdfAnthony Lucente
 

Kürzlich hochgeladen (20)

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Intelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdfIntelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdf
 

Managing Trustworthy Big-data Applications in the Cloud with the ATMOSPHERE Platform

  • 1. Co-funded by the European Commission Horizon 2020 - Grant #777154 Managing Trustworthy Big- Data Applications in the Cloud with the ATMOSPHERE Platform Ignacio Blanquer ATMOSPHERE EU Project coordinator Francisco Brasileiro ATMOSPHERE Brazil Project coordinator
  • 2. • ATMOSPHERE is a 24-month H2020 project aiming at the design and development of a framework and a platform to implement trustworthy cloud services on a federated intercontinental cloud. • Expected Results • A federated cloud platform. • A development framework • Trustworthy evaluation and monitoring • Trustworthy Distributed Data Management • Trustworthy Distributed Data Processing • A pilot use case on Medical Imaging Processing. The Project Trustworthy Data Processing Services (TDPS) Application Trustworthy Data Management Services (TDMS) Infrastructure Management Services (IMS) Federated Infrastructure Trustworthiness Monit.&Assessment (TMA)
  • 3. The problem I do not want to care for the infrastructure, resource management, job scheduling, secure access and similar burdens. Moreover, I want to guarantee that no sensitive data is exposed outside of the country where it was produced. I need to build up an Image Processing Tool that uses sensitive data that requires a high computing demand. Once developed, I want to exploit it as a service securely and with a Quality of Service.
  • 5. • PROVAR study – the first large-scale RHD screening program in Brazil. • RHD Screening: public schools, private schools and primary health units in the cities of Belo Horizonte, Montes Claros and Bocaiúva, Minas Gerais, Brazil. The Data
  • 6. • The characterization of Echo-cardio images obtained in public schools • 5,600 exams, with an average of 14 videos per exams (total of 75,836 videos) • 5,330 exams are classified as normal (with a total of 71,686 videos) - 95% • 238 exams are classified as borderline RHD (with a total of 3,649 videos) - 4%. • 32 exams are classified as definite RHD (with a total of 501 videos) - 1%. • Additionally, there is another databank with 3.5 millions electrocardiograms from the same population area and age. Image Biobank Requirements Mean age: 13 ± 3 y.o. Female sex: 55%.
  • 7. • Sensitive data must not be accessible out of the boundaries of the hosting country • Sensitive data is protected by the Brazilian LGPD and must be processed under high access-protection means, robust even in a potentially vulnerable cloud offering. • Anonymous data, though, can be released but should be kept accessible only in a secured environment. • Medical Imaging processing and Machine Learning model building requires intensive computing resources • The capabilities for processing may not be accessible in the boundaries where the data is located and therefore such processing algorithms must run elsewhere. • The access should be coherent and secure, and image processing should be efficient. • Experiments should be reproducible and stable • The model building, image processing and classification should run on well-defined environments that could be reproduced for further analysis. Image Biobank Requirements
  • 8. • Trust is a choice that is based on past experience. Trust takes time to build, but it can disappear in a second. • Trusting cloud services is as complicated as trusting people. You need a way to measure it and pieces of evidence to build trust. • Trust in a cloud environment is considered as the reliance of a customer on a cloud service and, consequently, on its provider. • Trust bases on a broad spectrum of properties such as Security, Privacy, Coherence, Isolation, Stability, Fairness, Transparency and Dependability. • Nowadays, few approaches deal with the quantification of trust in cloud computing. What is trust?
  • 9. • Along with these requirements, we explore other requirements: • Measurement of the Fairness of the models to evaluate the bias of the model with respect to sensitive categories, such as gender or race. • Evaluation of the Explainability of the model. • Evaluation of the privacy loss risk to determine the quality of the anonymisation and the potential leakage of personal data inside the models. Trust in Health Data Processing ... successfully reidentified the demographic data of 4478 adults (94.9%) & 2120 children (87.4%) … (P < .001)
  • 10. 10 The Previous situation Application Developers - Who develop the tools for processing the data. - They require the infrastructure to provide some types of services and resources, such as computing, secure storage, high-availability, data persistence. - They will deliver the applications to others to operate. Application Manager - An Application Developer may not be in charge of deploying the application on the production infrast. - The deployment implies the monitoring and management of the resources, services, user accounts and data. - The Application Manager will have access credentials to the infrastructure and will decide the optimal allocation of the resource. End-Users - Data providers and Data scientists exploring and processing data. - Need for secure data transfer and data access tracing, as well as simplified processing tools. - No need to worry about achieving ICT skills.
  • 12. 12 One platform, multiple dimensions • The platform can be described considering different conceptual dimensions • Users and their roles • Service delivery models • Service classes • Application life cycle
  • 13. 13 Users and their roles Federated Infrastructure Resource Provider Resource Provider Resource Provider Trustworthy Applications & Services ATMOSPHERE Platform Application developer Data scientist Application manager System administrator Data owner
  • 14. 14 Service delivery models Federation-wide Services Services Toolboxes Trustworthy Applications EMBED USE ATMOSPHERE Platform
  • 16. 16 Application life cycle Design time Execution time
  • 17. Summary of the Main Services Available
  • 19. 19 TMA: Design and interfaces
  • 21. ● Lemonade* is a web-based system for designing and running analytics applications. ● Users, who are not necessarily programmers, describe applications as workflows; Lemonade generates code and controls their execution. ● Workflows consist of operations (boxes) and data flows (arrows) among them, performing: ⁃ Data preparation and engineering ⁃ Machine learning methods (MLib) ⁃ Visualization metaphors 21 LEMONADE
  • 22. 22 Supported Trustworthiness properties Property Developers Data Scientists Stability Stability strategies (e.g., cross- validation) Quality assurance of model outcome (e.g., calibrate cross validation and evaluate accuracy variance) Privacy Privacy-preserving algorithms and techniques (e.g., k-anonymity) Assess the impact of preserving privacy on the outcome utility and effectiveness Transparency Transparency methods to be combined with different data analytic flows (e.g., LIME/SHAP methods) Execute ML models and, based on explanations, calibrate the model or enhance the input Fairness Fairness-enhancing mechanisms and strategies (e.g., Aequitas toolkit). Generate report as to evaluate fairness and decide on features to include on models
  • 23. • PAF assists organizations owning and processing datasets to understand how the processing of data can affect their conformance with regulations related to privacy (GDPR and LGPD) • These assessments may be used to generate appropriate security/privacy policies used by other services (eg. LEMONADE) 23 Privacy assessment forms (PAF)
  • 25. • Typical best practices • Data in transit and at rest can be encrypted • Some processing can even be done over encrypted data • Keys and certificates not included in repositories • But this is not enough... • If attacker has access to the machine (VM escapes, internal attacker, cold boots), code can be changed, memory can be dumped • Keys or data can be stolen 25 Data access challenges
  • 26. 26 ATMOSPHERE approach for data access security and privacy • Use trusted execution environments (TEE) to protect data access • Advantages • Raw data is preserved: no noise or anonymization before storage, value of the original data is preserved • Proxies used for filtering queries and results to guarantee protection of sensitive data • Data is encrypted not only in transit and at rest, but also during processing • Enforcement of which applications can access data • Vallum: the TEE-enabled Access and Privacy Protection Layer
  • 27. Data Protection Layer (Vallum) The Vallum Framework Colunar DBMS (e.g., Cassandra) Relational DBMS (e.g., MySQL)Proxying Authentication Authorization Privacy Auditing Document Store (e.g., MongoDB) File System (e.g. IPFS) Query Compliant Results Query Compliant Results Query Compliant Results Modified Query Result Modified Query Result Modified Query Result Modified Query Result
  • 29. 29 Infrastructure Management Services Federated Infrastructure Resource Provider Resource Provider Resource Provider ATMOSPHERE Platform Federation middleware Fogbow Fogbow Fogbow Federation-wide TMA services probes running at each site TMA service Automated deployment service Performance prediction & assessment serviceEC3 TOSCA-IM Model training Profiling
  • 31. • The underlying infrastructure is a federated cloud • Using fogbow (www.fogbowcloud.org) on OpenStack and OpenNebula. • With a Federated Network to provide a coherent network space among nodes. • Heterogeneous resources: SGX-enabled and GPU nodes. • Using EC3(1) and Infrastructure Manager(2) to deploy a virtual infrastructure. 31 Intercontinental infrastructure Cloud Resources @EU Cloud Resources @ Brazil SGX-Enabled Resources container Encrypted PROVAR Study Cloud Manager Cloud Manager Federation Layer Secure overlay network Central TMA TOSCA-IM GPU-Enabled Resources container (1) https://marketplace.eosc-portal.eu/services/elastic-cloud-compute-cluster-ec3 (2) https://marketplace.eosc-portal.eu/services/infrastructure-manager-im EC3
  • 32. • The virtual infrastructure is managed by an elastic Kubernetes cluster spawn over the federated network • Containers and services are accessible from both sites but only through the federated network. • Resources are properly tagged (SGX and GPU capabilities and Brazil / Europe) so K8s applications are placed in the correct resource. • Infrastructure is described as code(3). • K8s Front-end is deployed and nodes are being powered on as the applications are deployed, creating the request for specific resources. 32 Deployment of the virtual infrastructure (3) https://github.com/grycap/ec3/tree/atmosphere
  • 33. • A secure storage is deployed at the Brazilian side • It uses Vallum(4), a service that provides on-the-fly annonymisation based on policies. • It masks (or blurs) the fields that are marked as sensitive to different profiles of users. • It relies on an HDFS filesystem for the files and on SQL databases for the structured data. • It runs the data anonymisation and sensitive data access on enclaves running on SGX-enabled containers, so they securely run even in untrusted clouds • Data remains encrypted in disk. 33 Secure storage at Brazilian side Cloud Resources @ Brazil SGX-Enabled Resources VALLUM Encrypted PROVAR Study Cloud Manager (4) https://www.atmosphere-eubrazil.eu/vallum-framework-access-privacy-protection
  • 34. • Data is requested to Vallum from external users, but they will only access to partially anonymised data • Anonymised data (~1TB) is copied where the computing accelerators are placed. 34 Anonymised Data Cloud Resources @EUCloud Resources @ Brazil SGX-Enabled Resources VALLUM Encrypted PROVAR Study Plain & Anonymised data Application TMA Cloud Manager Cloud Manager Federation Layer Secure overlay network Central TMA GPU-Enabled Resources TOSCA-IM storage service
  • 35. • Videos are split into frames and classified by color inspection • A color-based segmentation using k-means clustering extracts the color pixels from the Doppler images. • Images are classified according their acquisition view using a CNN • Parasternal long axis view has proven to be relevant to obtain an accurate classification. • First & second order texture analyses characterize the images by the spatial variation of pixel intensities. • Besides texture features, blood velocity information is also obtained. • Finally, all the extracted features are classified through machine learning techniques in order to differentiate between RHD positive and healthy subjects. 35 Building the models for the Estimation pipeline. Image Classification Frame Splitting Preparation of images for classifier Color-Based Segment. Doppler Data Preparation View Classification Texture Analysis & Velocity Extraction Features Classification Parasternal Long Axis Data Analysis
  • 36. • The pipeline is developed using LEMONADE(5) • LEMONADE provides a GUI and a Machine Learning librarie to develop data analytics pipelines. • Pipelines can be run interactively or transformed into executable code. • Code can be interactively run or further embed into services to be exposed for production. • A model building pipeline and an estimation pipeline are developed. 36 Coding the pipeline: LEMONADE (5) https://www.atmosphere-eubrazil.eu/lemonade-live-exploration-and-mining-non-trivial-amount-data-everywhere
  • 37. Fairness ● Algorithms, in ML and IA, learn by identifying patterns in data collected over many years. Why may algorithms become “unfair”? ○ By using unbalanced data sets, biased to certain population. ○ By using data sets that are perpetuating historical biases. ○ By inappropriate data handling. ○ As result of inappropriate model selection, uncorrect algorithm design or application. ● Algorithms Fairness components: ○ Aequitas Bias and Fairness Audit Toolkit, proposed by the DSSG group from University of Chicago (http://aequitas.dssg.io/) ○ Properties: ■ Equal Parity & Proportional Parity. ■ False Positive Rate and False Discovery Rate Parity. ■ False Negative Rate and False Omission Rate Parity. Fairness Tree Equal Parity Proport. Parity Represent. Fairness Error Fairness FNRP FPRP FDRP FORP
  • 38. ● Model Complexity increase typically reduces Interpretability ○ Complex multilayer Convolutional Neural Networks are far more difficult to explain than Decision Trees or Linear Regression. ● Effort is invested in characterizing explainability and providing information to explain how the algorithm reached such results ○ 𝛿-Interprepetability (https://arxiv.org/pdf/1707.03886.pdf). ○ LIME (https://github.com/marcotcr/lime) ■ The output of LIME is a list of explanations, reflecting the contribution of each feature to the prediction of a data sample. Interpretability Retinopathy prediction using a 48 layers deep net) https://www.kaggle.com/kmader/inceptionv3-for-retinopathy-gpu-hr Severe Retinopathy
  • 39. Privacy Assessment Forms for GDPR and LGPD ● The International context requires dealing with multiple legal frameworks ○ Brazilian LGPD and GDPR in our case. ● Integrated a tool for tagging and following up sensitive fields ○ To provide a list of Personally Identifiable Information (PII) and Sensitive Information ■ PIIs: Fullname, Ethnicity, Medical Record id, Gender,.. ■ Sensitive Info: Medical Information, Genetics,.. ○ Traces the use of sensitive data within a processing workflow to guide on the annotation of sensitive derived information.
  • 40. Re-identification Risk ● Anonymisation defined by policies ○ Define actions (Removal, Blurring, Reduction, Substitution) and fields. ○ The system starts with the less restrictive policy, applies anonymisation and computes the Metric. ● Data Privacy Model ○ Anonymisation Process. ○ K-anonymity Model Computation. ○ Threshold Checker. ○ Linkage Attack for Validation. ○ Increase Anonymity.
  • 41. 41 Conclusions • Need to manually configure the environment. • Lack of reproducibility. • Qualitative appraisal of the trustworthiness. Before After • Self-assessment of GDPR/LGDP. • Trustable storage environment even on an untrusted provider. • Quantitative anonymisation level. • Manual analysis of GDPR/LGDP risks • Need to trust on the storage provider. • Anonymisation level is qualitative. • Applications templates for complex & distributed applications. • Provide a repeatable way to deploy the whole application. • Quantitative measure of trustworthiness