Call Girl Service ITPL - [ Cash on Delivery ] Contact 7001305949 Escorts Service
Service and Support for Science IT-Peter Kunzst, University of Zurich
1. Service and Support for Science IT
Scientific Cloud Experiences
Dr. Peter Kunszt
Director S3IT
2. Outline
• Introduction
– What is Science IT
– How are we organized
• UZH ScienceCloud Infrastructure and
Implementation
• Science Data and Security/Privacy
3. Challenge : Scale Up
• High Throughput Instruments
– Much larger data volumes
– Increased data complexity
• Large Collaborations
– More people
– More experiments and measurements
– More coverage
4. Fire and forget...
• Scientists do not want to be bothered with
infrastructure details
• IT JUST NEEDS TO WORK!
5. Widening Complexity Gap: IT-Research
Local IT
Resources
Research Labs
Core Facilities
Miracle
SCIENCE IT
6. What is Science IT ?
FILL THE GAP
Dedicated Support Center
for Science IT
• SPEED : faster time to solution
• ACCESS : to infrastructure,
software, expertise
• ENABLE : use IT technology and
software for new ideas
Speed
Access
Enablement
7.
8. Supporting Science
• Be a partner to research projects for Science IT
• Provide services to individual researchers, groups and
consortia
– Consultancy for advanced usage of IT in Science
– Research software development and support
– Access to competitive IT infrastructure
– Access to a library of tools and software
– Project management and collaboration support
– Training and education on the usage of infrastructure and software
• Collaborate internally, nationally and internationally with
partners, suppliers and other Science IT units
• Maintain high level of internal expertise on topics relevant to
Science IT
• Advise UZH Governance on evolution of needs, assist in
prioritization
10. S3IT Organization
Core
Team
Site
Team
Site
Team
EE
EE
EE
EE
EE
...
...
EE = Embedded Expert
Working directly in projects
or on-site in groups on
specific tasks
Site Teams
Joint teams with other units
providing local support and
some global services
Core Team
Directorate, Office, core
services, central
infrastructure and
consultancy, project mgmt
12. S3IT Core Business: Project
Support
• Infrastructure is important but ‚just‘ a means to an end
• Science IT Support: Applications, access, integration
• Data analysis
• Simulations
• Data Integration
• Application scaling, making use of big infrastructures
• Workflows, automation
• Visualization
• Software design and usage advice, Code Clinic
• Training and education
• ...
14. Mapping Security and Privacy
• Most science follows 3 stages
– Conception, preparation, proposition stage – private
– Project stage (3-5y) – share in group
– Publication of results – open to all
• Some have additional constraints (regulations)
– Medicine – patient data records need consent
(different per country)
– Law and business – confidentiality in projects
– Engineering, pharmacology, etc.. – patents
15. Infrastructure
• Supercomputing
– Used as a scientific instrument by
• theoretical physics, astrophysics, mathematics, computational
chemistry, biochemistry, quantum chemistry
• Continuous usage
• Cluster computing
– Used as a workhorse by many groups
• Life science, biochem, geoscience, medicine, digital humanities,
banking and finance, art history, ...
• Data analysis, statistical analysis, parameter studies, etc
• Non-continuous usage
• Server computing
– Used as interactive computers by many groups
• All groups. Interactive processing, visualization, steering of
computation. Commercial and open-source tools.
• Daily usage, non-continuous.
16. Storage Classes
• Large, cheap data store for projects O(xPB)
– No need to be backed up: Easy to regenerate but
time-consuming
• Reliable project data store O(1PB)
– With secondary copy
– Only addition, no changes
• Working storage O(x100TB)
– Active data, databases, server-side processes
• Fast storage for streaming analysis O(100TB)
– Fast changing data, immediate analysis, rare!
17. Datacenter Consolidation
OCI – S3IT
ZMB
BIOC
MATH
PHYS
IMLS / Neuro
Consolidate
into
Central
Datacenter
Aim: Scale and Secure!
18. UZH ScienceCloud Implementation
• OpenStack – based on Canonical
• Deployment using Ansible
• Vagrant-like system for configuration:
Elasticluster (developed at UZH)
• Flexible submission and workflow framework
for job control: GC3pie (developed at UZH)
• Database management framework openBIS
for data lifecycle management (developed at
ETH/SystemsX.ch)
19. Business Model
• Supercomputing
– Investment every 4 years into the system
– Research groups to find 3rd party funding
• Commodity Cloud and Storage
– Subscription / year : Cores, TB
– Per use fee
– Subsidized, not TCO – covering operations
• Servers / Pets
– Yearly or monthly fee
– Size matters
• Yearly acquisition / rollover
– Easy to plan
20. Experience so far:
• Supercomputing needed only by few groups
– Can be completely outsourced to national center, done as of 2015
• Cloud is suitable for most Science Workloads
– User support scales well
– Can cover very many use cases
– Build dedicated boxes for exceptions, don‘t be driven by them
– Flexibility is key
• Must use local infrastructure for secure, data intensive and
memory intensive workloads
– Data locality needed for COST and (rarely) policy reasons –
exception: medical data
– Hybrid cloud – burst available for CPU intensive jobs
– Deal with heterogeneity
21. Future Cloud Strategy: HYBRID
• Run sizeable local cloud infrastructure for internal
workloads
• Burst peak loads to public cloud providers
– For selected workloads coherent with policy and cost
Advantages
• Plannable local infrastructure (plan for full usage)
• Flexibility in scaling, quick provisioning of needed
capacity
22. Open Questions
• Policies. What workloads can be burst to public clouds?
Under what conditions
– Calculations, simulations usually OK
– Data analyis: depends on data (network issues being
resolved)
– Check compliance of cloud providers. ISO, HIPAA, etc
– Adherence to swiss cantonal data protection regulations
• Cost. How to buy public cloud services?
– Public procurement of agreements?
– How not to be bound to a single provider?
– Is this necessary at all?
• How do i charge my users?
– For internal and for external use?
– Aim: consolidate their workload into our cloud. No TCO!
23. Comments on Security in
academia
• Users in academia are super smart. They remove
barriers faster than you can erect them.
• Do risk assessment and risk analysis instead of
prevention.
• Don‘t do anything ‚for security reasons‘, always qualify
with real risk numbers
• Public Clouds are MUCH MORE secure than our own
– Amazon, Microsoft, IBM etc have whole teams of security
experts – they hired our best students for this
• It is a question of TRUST
– Regulations by countries
– Do we trust the US not to do industrial and academic
espionage, forcing their own companies to give out our
data?
24. Scientific Requirements
• Know your workload: Data, Privacy, Science,
Sharing aspects are tightly connected
• Lots of hidden complexity and contradicting
requirements
29
25. 1. What Data?
• Different kinds of ‚BIG‘ data
• Volume, Variety, Velocity, Veracity
• Understanding is Knowledge is Science
– Data vs. Information and Knowledge
– What are the right questions?
– What should be protected, till when?
– How to navigate, explore, evolve
30
WHO OWNS THE DATA?
For science, proprietary data is a hindrance
26. 2. Data Reuse
• Currently a wealth of data is not reused for
new discovery
• Lots of potential! Regulators need to be told..
• Data repositories with computing and search
capability – perfect for Cloud Model
• Do the computation where the data is –
Private, public, hybrid Cloud
31
IP on TOOLS, ease of data USE, not DATA itself.
27. 3. Motivate to annotate
• Scientists publish what is necessary and
prescribed by the journals, not more –
mandate better annotation
• Provide more recognition for producing ´good´
datasets – Data Citation
• Check Data quality – bad quality or
data without annotation has no value
32
Creation of well annotated, sustained public
resources
28. 4. Standard Formats
• Too many ‚Standards‘ or not used
– Instrument vendors often at fault
• Protection of data by proprietary formats
– Data is lost to research
• Do not pay for data in nonstandard
formats
– Data value is zero if unusable
33
Mandate standard formats for domain data
29. 5. Data Sharing/Publishing
• Share in collaborative mode
• Avoid Data Loss
• Motivate and enable data publication
• Establish business model for data publication
(reward/career benefit)
• Journals adapt, see Scientific Data
http://www.nature.com/scientificdata
New role for Archives and Libraries
30. 6. Patient Data Records
• Legal issues of data privacy
• People are not in control of their own data
• Difficult to get consent
• NSA effect – trust
Put citizens back in control
31. Patient Data Records
• TRUST
– Swiss Cooperative: citizen owned
• NEUTRALITY
– A simple e-Banking system for any personal health data. Same
level of security
• TRACTION
– Volume: it is free, it‘s rewarded
• IMPACT
– Request data directly, avoid legal issues
36
32. • It is a cooperative, not a business
• Funding by running campaigns to ask people to
participate in research & surveys
• Participants are REWARED for sharing their data or
providing new data
• Build tools on top
• Currently seeking funding
– H2020, foundations
– Projects with hospitals, clinics
37
33. Approach at S3IT
• Early involvement with Research Groups
– Proposal writing, partnership
– Advice on Data Management, infrastructure, standards
• Strong cooperation with Libraries
– Early involvement with publishers, archives
– Joint information to research groups on data management
plans, data citations
• Seeking contact with funding bodies and decision
makers
– Communicate business plan for Science IT ‚project
consumables‘
– Evaluation of projects based on technology cost and
feasibility
– Usage of public and each others‘ cloud resources for cash
34. Links
• www.s3it.uzh.ch - Science IT at UZH
• www.sybit.net - Systems Biology IT, SystemsX.ch
• www.erasysapp.eu - Systems Biology, DMMCore
project
• www.healthbank.ch - Public Cooperative being
set up for patient-owned data. Seeking funding
(H2020, pending, and other sources)