SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
Enabling Scientific Workflows
on FermiCloud using
OpenNebula
Steven Timm
Grid & Cloud Services Department
Fermilab
Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359
Outline
Introduction—Fermilab and Scientific
Computing
FermiCloud Project and Drivers
Applying Grid Lessons To Cloud
FermiCloud Project
Current and Future Interoperability
Reframing the Cloud Discussion
25-Sep-2013S. Timm, OpenNebulaConf1
Fermilab and Scientific Computing
Fermi National Accelerator
Laboratory:
• Lead United States
particle physics
laboratory
• ~60 PB of data on tape
• High Throughput
Computing
characterized by:
• ―Pleasingly parallel‖ tasks
• High CPU instruction /
Bytes IO ratio
• But still lots of I/O. See
Pfister: ―In Search of
Clusters‖
25-Sep-2013S. Timm, OpenNebulaConf2
Grid and Cloud Services Dept.
Operations:
Grid Authorization
Grid Accounting
Computing Elements
Batch Submission
All require high availability
All require multiple
integration systems to test.
Also requires virtualization
And login as root
Solutions:
Development of
authorization, accounting,
and batch submission
software
Packaging and integration
Requires development
machines not used all the
time
Plus environments that are
easily reset
And login as root
25-Sep-2013S. Timm, OpenNebulaConf3
HTC Virtualization Drivers
Large multi-core servers have evolved from 2 to 64 cores per box,
• A single ―rogue‖ user/application can impact 63 other users/applications.
• Virtualization can provide mechanisms to securely isolate users/applications.
Typical ―bare metal‖ hardware has significantly more performance than usually needed
for a single-purpose server,
• Virtualization can provide mechanisms to harvest/utilize the remaining cycles.
Complicated software stacks are difficult to distribute on grid,
• Distribution of preconfigured virtual machines together with GlideinWMS and
HTCondor can aid in addressing this problem.
Large demand for transient development/testing/integration work,
• Virtual machines are ideal for this work.
Science is increasingly turning to complex, multiphase workflows.
• Virtualization coupled with cloud can provide the ability to flexibly reconfigure
hardware ―on demand‖ to meet the changing needs of science.
Legacy code:
• Data and code preservation for recently-completed experiments at Fermilab
Tevatron and elsewhere.
Burst Capacity:
• Systems are full all the time, need more cycles just before conferences.
25-Sep-2013S. Timm, OpenNebulaConf4
FermiCloud – Initial Project
Specifications
FermiCloud Project was established in 2009 with the goal of
developing and establishing Scientific Cloud capabilities for the
Fermilab Scientific Program,
• Building on the very successful FermiGrid program that supports the full
Fermilab user community and makes significant contributions as members of
the Open Science Grid Consortium.
• Reuse High Availabilty, AuthZ/AuthN, Virtualization from Grid
In a (very) broad brush, the mission of the FermiCloud project is:
• To deploy a production quality Infrastructure as a Service (IaaS) Cloud
Computing capability in support of the Fermilab Scientific Program.
• To support additional IaaS, PaaS and SaaS Cloud Computing capabilities based
on the FermiCloud infrastructure at Fermilab.
The FermiCloud project is a program of work that is split over
several overlapping phases.
• Each phase builds on the capabilities delivered as part of the previous phases.
25-Sep-2013S. Timm, OpenNebulaConf5
Overlapping Phases
25-Sep-2013S. Timm, OpenNebulaConf6
Phase 1:
“Build and Deploy the Infrastructure”
Phase 2:
“Deploy Management Services, Extend
the Infrastructure and Research
Capabilities”
Phase 3:
“Establish Production Services and
Evolve System Capabilities in
Response to User Needs & Requests”
Phase 4:
“Expand the service capabilities to serve
more of our user communities”
Time
Today
Current FermiCloud Capabilities
The current FermiCloud hardware capabilities include:
• Public network access via the high performance Fermilab network,
- This is a distributed, redundant network.
• Private 1 Gb/sec network,
- This network is bridged across FCC and GCC on private fiber,
• High performance Infiniband network,
- Currently split into two segments,
• Access to a high performance FibreChannel based SAN,
- This SAN spans both buildings.
• Access to the high performance BlueArc based filesystems,
- The BlueArc is located on FCC-2,
• Access to the Fermilab dCache and enStore services,
- These services are split across FCC and GCC,
• Access to 100 Gbit Ethernet test bed in LCC (Integration nodes),
- Intel 10 Gbit Ethernet converged network adapter X540-T1.
25-Sep-2013S. Timm, OpenNebulaConf7
Typical Use Cases
Public net virtual machine:
• On Fermilab Network open to Internet,
• Can access dCache and Bluearc Mass Storage,
• Common home directory between multiple VM’s.
Public/Private Cluster:
• One gateway VM on public/private net,
• Cluster of many VM’s on private net.
• Data acquisition simulation
Storage VM:
• VM with large non-persistent storage,
• Use for large MySQL or Postgres databases,
Lustre/Hadoop/Bestman/xRootd/dCache/OrangeFS/IRODS servers.
25-Sep-2013S. Timm, OpenNebulaConf8
FermiGrid-HA2 Experience
In 2009, based on operational experience and plans for
redevelopment of the FCC-1 computer room, the FermiGrid-HA2
project was established to split the set of FermiGrid services
across computer rooms in two separate buildings (FCC-2 and
GCC-B).
• This project was completed on 7-Jun-2011 (and tested by a building failure
less than two hours later).
• FermiGrid-HA2 worked exactly as designed.
Our operational experience with FermiGrid-HA and FermiGrid-
HA2 has shown the benefits of virtualization and service
redundancy.
• Benefits to the user community – increased service reliability and uptime.
• Benefits to the service maintainers – flexible scheduling of maintenance and
upgrade activities.
25-Sep-2013S. Timm, OpenNebulaConf9
Experience with FermiGrid =
Drivers for FermiCloud
25-Sep-2013S. Timm, OpenNebulaConf10
Access to pools of resources using common interfaces:
• Monitoring, quotas, allocations, accounting, etc.
Opportunistic access:
• Users can use the common interfaces to ―burst‖ to
additional resources to meet their needs
Efficient operations:
• Deploy common services centrally
High availability services:
• Flexible and resilient operations
Additional Drivers for FermiCloud
Existing development and integration (AKA the FAPL cluster)
facilities were:
• Technically obsolescent and unable to be used effectively to
test and deploy the current generations of Grid middleware.
• The hardware was over 8 years old and was falling apart.
• The needs of the developers and service administrators in
the Grid and Cloud Computing Department for reliable and
―at scale‖ development and integration facilities were
growing.
• Operational experience with FermiGrid had demonstrated
that virtualization could be used to deliver production class
services.
25-Sep-2013S. Timm, OpenNebulaConf11
OpenNebula
OpenNebula was picked as result of evaluation of Open
source cloud management software.
OpenNebula 2.0 pilot system in GCC available to users
since November 2010.
Began with 5 nodes, gradually expanded to 13 nodes.
4500 Virtual Machines run on pilot system in 3+ years.
OpenNebula 3.2 production-quality system installed in
FCC in June 2012 in advance of GCC total power
outage—now comprises 18 nodes.
Transition of virtual machines and users from ONe 2.0
pilot system to production system almost complete.
In the meantime OpenNebula has done five more
releases, will catch up shortly.
25-Sep-2013S. Timm, OpenNebulaConf12
FermiCloud – Fault Tolerance
As we have learned from FermiGrid, having a distributed fault
tolerant infrastructure is highly desirable for production operations.
We are actively working on deploying the FermiCloud hardware
resources in a fault tolerant infrastructure:
• The physical systems are split across two buildings,
• There is a fault tolerant network infrastructure in place that interconnects the
two buildings,
• We have deployed SAN hardware in both buildings,
• We have a dual head-node configuration with HB for failover
• We have a GFS2 + CLVM for our multi-user filesystem and distributed SAN.
• SAN replicated between buildings using CLVM mirroring.
GOAL:
• If a building is ―lost‖, then automatically relaunch ―24x7‖ VMs on surviving
infrastructure, then relaunch ―9x5‖ VMs if there is sufficient remaining capacity,
• Perform notification (via Service-Now) when exceptions are detected.
25-Sep-2013S. Timm, OpenNebulaConf13
FCC and GCC
25-Sep-2013S. Timm, OpenNebulaConf14
FC
C
GC
C
The FCC and GCC buildings
are separated by
approximately 1 mile (1.6 km).
FCC has UPS and Generator.
GCC has UPS.
Distributed Network Core
Provides Redundant Connectivity
25-Sep-2013S. Timm, OpenNebulaConf15
GCC-A
Nexus
7010
Robotic
Tape
Libraries
(4)
Robotic
Tape
Libraries
(3)
Fermi
Grid
Fermi
Cloud
Fermi
Grid
Fermi
Cloud
Disk Servers
Disk Servers
20 Gigabit/s L3 Routed Network
80 Gigabit/s L2 Switched Network
Note – Intermediate level switches and top of rack switches are
not shown in the this diagram.
Private Networks over dedicated fiber
Grid
Worker
Nodes
Nexus
7010
FCC-2
Nexus
7010
FCC-3
Nexus
7010
GCC-B
Grid
Worker
Nodes
Deployment completed in June 2012
Distributed Shared File System
Design:
Dual-port FibreChannel HBA in each node,
Two Brocade SAN switches per rack,
Brocades linked rack-to-rack with dark fiber,
60TB Nexsan Satabeast in FCC-3 and GCC-B,
Redhat Clustering + CLVM + GFS2 used for file system,
Each VM image is a file in the GFS2 file system
LVM mirroring RAID 1 across buildings.
Benefits:
Fast Launch—almost immediate as compared to 3-4 minutes with ssh/scp,
Live Migration—Can move virtual machines from one host to another for
scheduled maintenance, transparent to users,
Persistent data volumes—can move quickly with machines,
Can relaunch virtual machines in surviving building in case of building
failure/outage,
25-Sep-2013S. Timm, OpenNebulaConf16
FermiCloud – Network & SAN
―Today‖
Private Ethernet
over dedicated fiber
Fibre Channel
over dedicated fiber
25-Sep-201317 S. Timm, OpenNebulaConf
Nexus
7010
Nexus
7010
Nexus
7010
FCC-2 GCC-A
FCC-3 GCC-B
Nexus
7010
fcl315
To
fcl323
FCC-3
Brocade
Satabeast
Brocade
fcl001
To
fcl013
GCC-B
Brocade
Satabeast
Brocade
FermiCloud-HA
Head Node Configuration
25-Sep-2013S. Timm, OpenNebulaConf18
fcl001 (GCC-B) fcl301 (FCC-3)
ONED/SCHED
fcl-ganglia2
fermicloudnis2
fermicloudrepo2
fclweb2
fcl-cobbler
fermicloudlog
fermicloudadmin
fcl-lvs2
fcl-mysql2
ONED/SCHED
fcl-ganglia1
fermicloudnis1
fermicloudrepo1
fclweb1
fermicloudrsv
fcl-lvs1
fcl-mysql1
2 way rsync
Live Migration
Multi-master
MySQL
CLVM/rgmanager
Cooperative R+D Agreement
Partners:
• Grid and Cloud Computing Dept. @FNAL
• Global Science Experimental Data hub Center @KISTI
Project Title:
• Integration and Commissioning of a Prototype Federated Cloud for Scientific
Workflows
Status:
• Three major work items:
1. Virtual Infrastructure Automation and Provisioning,
2. Interoperability and Federation of Cloud Resources,
3. High-Throughput Fabric Virtualization.
25-Sep-2013S. Timm, OpenNebulaConf19
Virtual Machines as Jobs
OpenNebula (and all other open-source IaaS stacks) provide
an emulation of Amazon EC2.
HTCondor developers added code to their ―Amazon EC2‖
universe to support the X.509-authenticated protocol.
Currently testing in bulk, up to 75 VM’s OK thus far:
Goal to submit NOvA workflow to OpenNebula @ FermiCloud,
OpenStack @ Notre Dame, and Amazon EC2.
Smooth submission of many thousands of VM’s is key step to
making the full infrastructure of a site into a science cloud.
25-Sep-2013S. Timm, OpenNebulaConf20
GWMS
FACTORY
USER
FRONTEND
FermiGrid
SITE
GATEWAY
CDF D0 CMS GP CLOUDGATE
Amazon
EC2
FermiCloudGCLOUD
Notre
Dame
OpenStack
VCLUSTER
GRID BURSTING
Vcluster reads
job queue and
submits VMs as
needed
25-Sep-2013S. Timm, OpenNebulaConf21
22
vCluster at SC2012
25-Sep-2013S. Timm, OpenNebulaConf
GWMS
FACTORY
USER
FRONTEND
Amazon
EC2
FermiCloudGCLOUD
Notre
Dame
OpenStack
CLOUD
BURSTING
VIA GWMS
25-Sep-2013S. Timm, OpenNebulaConf23
GWMS
FACTORY
USER
FRONTEND
Amazon
EC2
FermiCloud
CLOUD
BURSTING
Via OpenNebula
25-Sep-2013S. Timm, OpenNebulaConf24
True Idle VM Detection
In times of resource need, we want the ability to suspend or ―shelve‖ idle VMs
in order to free up resources for higher priority usage.
• This is especially important in the event of constrained resources (e.g. during
building or network failure).
Shelving of ―9x5‖ and ―opportunistic‖ VMs allows us to use FermiCloud
resources for Grid worker node VMs during nights and weekends
• This is part of the draft economic model.
Giovanni Franzini (an Italian co-op student) has written (extensible) code for
an ―Idle VM Probe‖ that can be used to detect idle virtual machines based on
CPU, disk I/O and network I/O.
Nick Palombo, consultant, has written the communication system and the
collector system to do rule-based actions based on the idle information.
25-Sep-2013S. Timm, OpenNebulaConf25
Idle VM Information Flow
25-Sep-2013S. Timm, OpenNebulaConf26
Raw VM
State DB
Idle VM
Collector
Idle VM
Logic
Idle VM
List
Idle VM
Trigger
Idle VM
Shutdown
Idle VM Management Process HOST
VM
VM
VM
Idle data
store
OpenNebula
HOST
VM
VM
VM
Idle data
store
IM
IM
XMLRPC
XMLRPC
Interoperability and Federation
Driver:
• Global scientific collaborations such as LHC experiments will have to interoperate across
facilities with heterogeneous cloud infrastructure.
European efforts:
• EGI Cloud Federation Task Force – several institutional clouds (OpenNebula, OpenStack,
StratusLab).
• HelixNebula—Federation of commercial cloud providers
Our goals:
• Show proof of principle—Federation including FermiCloud + KISTI ―G Cloud‖ + one or more
commercial cloud providers + other research institution community clouds if possible.
• Participate in existing federations if possible.
Core Competency:
• FermiCloud project can contribute to these cloud federations given our expertise in X.509
Authentication and Authorization, and our long experience in grid federation
25-Sep-2013S. Timm, OpenNebulaConf27
Virtual Image Formats
Different clouds have different virtual
machine image formats:
• File system ++, Partition table, LVM
volumes, Kernel?
We have identified the differences and
written a comprehensive step by step user
manual, soon to be public.
25-Sep-2013S. Timm, OpenNebulaConf28
Interoperability/Compatibility of API’s
Amazon EC2 API is not open source, it is a
moving target that changes frequently.
Open-source emulations have various feature
levels and accuracy of implementation:
• Compare and contrast OpenNebula,
OpenStack, and commercial clouds,
• Identify lowest common denominator(s) that
work on all.
25-Sep-2013S. Timm, OpenNebulaConf29
VM Image Distribution
Investigate existing image marketplaces
(HEPiX, U. of Victoria).
Investigate if we need an Amazon S3-like
storage/distribution method for OS images,
• OpenNebula doesn’t have one at present,
• A GridFTP ―door‖ to the OpenNebula VM
library is a possibility, this could be integrated
with an automatic security scan workflow
using the existing Fermilab NESSUS
infrastructure.
25-Sep-2013S. Timm, OpenNebulaConf30
High-Throughput Fabric Virtualization
Followed up earlier virtualized MPI work:
• Use it in real scientific workflows
• Now users can define a set of IB machines
in OpenNebula on their own
• DAQ system simulation
• Large multicast activity
• Also experiments done with virtualized
10GBe on 100GBit WAN testbed.
.
25-Sep-2013S. Timm, OpenNebulaConf31
Security
Main areas of cloud security development:
Secure Contextualization:
• Secrets such as X.509 service certificates and Kerberos keytabs are not stored in
virtual machines (See following talk for more details).
X.509 Authentication/Authorization:
• X.509 Authentication written by T. Hesselroth, code submitted to and accepted by
OpenNebula, publicly available since Jan-2012.
Security Policy:
• A security taskforce met and delivered a report to the Fermilab Computer Security
Board, recommending the creation of a new Cloud Computing Environment, now in
progress.
We also participated in the HEPiX Virtualisation Task Force,
• We respectfully disagree with the recommendations regarding VM endorsement.
25-Sep-2013S. Timm, OpenNebulaConf32
Pluggable authentication
Some assembly required
Batteries not included
25-Sep-2013S. Timm, OpenNebulaConf33
OpenNebula Authentication
OpenNebula came with ―pluggable‖ authentication, but few plugins
initially available.
OpenNebula 2.0 Web services by default used access key / secret
key mechanism similar to Amazon EC2. No https available.
Four ways to access OpenNebula:
• Command line tools,
• Sunstone Web GUI,
• ―ECONE‖ web service emulation of Amazon Restful (Query) API,
• OCCI web service.
FermiCloud project wrote X.509-based authentication plugins:
• Patches to OpenNebula to support this were developed at Fermilab and
submitted back to the OpenNebula project in Fall 2011 (generally available
in OpenNebula V3.2 onwards).
• X.509 plugins available for command line and for web services authentication.
25-Sep-2013S. Timm, OpenNebulaConf34
X.509 Authentication—how it works
• Command line:
• User creates a X.509-based token using ―oneuser login‖
command
• This makes a base64 hash of the user’s proxy and
certificate chain, combined with a username:expiration
date, signed with the user’s private key
• Web Services:
• Web services daemon contacts OpenNebula XML-RPC
core on the users’ behalf, using the host certificate to sign
the authentication token.
• Use Apache mod_proxy to pass the grid certificate DN to
web services.
• Limitations:
• With Web services, one DN can map to only one user.
25-Sep-2013S. Timm, OpenNebulaConf35
Grid AuthZ Interoperability Protocol
• Use XACML 2.0 to specify
• DN, CA, Hostname, CA, FQAN, FQAN signing entity, and more.
• Developed in 2007, has been used in Open Science Grid
and other grids
• Java and C bindings available for client
• Most commonly used C binding is LCMAPS
• Used to talk to GUMS, SAZ, others
• Allows one user to be part of different Virtual
Organizations and have different groups and roles.
• For Cloud authorization we will configure GUMS to map
back to individual user names, one per person
• Each personal account in OpenNebula created in
advance.
25-Sep-2013S. Timm, OpenNebulaConf36
―Authorization‖ in OpenNebula
• Note: OpenNebula has pluggable
―Authorization‖ modules as well.
• These control Access ACL’s—namely which
user can launch a virtual machine, create a
network, store an image, etc.
• Not related to the grid-based notion of
authorization at all.
• Instead we make our ―Authorization‖
additions to the Authentication routines of
OpenNebula
25-Sep-2013S. Timm, OpenNebulaConf37
X.509 Authorization
• OpenNebula authorization plugins written in Ruby
• Use existing Grid routines to call to external GUMS and SAZ
authorization servers
• Use Ruby-C binding to call C-based routines for LCMAPS or
• Use Ruby-Java bridge to call Java-based routines from Privilege proj.
• GUMS returns uid/gid, SAZ returns yes/no.
• Works with OpenNebula command line and non-interactive web services
• Much effort spent in trying to send user credentials with extended
attributes into web browser
• Currently—ruby-java-bridge setup works for CLI
• For Sunstone we have shifted to have callout to VOMS done on server
side.
• We are always interested in talking to anyone who is doing X.509
authentication in any cloud.
25-Sep-2013S. Timm, OpenNebulaConf38
Reframing Cloud Discussion
Purpose of Infrastructure-as-a-service:
On demand only?
No—a whole new way to think about IT infrastructure both
internal and external.
Cloud API is just a part of rethinking IT infrastructure for data-
intensive science (and MIS).
Only as good as the hardware and software it’s built on.
Network fabric, storage, and applications all crucial.
Buy or build?
Both! Will always need some in-house capacity.
Performance hit?
Most can be traced to badly written applications or
misconfigured OS.
25-Sep-2013S. Timm, OpenNebulaConf39
FermiCloud Project Summary - 1
Science is directly and indirectly benefiting from FermiCloud:
• CDF, D0, Intensity Frontier, Cosmic Frontier, CMS, ATLAS, Open Science Grid,…
FermiCloud operates at the forefront of delivering cloud
computing capabilities to support scientific research:
• By starting small, developing a list of requirements, building on existing Grid
knowledge and infrastructure to address those requirements, FermiCloud has
managed to deliver a production class Infrastructure as a Service cloud computing
capability that supports science at Fermilab.
• FermiCloud has provided FermiGrid with an infrastructure that has allowed us to test
Grid middleware at production scale prior to deployment.
• The Open Science Grid software team used FermiCloud resources to support their
RPM ―refactoring‖ and is currently using it to support their ongoing middleware
development/integration.
25-Sep-201340 S. Timm, OpenNebulaConf
FermiCloud Project Summary
The FermiCloud collaboration with KISTI has leveraged the
resources and expertise of both institutions to achieve significant
benefits.
vCluster has demonstrated proof of principle Grid Bursting‖ using
FermiCloud and Amazon EC2 resources.
Using SRIOV drivers on FermiCloud virtual machines, MPI
performance has been demonstrated to be >96% of the native
―bare metal‖ performance.
The future is mostly cloudy.
25-Sep-201341 S. Timm, OpenNebulaConf
Acknowledgements
None of this work could have been
accomplished without:
• The excellent support from other departments of the
Fermilab Computing Sector – including Computing
Facilities, Site Networking, and Logistics.
• The excellent collaboration with the open source
communities – especially Scientific Linux and
OpenNebula,
• As well as the excellent collaboration and contributions
from KISTI.
• And talented summer students from Illinois Institute of
Technology
25-Sep-2013S. Timm, OpenNebulaConf42

Weitere ähnliche Inhalte

Was ist angesagt?

Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCOpenStack
 
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...OPNFV
 
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentOPNFV
 
Webinar- Tea for the Tillerman
Webinar- Tea for the TillermanWebinar- Tea for the Tillerman
Webinar- Tea for the TillermanCumulus Networks
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesShapeBlue
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Belmiro Moreira
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-upHungWei Chiu
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and futureShapeBlue
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebula Project
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Belmiro Moreira
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudShapeBlue
 
Cumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Networks
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupFaheem Memon
 
Cumulus networks - Overcoming traditional network limitations with open source
Cumulus networks - Overcoming traditional network limitations with open sourceCumulus networks - Overcoming traditional network limitations with open source
Cumulus networks - Overcoming traditional network limitations with open sourceNat Morris
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Belmiro Moreira
 
Troubleshooting Tracebacks
Troubleshooting TracebacksTroubleshooting Tracebacks
Troubleshooting TracebacksJames Denton
 

Was ist angesagt? (19)

Supercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPCSupercomputing by API: Connecting Modern Web Apps to HPC
Supercomputing by API: Connecting Modern Web Apps to HPC
 
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
 
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus NetworksOpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
OpenStack Networks the Web-Scale Way - Scott Laffer, Cumulus Networks
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deployment
 
Webinar- Tea for the Tillerman
Webinar- Tea for the TillermanWebinar- Tea for the Tillerman
Webinar- Tea for the Tillerman
 
John Spray - Ceph in Kubernetes
John Spray - Ceph in KubernetesJohn Spray - Ceph in Kubernetes
John Spray - Ceph in Kubernetes
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up
 
XCP-ng - past, present and future
XCP-ng - past, present and futureXCP-ng - past, present and future
XCP-ng - past, present and future
 
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
OpenNebulaConf 2016 - Budgeting: the Ugly Duckling of Cloud computing? by Mat...
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Antoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloudAntoine Coetsier - billing the cloud
Antoine Coetsier - billing the cloud
 
Status of Embedded Linux
Status of Embedded LinuxStatus of Embedded Linux
Status of Embedded Linux
 
Cumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Linux 2.5 Overview
Cumulus Linux 2.5 Overview
 
KubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetupKubeCon US 2021 - Recap - DCMeetup
KubeCon US 2021 - Recap - DCMeetup
 
Cumulus networks - Overcoming traditional network limitations with open source
Cumulus networks - Overcoming traditional network limitations with open sourceCumulus networks - Overcoming traditional network limitations with open source
Cumulus networks - Overcoming traditional network limitations with open source
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
 
Troubleshooting Tracebacks
Troubleshooting TracebacksTroubleshooting Tracebacks
Troubleshooting Tracebacks
 

Ähnlich wie OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud using OpenNebula by by Steven C. Timm

4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)NAIM Networks, Inc.
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on JanetJisc
 
Software Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaSoftware Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaCPqD
 
Software Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaSoftware Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaCPqD
 
OpenStack Telco Cloud Challenges, David Fick, Oracle
OpenStack Telco Cloud Challenges, David Fick, OracleOpenStack Telco Cloud Challenges, David Fick, Oracle
OpenStack Telco Cloud Challenges, David Fick, OracleSriram Subramanian
 
Emulating cisco network laboratory topologies in the cloud
Emulating cisco network laboratory topologies in the cloudEmulating cisco network laboratory topologies in the cloud
Emulating cisco network laboratory topologies in the cloudronan messi
 
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Community
 
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SAMeh Zaghloul
 
Lessons learned so far in operationalizing NFV
Lessons learned so far in operationalizing NFVLessons learned so far in operationalizing NFV
Lessons learned so far in operationalizing NFVJames Crawshaw
 
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...Ceph Community
 
Implementing vCPE with OpenStack and Software Defined Networks
Implementing vCPE with OpenStack and Software Defined NetworksImplementing vCPE with OpenStack and Software Defined Networks
Implementing vCPE with OpenStack and Software Defined NetworksPLUMgrid
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsJay Bryant
 
Openflow for Mobile Broadband service providers_Nov'11
Openflow for Mobile Broadband service providers_Nov'11Openflow for Mobile Broadband service providers_Nov'11
Openflow for Mobile Broadband service providers_Nov'11Radhakant Das
 
2017 - LISA - LinkedIn's Distributed Firewall (DFW)
2017 - LISA - LinkedIn's Distributed Firewall (DFW)2017 - LISA - LinkedIn's Distributed Firewall (DFW)
2017 - LISA - LinkedIn's Distributed Firewall (DFW)Mike Svoboda
 
Container orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsContainer orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsFogGuru MSCA Project
 
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안NAIM Networks, Inc.
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftKangaroot
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 
5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments Work5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments WorkLumina Networks
 

Ähnlich wie OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud using OpenNebula by by Steven C. Timm (20)

4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)
 
Future services on Janet
Future services on JanetFuture services on Janet
Future services on Janet
 
Software Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaSoftware Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur Channegowda
 
Software Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur ChannegowdaSoftware Defined Optical Networks - Mayur Channegowda
Software Defined Optical Networks - Mayur Channegowda
 
OpenStack Telco Cloud Challenges, David Fick, Oracle
OpenStack Telco Cloud Challenges, David Fick, OracleOpenStack Telco Cloud Challenges, David Fick, Oracle
OpenStack Telco Cloud Challenges, David Fick, Oracle
 
Emulating cisco network laboratory topologies in the cloud
Emulating cisco network laboratory topologies in the cloudEmulating cisco network laboratory topologies in the cloud
Emulating cisco network laboratory topologies in the cloud
 
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
Ceph Day SF 2015 - Deploying flash storage for Ceph without compromising perf...
 
SDN and NFV
SDN and NFVSDN and NFV
SDN and NFV
 
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
SDN 101: Software Defined Networking Course - Sameh Zaghloul/IBM - 2014
 
Lessons learned so far in operationalizing NFV
Lessons learned so far in operationalizing NFVLessons learned so far in operationalizing NFV
Lessons learned so far in operationalizing NFV
 
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
Ceph Day Amsterdam 2015 - Deploying flash storage for Ceph without compromisi...
 
Implementing vCPE with OpenStack and Software Defined Networks
Implementing vCPE with OpenStack and Software Defined NetworksImplementing vCPE with OpenStack and Software Defined Networks
Implementing vCPE with OpenStack and Software Defined Networks
 
Automated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge CloudsAutomated Deployment and Management of Edge Clouds
Automated Deployment and Management of Edge Clouds
 
Openflow for Mobile Broadband service providers_Nov'11
Openflow for Mobile Broadband service providers_Nov'11Openflow for Mobile Broadband service providers_Nov'11
Openflow for Mobile Broadband service providers_Nov'11
 
2017 - LISA - LinkedIn's Distributed Firewall (DFW)
2017 - LISA - LinkedIn's Distributed Firewall (DFW)2017 - LISA - LinkedIn's Distributed Firewall (DFW)
2017 - LISA - LinkedIn's Distributed Firewall (DFW)
 
Container orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsContainer orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platforms
 
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
OVNC 2015-Open Ethernet과 SDN을 통한 Mellanox의 차세대 네트워크 혁신 방안
 
Red Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments Work5G in Brownfield how SDN makes 5G Deployments Work
5G in Brownfield how SDN makes 5G Deployments Work
 

Mehr von OpenNebula Project

OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebula Project
 
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...OpenNebula Project
 
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...OpenNebula Project
 
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...OpenNebula Project
 
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebula Project
 
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAFOpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAFOpenNebula Project
 
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...OpenNebula Project
 
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...OpenNebula Project
 
Replacing vCloud with OpenNebula
Replacing vCloud with OpenNebulaReplacing vCloud with OpenNebula
Replacing vCloud with OpenNebulaOpenNebula Project
 
NTS: What We Do With OpenNebula - and Why We Do It
NTS: What We Do With OpenNebula - and Why We Do ItNTS: What We Do With OpenNebula - and Why We Do It
NTS: What We Do With OpenNebula - and Why We Do ItOpenNebula Project
 
OpenNebula from the Perspective of an ISP
OpenNebula from the Perspective of an ISPOpenNebula from the Perspective of an ISP
OpenNebula from the Perspective of an ISPOpenNebula Project
 
NTS CAPTAIN / OpenNebula at Julius Blum GmbH
NTS CAPTAIN / OpenNebula at Julius Blum GmbHNTS CAPTAIN / OpenNebula at Julius Blum GmbH
NTS CAPTAIN / OpenNebula at Julius Blum GmbHOpenNebula Project
 
Performant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayPerformant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayOpenNebula Project
 
NetApp Hybrid Cloud with OpenNebula
NetApp Hybrid Cloud with OpenNebulaNetApp Hybrid Cloud with OpenNebula
NetApp Hybrid Cloud with OpenNebulaOpenNebula Project
 
NSX with OpenNebula - upcoming 5.10
NSX with OpenNebula - upcoming 5.10NSX with OpenNebula - upcoming 5.10
NSX with OpenNebula - upcoming 5.10OpenNebula Project
 
Security for Private Cloud Environments
Security for Private Cloud EnvironmentsSecurity for Private Cloud Environments
Security for Private Cloud EnvironmentsOpenNebula Project
 
CheckPoint R80.30 Installation on OpenNebula
CheckPoint R80.30 Installation on OpenNebulaCheckPoint R80.30 Installation on OpenNebula
CheckPoint R80.30 Installation on OpenNebulaOpenNebula Project
 
Cloud Disaggregation with OpenNebula
Cloud Disaggregation with OpenNebulaCloud Disaggregation with OpenNebula
Cloud Disaggregation with OpenNebulaOpenNebula Project
 

Mehr von OpenNebula Project (20)

OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
OpenNebulaConf2019 - Welcome and Project Update - Ignacio M. Llorente, Rubén ...
 
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...
OpenNebulaConf2019 - Building Virtual Environments for Security Analyses of C...
 
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...
OpenNebulaConf2019 - CORD and Edge computing with OpenNebula - Alfonso Aureli...
 
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...
OpenNebulaConf2019 - 6 years (+) OpenNebula - Lessons learned - Sebastian Man...
 
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
OpenNebulaConf2019 - Performant and Resilient Storage the Open Source & Linux...
 
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAFOpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF
OpenNebulaConf2019 - Image Backups in OpenNebula - Momčilo Medić - ITAF
 
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...
OpenNebulaConf2019 - How We Use GOCA to Manage our OpenNebula Cloud - Jean-Ph...
 
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...
OpenNebulaConf2019 - Crytek: A Video gaming Edge Implementation "on the shoul...
 
Replacing vCloud with OpenNebula
Replacing vCloud with OpenNebulaReplacing vCloud with OpenNebula
Replacing vCloud with OpenNebula
 
NTS: What We Do With OpenNebula - and Why We Do It
NTS: What We Do With OpenNebula - and Why We Do ItNTS: What We Do With OpenNebula - and Why We Do It
NTS: What We Do With OpenNebula - and Why We Do It
 
OpenNebula from the Perspective of an ISP
OpenNebula from the Perspective of an ISPOpenNebula from the Perspective of an ISP
OpenNebula from the Perspective of an ISP
 
NTS CAPTAIN / OpenNebula at Julius Blum GmbH
NTS CAPTAIN / OpenNebula at Julius Blum GmbHNTS CAPTAIN / OpenNebula at Julius Blum GmbH
NTS CAPTAIN / OpenNebula at Julius Blum GmbH
 
Performant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux WayPerformant and Resilient Storage: The Open Source & Linux Way
Performant and Resilient Storage: The Open Source & Linux Way
 
NetApp Hybrid Cloud with OpenNebula
NetApp Hybrid Cloud with OpenNebulaNetApp Hybrid Cloud with OpenNebula
NetApp Hybrid Cloud with OpenNebula
 
NSX with OpenNebula - upcoming 5.10
NSX with OpenNebula - upcoming 5.10NSX with OpenNebula - upcoming 5.10
NSX with OpenNebula - upcoming 5.10
 
Security for Private Cloud Environments
Security for Private Cloud EnvironmentsSecurity for Private Cloud Environments
Security for Private Cloud Environments
 
CheckPoint R80.30 Installation on OpenNebula
CheckPoint R80.30 Installation on OpenNebulaCheckPoint R80.30 Installation on OpenNebula
CheckPoint R80.30 Installation on OpenNebula
 
DE-CIX: CloudConnectivity
DE-CIX: CloudConnectivityDE-CIX: CloudConnectivity
DE-CIX: CloudConnectivity
 
DDC Demo
DDC DemoDDC Demo
DDC Demo
 
Cloud Disaggregation with OpenNebula
Cloud Disaggregation with OpenNebulaCloud Disaggregation with OpenNebula
Cloud Disaggregation with OpenNebula
 

Kürzlich hochgeladen

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Kürzlich hochgeladen (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

OpenNebulaConf 2013 - Keynote: Enabling Scientific Workflows on FermiCloud using OpenNebula by by Steven C. Timm

  • 1. Enabling Scientific Workflows on FermiCloud using OpenNebula Steven Timm Grid & Cloud Services Department Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359
  • 2. Outline Introduction—Fermilab and Scientific Computing FermiCloud Project and Drivers Applying Grid Lessons To Cloud FermiCloud Project Current and Future Interoperability Reframing the Cloud Discussion 25-Sep-2013S. Timm, OpenNebulaConf1
  • 3. Fermilab and Scientific Computing Fermi National Accelerator Laboratory: • Lead United States particle physics laboratory • ~60 PB of data on tape • High Throughput Computing characterized by: • ―Pleasingly parallel‖ tasks • High CPU instruction / Bytes IO ratio • But still lots of I/O. See Pfister: ―In Search of Clusters‖ 25-Sep-2013S. Timm, OpenNebulaConf2
  • 4. Grid and Cloud Services Dept. Operations: Grid Authorization Grid Accounting Computing Elements Batch Submission All require high availability All require multiple integration systems to test. Also requires virtualization And login as root Solutions: Development of authorization, accounting, and batch submission software Packaging and integration Requires development machines not used all the time Plus environments that are easily reset And login as root 25-Sep-2013S. Timm, OpenNebulaConf3
  • 5. HTC Virtualization Drivers Large multi-core servers have evolved from 2 to 64 cores per box, • A single ―rogue‖ user/application can impact 63 other users/applications. • Virtualization can provide mechanisms to securely isolate users/applications. Typical ―bare metal‖ hardware has significantly more performance than usually needed for a single-purpose server, • Virtualization can provide mechanisms to harvest/utilize the remaining cycles. Complicated software stacks are difficult to distribute on grid, • Distribution of preconfigured virtual machines together with GlideinWMS and HTCondor can aid in addressing this problem. Large demand for transient development/testing/integration work, • Virtual machines are ideal for this work. Science is increasingly turning to complex, multiphase workflows. • Virtualization coupled with cloud can provide the ability to flexibly reconfigure hardware ―on demand‖ to meet the changing needs of science. Legacy code: • Data and code preservation for recently-completed experiments at Fermilab Tevatron and elsewhere. Burst Capacity: • Systems are full all the time, need more cycles just before conferences. 25-Sep-2013S. Timm, OpenNebulaConf4
  • 6. FermiCloud – Initial Project Specifications FermiCloud Project was established in 2009 with the goal of developing and establishing Scientific Cloud capabilities for the Fermilab Scientific Program, • Building on the very successful FermiGrid program that supports the full Fermilab user community and makes significant contributions as members of the Open Science Grid Consortium. • Reuse High Availabilty, AuthZ/AuthN, Virtualization from Grid In a (very) broad brush, the mission of the FermiCloud project is: • To deploy a production quality Infrastructure as a Service (IaaS) Cloud Computing capability in support of the Fermilab Scientific Program. • To support additional IaaS, PaaS and SaaS Cloud Computing capabilities based on the FermiCloud infrastructure at Fermilab. The FermiCloud project is a program of work that is split over several overlapping phases. • Each phase builds on the capabilities delivered as part of the previous phases. 25-Sep-2013S. Timm, OpenNebulaConf5
  • 7. Overlapping Phases 25-Sep-2013S. Timm, OpenNebulaConf6 Phase 1: “Build and Deploy the Infrastructure” Phase 2: “Deploy Management Services, Extend the Infrastructure and Research Capabilities” Phase 3: “Establish Production Services and Evolve System Capabilities in Response to User Needs & Requests” Phase 4: “Expand the service capabilities to serve more of our user communities” Time Today
  • 8. Current FermiCloud Capabilities The current FermiCloud hardware capabilities include: • Public network access via the high performance Fermilab network, - This is a distributed, redundant network. • Private 1 Gb/sec network, - This network is bridged across FCC and GCC on private fiber, • High performance Infiniband network, - Currently split into two segments, • Access to a high performance FibreChannel based SAN, - This SAN spans both buildings. • Access to the high performance BlueArc based filesystems, - The BlueArc is located on FCC-2, • Access to the Fermilab dCache and enStore services, - These services are split across FCC and GCC, • Access to 100 Gbit Ethernet test bed in LCC (Integration nodes), - Intel 10 Gbit Ethernet converged network adapter X540-T1. 25-Sep-2013S. Timm, OpenNebulaConf7
  • 9. Typical Use Cases Public net virtual machine: • On Fermilab Network open to Internet, • Can access dCache and Bluearc Mass Storage, • Common home directory between multiple VM’s. Public/Private Cluster: • One gateway VM on public/private net, • Cluster of many VM’s on private net. • Data acquisition simulation Storage VM: • VM with large non-persistent storage, • Use for large MySQL or Postgres databases, Lustre/Hadoop/Bestman/xRootd/dCache/OrangeFS/IRODS servers. 25-Sep-2013S. Timm, OpenNebulaConf8
  • 10. FermiGrid-HA2 Experience In 2009, based on operational experience and plans for redevelopment of the FCC-1 computer room, the FermiGrid-HA2 project was established to split the set of FermiGrid services across computer rooms in two separate buildings (FCC-2 and GCC-B). • This project was completed on 7-Jun-2011 (and tested by a building failure less than two hours later). • FermiGrid-HA2 worked exactly as designed. Our operational experience with FermiGrid-HA and FermiGrid- HA2 has shown the benefits of virtualization and service redundancy. • Benefits to the user community – increased service reliability and uptime. • Benefits to the service maintainers – flexible scheduling of maintenance and upgrade activities. 25-Sep-2013S. Timm, OpenNebulaConf9
  • 11. Experience with FermiGrid = Drivers for FermiCloud 25-Sep-2013S. Timm, OpenNebulaConf10 Access to pools of resources using common interfaces: • Monitoring, quotas, allocations, accounting, etc. Opportunistic access: • Users can use the common interfaces to ―burst‖ to additional resources to meet their needs Efficient operations: • Deploy common services centrally High availability services: • Flexible and resilient operations
  • 12. Additional Drivers for FermiCloud Existing development and integration (AKA the FAPL cluster) facilities were: • Technically obsolescent and unable to be used effectively to test and deploy the current generations of Grid middleware. • The hardware was over 8 years old and was falling apart. • The needs of the developers and service administrators in the Grid and Cloud Computing Department for reliable and ―at scale‖ development and integration facilities were growing. • Operational experience with FermiGrid had demonstrated that virtualization could be used to deliver production class services. 25-Sep-2013S. Timm, OpenNebulaConf11
  • 13. OpenNebula OpenNebula was picked as result of evaluation of Open source cloud management software. OpenNebula 2.0 pilot system in GCC available to users since November 2010. Began with 5 nodes, gradually expanded to 13 nodes. 4500 Virtual Machines run on pilot system in 3+ years. OpenNebula 3.2 production-quality system installed in FCC in June 2012 in advance of GCC total power outage—now comprises 18 nodes. Transition of virtual machines and users from ONe 2.0 pilot system to production system almost complete. In the meantime OpenNebula has done five more releases, will catch up shortly. 25-Sep-2013S. Timm, OpenNebulaConf12
  • 14. FermiCloud – Fault Tolerance As we have learned from FermiGrid, having a distributed fault tolerant infrastructure is highly desirable for production operations. We are actively working on deploying the FermiCloud hardware resources in a fault tolerant infrastructure: • The physical systems are split across two buildings, • There is a fault tolerant network infrastructure in place that interconnects the two buildings, • We have deployed SAN hardware in both buildings, • We have a dual head-node configuration with HB for failover • We have a GFS2 + CLVM for our multi-user filesystem and distributed SAN. • SAN replicated between buildings using CLVM mirroring. GOAL: • If a building is ―lost‖, then automatically relaunch ―24x7‖ VMs on surviving infrastructure, then relaunch ―9x5‖ VMs if there is sufficient remaining capacity, • Perform notification (via Service-Now) when exceptions are detected. 25-Sep-2013S. Timm, OpenNebulaConf13
  • 15. FCC and GCC 25-Sep-2013S. Timm, OpenNebulaConf14 FC C GC C The FCC and GCC buildings are separated by approximately 1 mile (1.6 km). FCC has UPS and Generator. GCC has UPS.
  • 16. Distributed Network Core Provides Redundant Connectivity 25-Sep-2013S. Timm, OpenNebulaConf15 GCC-A Nexus 7010 Robotic Tape Libraries (4) Robotic Tape Libraries (3) Fermi Grid Fermi Cloud Fermi Grid Fermi Cloud Disk Servers Disk Servers 20 Gigabit/s L3 Routed Network 80 Gigabit/s L2 Switched Network Note – Intermediate level switches and top of rack switches are not shown in the this diagram. Private Networks over dedicated fiber Grid Worker Nodes Nexus 7010 FCC-2 Nexus 7010 FCC-3 Nexus 7010 GCC-B Grid Worker Nodes Deployment completed in June 2012
  • 17. Distributed Shared File System Design: Dual-port FibreChannel HBA in each node, Two Brocade SAN switches per rack, Brocades linked rack-to-rack with dark fiber, 60TB Nexsan Satabeast in FCC-3 and GCC-B, Redhat Clustering + CLVM + GFS2 used for file system, Each VM image is a file in the GFS2 file system LVM mirroring RAID 1 across buildings. Benefits: Fast Launch—almost immediate as compared to 3-4 minutes with ssh/scp, Live Migration—Can move virtual machines from one host to another for scheduled maintenance, transparent to users, Persistent data volumes—can move quickly with machines, Can relaunch virtual machines in surviving building in case of building failure/outage, 25-Sep-2013S. Timm, OpenNebulaConf16
  • 18. FermiCloud – Network & SAN ―Today‖ Private Ethernet over dedicated fiber Fibre Channel over dedicated fiber 25-Sep-201317 S. Timm, OpenNebulaConf Nexus 7010 Nexus 7010 Nexus 7010 FCC-2 GCC-A FCC-3 GCC-B Nexus 7010 fcl315 To fcl323 FCC-3 Brocade Satabeast Brocade fcl001 To fcl013 GCC-B Brocade Satabeast Brocade
  • 19. FermiCloud-HA Head Node Configuration 25-Sep-2013S. Timm, OpenNebulaConf18 fcl001 (GCC-B) fcl301 (FCC-3) ONED/SCHED fcl-ganglia2 fermicloudnis2 fermicloudrepo2 fclweb2 fcl-cobbler fermicloudlog fermicloudadmin fcl-lvs2 fcl-mysql2 ONED/SCHED fcl-ganglia1 fermicloudnis1 fermicloudrepo1 fclweb1 fermicloudrsv fcl-lvs1 fcl-mysql1 2 way rsync Live Migration Multi-master MySQL CLVM/rgmanager
  • 20. Cooperative R+D Agreement Partners: • Grid and Cloud Computing Dept. @FNAL • Global Science Experimental Data hub Center @KISTI Project Title: • Integration and Commissioning of a Prototype Federated Cloud for Scientific Workflows Status: • Three major work items: 1. Virtual Infrastructure Automation and Provisioning, 2. Interoperability and Federation of Cloud Resources, 3. High-Throughput Fabric Virtualization. 25-Sep-2013S. Timm, OpenNebulaConf19
  • 21. Virtual Machines as Jobs OpenNebula (and all other open-source IaaS stacks) provide an emulation of Amazon EC2. HTCondor developers added code to their ―Amazon EC2‖ universe to support the X.509-authenticated protocol. Currently testing in bulk, up to 75 VM’s OK thus far: Goal to submit NOvA workflow to OpenNebula @ FermiCloud, OpenStack @ Notre Dame, and Amazon EC2. Smooth submission of many thousands of VM’s is key step to making the full infrastructure of a site into a science cloud. 25-Sep-2013S. Timm, OpenNebulaConf20
  • 22. GWMS FACTORY USER FRONTEND FermiGrid SITE GATEWAY CDF D0 CMS GP CLOUDGATE Amazon EC2 FermiCloudGCLOUD Notre Dame OpenStack VCLUSTER GRID BURSTING Vcluster reads job queue and submits VMs as needed 25-Sep-2013S. Timm, OpenNebulaConf21
  • 23. 22 vCluster at SC2012 25-Sep-2013S. Timm, OpenNebulaConf
  • 26. True Idle VM Detection In times of resource need, we want the ability to suspend or ―shelve‖ idle VMs in order to free up resources for higher priority usage. • This is especially important in the event of constrained resources (e.g. during building or network failure). Shelving of ―9x5‖ and ―opportunistic‖ VMs allows us to use FermiCloud resources for Grid worker node VMs during nights and weekends • This is part of the draft economic model. Giovanni Franzini (an Italian co-op student) has written (extensible) code for an ―Idle VM Probe‖ that can be used to detect idle virtual machines based on CPU, disk I/O and network I/O. Nick Palombo, consultant, has written the communication system and the collector system to do rule-based actions based on the idle information. 25-Sep-2013S. Timm, OpenNebulaConf25
  • 27. Idle VM Information Flow 25-Sep-2013S. Timm, OpenNebulaConf26 Raw VM State DB Idle VM Collector Idle VM Logic Idle VM List Idle VM Trigger Idle VM Shutdown Idle VM Management Process HOST VM VM VM Idle data store OpenNebula HOST VM VM VM Idle data store IM IM XMLRPC XMLRPC
  • 28. Interoperability and Federation Driver: • Global scientific collaborations such as LHC experiments will have to interoperate across facilities with heterogeneous cloud infrastructure. European efforts: • EGI Cloud Federation Task Force – several institutional clouds (OpenNebula, OpenStack, StratusLab). • HelixNebula—Federation of commercial cloud providers Our goals: • Show proof of principle—Federation including FermiCloud + KISTI ―G Cloud‖ + one or more commercial cloud providers + other research institution community clouds if possible. • Participate in existing federations if possible. Core Competency: • FermiCloud project can contribute to these cloud federations given our expertise in X.509 Authentication and Authorization, and our long experience in grid federation 25-Sep-2013S. Timm, OpenNebulaConf27
  • 29. Virtual Image Formats Different clouds have different virtual machine image formats: • File system ++, Partition table, LVM volumes, Kernel? We have identified the differences and written a comprehensive step by step user manual, soon to be public. 25-Sep-2013S. Timm, OpenNebulaConf28
  • 30. Interoperability/Compatibility of API’s Amazon EC2 API is not open source, it is a moving target that changes frequently. Open-source emulations have various feature levels and accuracy of implementation: • Compare and contrast OpenNebula, OpenStack, and commercial clouds, • Identify lowest common denominator(s) that work on all. 25-Sep-2013S. Timm, OpenNebulaConf29
  • 31. VM Image Distribution Investigate existing image marketplaces (HEPiX, U. of Victoria). Investigate if we need an Amazon S3-like storage/distribution method for OS images, • OpenNebula doesn’t have one at present, • A GridFTP ―door‖ to the OpenNebula VM library is a possibility, this could be integrated with an automatic security scan workflow using the existing Fermilab NESSUS infrastructure. 25-Sep-2013S. Timm, OpenNebulaConf30
  • 32. High-Throughput Fabric Virtualization Followed up earlier virtualized MPI work: • Use it in real scientific workflows • Now users can define a set of IB machines in OpenNebula on their own • DAQ system simulation • Large multicast activity • Also experiments done with virtualized 10GBe on 100GBit WAN testbed. . 25-Sep-2013S. Timm, OpenNebulaConf31
  • 33. Security Main areas of cloud security development: Secure Contextualization: • Secrets such as X.509 service certificates and Kerberos keytabs are not stored in virtual machines (See following talk for more details). X.509 Authentication/Authorization: • X.509 Authentication written by T. Hesselroth, code submitted to and accepted by OpenNebula, publicly available since Jan-2012. Security Policy: • A security taskforce met and delivered a report to the Fermilab Computer Security Board, recommending the creation of a new Cloud Computing Environment, now in progress. We also participated in the HEPiX Virtualisation Task Force, • We respectfully disagree with the recommendations regarding VM endorsement. 25-Sep-2013S. Timm, OpenNebulaConf32
  • 34. Pluggable authentication Some assembly required Batteries not included 25-Sep-2013S. Timm, OpenNebulaConf33
  • 35. OpenNebula Authentication OpenNebula came with ―pluggable‖ authentication, but few plugins initially available. OpenNebula 2.0 Web services by default used access key / secret key mechanism similar to Amazon EC2. No https available. Four ways to access OpenNebula: • Command line tools, • Sunstone Web GUI, • ―ECONE‖ web service emulation of Amazon Restful (Query) API, • OCCI web service. FermiCloud project wrote X.509-based authentication plugins: • Patches to OpenNebula to support this were developed at Fermilab and submitted back to the OpenNebula project in Fall 2011 (generally available in OpenNebula V3.2 onwards). • X.509 plugins available for command line and for web services authentication. 25-Sep-2013S. Timm, OpenNebulaConf34
  • 36. X.509 Authentication—how it works • Command line: • User creates a X.509-based token using ―oneuser login‖ command • This makes a base64 hash of the user’s proxy and certificate chain, combined with a username:expiration date, signed with the user’s private key • Web Services: • Web services daemon contacts OpenNebula XML-RPC core on the users’ behalf, using the host certificate to sign the authentication token. • Use Apache mod_proxy to pass the grid certificate DN to web services. • Limitations: • With Web services, one DN can map to only one user. 25-Sep-2013S. Timm, OpenNebulaConf35
  • 37. Grid AuthZ Interoperability Protocol • Use XACML 2.0 to specify • DN, CA, Hostname, CA, FQAN, FQAN signing entity, and more. • Developed in 2007, has been used in Open Science Grid and other grids • Java and C bindings available for client • Most commonly used C binding is LCMAPS • Used to talk to GUMS, SAZ, others • Allows one user to be part of different Virtual Organizations and have different groups and roles. • For Cloud authorization we will configure GUMS to map back to individual user names, one per person • Each personal account in OpenNebula created in advance. 25-Sep-2013S. Timm, OpenNebulaConf36
  • 38. ―Authorization‖ in OpenNebula • Note: OpenNebula has pluggable ―Authorization‖ modules as well. • These control Access ACL’s—namely which user can launch a virtual machine, create a network, store an image, etc. • Not related to the grid-based notion of authorization at all. • Instead we make our ―Authorization‖ additions to the Authentication routines of OpenNebula 25-Sep-2013S. Timm, OpenNebulaConf37
  • 39. X.509 Authorization • OpenNebula authorization plugins written in Ruby • Use existing Grid routines to call to external GUMS and SAZ authorization servers • Use Ruby-C binding to call C-based routines for LCMAPS or • Use Ruby-Java bridge to call Java-based routines from Privilege proj. • GUMS returns uid/gid, SAZ returns yes/no. • Works with OpenNebula command line and non-interactive web services • Much effort spent in trying to send user credentials with extended attributes into web browser • Currently—ruby-java-bridge setup works for CLI • For Sunstone we have shifted to have callout to VOMS done on server side. • We are always interested in talking to anyone who is doing X.509 authentication in any cloud. 25-Sep-2013S. Timm, OpenNebulaConf38
  • 40. Reframing Cloud Discussion Purpose of Infrastructure-as-a-service: On demand only? No—a whole new way to think about IT infrastructure both internal and external. Cloud API is just a part of rethinking IT infrastructure for data- intensive science (and MIS). Only as good as the hardware and software it’s built on. Network fabric, storage, and applications all crucial. Buy or build? Both! Will always need some in-house capacity. Performance hit? Most can be traced to badly written applications or misconfigured OS. 25-Sep-2013S. Timm, OpenNebulaConf39
  • 41. FermiCloud Project Summary - 1 Science is directly and indirectly benefiting from FermiCloud: • CDF, D0, Intensity Frontier, Cosmic Frontier, CMS, ATLAS, Open Science Grid,… FermiCloud operates at the forefront of delivering cloud computing capabilities to support scientific research: • By starting small, developing a list of requirements, building on existing Grid knowledge and infrastructure to address those requirements, FermiCloud has managed to deliver a production class Infrastructure as a Service cloud computing capability that supports science at Fermilab. • FermiCloud has provided FermiGrid with an infrastructure that has allowed us to test Grid middleware at production scale prior to deployment. • The Open Science Grid software team used FermiCloud resources to support their RPM ―refactoring‖ and is currently using it to support their ongoing middleware development/integration. 25-Sep-201340 S. Timm, OpenNebulaConf
  • 42. FermiCloud Project Summary The FermiCloud collaboration with KISTI has leveraged the resources and expertise of both institutions to achieve significant benefits. vCluster has demonstrated proof of principle Grid Bursting‖ using FermiCloud and Amazon EC2 resources. Using SRIOV drivers on FermiCloud virtual machines, MPI performance has been demonstrated to be >96% of the native ―bare metal‖ performance. The future is mostly cloudy. 25-Sep-201341 S. Timm, OpenNebulaConf
  • 43. Acknowledgements None of this work could have been accomplished without: • The excellent support from other departments of the Fermilab Computing Sector – including Computing Facilities, Site Networking, and Logistics. • The excellent collaboration with the open source communities – especially Scientific Linux and OpenNebula, • As well as the excellent collaboration and contributions from KISTI. • And talented summer students from Illinois Institute of Technology 25-Sep-2013S. Timm, OpenNebulaConf42