The document discusses using cloud computing for global risk modeling and earthquake modeling. It describes challenges like working with large, sensitive datasets from different sources and ensuring calculations and results are verifiable. The solution proposed is using an open source cloud platform that can elastically scale to thousands of nodes to run complex modeling as a repeatable service. Lessons discussed include the importance of full data sharing and open science principles to advance global modeling efforts.
Driving Behavioral Change for Information Management through Data-Driven Gree...
Â
The Power of the Cloud, and Global Risk Modelling in the Open
1. There is a lot that happens around
the world we cannot control. We
cannot stop earthquakes, we cannot
prevent droughts, and we cannot
prevent all conflict, but when we
know where the hungry, the homeless
and the sick exist, then we can
help.
- Jan Schakowsky
2.
3.
4. THE POWER OF THE CLOUD, AND
GLOBAL RISK MODELING IN THE OPEN
5. Me: Joshua McKenty
Twitter: @jmckenty
Email: joshua@pistoncloud.com
IT Architect, Global Earthquake Model
Chief Architect, NASA Nebula
Founding Member, OpenStack
OpenStack Project Policy Board
CEO, Piston Cloud Computing
9. âThe GEM Foundation has set itself out to
engage a global community in the design,
development and deployment of state-of-
the-art models and tools for earthquake risk
assessment worldwide.â
15. Challenges
» Complex calculations on large, federated
data sets
» Data is sensitive, sometimes secret,
often proprietary
» Process and calculations used need to
be certified and verifiable
» Results need to be public
17. Solution: Cloud
» Built to scale elastically to the extents of
the infrastructure (1-10,000 nodes)
» Run as a service (OpenGEM)
» Open source
» ⊠and repeatable.
22. "International programs for
global change research and
environmental monitoring
crucially depend on the principle
of full and open data exchange"
- On the Full and Open Exchange of Scientific Data (A publication
of the Committee on Geophysical and Environmental Data -
National Research Council) 1995
23. âŠthe Internet was conceived as a
communication mechanism for the
dissemination of ideas and as a
means to support distributed
collaboration.
- Open Source Software Development and
Distributed Innovation, Bruce Kogut and Anca
Metiu, April 2001
24. "...without access to the source for
the programs we use... (i.e. when
simulation codes or parameter files
are proprietary or are hidden by their
owners), numerical
experimentation isnât even
science. Science has to be
'verifiable in practice' as well as
'verifiable in principleââ
26. â...Michael Faradayâs advice to his junior
colleague to: âWork. Finish. Publish.â needs to
be revised. It shouldnât be enough to publish a
paper anymore. If we want open science to
flourish, we should raise our expectations
to: âWork. Finish. Publish. Release.â That is,
your research shouldnât be considered complete
until the data and meta-data is put up on the
web for other people to use, until the code is
documented and releasedâŠ
- Dan Gezelter
How many of you have been in an earthquake?How many of you died?The difference between an earthquake killing 40% of the population, and an earthquake killing 0.01% of the population, is in our understanding of risk.
Weâre all clouds
Who am I - I'm a one-trick pony (but it's a hell of a trick) - Open Source, distributed systems at scale - Data analytics (Netscape, AOL) - NASA and OpenStack
Where weâre headed:What is GEMHow is this Cloud?What did we learn?
What is GEM - Global Earthquake Modelling - A social problem, a technical problem, and a user experience problem
Risk modeling is the same in all fields â financial risk, political risk, disaster risk.But when it really matters, it sharpens the mind.
We wanted to move from the old model (static products)âŠ
to the new model (dynamic systems).
In order to answer the âWhy Cloudâ â we have to look at what makes GEM hardâŠ
Monte Carlo â 10,000 realizations for a decent result
All of the challenges of a web application at scale, plus - repeatability
http://en.wikipedia.org/wiki/Open_science_data
http://knowledge-stage.wharton.upenn.edu/papers/1252.pdfOpen Source Software Development and Distributed Innovation
Everyone has a big enough computerEveryone can reproduce the resultsEveryone can use the same methods with additional datasets
http://www.openscience.org/blog/?p=269 July 28, 2009 by Dan Gezelter, and until the comments start coming in to your blog post announcing the paper. If our general expectations of what it means to complete a project are raised to this level, the scientific community will start doing these activities as a matter of course."
There was a point when the internet was no longer a tool for doing things BETTER â it was a tool for doing entirely NEW THINGS.CLOUD is reaching that point. And I believe that 50 years from now, weâll be able to look back and say that OpenQuake was one of those things.