Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Semantic Cloud Governance
1.
2. What is the Governance challenge?
How do we intelligently
such that it delivers superior value to organizations?
How can we deploy semantic technologies to solve this problem effectively ?
3. -ve impact of Lack
of Governance
Individual
Home
Village
City
Why Governance?
Scale + Complexity
Metropolis
Megapolis
4. -ve impact of Lack
of Governance
Application
Services
Enterprise
Services
Inter- Enterprise
Services
Why Governance?
Scale + Complexity
Cloud
Services
Inter Cloud
Gateways
5. What is the scale of the systems we are dealing with today?
6. What is the scale of the systems we are dealing with today?
bundle
7. What is the scale of the systems we are dealing with today?
deploy
bundle
8. What is the scale of the systems we are dealing with today?
aggregate
deploy
bundle
9. What is the scale of the systems we are dealing with today?
consume
aggregate
deploy
bundle
10. What is the scale of the systems we are dealing with today?
consume
aggregate
Scale
deploy
bundle
19. What are the tools needed to govern Scale & Complexity?
Automation
Machine Learning
Situation Detection
Collective
Context
Inference
Semantic Governance
Repository
Intelligence
21. Semantic Service Repository
Semantic Service Repository
uses
uses
uses
has
depends on
belongs to
depends on
depends on
is rated by
depends on
is governed by
is ranked by
Taxonomy is tagged by
Taxonomy –Folksonomy
Reconciliation
22. Definition: Service Reputation
• Service Reputation = f(User Rating, User
Ranking, Compliance, Verity)
– User rating = User reviews (Collective Intelligence, Subjective
User Perception)
– User ranking = User reviews (Collective Intelligence, Subjective User
Perception)
– Compliance = Degree of Conformance to policies, guidelines and
standards (run-time)
– Verity = Degree of consistency exhibited by Service provider in
delivering the quality levels laid out in service contract, in a range of
previous transactions (Run Time)
23. What does a Core Governance Information Model look like?
24. Where does Machine Learning fit in?
(Service Auto-Classification)
Service Ontology (including
Service Descriptions + other
textual attributes) Service
Taxonomies
On publish of service
On tagging of service
On ranking & rating of service
Service
Service Auto-classifier
Service
Linguistics (UIMA) +
Registry Neural Net Engine/
Text SVM Registry
Auto-Classified
Services
26. What is the CEP & Semantics connection?
Potential Situations
27. What is the BPM, CEP & Semantics connection?
On Service creation alert
On SLA violation alert
On Error prediction alert
Potential Situations
28. What is the BPM, BAM, CEP & Semantics connection?
On Service creation alert
On SLA violation alert
On Error prediction alert
Potential Situations
29. What is the BPM, BI, CEP & Semantics connection?
On Service creation alert
On SLA violation alert
On Error prediction alert
Potential Situations
30. What is the Semantic Search connection?
The Ontology is the basis for the Search
-Discovery (D) -Code versions (D)
-Description (D) -Documentation (D)
-Contracts (R) -Query message store (R)
Registry -Policies (R) -Logging (R)
Repository -Auditing (R)
-Versioning (R)
32. Summary?
We can intelligently
to deliver superior value
Semantic technologies go a long way to solve this problem effectively !
33. Conclusion
Need: A highly scalable, performant and robust semantically enabled
platform to propel cloud based enterprises
Opportunity: To propel mainstream adoption of semantics by
incrementally enabling semantic intelligence in components that are
widespread in today’s enterprise
Need and Opportunity: An open source platform to do all of this
leveraging existing open source components …. volunteers?
34. Acknowledgements
•Keshava Rangarajan, Chief Architect, Halliburton
•Jayesh Shah, Vice President, Product Development, Oracle America, Inc.
•Rajesh Raheja, Senior Director, Applications Development, Oracle America, Inc
•Harry Teunissen, Senior Technical Advisor, Halliburton (Landmark)
•Nagaraj Srinivasan, VP R&D, Halliburton (Landmark)
•Chaminda Peries, Principal Architect, Halliburton (Landmark)
•Bhagat Nainani, Vice President Development, Oracle America, Inc
•Manoj Das, Senior Director, Product Management, Oracle America, Inc
•Payal Srivastava, Senior Principal Product Manager, Oracle America, Inc.
•Ashish Pathak, Director, Product Management, Oracle America, Inc.
•Veshaal Singh, Senior Director, ATG Development, Oracle Corporation
•Rajesh Ghosh, Group Manager, ATG Development, Oracle Corporation
•Alexandre Alves, Consulting Member of Technical Staff, Oracle America, Inc.
•Meera Srinivasan, Senior Principal Product Manager, Oracle America, Inc.
Editor's Notes
How do we intelligently Create, Publish, Search, Discover, Consume, Manage, Meter, Monitor, Govern, Report on SOA Services such that it delivers superior value to organizations? Also, in this discussion, we are interested in finding out how can we deploy semantic technologies to solve this problem effectively.
(Christopher Alexandra and the Megalopolis model) On the X-axis, we have the Scale + Complexity. On the Y-axis, we have the Negative impact of Lack of Governance. This is an inverse flip of the way we look at the problem. Let’s start with Individual. Does Lack of Governance really matter? May be not.As we move on to Home, Village, City, etc., the negative impact of Lack of Governance grows.Metropolis is a collective set of residencies in cities.Megapolis is a collective set of Metropolis, loosely connected to one another. As you can see, the negative impact of Lack of Governance grows from left to right.
When we talk about applications, the same metaphor applies. We have got Applications which are free-form, floating around, lot of Enterprise applications are that way.We have got Services, which are clusters of functionality, exposed out of these applications. You need a little more control. Otherwise you will end up with a pile of services.Then, you have got Enterprise services which are leveraged on multiple business units. You got to worry of Governance, quite a bit there.When you are looking at B2B/Inter-enterprise services, then it is all the more reason. Because, partner agreements depend on the kind of SLAs you have agreed upon.When we move to Cloud, since metering and monitoring, charged by the functionality is all integrated in the cloud platform, again, Governance is key.And finally, when we move to this new wave, which is the Inter Cloud Gateways, it is again a non-linear scale problem.
Before we delve into the topic, I'd like to set some context here. In the large scale systems, there are four key dimensions which are of interest:PerformanceScaleAdaptabilitySecurity These are things that I look for in a system, at least. We'll start looking at the Scale. What is the scale of the system?What is size of the systems that we deal with today? Ok, everybody is familiar with the Component-oriented approach. We have listed here some of them - the Java components, .Net components, etc. Also, we have an exploding set of languages. We have mentioned a sample of them here - Ruby, JavaScript, etc. .Net assemblies, Java jars, etc.
All of them lead up to the next tier – the Service Tier. We actually bind and deploy all these components into the service tier, which is typically a Web Server. We have Rest, SOAP services, etc. SOAP is traditionally being used in the Enterprise; and we are seeing a transition into Rest services, etc.
We then move on into the Infrastructure/Platform/SaaS Tier, which is the big boom nowadays up, and the big wave. We deploy out to:Amazon (AWS)EucalyptusGoogleAzureSales Force Dot Com (SFDC)Private Clouds (where Security is a big concern)
And finally, we move on to the Cloud Services Gateway Tier, which is really exciting. This is a way of actually aggregating, securing, and providing virtualizing entry point into the various cloud entities in a platform.
Then, we have the Service Consumer Tier.Mobile clients (which is exploding exponentially)Web clientsDesktop clients
You can see that when we look at Scale, it is a non-linear function, as we move from bottom to top. With every level, the impact of scale (even though we stayed away from the exact measure of scale) is an exploding problem - in terms of size and complexity.
Complexity:Complexity itself is a very interesting topic. There are actually several measures of complexity. In our case, we are just taking one aspect - Inter-dependency. We are giving a sample of the type of Inter-dependency. If you look at the Service Tier, you have got dependencies from the Service Tier onto to the actual components.
We move on up to the next tier, which is the Platform tier. Here, we have Google (which is more of a PaaS than IaaS) which has dependencies onto the actual services clusters that has been deployed.
We move on up to the Gateway tier. Given that the reason that we actually go to Gateway is for reliability purposes (that we can actually flip over from one particular cloud service provider to another seamlessly, either for bursting or for other reasons) – dependency is a big, big aspect of complexity.
And finally, when we look at the end clients, who are using these services, we are relying on them to actually perform the functionalities.
Inter-dependency: And that’s again a non-linear issue! Increasing amounts of complexity is involved at every level.
Lets look at another dimension - theDesign Time complexity. This is all about how you actually take your IDE based work product, put it out into your Web Tier, bundle it up into an AMI, put it in Amazon, deploy it, get it up from there into these router-type boxes, which actually point to virtual end points, that could either move across to AWS or to Eucalyptus, etc.There is a lot of complexity involved here too!
Lets look at another dimension – the Deploy Time complexity. As already explained in the previous slide, here we just bundle it up and deploy it into the Cloud.
Lets look at another dimension – the Run Time complexity. As the system is up andrunning, we have the SLA and other metrics that needs to be monitored, etc.
Given the fact that the volume is enormous, we need extreme Automation. The closest answer to extreme automation in the Enterprise world is BPM and the offset technologies like Workflow, etc. with BPEL being a layer below that.Pattern learning Machine Learning to support predictive insight into things that are going to happen, and making changes – almost like a Control System type behavior. Data Mining holds the key there.We want to be able to predict situations (Situation detection) - situations that are of interest, situations that are Contextual, situations that cater to our needs (not everything, we want to follow the 80%/20% rule). Given that we want to focus on the most important of the 20% things. CEP has some of the answer there.Finally, what we call the quartet – Collective, Context, Inference and Intelligence. We would look at how we are going to bring them together. Semantics holds the key there.
Though there are various ways of classification, for this discussion,we would stick to the following classification. Thereare 2 types of primitives - Entity primitives and Activity primitives. Entity primitives are for entities that need to be governed.Activity primitives are for activities performed on the entities and that also needs to be governed. Entity primitives: comprises of Services, their dependent artifacts (like docs, metadata, configuration, deployment plan, etc. and so on and so forth), their SLAs which are defined by Time, Money and Quality, compliance rules based on which SLAs are calculated, governance policies which are used to govern services, Taxonomy (which is a structured classification of services), Folksonomy (which is the unstructured, free-form user-tagged descriptions of the service). Activity primitives: are activities (refer to the list of activities in the previous slide) performed on the entities.
Semantic Service RepositoryThis is the foundation piece for the rest of this conversation. Semantic Service Repository has 2 components to it - a dynamic component and a static component.Dynamic component: is the inferencing engine (eg: Jena). This uses Inferencing rules that act on the static components of the service repository.Static components: are typically all of the entities (referred to in the previous slide). Eg: Services, Service artifacts, Taxonomy, Folksonomy, etc.The Semantic Service Repository persists all of the primitives that were referred in the Entity primitives.Example:1) I have in front of us a typical service. Eg: GetCustomerInformation service.2) As you can see, this service has 3 versions - V1, V2, and V3.3) Lets consider V3.4) V3 depends on all of the components that implements the services, all the docs that describe the services, all of the configuration that drives the behavior of the service, the metadata that describes the service, the deployment plan that determines the deployment of the service, and Governance model, policy and rules that govern the service.5) The service could be associated with multiple Taxonomies.6) The service is further user-tagged by Folksonomies.7) The service has a Rating, a Ranking, and a Reputation.
One of the differences between User Rating and User Ranking is, a service can be rated as 2 Star or 3 Star. If you want to break the tie among equally rated services, then ranking is used.
When we are looking at an Information Model, especially the Governance Information Model, we are looking at the Service Offering, its Service Components (which is all the attributes and service dependencies). There is a Governance Contracts, and there are Governance Clauses, which are clauses of Governance values. Then there is the Governance Aspects, which tells how do we actually interpret Governance clauses, and how do you measure, manage, meter and monitor them. Those are the key things of the Governance Information Model. It is the basis on which the Inference engine works.
We will talk about Auto-Classification and the place for Machine Learning. In a Cloud based environment, we have a huge amount of services. Upon publishing of a service, upon tagging of a service, upon rating and ranking of any particular service by a user, what needs to happen is, the positioning of a service in terms of a formal taxonomy might have to be dynamically changed. And that is not something that a person can manually do it in real time. We need an automated way of doing that. The services themselves have descriptions. When a tagging activity happens, when a review is written up, there is textual information. We could use UIMA, to pick out all the textual tokens – break them out into attributes and do Named Entity recognition. And then bring out a trained SVM engine which works on a model, that is able to pick up all the service descriptions, and all its attributes from the Governance model, and tag it, and then position it appropriately in the Taxonomy. There are two flavors available:Neural Net EngineSVMThey both have comparable performance.The bottom line is, we took in the service Taxonomy, we took in the service Ontology that describes the entire governance model as well as the description of the service - you can run it into a Neural Net Engine and then you can tag things, so that, as and when a new service is introduced, it is appropriately positioned in the Taxonomy, dynamically.
One of the things that happen very often is, there are lots of events generated as and when a service is being deployed and used, etc.And, events are being generated at multiple levels of the platform. We are looking for creating the aggregate info in real time. These are all activity streams - whenever a service is used, when some particular activity happens, there seems a log based time, that go into the CEP engine. We are looking for correlations; we are looking for causality behavior (ie, when a service fails, usually there is another dependent service which would go down as well), there is actually a causal effect which we can analyze through CQL.We have complex continuous events running with regard to time, looking at behaviors and type patterns and portions of the data. Then we can get an idea of when things could fail actually.
Based on that, we are able to interpret this along with Context and Event ontology.Eg: At a particular point in a day, if the SLA numbers are below a certain threshold, then that is a situation; whereas in peak time, it is not.So we need to bring in Contextual intelligence into standard CEP engines. (This does not come out of the box). We need to give out potential situations – situations which are predictive in nature and are probabilistic, with a certain amount of probability associated with them. These potential situations can feed into downstream systems.
The potential situations that have been detected by the CEP engine are queued into the BPM server.These situations act as trigger events to drive a BPM Process flow.This is the basis of EDBPM.As part of the Business Process Flow, Workflow events and notifications may be triggered that can then be consumed by the Application platform.The EDBPM process is further enriched by the participation of Semantics during its execution.Here, a Situation and context Ontology is loaded and inferred upon by the inferencing engine to enable contextually sensitive process flows.This inferencing activity could be a Decision node.
Let’s talk about BAM. We know that events are coming out. There are ingress and egress points for all these events to come in. We need to be able to run queries on that, and then at various time slices, actually look at consolidated metrics.Examples:SLAs by time of the daySLAs by business unit at a time slice, etc.We need to be able to compare them against the contracts that are being set, and the terms of condition of the contract. The BAM dashboard gives you a good insight of that. The only difference is that, we can inject semantics into that. When you looking at the BAM engine, when you are looking at a particular SLA, you need to be able to inject context. Example use case:Suppose, you have a Gold customer who is using a service of interest.Suppose, the SLA expected for the service on a particular point in time of a day is X.Suppose, the service has exceeded by, say 20%.Then you would like to trigger off a BPM process flow that actually alerts the customer sales representative, who works with the account, to actually talk to him and inform him saying “we just missed the SLA terms”, so that you could oil the wheels of relationship, etc. The bottom line is, there is an opportunity to inject semantics into standard Enterprise BAM that is available, and then do a better job than what BAM dashboards do currently!
All these events are being stored. It’s a large time scale information. We can mine it, using a mining engine, you can get patterns on it, and then generate reports, as well as you can look at dashboards. And use that for process improvements.
Let’s talk about something that everybody talks about – Semantic Search. Remember, earlier we talked about a key anchor point, which is that we have an active Semantic Service Repository. It’s got the model of everything persisted in it – services, service artifacts, dependencies, all the related entities, and other services too. And these are prime setup for search. What we call repository is actually a combination of a registry and a repository. We can look at services from a Discovery perspective, Description based search, search based on Contracts, Policies, Versions of services, etc. There are a lot of variations of what a service itself is. In addition, there are versions of the actual dependent code, the documentation, there are messages that are queued based on the use case, there is logging, auditing, etc. Likewise, there are a whole bunch of service attributes which are perfectly setup for search.
Lets look at a classic application of Semantic search on the Governance repository.Here is the architecture diagram of a system that we actually deployed.On the left hand side, we have the SOA registry, and Taxonomy. The Jena DB Adapter pulls it all in. And from the Presentation layer, we were able to take a search string, do the Named Entity recognition, generate the SPARQL queries (Triples, essentially), fire it on to the Template based SPARQL generation utility, which does the actual search on the semantic layer, pulls that info out and gives it back to the user.
We canintelligently Create, Publish, Search, Discover, Consume, Manage, Meter, Monitor, Govern and Report on SOA services to deliver superior value. Semantic technologies go a long way to solve this problem effectively! In Enterprise, we would like to do this in an incremental fashion (and not the big bang – “lets all move to semantics” effort!). This is, in our opinion, the way to go about bringing in mainstream quality to semantics in the enterprise.
3. For every one of those enterprise components, there is an open source equivalent available. What we are working is actually trying to integrate them all together and do a demo.If there are volunteers who are interested in joining the effort, they are most welcome!