8. Project Halo’s Focus Areas
• Automated User-Centered
AURA Reasoning and Acquisition System
• Text book you can talk to
• Semantic Inference with Large
SILK Knowledge-base
• Non-monotonic rule system / RIF
• Semantic MediaWiki +
SMW+ • Knowledge authoring with SMEs
Plus other related semantic technologies and commercial efforts
8
11. A Key Feature of Wiki
This distinguishes wikis from other publication tools
13
12. Consensus in Wikis Comes from
Collaboration
– ~17 edits/page on average in
Wikipedia (with high variance)
– Wikipedia’s Neutral Point of View
Convention
– Users follow customs and
conventions to engage with
articles effectively
14
13. Software Support Makes Wikis Successful
Trivial to edit by anyone
Tracking of all changes, one-
step rollback
Every article has a “Talk” page
for discussion
Notification facility allows
anyone to “watch” an article
Sufficient security on
pages, logins can be required
A hierarchy of
administrators, gardeners, and
editors
Software Bots recognize certain
kinds of vandalism and auto-
revert, or recognize articles that
need work, and flag them for
editors
15
14. How about Deep Info?
Wikipedia has articles about…
• … all cities with info on their
populations, locations and
skyscrapers, etc.
… all German cars with engine
size, accelerating data…
Can you find:
Skyscrapers with 50+ floors
and built after 2000 in
Shanghai (or Chinese cities
with 1,000,000+ people)?
Or German(Porsche) cars that
accelerate from 0-100km/h in
5 seconds?
16
24. Static List, Tables, …, Not Useable Enough
http://en.wikipedia.org/wiki/List_of_lists_about_Oregon
26
25. To Find More Info
• All Porsche vehicles made in
Germany that accelerate from 1-
100 km/h less than 4 seconds
• Sci-Fi movies made after year 2000
that cost less than $10M and gross
more than $30M
• A map showing where all
Mercedes-Benz vehicles are
manufactured
• All skyscrapers in China
(Japan, Thailand,…) of 50
(40/60/70) floors or more, and built
in year 2000 (2001/2002) and
after, sorted by built
year, floors…, grouped by
cities, regions…
• And many more
27
26. What is a Semantic Wiki
A wiki that has an underlying model of the
knowledge described in its pages.
To allow users to make their knowledge explicit and formal
Semantic Web Compatible
Semantic Wiki
28
29. Basics of Semantic Wikis
Still a wiki, with regular wiki features
– Category/Tags, Namespaces, Title, Versioning, ...
Typed Content (built-ins + user created, e.g. categories)
– Page/Card, Date, Number, URL/Email, String, …
Typed Links (e.g. properties)
– “capital_of”, “contains”, “born_in”…
Querying Interface Support
– E.g. “[[Category:Member]] [[Age::<30]]” (in SMW)
31
30. What is the Promise of Semantic Wikis?
Semantic Wikis facilitate
Consensus over Data
Combine low-expressivity
data authorship with the
best features of traditional
wikis
User-governed, user-
maintained, user-defined
Easy to use as an
extension of text authoring
33
31. One Key Helpful Feature of Semantic Wikis
Semantic Wikis are “Schema-Last”
Databases require DBAs and schema design;
Semantic Wikis develop and maintain the schema in the wiki
35
32. List of Semantic Wikis
AceWiki Semantic MediaWiki - an
ArtificialMemory extension to MediaWiki that
Wagn - Ruby on Rails-based turns it into a semantic wiki
KiWi – Knowledge in a Wiki Swirrl - a spreadsheet-based
semantic wiki application
Knoodl – Semantic
Collaboration tool and TaOPis - has a semantic wiki
application platform subsystem based on Frame
logic
Metaweb - the software that
powers Freebase TikiWiki CMS/Groupware
integrates Semantic links as a
OntoWiki core feature
OpenRecord zAgile Wikidsmart - semantically
PhpWiki enables Confluence
36
33. Short History of Semantic MediaWiki (SMW)
Born at AIFB
– Typed links and types and more
– Export articles as RDF
– Maximally flexible for the wiki user
SMW 0.1 released by AIFB in Sept 2005
– Parser/storage support for typed links – [[type::link | label]]
– FactBox for semantic relations at end of article
– Special:SearchSemantic, with basic auto-completion for link types
– Simple query language (“ask”)
Vulcan kicks off Halo Extensions to SMW project in August 2007
SMW 1.0 released by AIFB in Dec 2007, Ontoprise releases Halo
Extension 1.0 in parallel
– “Property” instead of “Relation” and “Attribute”
– Many new datatypes/special pages/UI features
37
34. Overview of Semantic MediaWiki (SMW)
Open source (GPL)
– Well documented, active user forum
Active development
– Commercial support (SMW+) available
World-wide community
– International Conferences
• Next SMWCon on 4/25-27, 2012 in Carlsbad, CA
Very stable core, various extensions
38
35. Semantic MediaWiki (SMW) Markup Syntax
Tsinghua is a university located in
[[Has location::Beijing]], with
[[Has population::27000|about 27 thousands]]
students.
In page "Property:Has location": In page "Property:Has population":
[[Has type::Page]] [[Has type::number]]
39
36. Special Properties
“Has Type” is a pre-defined “special” property for meta-
data
– Example: [[Has type::String]]
“Allowed Values” is another special property
– [[Allows value::Low]],
– [[Allows value::Medium]],
– [[Allows value::High]]
In Halo Extensions, there are domain and range support
– RDFs expressivity
– Semantic Gardening extension also supports “Cardinality”
40
37. Define Classes
Beijing is a city in [[Has
country::China]], with population
[[Has population::2,200,000]].
[[Category::Cities]]
Categories are used to define classes because they are better for class inheritance.
The Jin Mao Tower (金茂大厦) is an 88-story landmark supertall
skyscraper in …
[[Categories: 1998 architecture | Skyscrapers in
Shanghai | Hotels in Shanghai | Skyscrapers over 350
meters | Visitor attractions in Shanghai | Landmarks in
Shanghai | Skidmore, Owings and Merrill buildings]]
Category:Skyscrapers in China Category: Skyscrapers by country
41
38. Database-style Query over Wiki Data
Example: Skyscrapers in China
higher than 50 stories, built before
2000
ASK/SPARQL query target
{{#ask:
[[Category:Skyscrapers]]
[[Located in::China]]
[[Floor count::>50]]
[[Year built::<2000]]
…
}}
42
39. SMW Extensions – Help Build Great Things
Data I/O
• Halo Extensions, Semantic Forms, Semantic Notification, …
Query and Browsing
• Semantic Toolbar, Semantic Drilldown, Enhanced Retrieval, Search…
Visualization
• Semantic Result Printers, Tree View, Exhibit, Flash charts…
Other useful extensions
• HaloACL, Deployment, Triplestore Connector, Simple Rules…
• Semantic WikiTags and Subversion Integration extensions
• Upcoming Linked Data Extension, with R2R and SILK from F.U.Berlin
43
41. Example: Ultrapedia – Semantic Wikipedia
Ultrapedia: An SMW demo built to explore general
knowledge acquisition in a wiki
Wikipedia merged with the power of a database
Help Readers and Writers Be More Productive
45
42. Standard View of the Wiki Data
http://wiking.vulcan.com/up/index.php/Porsche_996
47. Video: Semantic Wikis for A New Problem
Increasing technical complexity →
← Increasing User Participation
Social tag-based Algorithm-based
Semantic
characterization object
Entertainment
Keyword search over Wiki characterization
tag data Database-style
Inconsistent Social database-style search
semantics characterization Consistent semantics
Easy to engineer Database search + Extremely difficult to
wiki text search engineer
Semantic consistency
via wiki mechanisms
Easy to engineer
53
51. Case Study and Demo: Project Management with SMW+
Automatically
populate tables
Just the data you
want,
At the level you want
Calendars and
timelines
Workflows
Personal menus
Form-oriented inputs
Notifications via
email/RSS
MS Office integration
SVN integration
66
55. Screenshot of a Sprint page
Data automatically generated via template queries on page
http://wiking.vulcan.com/dev/index.php/Sprint_101020
70
56. Requirements for Wiki “Developers”
One need not
– Write code like a hardcore programmer
– Design, setup RDBMS or make frequent
schema changes
– Possess knowledge of a senior system
admin
Instead one need
– Configure the wiki with desired extensions
– Design and evolve the data model
(schema)
– Design Content
• Customize templates, forms, styles, skin, etc.
71
57. Effectiveness of SMW as a Platform Choice
Packaged Software SMW + Extensions Custom Development
☺Very quick to ☺ Still quick to N Slow to develop
obtain program ☺Extremely flexible
N Hard to customize ☺ Easy to customize N High cost to develop
N Expensive ☺ Low-moderate cost and maintain
Microsoft Project Vulcan Project Wiki .NET Framework
Version One B.L.S. J2EE, …
Microsoft RPI map Ruby on rails
SharePoint
72
58. SMW:: powerful tools and contents
Semantic MediaWiki and related extensions have more potential power
59. Need Release :: The Power
Be used by more
people
Content in more
places
Accessible via more
applications
Enhanced with more
semantics
60. Need :: Workflow Integration + Usability Enhancements
Infrequent Wiki users frequently forget where the wiki
pages are located
Search is a break from current workflow
Search result can be noisy or irrelevant
Usability:
– Wiki/Template/SF markup syntax is not extremely hard, but
enough to turn off many users
– To locate and consume info in SMW is just not easy
enough, need something better
Why don’t we leverage Microsoft Office suite?
61. Microsoft Office :: The Most Popular Productivity Suite
500m users worldwide
>90% market share
Users live in the “suite”
Outlook always open
Potential for SMW
62. MICROSOFT OFFICE CONNECTOR :: How It Works
Leverage Microsoft Office
Add-ins technology
Bring SMW info to Office
applications on-demand
API for semantic data I/O
Utilize semantics to
improve relevance
Smart actions for
semantic properties
63. Backstage::Semantic Wiki Object Model
Wiki Validation To get page info
Authentication Get all forms related info
To get the categories Edit and save page w/ form
– And descriptions Change a property
To get the article titles Set form of a page
To get the semantic Create form templates
properties To upload into the Wiki
http://wiking.vulcan.com/dev/index.php/SMW_Webservice_APIs
64. Microsoft Office Connector Smart Connections
• Consume relevant, targeted information
– With the tools you are already familiar with
– In the context – better relevance and productivity
– In place – no search overhead to break workflow
– In real time – data from wiki is live
– Automatically – linking to wiki
• Let you contribute to Wiki
– Without knowing where the content is
– Without learning wiki/template syntax
66. Semantic MediaWiki Enables Collaboration
Create and Manage Real
Knowledge
Build Social Semantic
Web Applications
In an Efficient and Cost-
Effective Way
85
69. Case Study: Battle-space Luminary System
Discover when New Information represents a change in understanding of entities
– Discovery of explicit entity links, implicit relationships
Large Volumes of Data in various formats
– Unstructured news articles
– Tactical Reports, Field Intelligence
– Structured Database Information
Use Wiki Pages to represent current knowledge about an entity – “what we know”
Domain Ontology to represent domain of information – “what we want to know”
Issue Alerts when Significant Events occur
– New information according to category
– Changing information on topics of interest
– Need to send information to various devices – cell phones, email, etc.
88
70. System Design
Wiki Configuration
– Semantic MediaWiki: Large developer community, active development, open
source. Wikipedia uses MediaWiki, so scalability and performance are
important.
– Semantic Results Format: Provides various rich media displays of semantic
information, including graphs, timelines, maps
– Semantic Forms: Provides convenient user interface for entering semantic
data into wiki, avoiding cumbersome wikitext
– Semantic Notifications: Enables sending of notifications when results of
semantic query change.
Domain Ontology
– Created OWL Ontology for Terrorism
Semantic Parsing, Extraction, Reasoning
– Java Process using various Open-Source Toolkits
– Rapid plugin of new technologies
89 – Multiple Data Sources supported
72. Wiki Content Design
Use Templates to Ensure Consistent Look-and-Feel
– Templates Correspond to Ontology Classes
– Fields within Templates correspond to Properties within Ontology
– Rich Content Visualizations derived in consistent way
Hierarchical Categories match Class Hierarchy within Ontology
– Ensures Validity for Properties
– Category included on each Template page to ensure consistency
Forms Provide ability for users to enter data directly into wiki without
knowing Wiki Text
– Each form corresponds to a Template
– Fields within forms correspond to the fields/properties within the Template
– GUI can include auto-completion
– Created Page immediately linked semantically to rest of Wiki
91
80. Dynamically-Generated Tables forfast?
Which Porsches accelerate
Queries
Information Need: All Porsche models that accelerate 0-
100kph in under 5, 6, and 7 seconds
----- Meeting Notes (3/24/11 15:29) -----Vulcan is the MothershipProviding funds and supportPaul Allen successful
Of course once you have data, Ultrapedia can support data visualizations. This is a simple Flash-based chart widget based on the same Porsche 996 data, and included in Ultrapedia’s Porsche 996 page.It shows us that while acceleration varies dramatically, top speed and peak engine power remain fairly constant across models.The chart was specified manually with a query. There are of course a huge number of possible ways to chart a set of data, and most of these ways are uninteresting.In the Ultrapedia concept, we rely on article authors to specify interesting charts for their readers that will support the particular points in the article.
Of course once you have data, Ultrapedia can support data visualizations. This is a simple Flash-based chart widget based on the same Porsche 996 data, and included in Ultrapedia’s Porsche 996 page.It shows us that while acceleration varies dramatically, top speed and peak engine power remain fairly constant across models.The chart was specified manually with a query. There are of course a huge number of possible ways to chart a set of data, and most of these ways are uninteresting.In the Ultrapedia concept, we rely on article authors to specify interesting charts for their readers that will support the particular points in the article.
Of course once you have data, Ultrapedia can support data visualizations. This is a simple Flash-based chart widget based on the same Porsche 996 data, and included in Ultrapedia’s Porsche 996 page.It shows us that while acceleration varies dramatically, top speed and peak engine power remain fairly constant across models.The chart was specified manually with a query. There are of course a huge number of possible ways to chart a set of data, and most of these ways are uninteresting.In the Ultrapedia concept, we rely on article authors to specify interesting charts for their readers that will support the particular points in the article.
But, did you know that Uusikaupunki, Finland, is a major hub for Porsche manufacturing?Ultrapedia allows us to drill down to look at Finland’s contribution to Porsche production.
Wikis, especially, semantic-enhanced wikis, are wonderful tools for collaboration and content management. Semantic MediaWiki Plus, with Halo and other useful extensions made it a great platform for web application development.
With all the semantic structures generated, it is important to empower more people with the magic of this platform. The more people use it, the better it will be.
With all the semantic structures generated, it is important to empower more people with the magic of this platform. The more people use it, the better it will be.
Microsoft Office application suite has more than 90% market share, generating billions of revenue for Microsoft. Many users are dependent on the application to get their things done, such as Excel, PowerPoint. Outlook, especially, is usually open all the time, and in fact, many people spend most of their work time a day with Outlook. So, if we can entice Microsoft Office users to use Semantic Wiki, it’ll be a great plus. 500 million users is from http://blogs.technet.com/office2010/archive/2009/10/07/new-ways-to-try-and-buy-microsoft-office-2010.aspx
WikiTags is here to bridge semantic wikis with more potential users, such as users of Microsoft Word, Outlook and Excel, with Microsoft SmartTag technology.
WikiMail let users contribute to the wiki using their familiar tools
WikiTags can help wikis connecting to more people and releasing more power of semantic wikis, and it is available for free trial.
The problem we are going to solve is “find the 0-60 times of all Porsche cars in Wikipedia”This is a sample Wikipedia page for the Porshe 996, showing its acceleration times in a performance data table.This table is manually built – all the table data exists as constants in the table.
This is a Wikipedia page showing 0-60 times for the Porsche Cayenne.If we have to manually go through every Porsche model to assemble the 0-60 data for each model and type, this is going to take a while.A better idea is to treat Wikipedia like a database, and simply query it. Enter Ultrapedia.
This is the Ultrapedia home page.
First notice that Ultrapedia can leverage all the data it extracts from Wikipedia to support a much more helpful UI.For example, Ultrapedia adds a manufacturer-based navigation system on the side, and show explanatory popups. These kinds of UI tweaks aren’t possible with MediaWiki now, and are an important benefit of having the semantic data.
Remember that we want to find the 0-60 acceleration data for all Porsche models that Wikipedia knows about.Let’s start by looking at a query generated table on the Ultrapedia Porsche 996 page. For comparison, Ultrapedia also includes the original performance table from Wikipedia (above)
This is Ultrapedia’sPorsche 996 performance table, built by a query to the Ultrapedia database of Wikipedia-extracted data.Notice that it has the same information that the original static table has, this is because we scrape the data from the static table.This table is dyamically generated at each page load out of the extracted Wikipedia data, so it is always up to date.It is sortable and also accepts feedback and ratings on individual data items.
Now we can answer our question about 0-60 times across all Porsche models with one simple query in Ultrapedia. We can make this an Ultrapedia-only page – the page itself just 5 queries on it (one for each acceleration range).We could also do this as one big table but it’s easier to read as 5 smaller tables.All the data here flows from Wikipedia.
Of course once you have data, Ultrapedia can support data visualizations. This is a simple Flash-based chart widget based on the same Porsche 996 data, and included in Ultrapedia’s Porsche 996 page.It shows us that while acceleration varies dramatically, top speed and peak engine power remain fairly constant across models.The chart was specified manually with a query. There are of course a huge number of possible ways to chart a set of data, and most of these ways are uninteresting.In the Ultrapedia concept, we rely on article authors to specify interesting charts for their readers that will support the particular points in the article.
We can also use the data to dynamically link to other data sources. In this case we have configured the Ultrapedia Porsche 996 article to include a live ebay query to find out what the Porsche 996 sells for today…We access the ebay data through a web services interface.We can do this for arbitrary other web-service-accessible data sources, like amazon or geonames.In a government or enterprise context, we would link articles to supporting data from appropriate systems of record.
I don’t think I’ll be buying one… I think I’d rather send my daughter to college.
Pictures automatically get metadata, so Ultrapedia can deliver an iPod-like “cover flow” browsing experience with images to augment the table data. We could also embed images or videos in the tables.
Since Ultrapedia includes some simple internal logic about time, we can generate simple browsable timelines and use them in articles.Here we see a timeline of VW models.
But, did you know that Uusikaupunki, Finland, is a major hub for Porsche manufacturing?Ultrapedia allows us to drill down to look at Finland’s contribution to Porsche production.