Presentation given at VDB6, 6th IFIP Workshop on Visual Database Systems, Brisbane, Australia, May 2002
ABSTRACT: As part of a general framework for the development of global information systems, we include support for the development of aural interfaces. The framework uses an object-oriented database for the management of application, document content and presentation data. The access layer is based around an XML server and XSLT for document generation from default and customised templates. Specifically, aural interfaces are supported through a VoiceXML server that provides the speech recognition and synthesis mechanisms, together with XSLT templates for the generation of VoiceXML. In this paper, we describe the implementation of a generic voice browser for application databases as well as the development of a customised aural interface for a community diary managing appointments and events.
1. Aural Interfaces to Databases
based on VoiceXML
Beat Signer, Moira C. Norrie,
Peter Geissbuehler and Daniel Heiniger
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
2. Outline
Motivation
Architecture
Voice Interfaces
Application Development
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
3. Avalanche Forecasting System
Project to provide
WAP and
Voice Access
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
4. Avalanche Forecasting System ...
Information model (OM model) for SLF
forecast data
Application user interfaces for WAP
and voice access
national bulletin with maps and glossary
local bulletin based on a region's start
letter, GPS or Swiss Coordinates
WAP responses for voice requests
(mixed-mode) or triggered events
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
5. Requirements
Platform supporting universal client
access to databases
→ eXtensible Information Management
Architecture (XIMA)
Use of a technology which allows the
separation of content and presentation
→ XML and XSL
Minimise effort to support new types of
client devices, e.g. XML, HTML, WML,
CHTML, VXML, ?
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
6. XIMA
HTML WML VXML
Browser Browser Browser
Main Entry Servlet
Delegation
XML + XSLT
→ Response
HTML Servlet WML Servlet VXML Servlet
Builds XML
based on JDOM
XML Server
OM Model
Global Information Systems Group OMS Java API Collections, Associations,
Department of Computer Science multiple inheritance and
multiple instantiation
ETH Zurich, Switzerland
OMS Java Workspace
7. XML Reponse
XML Response
<?xml version="1.0" encoding="ISO-8859-1"?>
<oms>
<instance id="OM_4077" last="true" pos="1" type="person">
XML Schema
<dressedWith type="person"/>
<attribute name="name">
<xsd:element name="oms">
<string>Moira Norrie</string>
</attribute> <xsd:complexType>
<xsd:choice minOccurs="0" maxOccurs="unbounded">
…
<xsd:element name="workspace" type="workspaceType"/>
<attribute name="picture">
<xsd:element name="instance" type="instanceType"/>
<mime>/globis/staff/moira.jpg</mime>
</attribute> <xsd:element name="collection" type="collectionType"/>
<xsd:element name="association" type="associationType"/>
<method name="age"/>
… <xsd:element name="result" type="resultType"/>
<xsd:element ref="warning"/>
<link idref="OM_2693" inv="false" name="Workplace"/>
</instance> </xsd:choice>
… </xsd:complexType>
</oms> </xsd:element>
<xsd:complexType name="instanceType">
<xsd:sequence>
<xsd:element name="dressedWith" type="dressedWithType" …>
…
valid? <xsd:element name="link" type="linkType" minOccurs="0" …>
Global Information Systems Group </xsd:sequence>
Department of Computer Science <xsd:attribute name="id" type="xsd:string" use="required"/>
ETH Zurich, Switzerland …
</xsd:complexType>
8. VoiceXML
Voice Input Voice Output
Speech Language Application Speech
Speech Recogniser Text Analyser Meaning Server Text Synthesiser Speech
Converts voice Extracts meaning Gets data (text) Generates
input into text from text from database speech output
Application Pronounciation
Speech model Grammar
database rules
Development
IBM WebSphere Voice Server SDK
Global Information Systems Group
Deployment
Department of Computer Science
ETH Zurich, Switzerland BeVocal Cafe Voice Portal
9. VoiceXML ...
VoiceXML is an application of XML
Describes call flows and human machine
dialogues
Use advantages of web-based development
and content delivery to build interactive voice
response applications
Hello Word Example
<?xml version="1.0" encoding="ISO-8859-1"?>
<vxml version="2.0">
<form id="f1">
<block>Hello World</block>
Global Information Systems Group </form>
Department of Computer Science
ETH Zurich, Switzerland
</vxml>
10. XML to VXML Example
XML Response
XSLT Stylesheet
<?xml version="1.0" encoding=… ?>
<oms> <xsl:template match="instance">
<instance id="OM_4077" <form id="instance_entry">
type="person" …>
<dressedWith type="person"/>
<block>
<attribute name="name"> <xsl:choose>
<string>Moira Norrie</string>
<xsl:when test="count(dressedWith)=1">
</attribute> Object
… <xsl:call-template name="removeUnderscore">
<method name="age"/> <xsl:with-param name="label" select="@id"/>
… </xsl:call-template>
</instance> is dressed with type
</oms> <xsl:value-of select="./@type"/>
</xsl:when>
…
</xsl:template>
…
VXML Result
<?xml version="1.0" encoding="ISO-8859-1"?>
<vxml application="http://macbain/xima/omsmain_root.vxml" version="2.0">
<form id="instance_entry"><block>
Object 4077 is dressed with type person and is viewed as type person.
<prompt>It contains 8 attributes, 5 links, and 1 method</prompt>
Global Information Systems Group <goto next="#instance_process"/></block></form>
Department of Computer Science <form id="instance_process"><field name="Member_Choice"><prompt>Would you
ETH Zurich, Switzerland like to hear the attributes, the links or the methods or go back?</prompt>
…
11. Design Phase
Define the required functionality
User analysis
motivation, expertise
High level decisions
full-duplex (barge-in)
simple grammars (dynamic)
only synthesised speech (TTS)
Representation of base types
Global Information Systems Group
Department of Computer Science
Information flow
ETH Zurich, Switzerland
12. The database contains #Collections #Associations
Would you like to go to the collections, to the associations,
directly to an object or back to the main menu?
collections associations objects
The database contains the The database contains the
The database contains #Objects
following # collections following # associations
Choose a collection Choose an association Choose an object or say back
Collection 'name' contains #M Association 'name' contains #A
Would you like to list the Would you like to list the
members or go back? members or go back?
Collection 'name' contains the Association 'name' contains the
following # members following # associations
Choose a 'domaintype' or
Choose one of the members
a 'rangetype' or say back
Object 'oID' is dressed with type 'type' and currently viewed as type 'type'. It contains #Attr, #Links, #Methods
Would you like to hear the attributes, the links or
the methods, change the type or go back?
The object contains the You can choose among You can choose among You can view the object
following # attributes the following links the following methods as the following types
Choose a link Choose a method Choose one of the
or say back or say back types or say back
The result of the
method is Result
13. Test and Refinement Phase
Recognition problems
elimination of similar sounding words from
the grammar
addition of optional words to the grammar
(e.g. "please")
Insufficient help functionality
introduction of prompt-specific help
instead of always active command list
Immediate feedback after input has
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
been processed ("OK" prompt)
14. OMS Database Development Suite
OM
Semantic Object Data Model Application Modelling
OMS Pro
Rapid Prototyping System Database and
and Lightweight DBMS Application Design
OMS Java
Implementation
Data Management System
Global Information Systems Group
Department of Computer Science and Application Framework
ETH Zurich, Switzerland
15. XIMA Application Development
Prototype the application's information
model in prototyping system OMS Pro
Export model (and data) to OMS Java
Installation of XML Server with default
XSLT stylesheets and servlets
database immediately acessible by
generic object browser
Customisation of stylesheets
Global Information Systems Group
Department of Computer Science
ETH Zurich, Switzerland
16. Conclusions
Database driven development of voice-
enabled applications
Rapid prototyping supported by OMS
Pro and XIMA's generic object browser
Multi-mode access provided by generic
object browser (HTML, WAP, VXML)
Customised user interfaces (stepwise
refinement of XSLT stylesheets)
Global Information Systems Group
Department of Computer Science
New potential user communities
ETH Zurich, Switzerland