SlideShare a Scribd company logo
1 of 18
A Common Gesture and Speech Production Framework for
              Virtual and Physical Agents
     Quoc Anh Le - Jing Huang - Catherine Pelachaud
                        CNRS, LTCI
               Telecom-ParisTech, France



    Workshop on Speech and Gesture Production, ICMI 2012, Santa Monica, CA, USA
Introduction
      Motivations

       • Similar approaches between virtual agents and
         humanoid robots
       • Limits of existing systems: agent dependent
      Objectives

       • Common co-verbal gesture generation framework for
         both virtual and physical agents
      Methodologies

       • Based on GRETA system
       • Use
          - same representation languages
          - same algorithm for selecting and planning gestures
          - different algorithms for creating the animation
page 2
Architecture Overview
                                                     Intent Lexicon                          Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer
         (Common Module)                          (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML

                                                        ActiveMQ
                                              Messaging Central System

                               Keyframes                           Keyframes
FAP-BAP      FAP-BAP                                                                             Joint         Nao Built-in
 Player       Values            Animation Realizer                    Animation Realizer         Values        Proprietary
                                (Specific Module)                       (Specific Module)                      Procedures




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 3
Behavior Realizer
                                                     Intent Lexicon                          Behavior Lexicon
                                                                                              Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer

         (Common Module)                          (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML




                               Keyframes                           Keyframes
FAP-BAP      FAP-BAP                                                                             Joint         Nao Built-in
 Player       Values            Animation Realizer                    Animation Realizer         Values        Proprietary
                                (Specific Module)                       (Specific Module)                      Procedures




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 4
Behavior Realizer: Outline

      Common          processes to all agents
         1.   Create gesture from the gestuary of an agent
         2.   Schedule timing of gesture phases
         3.   Generate keyframes: pair (absolute time, symbolic
              description of hand configuration at this time)
      Different      databases
             For Nao
                 Gestuary (for instance, pointing with full stretch arm)
                 Velocity profile (empirically determined from Nao)
             For Greta
                 Gestuary (for instance, pointing with one finger)
                 Velocity profile (empirically determined from real humans)


page 5
Example: Different pointing gestures
                                                              <bml id=“bml1” >
Nao Gestuary
..
                                                                 <speech xmlns="" id="s1" start="0">
                                                                   <text>It is <sync id=« tm1 »/> overthere! <sync id=« tm2 »/>                BML                                Greta Gestuary
                                                                                                                                                                                  ..
                                                                 </speech>
<gesture id=« pointing »>                                        <gesture id=« g1 » lexeme=« pointing » start=«s1:tm1» end=«s2:tm2»>                                              <gesture id=« pointing »>
<phase type=« stroke »>
 <vertical>YUpperP</vertical>            1                           <description priority=« 1 » type=«GRETA»>
                                                                              <GRETA:SPC>0.80</GRETA:SPC>
                                                                             <GRETA:TMP>0.50</GRETA:TMP>
                                                                                                                                                                              1   <phase type=« stroke »>
                                                                                                                                                                                   <vertical>YP</vertical>
 <horizontal>XEP</horizontal>                                                <GRETA:FLD>-0.62</GRETA:FLD>                                                                          <horizontal>XP</horizontal>
 <distance>XFar<distance>                                                    <GRETA:PWR>0.30</GRETA:PWR>                                                                           <distance>XMiddle<distance>
 <hShape>OPEN</hShape>                                                       <GRETA:REP>0.00</GRETA:REP>                                                                           <hShape>INDEX</hShape>
                                                                             <GRETA:OPE>1.00</GRETA:OPE>
</phase>                                                                     <GRETA:TEN>0.20</GRETA:TEN>                                                                          </phase>
</gestures>                                                          </description>                                                                                               </gestures>
…                                                                </gesture>                                                                                                       …
                                                              </bml>




                                                                               2, 3                                                                  2,3
                                <keyframe 1 (time, description)>                                                                       <keyframe 1 (time, description)>
                                <keyframe 2 (time, description)>                                                                       <keyframe 2 (time, description)>
                                …                                                                                                      …
                                <keyframe N (time, description)>                                                                       <keyframe N (time, description)>




                                                                   4                                                                                                      4
                                            JOINT VALUES                                                                                                       BAP




       page 6
BR: Synchronization with speech

          Algorithm
          • Compute preparation phase
          • Do not perform gesture if not enough time (strokeEnd(i-1) > strokeStart(i)
            +duration)

          • Add a hold phase to fit gesture planned duration
          • Co-articulation between several gestures
            - If enough time, retraction phase (ie go back to rest position)


               Start                 end   Start                end
            - Otherwise, go from end of stroke to preparation phase of next
              gesture
                           S-start     S-end       S-start   S-end


                                                                  end
                  Start
page 7
BR: Velocity profiles

          Gesture   velocity
          • Predict a movement duration using Fitts’ law:
             • Movement Time = a+b*log2(Distance+1)
          • Threshold of maximal speeds (empirically determined)
          • Stroke phase is different from other phases in velocity and
            acceleration (Quek, 1995)
          Add   expressivity
              • Temportal extent (TMP): Modulate the duration of whole gesture
                => change coefficient of Fitts’ Law




page 8
BR: Build coefficients of Fitts’ law




page 9
Animation Realizer
                                                     Intent Lexicon                          Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer
      (Common Module)                             (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML




                               Keyframes                           Keyframes
             FAP-BAP                                                                             Joint
              Values            Animation Realizer                    Animation Realizer         Values
                                (Specific Module)                       (Specific Module)




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 10
Implemented expressivity parameters
EXP               Definition                       Nao                        Greta
TMP       Velocity of movement         Change coefficient of Fitts’   Change coefficient of
                                       law                            Fitts’ law
SPC       Amplitude of movement        Limited in predefined key      Change gesture
                                       positions                      space scales
PWR       Acceleration of              Modulate stroke duration       Modulate stroke
          movement                                                    acceleration
REP       Number of stroke             Yes                            Yes
          repetition times
FLD       Smoothness and               No                             No
          Continuity
OPN       Relative spatial extent to   No                             elbow swivel angle
          body
TEN       Muscular tension             No                             No

   Create animation parameters
         Joint values for Nao
         BAP values for Greta
    page 11
Create animation parameters
         Descritization of the gestural space of McNeill (1992)
         One symbolic position will be translated into concrete values of agent joints (for
          instance 6 joints of Nao as table below)
            Code   ArmX   ArmY       ArmZ      Joint values (LShoulderPitch, LShoulderRoll, LElbowYaw, LElbowRoll, LWristYaw, Hand)

            000    XEP    YUpperEP   ZNear     (-54.4953, 22.4979, -79.0171, -5.53477, -0.00240423, 1.0)
            001    XEP    YUpperEP   ZMiddle   (-65.5696, 22.0584, -78.7534, -8.52309, -0.178188, 1.0)
            002    XEP    YUpperEP   ZFar      (-79.2807, 22.0584, -78.6655,-8.4352, -0.178188, 1.0)
            010    XEP    YUpperP    ZNear     (-21.0964, 24.2557, -79.4565, -26.8046, 0.261271, 1.0)
            ...    ...    ...        ...       ...



         Translate symbolic keyframes in joint values
         Animation is obtained by interpolating between
             joint values with robot built-in proprietary procedures
             use Slerp (spherical linear interpolation) with time warping: easing in out
              functionsfor Greta



page 12
Greta: Full Body IK
                                                 Torso IK




                                          Analytic Method: Arm To Torso




     Torso target depending on hand position

page 13
Demo: Greta




page 14
Demo: Nao




page 15
Perceptive Evaluation
         Objective
          • Evaluate how robot’s gestures are perceived by human users
         Procedure
          • Participants (63 French speakers) rate videos of Nao
            storyteller
          • Random displayed versions to the participants:
          - Gestures with expressivity VS. Gestures without expressivity
          - Gesture-speech synchronization VS. Gesture-speech asynchronization
         Results (using the ANOVA method)
          • Synchronization:
          - F(1, 124) = 4.94, p < .05
          - 76% agreed that gestures were synchronized with speech for sync version
          • Expressivity:
          - F(1, 124) = 4.43, p < .05
          - 70% agreed that gestures were expressive for expressivity version
page 16
State of the art
         Most similar work: Salem et al. (2012)
          • Same idea (based on existing Max virtual agent system)
         Main differences:
          • Our system: re-designed GRETA as a common framework
          • Salem et al.’s system: adjusted Max’s ACE to ASIMO robot

          Features             Our model                 Salem et al.’s system

 Gesture Product     Online from templates        Automatically generated from trained
                     regardless specific domain   specified domain data corpus
 Gesture Shapes      Agent specific parameter     Original for Max and mapped to
                                                  ASIMO configurations

 Gesture Timing      Agent specific parameter     Original for Max and adapted to
                                                  ASIMO by feedback
 Expressivity        Yes                          No
 Synchronization     Adapt gesture to speech      Cross-Modal Adjustment



page 17
Future works

       Short-term   plan
        • Human like gestures: enhance velocity profiles
        • Expressivity: implement fluidity and tension
       Long-term plan

        • Feedback mechanism
        • Study of the coherence between consecutive
          gestures in a G-Unit (Kendon, 2004)




page 18

More Related Content

Viewers also liked

فيتامين واو
فيتامين واوفيتامين واو
فيتامين واوkininaful
 
EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist Co., Ltd.
 
ACM ICMI Workshop 2012
ACM ICMI Workshop 2012ACM ICMI Workshop 2012
ACM ICMI Workshop 2012Lê Anh
 
Người Ảo
Người ẢoNgười Ảo
Người ẢoLê Anh
 
Cahier de charges
Cahier de chargesCahier de charges
Cahier de chargesLê Anh
 
Automatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsAutomatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsLê Anh
 
Lap trinh java hieu qua
Lap trinh java hieu quaLap trinh java hieu qua
Lap trinh java hieu quaLê Anh
 

Viewers also liked (8)

فيتامين واو
فيتامين واوفيتامين واو
فيتامين واو
 
EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要
 
ACM ICMI Workshop 2012
ACM ICMI Workshop 2012ACM ICMI Workshop 2012
ACM ICMI Workshop 2012
 
Diftong
DiftongDiftong
Diftong
 
Người Ảo
Người ẢoNgười Ảo
Người Ảo
 
Cahier de charges
Cahier de chargesCahier de charges
Cahier de charges
 
Automatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsAutomatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordings
 
Lap trinh java hieu qua
Lap trinh java hieu quaLap trinh java hieu qua
Lap trinh java hieu qua
 

Similar to Common Gesture and Speech Production Framework for Virtual and Physical Agents

SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!melbats
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroYosuke Matsusaka
 
Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Lê Anh
 
Casing3d opengl
Casing3d openglCasing3d opengl
Casing3d openglgowell
 
Florian adler minute project
Florian adler   minute projectFlorian adler   minute project
Florian adler minute projectDmitry Buzdin
 
Metadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createMetadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createvrt-medialab
 

Similar to Common Gesture and Speech Production Framework for Virtual and Physical Agents (8)

SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI Intro
 
Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)
 
Casing3d opengl
Casing3d openglCasing3d opengl
Casing3d opengl
 
Cascon2011_5_rules+owl
Cascon2011_5_rules+owlCascon2011_5_rules+owl
Cascon2011_5_rules+owl
 
Florian adler minute project
Florian adler   minute projectFlorian adler   minute project
Florian adler minute project
 
Metadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createMetadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to create
 
2
22
2
 

More from Lê Anh

Spark docker
Spark dockerSpark docker
Spark dockerLê Anh
 
Presentation des outils traitements distribues
Presentation des outils traitements distribuesPresentation des outils traitements distribues
Presentation des outils traitements distribuesLê Anh
 
Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Lê Anh
 
Lequocanh
LequocanhLequocanh
LequocanhLê Anh
 
These lequocanh v7
These lequocanh v7These lequocanh v7
These lequocanh v7Lê Anh
 
Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Lê Anh
 
Poster WACAI 2012
Poster WACAI 2012Poster WACAI 2012
Poster WACAI 2012Lê Anh
 
Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lê Anh
 
IEEE Humanoids 2011
IEEE Humanoids 2011IEEE Humanoids 2011
IEEE Humanoids 2011Lê Anh
 
ACII 2011, USA
ACII 2011, USAACII 2011, USA
ACII 2011, USALê Anh
 
Mid-term thesis report
Mid-term thesis reportMid-term thesis report
Mid-term thesis reportLê Anh
 
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotJournée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotLê Anh
 

More from Lê Anh (12)

Spark docker
Spark dockerSpark docker
Spark docker
 
Presentation des outils traitements distribues
Presentation des outils traitements distribuesPresentation des outils traitements distribues
Presentation des outils traitements distribues
 
Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013
 
Lequocanh
LequocanhLequocanh
Lequocanh
 
These lequocanh v7
These lequocanh v7These lequocanh v7
These lequocanh v7
 
Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam
 
Poster WACAI 2012
Poster WACAI 2012Poster WACAI 2012
Poster WACAI 2012
 
Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)
 
IEEE Humanoids 2011
IEEE Humanoids 2011IEEE Humanoids 2011
IEEE Humanoids 2011
 
ACII 2011, USA
ACII 2011, USAACII 2011, USA
ACII 2011, USA
 
Mid-term thesis report
Mid-term thesis reportMid-term thesis report
Mid-term thesis report
 
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotJournée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Common Gesture and Speech Production Framework for Virtual and Physical Agents

  • 1. A Common Gesture and Speech Production Framework for Virtual and Physical Agents Quoc Anh Le - Jing Huang - Catherine Pelachaud CNRS, LTCI Telecom-ParisTech, France Workshop on Speech and Gesture Production, ICMI 2012, Santa Monica, CA, USA
  • 2. Introduction  Motivations • Similar approaches between virtual agents and humanoid robots • Limits of existing systems: agent dependent  Objectives • Common co-verbal gesture generation framework for both virtual and physical agents  Methodologies • Based on GRETA system • Use - same representation languages - same algorithm for selecting and planning gestures - different algorithms for creating the animation page 2
  • 3. Architecture Overview Intent Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML ActiveMQ Messaging Central System Keyframes Keyframes FAP-BAP FAP-BAP Joint Nao Built-in Player Values Animation Realizer Animation Realizer Values Proprietary (Specific Module) (Specific Module) Procedures Greta Nao Animation Lexicon Animation Lexicon page 3
  • 4. Behavior Realizer Intent Lexicon Behavior Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML Keyframes Keyframes FAP-BAP FAP-BAP Joint Nao Built-in Player Values Animation Realizer Animation Realizer Values Proprietary (Specific Module) (Specific Module) Procedures Greta Nao Animation Lexicon Animation Lexicon page 4
  • 5. Behavior Realizer: Outline  Common processes to all agents 1. Create gesture from the gestuary of an agent 2. Schedule timing of gesture phases 3. Generate keyframes: pair (absolute time, symbolic description of hand configuration at this time)  Different databases  For Nao  Gestuary (for instance, pointing with full stretch arm)  Velocity profile (empirically determined from Nao)  For Greta  Gestuary (for instance, pointing with one finger)  Velocity profile (empirically determined from real humans) page 5
  • 6. Example: Different pointing gestures <bml id=“bml1” > Nao Gestuary .. <speech xmlns="" id="s1" start="0"> <text>It is <sync id=« tm1 »/> overthere! <sync id=« tm2 »/> BML Greta Gestuary .. </speech> <gesture id=« pointing »> <gesture id=« g1 » lexeme=« pointing » start=«s1:tm1» end=«s2:tm2»> <gesture id=« pointing »> <phase type=« stroke »> <vertical>YUpperP</vertical> 1 <description priority=« 1 » type=«GRETA»> <GRETA:SPC>0.80</GRETA:SPC> <GRETA:TMP>0.50</GRETA:TMP> 1 <phase type=« stroke »> <vertical>YP</vertical> <horizontal>XEP</horizontal> <GRETA:FLD>-0.62</GRETA:FLD> <horizontal>XP</horizontal> <distance>XFar<distance> <GRETA:PWR>0.30</GRETA:PWR> <distance>XMiddle<distance> <hShape>OPEN</hShape> <GRETA:REP>0.00</GRETA:REP> <hShape>INDEX</hShape> <GRETA:OPE>1.00</GRETA:OPE> </phase> <GRETA:TEN>0.20</GRETA:TEN> </phase> </gestures> </description> </gestures> … </gesture> … </bml> 2, 3 2,3 <keyframe 1 (time, description)> <keyframe 1 (time, description)> <keyframe 2 (time, description)> <keyframe 2 (time, description)> … … <keyframe N (time, description)> <keyframe N (time, description)> 4 4 JOINT VALUES BAP page 6
  • 7. BR: Synchronization with speech  Algorithm • Compute preparation phase • Do not perform gesture if not enough time (strokeEnd(i-1) > strokeStart(i) +duration) • Add a hold phase to fit gesture planned duration • Co-articulation between several gestures - If enough time, retraction phase (ie go back to rest position) Start end Start end - Otherwise, go from end of stroke to preparation phase of next gesture S-start S-end S-start S-end end Start page 7
  • 8. BR: Velocity profiles  Gesture velocity • Predict a movement duration using Fitts’ law: • Movement Time = a+b*log2(Distance+1) • Threshold of maximal speeds (empirically determined) • Stroke phase is different from other phases in velocity and acceleration (Quek, 1995)  Add expressivity • Temportal extent (TMP): Modulate the duration of whole gesture => change coefficient of Fitts’ Law page 8
  • 9. BR: Build coefficients of Fitts’ law page 9
  • 10. Animation Realizer Intent Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML Keyframes Keyframes FAP-BAP Joint Values Animation Realizer Animation Realizer Values (Specific Module) (Specific Module) Greta Nao Animation Lexicon Animation Lexicon page 10
  • 11. Implemented expressivity parameters EXP Definition Nao Greta TMP Velocity of movement Change coefficient of Fitts’ Change coefficient of law Fitts’ law SPC Amplitude of movement Limited in predefined key Change gesture positions space scales PWR Acceleration of Modulate stroke duration Modulate stroke movement acceleration REP Number of stroke Yes Yes repetition times FLD Smoothness and No No Continuity OPN Relative spatial extent to No elbow swivel angle body TEN Muscular tension No No  Create animation parameters  Joint values for Nao  BAP values for Greta page 11
  • 12. Create animation parameters  Descritization of the gestural space of McNeill (1992)  One symbolic position will be translated into concrete values of agent joints (for instance 6 joints of Nao as table below) Code ArmX ArmY ArmZ Joint values (LShoulderPitch, LShoulderRoll, LElbowYaw, LElbowRoll, LWristYaw, Hand) 000 XEP YUpperEP ZNear (-54.4953, 22.4979, -79.0171, -5.53477, -0.00240423, 1.0) 001 XEP YUpperEP ZMiddle (-65.5696, 22.0584, -78.7534, -8.52309, -0.178188, 1.0) 002 XEP YUpperEP ZFar (-79.2807, 22.0584, -78.6655,-8.4352, -0.178188, 1.0) 010 XEP YUpperP ZNear (-21.0964, 24.2557, -79.4565, -26.8046, 0.261271, 1.0) ... ... ... ... ...  Translate symbolic keyframes in joint values  Animation is obtained by interpolating between  joint values with robot built-in proprietary procedures  use Slerp (spherical linear interpolation) with time warping: easing in out functionsfor Greta page 12
  • 13. Greta: Full Body IK Torso IK Analytic Method: Arm To Torso Torso target depending on hand position page 13
  • 16. Perceptive Evaluation  Objective • Evaluate how robot’s gestures are perceived by human users  Procedure • Participants (63 French speakers) rate videos of Nao storyteller • Random displayed versions to the participants: - Gestures with expressivity VS. Gestures without expressivity - Gesture-speech synchronization VS. Gesture-speech asynchronization  Results (using the ANOVA method) • Synchronization: - F(1, 124) = 4.94, p < .05 - 76% agreed that gestures were synchronized with speech for sync version • Expressivity: - F(1, 124) = 4.43, p < .05 - 70% agreed that gestures were expressive for expressivity version page 16
  • 17. State of the art  Most similar work: Salem et al. (2012) • Same idea (based on existing Max virtual agent system)  Main differences: • Our system: re-designed GRETA as a common framework • Salem et al.’s system: adjusted Max’s ACE to ASIMO robot Features Our model Salem et al.’s system Gesture Product Online from templates Automatically generated from trained regardless specific domain specified domain data corpus Gesture Shapes Agent specific parameter Original for Max and mapped to ASIMO configurations Gesture Timing Agent specific parameter Original for Max and adapted to ASIMO by feedback Expressivity Yes No Synchronization Adapt gesture to speech Cross-Modal Adjustment page 17
  • 18. Future works  Short-term plan • Human like gestures: enhance velocity profiles • Expressivity: implement fluidity and tension  Long-term plan • Feedback mechanism • Study of the coherence between consecutive gestures in a G-Unit (Kendon, 2004) page 18

Editor's Notes

  1. Schedule Mechanisme Such as Account Realize Obtain /ob chen/ Architecture /ar ki tec tro/ Exchange /ex s change z/ Twice / wi so/ Table /ta ble/ Creating /cre et ting/ Message /me se/ Virtual /vir tu al/
  2. donnes une description des keyframes que contiennent-elles comme information
  3. rajouter les definitions manquantes “ Power”: acceleration simulation through slerp (frame interpolation) or trajectory interpolation: use of time variation functions (easing in out functions) Expressive Posture: Volume Editing Power parameter: torso relative rotation varies with time and gesture target positions due to inertia Expressive Animated Sequence: Sequential Editing “ fluidity” and “tension” using TCB spline and noise functions(for trajectory) “ Power”: acceleration simulation through slerp (frame interpolation) or trajectory interpolation: use of time variation functions (easing in out functions)
  4. Joint rotation interpolation: use Slerp (spherical linear interpolation) with time warping: easing in out functions. Definition of trajectory parameters: Various trajectory paths: line, circle, spiral, etc. Expressivity: Kochanek Bartels splines(TCB splines)
  5. For posture generation, we use Forward kine. FK defines the initial states; the IK retargets the postures. Relative torso movement is first generated by using potential torso target depending on both hand gestures positions. (vt1, vl5) We decompose torso movement into horizontal and vertical movements, it depends on the center of both hands targets, we solve it directly by analytical method. Head direction is generated by FK, and trigonometric function for gaze. For Arm gesture we use a mass spring solver, which can apply light weight shoulder movements by defining arm chain from sternoclavicular till wrist. It allows us to model passive shoulder movement
  6. The system of Salem et al. produce gesture parameters &gt; potentially result in mistimed synchronization with speech affiliate due to physical joint velocity limits Max: Gesture shapes are designed for virtual agent &gt; Mapping solution
  7. Long-term plan: Mutual synchronization: Adapting phoneme duration to gestures