Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Controllable image-to-video
translation: A case study on facial
expression generation
Introduction
■ Task Description
– how to generate video clips of rich facial expressions from a single profile photo of
th...
Introduction
■ Different people express emotions in similar manners
■ the expressions are often “unimodal” for a fixed typ...
Method
■ Problem formulation
■ Given an input image I ∈ RH×W×3 where H andW are respectively the height and
width of the i...
Method
Method
Training loss
adversarial loss
Temporal continuity
Facial landmark prediction Lk:
method
■ Jointly learning the models of different types of facial expressions
Experiments
Visualization
Experiments
■ Analysis on temporal continuity
Experiments
Nächste SlideShare
Wird geladen in …5
×

Controllable image to-video translation

63 Aufrufe

Veröffentlicht am

Jiaxu Miao

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Controllable image to-video translation

  1. 1. Controllable image-to-video translation: A case study on facial expression generation
  2. 2. Introduction ■ Task Description – how to generate video clips of rich facial expressions from a single profile photo of the neutral expression ■ Difficulties – The image-to-video translation might seem like an ill-posed problem because the output has much more unknowns to fill in than the input values – humans are familiar with and sensitive about the facial expressions – the face identity is supposed to be preserved in the generated video clips
  3. 3. Introduction ■ Different people express emotions in similar manners ■ the expressions are often “unimodal” for a fixed type of emotion ■ the human face of a profile photo draws a majority of users’ attention, leaving the quality of the generated background less important
  4. 4. Method ■ Problem formulation ■ Given an input image I ∈ RH×W×3 where H andW are respectively the height and width of the image, our goal is to generate a sequence of video frames {V (a) := f(I,a);a ∈ [0,1]}, where f(I,a) denotes the model to be learned. ■ Properties: – a=0, f(I,a)=I – Smooth, f(I,a) and f(I,a+ Δa) should be visually similar when ∆a is small – V (1) be the peak state of the expression
  5. 5. Method
  6. 6. Method Training loss adversarial loss Temporal continuity Facial landmark prediction Lk:
  7. 7. method ■ Jointly learning the models of different types of facial expressions
  8. 8. Experiments Visualization
  9. 9. Experiments ■ Analysis on temporal continuity
  10. 10. Experiments

×