- 1. Product & Technology Roadmap
Automated Text-To-Speech and Text-To-Lip Synch
© Totalsynch Inc. 2015, California.
- 2. Problem with VR Avatars, Humanoid AIs and Robots
2© TotalSynch Inc. 2015, California.
The Problem of the UNCANNY VALLEY in Robotics, Avatars
and Humanoid AIs.
When a robot or avatar becomes too close to, but
not entirely real, it creates the sensation of revulsion
or fear in the viewer.
Japanese roboticist Masahiro Mori
in 1970, introduced the problem
of The Uncanny Valley. Masahiro
Mori has done pioneering work on
the emotional response of
humans to non-human entities.
He is also the founder of
- 3. Problem
3© TotalSynch Inc. 2015, California.
Fear and revulsion
generated by “The
Uncanny Valley” is a
reason why a robot
or a “virtual reality AI”
can NOT be your
Children’s nanny, your
romantic partner or
- 4. Simple Examples of Uncanny Valley
4© TotalSynch Inc. 2015, California.
“The big problem that one has to face is the fact that everybody in the audience is an expert on how
- Animator Shamus Culhane, from his book 'Animation: From Script to Screen.’
- 6. www.TotalSynch.info
6© TotalSynch Inc. 2015, California.
Examples of Uncanny Valley - 2
Angelina Jolie in a realistic Animation movie in Beowulf.
This movie was a Commercial failure and viewers claimed
that it was creepy and invoked fear.
- 7. Problem
7© TotalSynch Inc. 2015, California.
as a term referring to the hypothetical state in which a
robot, avatar, or other non-human humanoid is at ﬁrst
received in an increasingly positive and empathic way from
interacting humans. However, as it becomes even more
realistic, it hits a 'valley', a point where positive response
tails off dramatically, evoking revulsion or fear.
- 8. How we solve this problem?
8© TotalSynch Inc. 2015, California.
TotalSynch’s technology and patents, help Avatars, Robots and Virtual
Reality based AI generated characters, to speak with "
Simulated near perfect human lip movements
in the most cost effective way
This solves the biggest riddle of the Uncanny valley.
Example in a Virtual reality shop, where you would you like to have a perfect sales-
girl who has crossed the uncanny valley? Would it change the sales dynamic?
Conclusion: TotalSynch’s technology, APIs and patents, your company can build
the best Avatars, and Virtual Reality based AI generated characters.
The only moving part of your face other than eyes, are your
Lip and surrounding muscles.
Current cost of perfect lip movement is $9,000/minute of speech.
- 9. www.TotalSynch.info
9© TotalSynch Inc. 2015, California.
Industry veterans, and VR industry experts consider TotalSynch’s patents
as a marquee patents, to bring realism in any animation and VR Avatars in
the future. This ability has a huge commercial value.
We are working with our
IP legal experts/advisors
to partner with Intellectual
property expert ﬁrms of
the likes of
Humans speak using syllables and not phonemes. Your lips move by syllables, try
it out. TotalSynch is the only company with patents based on syllables.
- 11. What are the markets and customer uses?
TotalSynch Automates the process of 3D Animated Lip-Synch, turning one of the
most broken, difﬁcult and challenging aspects of animation and making it perfect,
automated, inexpensive, customizable and fast.
11© TotalSynch Inc. 2015, California.
Markets and customer use cases
vector output as a
variable SAAS model
- 12. Text-to-Speech Processes Compared
Critical to the effectiveness of the TotalSynch engine is its unique, patented syllabic
text-to-speech process. In order to recreate accurate 3D lip-synching, we must begin by
reproducing speech the way real people talk, and that is in syllables. All existing technology
reproduces speech using Phonemes.
The Phoneme Process
Master = Mh + Ah + Ss + Th + eR
The phoneme system was an academic invention, over one hundred years ago, intended to
reduce speech to the smallest elements of sound for simplicity in cataloguing and systemic
study. The process was never intended for reconstructing natural speech. And now when it is
being used to reconstruct computer assembled speech, its archaic and counter-intuitive
structure is obvious with its stilted, robotic sounding output.
TotalSynch’s Syllabic Process
Master = mah +stir
TotalSynch is intended to recreate speech entirely naturally, the way real people speak. And not
only that, but to allow the speech to be edited for variations in volume, pace and pitch to create
dramatic emphasis. But users would never be able to think of editing speech in terms of
Phonemes, they’d edit the same way they talk. When you say “Master” you say it in two
distinct syllables, not ﬁve separate sounds. TotalSynch reproduces sounds and lip-synch in this
patented natural way.
12© TotalSynch Inc. 2015, California.
Technology & Process
- 13. Syllables are the natural way people speak, and TotalSynch has the only patents
on text-to-speech/text-to-animation using syllables.
Every syllable in the language will be recorded individually (sometimes several different
versions), both vocally and visually.
This creates a library of sounds. The visual version will be used as a reference to create 3D
graphic motion graphs of each syllable. This creates a library of 3D graphic lip-synch syllables.
The sound library and graphic library are ‘married’ to each other, so any modiﬁcation to one
modiﬁes the other.
The TotalSynch Engine takes two kinds of input and drives two types of output. The ﬁrst
input is in the form of any type of text, such as a script. The second type of input is a 3D
graphic model of a face or Lip taken from an internal or external source.
The engine takes the text input, parses it to create a syllable map, then extracts those
sounds and graphic lip-motions from the library. The engine then renders these into the two
synchronized outputs, a Sound and motion graph track. AI or script languages create nuance
to the voice by altering volume, pace and pitch of syllables or words. Each change made to
the sound makes a representative change to the 3D graphic representation of the spoken
13© TotalSynch Inc. 2015, California.
Technology & Process
- 15. IPA recognizes only 23 Bilabial & Labiodental
15© TotalSynch Inc. 2015, California.
The bilabial and Labiodental consonants identiﬁed by the International Phonetic
Alphabet (IPA) are:
Using just 23 IPAs it is impossible for
robots, virtual avatars and artiﬁcial
intelligences to cross the “Uncanny valley”
- 16. There are 87 distinct vectors which can be identiﬁed for the lip movements for every possible syllables on human
voice box. (23 IPAs are audio units and not visual) TotalSynch has identiﬁed these 87 vectors and can map the
whole database of all possible lip movements using syllables, and that is how your cross the uncanny valley for
virtual reality Artiﬁcial Intelligences and robots.
Making of 87 Lip moving Syllables using Sanskrit and Tsonga
16© TotalSynch Inc. 2015, California.
Making of 87 Syllables using Sanskrit and Tsonga
5X17 = 85 in Sanskrit
+ 2 in Tsonga
- 20. TotalSynch API
20© TotalSynch Inc. 2015, California.
Text or Voice
Standard Audio Files
Standard or artiﬁcial
IBM Watson Developer
• 3D tools and plugins (After
• Internet Avatars
• Film/Television Animation
• Facebook/Social Media Avatars
• Video Game Animation
• Adult Entertainment
• VR/3D Studio Shops
Monetized as SAAS or one
time Licensing Model
- 21. Text to A/V output
21© TotalSynch Inc. 2015, California.
Parse text in to
Graphics library of
Modify pace, pitch
speech output using
APIs like IBM
3-D Vector output
From TotalSynch API
speech Audio- Visual
Synthetic syllabic audio visual output
- 22. Helping VR AIs & Robots cross uncanny valley
22© TotalSynch Inc. 2015, California.
- 24. www.TotalSynch.info
24© TotalSynch Inc. 2015, California.
The Lip Vector API
How an Avatar can sell handbags
in VR shops, perfectly lip-synching
to your language of choice.
Language Voice/Text or
direct voice translated
Why TotalSynch’s patented technology can dominate VR Salesbots and Avatar market?
- 25. www.TotalSynch.info
25© TotalSynch Inc. 2015, California.
Founded as a video game company with the vision to make
the most unique and immersive interactive Multiplayer Role-
Playing Game to the date. Technology of TotalSynch is key to
that reality and immersion.
Funding to Date:
$250,000 founder funding
$100,000 funding from founder’s parents
$150,000 from other friends and family.
- 26. www.TotalSynch.info
26© TotalSynch Inc. 2015, California.
Matthew Thomas, 30 year background in programming, software
project management, start-ups and video game design.
Bill Munns, 30 year background in Hollywood special effects
production, lip-synch technology and 3D Digital art.
DeVerges Jones, 25 years senior marketing for companies including
AT&T, PhiLip, Kodak, Pepsi-Cola. Won CLIO, EFFIE and CEBA awards.
Daniel Esbensen, 30 years Software Research and Development,
creating dozens of patents for companies including Microsoft and the
Mangesh Mahajan, former COO/CPO Yactraq Online, expert in artiﬁcial
intelligence in audio / NLP technology, strategy, design and architecture.
- 27. (Without Ad revenue) TotalSynch’s CashFlow and proﬁt point
27© TotalSynch Inc. 2015, California.
Delivery Cost: @500K US$!
Beak-even point: 6 – 8 trial businesses!
OR 1000 paying beta users!
Delivery Cost: @1.0M US$!
Proﬁt Point: 10,000 users ($100/Year/User)!
OR Just one large enterprise customer
Delivery Cost: @2.0M US$!
Proﬁt Point: 20,000 users ($100/Year/User)!
SAAS + API + Extension + Social App Revenue
Delivery Cost: @2.5M US$!
Proﬁt Point: SAAS + API + Social App +
Patents are issued, Ready to sign up with one of the
large patent defense fund.
Private Beta: Limited SaaS delivery (Browser-based
client + light iOS app)
Public Beta: Web + iOS + Android App
Commercial Release: Full SaaS delivery
(Browser + iOS/Android Apps + Desktop client)
Full SaaS solution + API/SDK (developers)
+ Security integration + Social Avatar App
Integration with multiple 3rd parties
(Other cloud/API vendors)
SaaS + API + Social App + 3rd party integration
Public Beta of Mobile Social App
- 28. TotalSynch
© Totalsynch Inc. 2015, California.
+1 203 533 9842