Gain in-depth insights about the challenges of implementing WebRTC on iOS and Android mobile devices for both an HTML5 and a native experience. Learn best practices to overcome them.
6. Background
Pre-WebRTC
§ Limited tools for developers
§ Real-time voice/video conferencing algorithms were
limited to only few proprietary native and enterprise
applications
§ Phone applications and web applications were two
independent identities
PC
Modem
Mobile IP Phone
§ Abundant Tools/APIs available for developers
§ Easy access to complex real-time voice/video algorithms
§ Enabling real-time communications on numerous platforms
with different operating systems, sizes, functions and on
browsers with no-plugins
…. But it does not come free, it brings along multiple
challenges
Post-WebRTC
7. IntegraBon
OpBons
Over-‐The-‐Top
ApplicaBons
Pre-‐Loaded
ApplicaBons
Embedded
ApplicaBons
NATIVE
APPLICATIONS
Web
Browser-‐
Based
ApplicaBons
• Best-‐effort
services
offered
by
non-‐operator
• Can
be
downloaded
from
Apple
Store/Google
Play
• Bundled
with
the
plaTorm
• Cannot
be
deleted
or
disabled
• Operator
driven
• Deeply
integrated
applicaBons
• Part
of
the
NaBve
Dialer
8. V.VoIP
Building
Blocks
V.VoIP
Applica4on
WebRTC
Media Engine
Audio
Coding
Module
Video
Conferencing
Module
Voice/Video
QoS
Audio
Pre-‐Processing
Module
–
AEC,
NC,
AGC
ICE
STUN
TURN
Security
TLS
SRTP
DTLS
Audio/Video
Capture/Rendering
Signaling & User Interface
Network
Interface
9. Mobile
V.VoIP
ApplicaBons
SIP
Signaling
Voice
+
Video
App
SIP
+
TradiBonal
Voice
+
Video
Codecs
Enterprise
Telephony
IR.92/IR.94
+
AMR
Video
+Voice
over
LTE
RCS
5.0/5.1
Rich
CommunicaBon
Suite
10. Challenges
I. Device
Related
1. Ba]ery
Life
2. Latency
Variability
3. PlaTorm
Diversity
II. Network
Related
4. Bandwidth
Variability
5. Latency
Variability
6. Packet
Losses
III. Ecosystem
Related
7. Network
Handoff
8. Interoperability
11. 1.
Ba]ery
Life
• Challenge
– Improved
ba]ery
life
for
mobile
video/voice
conferencing
• PotenBal
SoluBons
– OpBmize
for
target
mobile
applicaBon
processors
• MIPS
intensive
algorithms
such
as
echo
cancellaBon
and
codecs
should
be
opBmized
for
the
target
processors
thus
reducing
the
overall
mw/MHz
required
– Leverage
hardware
accelerators
wherever
available
• Not
all
plaTorms
expose
APIs
for
hardware
integraBon
– Diminishing
returns
for
frame
rate
• Significant
difference
in
perceivable
video
quality
between
10fps
and
15fps,
but
not
so
much
between
15fps
and
30fps
to
jusBfy
a
2x
increase
in
ba]ery
consumpBon
– Principle
of
good
enough
• ResoluBon
based
on
form
factor
of
the
devices.
No
need
for
HD
video
telephony
burning
up
CPU
and
your
hands
on
a
4
inch
screen.
• If
user
really
wants
highest
possible
quality,
give
him
a
sehng
for
it
with
an
appropriate
warning
J
12. 0
10
20
30
40
50
60
70
80
90
100
Voice
NB
Call
Voice
WB
Call
Voice
+
SW
Video
Voice
+
HW
Video
C
P
U
U
S
A
G
E
%
1.1
Example
CPU
Usage
%
HD
Video
VGA
30fps
on
Single-‐Core
CPU
Hardware
accelerators
not
only
offloads
CPU
and
reduces
power
consumpBon
but
also
enables
HD
resoluBon
Video
14. 2.1
Latency
Variability
• Challenge
– Reduced
latency
for
superior
user
experience
and
to
avoid
talking
over
each
other
• PotenBal
SoluBons
– Reduced
algorithmic
delay
for
voice/video
encode/decode
and
DSP
algorithms
such
as
echo
cancellaBon
– OpBmized
integraBon
with
audio,
video
and
network
interfaces
– Dedicated
bearer
for
real-‐Bme
applicaBons
-‐
voice
and
video
conferencing
15. 2.2
Example
Latency
Measurements
0
50
100
150
200
250
300
350
400
1
2
3
Audio
Interface
Delay
Network
Interface
Delay
Algorithmic
Delay
Handset#
Handset#
Handset#
• Audio
and
network
interface
delay
is
very
device
specific
and
contributes
to
major
chunk
of
the
latency
on
the
client
• Algorithmic
delay
is
dependent
on
the
opBmizaBon
of
the
voice
processing
algorithms
16. 3.
PlaTorm
Diversity
• Challenge
– Maintain
same
user-‐experience
across
different
plaTorms
• PotenBal
SoluBons
– Adapt
to
different
form-‐factors
– Auto
tuning
of
AcousBc
Echo
CancellaBon
(AEC)
across
the
devices.
AEC
characterisBcs
are
heavily
dependent
on
the
plaTorm
industrial
design
(mic
and
speaker
locaBon)
– OpBmizaBon
of
the
DSP
algorithms
for
target
processor
– OpBmizaBon/integraBon
with
the
plaTorm
audio
and
network
interfaces
17. 3.1
Example
Audio
CharacterisBcs
Mobile
Device
Flat
Delay
(msec)
ERL
Non-‐Linearity
Jiaer
(msec)
Record
Event
Play
Event
Handset#1
330
-‐3dB
No
non-‐linearity
15
8
Handset#2
250
-‐4dB
Non-‐Linear
20
5
Handset#3
225
-‐40dB
Highly
Non-‐Linear
20
5
Legend:
• Flat
Delay
-‐
Delay
between
the
speech
sample
played
ay
the
APP
(sent
to
audio
interface
for
play
back)
to
the
Bme
the
echo
of
the
sample
is
received
by
the
APP
(data
capture
provided
by
the
audio
interface).
• Non-‐Linearity
-‐
Because
of
the
post/pre-‐processing
components
used
in
the
audio
path,
the
enBre
frequency
range
of
the
played
and
echoed
speech
is
not
modified
uniformly
and
this
nonlinear
distorBon
impacts
the
echo
cancellaBon
• ERL
-‐
Echo
return
loss
-‐
specifies
the
loss/gain
encountered
by
the
signal
in
the
acousBc
environment
•
Ji]er
-‐
variaBons
in
the
data
capture/play
out
intervals
18. 4.
Bandwidth
Variability
• Challenge
– Adapt
quickly
and
efficiently
to
varying
bandwidth
condiBons
• PotenBal
SoluBons
– Video
AdaptaBon
Parameters
• Frames
Per
Second
• Video
ResoluBon
• Bit
Rates
– Voice
AdaptaBon
Parameters
• Data
Rate
AdaptaBon
• Codec
Change
– Diminishing
returns
for
bandwidth
• Marginal
to
no
improvement
in
quality
beyond
a
certain
bandwidth
for
a
given
resoluBon
and
frame
rate
• Don't
use
up
available
bandwidth
if
it
is
not
required.
A
judicious
choice
is
a
must
in
order
to
eliminate
blockiness
– Bandwidth
control
• Maintaining
a
given
bandwidth
and
the
right
split
of
codec
bitrate
and
FEC
in
the
presence
of
variable
loss
• A
constant
bitrate
configuraBon
of
the
codec
is
essenBal.
Variable
bandwidth
can
play
havoc
with
pacing/
bandwidth
adaptaBon
20. 6.
Packet
Losses
• Challenge
– MiBgate
packet
loss
to
provide
a
smooth
voice
and
video
experience
• packet
loss
is
very
common
in
wireless
environments
parBcularly
on
Wi-‐Fi
• MiBgaBon
Approach
– AdapBve
ji]er
buffer
and
packet
loss
concealment
to
miBgate
packet
losses
– A
suitable
FEC
strategy
to
combat
loss
and
maintain
user
experience
with
minimal
latency
– Balancing
FEC
protecBon,
quality,
latency
especially
at
low
bandwidths
21. 6.1
Example
Packet
Loss
&
MiBgaBon
• Packet
loss
miBgaBon
algorithm
should
help
maintain
good
voice
call
quality
for
random
and
small
bursts
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0%
3%
5%
10%
20%
30%
40%
50%
Voice
Quality
(PESQ)
Packet
Error
Rate
(PER)
App#1
App#2
App#3
Dropped
Call
Good
Voice
Quality
22. 7.
Network
Handoff
• Challenge
– Provide
seamless
call
handoff
• MiBgate
dropped
calls
when
moving
from
one
access
network
to
the
other
• PotenBal
SoluBons
– Implement
make
before
break/call
anchoring
algorithms
that
can
seamlessly
handoff
when
moving
from
one
network
to
the
other
– Some
of
the
3GPP
standards
support
VCC
(Voice
Call
ConBnuity)
for
seamless
handoff
between
circuit-‐switched
networks
and
packet
networks
23. 7.1
Typical
Network
Scenario
Dropped
calls
when
changing
networks
Home
On-‐the-‐road
Enterprise
3G
24. 8.
Interoperability
• Challenge
– Interoperability
with
legacy
systems
• PotenBal
SoluBons
– Because
of
the
open-‐source
nature
of
WebRTC
it
is
possible
to
integrate
addiBonal
codecs
and
algorithms
to
WebRTC
codebase
for
interoperability
– Transcoding
might
be
required
in
the
infrastructure
to
support
some
legacy
systems
25. Summary
• PPA
(Performance,
Power,
AdaptaBon)
–
the
key
metrics
for
opBmal
voice/
video
experience
on
mobile
devices
• Performance
– For
improved
performance
all
the
parameters
discussed
have
to
be
tuned
carefully
for
device
and
networks
• Power
– OpBmize
for
target
apps
processor,
leverage
hardware
accelerators
wherever
available
– Find
the
sweet
spot
-‐
adopt
principle
of
good
enough
and
diminishing
returns
for
frame
rate
• AdaptaBon
– Adapt
quickly
and
efficiently
to
varying
bandwidth
condiBons,
network
packet
losses
and
form-‐factors
26. WebRTC
on
Mobile
-‐
IntégraBon,
Challenges
&
SoluBons
Saraj
Mudigonda
ImaginaBon
Technologies
Saraj.mudigonda@imgtec.com
28. Outline
• A
ConstellaBon
of
OpBons
– browser
versus
naBve
Apps
• Mobile
ConsideraBons
– WebRTC
is
hard.
Mobile
is
harder.
• Case
study
across
plaTorms
using
an
SDK
– iOS,
Android,
Javascript
29. Browser
vs
NaBve
• HTML5+Javascript+WebRTC
– the
promise:
write
once,
run
everywhere!
– Chrome
leads,
Firefox
and
Opera
closely
follow
• NaBve
App
on
iOS
and
Android
– deliver
a
naBve
app
experience
– what
does
WebRTC
interoperability
mean?
30. WebRTC
PlaTorm
OpBons
• Android
– Browser:
Chrome,
Firefox,
Opera
(Mar
2014)
• Note:
WebRTC
is
not
in
the
Android
WebView
– NaBve:
SDKs
• Android
device
market
is
incredibly
fragmented
• iOS
– Browser:
really
none
– NaBve:
SDKs
• iOS
plaTorm
is
relaBvely
well
managed
• 80%
iOS7
(ArsTechnica)
NaBve
App
WebView
33. What
is
WebRTC?
• A
Suite
of
Technologies
standardized
by
W3C
– Codecs:
VP8,
Opus
– Transport:
RTP,
encrypBon
– SpecificaBons
for
describing
endpoints
(SDP)
– Methods
for
endpoints
to
connect
(ICE,
STUN,
TURN)
– Javascript
bindings
for
• accessing
your
camera,
microphone
• obtaining
SDP
entry
of
yourself
• rendering
media
streams
described
by
an
SDP
entry
34. What
WebRTC
is
not
• A
way
to
idenBfy
users
– authenBcaBon
• A
way
to
locate
other
users
– phonebook,
user
database
• A
way
to
iniBate
calls
to
other
users
– signaling
• A
way
to
see
the
online
status
of
other
users
– presence
35. What
is
WebRTC-‐compaBble?
• NaBve
Apps
may
be
compaBble
if
– they
can
interoperate
with
a
WebRTC
browser
• NaBve
Interoperability
– VP8,
Opus
– RTP,
encrypBon
– SDP,
ICE,
STUN,
TURN
• May
or
may
not
present
bindings
similar
to
Javascript
36. Compiling
WebRTC
yourself
• Ref:
h]p://ninjaneBc.com/how-‐to-‐get-‐started-‐with-‐webrtc-‐
and-‐ios-‐without-‐wasBng-‐10-‐hours-‐of-‐your-‐life/
• “The
current
HEAD
has
audio
and
video”
[May
2014]
• Demo
App
uses
ICE
servers
apprtc.appspot.com
• My
summary:
this
is
really
hard
37. Back-‐end
technolgies
• Website
versus
Mobile
Backend
– Website
development
across
plaTorms
(desktop,
iOS,
Android)
is
understood
(Rails,
Django,
Express,
etc)
– Mobile
App
backend
development
may
be
unfamiliar
• Parse
• Why
do
you
need
one?
– IdenBfy
users
(authenBcaBon,
roster)
– Keep
your
API
keys
secret
– Protect
access
to
your
realBme
resources
• TURN
relays
cost
real
$$
38. When
Worlds
Collide
• If
you’re
coming
from
the
telephony
world
– this
is
not
SIP,
it’s
not
a
so•phone
– webservices
bring
their
own
problems
– security,
cross-‐site
request
forging,
Oauth2,
Javascript
code
is
“in
the
clear”,
mobile
authenBcaBon
• If
you’re
coming
from
Webservices
– RTC
is
not
request/response,
it’s
not
ReST
– get
up
to
speed
on
Offer/Accept,
SDP,
RTP,
ICE
candidates
39. IdenBfy
your
Use
Case
• Is
two-‐way
video
the
applicaBon?
• Is
it
FaceBme?
• Is
it
customer
assistance?
– video
is
a
feature,
not
the
focus
– does
it
get
out
of
the
way
– one-‐way
video?
• Is
it
a
mulBway
call?
(meeBng)
• The
way
screen
real
estate
and
compute
resources
is
used
is
VERY
different
from
the
desktop.
40. These
are
very
different
• WVGA
or
HD
• 250
kbps
–
1
mbps
• Video
IS
the
focus
• 240x180
might
be
enough
• 100
kbps
• Video
floats
while
App
conBnues
41. Is
Tablet
the
same
as
Mobile?
• Tablets
do
not
roam
so
o•en:
more
tethered
to
a
basestaBon
• May
not
have
a
cellular
network
• Screensize?
Camera?
– Samsung
Galaxy
Tab
2
10”
front
camera
is
0.3MP
• Ba]ery
life?
CPU?
• Shaky
camera
effect
46. Cellular
network
consideraBons
• The
cellular
network
does
not
behave
like
a
consumer
home
network
– do
not
expect
peer-‐to-‐peer
to
work
on
4G
• Handoffs
between
cells
affects
IP
addresses
– sudden
changes
in
network
connecBvity
[RFC5944:
mobilility]
• QuesBon:
how
does
your
WebRTC
applicaBon
behave
in
the
context
of
mobile
networks?
47. Handoffs
from
WiFi
to
Cellular
• Mobile
apps
like
FaceBme
set
consumer
expectaBons
• WebRTC
technologies
do
not
really
address
changing
network
topologies
48. App
Development
• How
is
an
App
different
from
a
Desktop?
– Apps
are
sandboxed:
a
restricted
execuBon
environment
– no
shared
memory
or
libraries
• Screensharing
– Mobile
environments
sandbox
apps
for
security
– Apps
can
sample
their
own
pixels,
not
the
pixels
of
other
apps
49. Power
State
TransiBons
• Foreground/Background
– Can
a
user
depart
the
video
engagement
to
go
look
at
something
else?
– Can
the
App
or
Page
receive
noBficaBons
while
in
the
background?
• Browser-‐based
implementaBon
– background
window
is
not
acBve
50. Use
WebRTC-‐compaBble
SDK
• Abstracts
WebRTC
endpoints.
• AuthenBcates
users.
Presents
calls.
• Built
on
WebRTC
technologies
– VP8,
Opus
– RTP,
encrypBon
– SDP,
ICE,
STUN,
TURN
51. Examples
using
SDK
• iOS,
Android,
Javascript
– iniBalize
package
– authenBcate
– make
and
receive
calls
– connect
to
UI
elements
54. Mobile
AuthenBcaBon
Your
ApplicaBon
Auth
Controller
Web
Service
Cloud-‐based
Video
Service
Web
Service
SDK
Video
Service
token
token
request
token
(proxy)
(uid,client,profile)
token
Auth
token
58. iOS
Summary
• We
learned
how
to
– iniBalize
package
– authenBcate
– make
and
receive
calls
– connect
to
UI
elements
59. Android
• What
is
different?
– event
bus
instead
of
delegates
– decorators
to
register
event
listeners
60. EventBus
• Based
on
a
common
component
in
Google
Guava
• Our
lightweight
implementaBon
• Decouples
event
responsibility
from
class
hierarchy
61. IniBalizaBon
public class FullscreenActivity extends Activity {!
@Override!
public void onStart() {!
super.onStart();!
Weemo.eventBus().register(this);!
Weemo.initialize("APIKEY", this);!
}!
!
@WeemoEventListener!
public void onConnected(final ConnectedEvent event) { ... }!
!
@WeemoEventListener!
public void onAuthenticated(final AuthenticatedEvent event) { ... }!
!
62. Summary
• WebRTC
in
Chrome
has
taken
many
man-‐years
– there
are
more
things
to
think
about
for
mobile
• Leverage
a
NaBve
SDK
for
your
Mobile
ApplicaBon
63. Thank
You
Please
remember
to
complete
an
evaluaBon
of
today’s
sessions