If aiming to be the next Bell Labs sounds ambitious, it is. Yet that is the mission of SignalWire. We will hear about the digital tools they have created, along with those still in development, which aim to build remote communication in invisible ways. Their applications range from the karaoke bar to the military - anywhere people wish to find more natural and collaborative remote communication.
3. In the 1930s, Bell Labs conducted studies on human hearing
They selected 300hz to 3400hz as being reasonable for voice quality.
Enshrined in the digital telephony era via the PCM 16bit, 8khz mono format
5. âFrequency limitation is essentially an economic one, subject to
change as conditions change.â
- A.H. Ingles, 1938
6. Why are we tethered to 83+ year old
economic considerations?
7. Important information is contained in the <250hz and >4000hz+ range
- Subjects could determine talking from singing and the sex of the speaker with reasonable accuracy when all data
below 5000hz was removed
Previous models didnât contend with loud ambient noises, conference rooms, or kids on Zoom school in the next room.
Critically important with the rapid rise of remote collaboration amongst all genders/ages/languages
8. Speech Quality as conceived by H.W. Gierlich and can be extended for video
Focus on Overall Quality
Overall
Quality
Sound Quality &
Naturalness
Listening Effort
Talking Effort
Conversational Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Original Primary Concern
10. 2020s Telecom Network
- Loosely coupled to legacy PSTN networks
- Elastic, running distributed on any
compute/networking equipment worldwide
- Uses a vast array of connectivity paths
- Can provide 1-to-1 or centralized mixed audio/video
- Programmability via APIs that provide rich, seamless
control and metadata
11. Programmability
All features and metadata accessible via API
- Allow for complete global command & control of all resources
- Realtime contextual information about on-going calls, conferences, or access
- Get distance pings for all participants, geolocation information, realtime network analytics
- Full command & control of conference layout, participant actions, and settings
19. Recommended Latency
VoLTE deïŹnes the requirements for voice call latency as 100ms or less (one-way), VoLTE video latency as 150 ms
Maximum Latency is
1 Arc de Triomphe
20. Intelligent Networking Routing
- Utilize Cloud, Near-Edge, and Edge nodes to actively manage
participants - 25ms one-way latency goals
- Location of centralized muxing can be moved during a conversation
- Optimize location for greatest participant happiness
- Constantly examine the state of the network to provide optimal paths
- ML-based network analysis can detect disruptions or uncover path
optimizations on a per-endpoint basis
22. Distributed Video Compute
Centralized muxing for a broadcast level experience
- Everyone receives the same video & audio experience
- Can be siloed, air-gapped, or run independent of centralized command & control when necessary
- Works for 1000+ people in an interactive experience
- GPU ofïŹoading, dedicated video cores allow for massive expansion
- Scaling to unlimited number worldwide
- FHE to allow for edge-based node computation without risk
Can take in any format (SIP, PSTN, NDI, h264, VP8/9, AV1) and bridge them together
23. Use Case - Interactive Concerts / Events
- Musicians can host major concerts with the audio of thousands mixed and reproduced live
- Sports venues can create stadium audiences from at-home fans
- Jam sessions or Behind-the-Scenes moments that are in real-time and organic
24. Telecom of Tomorrow
Overall
Quality
Background Noise
- Suppression as a built-in commodity
Double Talk
- Excellent echo cancellers in WebRTC
- Full room cancellation
Network Conditions
- Latency as a top-tier concern
- AI + SD-WAN
Listening / Talking / Conversational Effort
- Cognitive challenges of degraded speech are signiïŹcant
Jonathan Peelle, Department of Otolaryngology, Washington University in Saint Louis
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5821557
- Hardware Improvements - beamforming, higher quality cameras
- Broadcast-quality audio mixing
- Realtime speaker diarization/translation
Seamlessly engage and disengage from conversations
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
25. Use Case - Remote Work / Interaction
- With instant on/instant off video and audio it allows for seamless interaction across language boundaries
- Health/Wellness - Camera sensor systems for health monitoring, Virtual AI Doctor
- Work - Remote Working Teams, Realtime Translation with Language Generation
- Fitness - Community-based VR interaction
- Remote Learning - Enabling hyperlocal/global opportunities
26. Telecom of Tomorrow
Background Noise
- Suppression as a built-in commodity
Double Talk
- Excellent echo cancellers in WebRTC
- Full room cancellation
Network Conditions
- Latency as a top-tier concern
- AI + SD-WAN
Listening / Talking Effort
- Cognitive challenges of degraded speech are signiïŹcant
- Hardware Improvements
- Broadcast-quality audio mixing
Sound / Video Quality
- Better Codecs
- ML Augmentation
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
27. Use Case - Field Workers / Servicemen
- Endpoints can all be connected to a mesh drone and/or direct-to-satellite (Base stations in
Space)
- A/V shared amongst teams while disconnected from central C&C
- HD resolution stored locally while a different resolution can be transmitted
- Information about the client endpoints (approx. distance, etcâŠ) is sent over side data
channels
28. Machine Learning
Super Resolution Video
- Upscale poor endpoint performance in realtime or in post-production.
Audio Optimization
- Compensate on both the server and client side for poor audio / dropouts
Voice Decon/Reconstruction
- Original voice speech models to allow for multi-model, ultra-bandwidth constrained, or other conditions
where text is preferred.
29. Telecom of Tomorrow
Background Noise
- Suppression as a built-in commodity
Double Talk
- Excellent echo cancellers in WebRTC
- Full room cancellation
Network Conditions
- Latency as a top-tier concern
- AI + SD-WAN
Listening / Talking Effort
- Cognitive challenges of degraded speech are signiïŹcant
- Hardware Improvements
- Broadcast-quality audio mixing
Intelligibility / Sound Quality
- Better Codecs
- Better hardware
Expectation
Perfection
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility
Video Quality
Overall
Quality
Sound Quality
& Naturalness
Listening Effort
Talking Effort
Conversational
Effort
Double Talk
Speech & Video
Characteristics
Expectation
Network Conditions
Background Noise
Intelligibility