Más contenido relacionado

Presentaciones para ti(20)

Similar a AR/VR Opinions on the State of the Industry(20)


AR/VR Opinions on the State of the Industry

  1. AR/VR Monday, Feb. 27th WeWork Civic Center, San Francisco Wednesday, May 31st to Friday, June 2nd Santa Clara Convention Center The largest Augmented Reality and Virtual Reality event of any kind, in the world. Christopher Grayson will be speaking. Augmented World ExpoPrivate Launch Party Request an invite VR / AR Style Where VR went wrong, why this matters to AR and how to set it right. Optically transparent displays Smartglasses general purpose AR interface Other notable developments Consumer face recognition & Mesh-networks Events CONCLUSIONS Contents: Opinions on The State of the industry by < >
  2. 2 Where VR went wrong, why this matters to AR and how to set it right.
  3. 3 2011 2012 2013 2014 2015 2016 2017 2018 2019 Where VR content went off course Consumer Stereoscopic (class of 2011) For CES 2011, stereoscopic cameras were a thing. Many were introduced but now no longer available. The trend peaked prematurely. The problem in 2011 was that there were no good 3D content consumption devices, VR headsets were still a few years away. You could take stereoscopic 3D images and videos, but there were very few ways to view them. The personal viewers that existed were little more than glorified Victorian era stereoscopes. Oculus Rift ships to the consumer
  4. 4 2011 Oculus DK1 ships Oculus Kickstarter Oculus Rift ships to the consumer 2012 2013 2014 2015 2016 2017 2018 2019 Oculus & The 2nd coming of VR Oculus launched their Kickstarter in 2012, by the time they came to market in 2016, there were many other pretenders joining the fray. The focus shifted to gaming where 360° content is the norm. With the gaming industry in the driver’s seat, and with 360° the standard approach, it was taken as a given that 360° was best for all things (and even treated as more important than depth). Stereoscopic cameras were off the market just before headsets arrived. Where VR content went off course
  5. 5 Then the consumer market was flooded with 360º cameras, most of them ball shaped. This category was a huge distraction. A good test for gadget success in the mass consumer market is, “How will a smartphone eat this?” In the short run, new gadgets come along as standalone devices. They typically achieve mass market success if they are “pocketable,” and if so, they are only viable up until they are absorbed into the smartphone. Phone? Music Player? Digital Assistant/Day Planner? Camera? All now merely features or apps on our smartphones. Ball shaped 360º cameras are neither pocketable nor a form factor easily absorbed into a smartphone … and most do not even capture in stereoscopic, so the “VR” experience is an inside-the-cylinder effect. On 360° Cinema “It’s nonsense, you are looking forward, and sometimes left and right, but not behind you. It’s really a waste of pixels.” —Greg Madison UX, Unity source: Fast Companyx x What is 360° video good for? • Real Estate • Entertainment events: • Sports • Music … but not for either cinematic or UGC 3D video content. Consumer 360° (The Dead end) 2011 2012 2013 2014 2015 2016 2017 2018 2019 Where VR content went off course
  6. 6 2011 2012 2013 2014 2015 2016 2017 2018 2019 Where VR content went off course …and why it matters to AR (to go forward, take one step back) “Camera-Through” AR The stereoscopic camera in a phone was the right direction, but the lenses need to be placed at the proper pupillary-distance, aprox. 60mm, to match human scale. To go forward, AR needs to take one step back. This turns any phone-based VR headset into an AR headset via camera-through, just as AR was performed in pre-2011 AR & VR headsets. With the size of the iPhone’s hardware market, if Apple adopted this approach, it would both flood the market with UGC VR content, as well as create a transitional stage from smartphone to AR smartglasses. Vuzix camera-through AR Camera-through attachments shown on NVIS & Sensics VR headsets in years past. Vintage examples: Who will show leadership?
  7. 7 Had the industry not lost focus and taken a distracting detour with 360° cameras, stereoscopic camera smartphones could have been both a boon for UGC VR content, as well as the basis for camera-through AR. If a player steps up to shows some leadership, this can still happen. Apple should have done it with the introduction of the iPhone 7. I’ve lost confidence that Apple is going to show industry leadership. Given the introduction of both the Surface Studio and HoloLens, I’m most inclined to view Microsoft as the innovation leader among large tech companies today. As a Mac user since the 80s, I don’t say that casually and would love to be proven wrong. I write with unwavering conviction: If just one major handset maker stepped up and introduced a smartphone with a stereoscopic camera*, capable of UGC VR video and camera-through AR, it would do more to propel both the VR & AR industries forward than another dozen me-too headsets, or anything happening in the AR smartglasses space. Course correction Notable it has to happen at scale, so it has to be from a major handset maker — a startup simply cannot do it —in fact, Apple or Samsung may be the only two players with enough market share to impact the market — though in their absence, another handset maker could make a big splash in the market by moving fast. * Lucid is principally a VR video editing software company. CEO Han Jin says they introduced their stereoscopic camera, LucidCam, because there were no good consumer stereoscopic cameras on the market. LucidCam is a reference model. Han also agrees with Greg Madison of Unity, and this author, that 360º video is, in most cases, a distraction. Where VR content went off course
  8. 8 Occipital’s volumetric mapping tech is similar to that which Microsoft uses in Kinect, or the Apple acquired Prime Sense. The Occipital Bridge headset combines their Structure Sensor and some impressive software, creating a camera-through inside- out mapping AR experience. The Structure Sensor is also available stand- alone for iOS. The hand gesture tracking of Leap Motion, as demonstrated for VR, also has cross-over application in AR. Notable Notable Where VR content went off course Don’t let the Perfect be the Enemyof The Good. I receive a lot of industry resistance to my advocacy of stereoscopy in smartphone cameras. It comes in the form of two arguments, both almost entirely from those in the gaming space: • Those who say 180º stereoscopic capture for UGC is not VR, to be VR is must be 360º capture and anything less should not be tolerated. • Those who say only six-degrees of freedom of movement through volumetric space is VR and anything less should not be tolerated. To both I say: Sony sold 9.7M PS4’s in the holiday quarter of 20161 . In the same quarter Apple sold 78.3M iPhones2 and the total Android market sold 350.2M3 . To The Detractors Sources: Sony Corporation’s final calendar quarter is their Q3 fiscal quarter, hence quarters are reported here as holiday 2016 quarter. Apple reported their Q4 2016 iPhone sales. iDC Worldwide Quarterly Mobile Phone Tracker, Feb. 1 2017 1 2 3
  9. 9Volumetric broadcast is the holy grail. There are use cases where 360° is acceptable. There are use cases where 180° is preferable. There are use cases where directed stereoscopic works best … …but broadcasting volumetrically captured content is coming. Everyone in the industry knows it is the metric against which all other VR content will measured. Between startups, gaming, industrial products in the engineering and medical field, and motion capture in special effects — there are too many hardware players in the space to mention here. The software companies to watch are those that can take the point cloud data from capture devices, convert it to polygons of acceptable resolution for display, and compress them enough to push through a pipe … and eventually do all of this on the fly so that the content can be streamed at an acceptable frame rate. SimplygonVolumetric Don’t let the Perfect be the Enemyof The Good. Yes, of course, volumetric Notable Notable 8i is currently leading the industry in volumetric capture. They’ve just raised a $27M series B led by Time Warner Investments. They’re the hottest thing in volumetric capture at this time. I expect their main competition to come from the Hollywood special effects industry. It should be no surprise that they’re based in LA. Cappasity is a small startup out of Russia that relocated to Silicon Valley. Recently pivoting from game avatars to focus on the apparel industry and the multi- billion-dollar problem of online-fit. They are an intel partner, but their software works with other capture hardware as well. Microsoft just keeps winning. They recently acquired Simplygon, maker of compression software that substantially reduces the polygon count of 3D models. When content is volumetrically scanned, it is first a point cloud model that must then be converted into polygons — facets that comprise an object’s surface. A complex shape can comprise millions of polygons, fine for special effects post production, but far too large to stream.
  10. 10 Optically transparent displays
  11. 11 Waveguides Hardware is hard. Near-eye optics are very hard. Waveguides are harder still. Going from an engineering concept to a functional prototype is exceptionally difficult. Taking that IP from prototype to a scalable, manufacturable product is much more difficult than filing patents and issuing press releases (It’s clearly more difficult than raising money). Today, there are only three companies that matter in waveguides. Why? Because there are only three who have shown they can go all the way to manufacturing. Nokia But what about … Lumus DigiLens Nokia’s surface relief waveguides are used by Microsoft in their HoloLens device. Their IP is also licensed by Vuzix who raised money from Intel to build a manufacturing plant in New York for their M3000. Nokia has not only shown that they can scale to manufacturing, but their Vuzix deal shows that their manufacturing process itself can be replicated. The physics of their surface relief design hits a wall at about 30° FOV. Lumus’ design is the simplest of the waveguides on the market. They have none-the-less shown this design can be mass produced and are providing waveguides to both Daqri and Atheer. Lumus has shown they can do more than technology, they can do business: they’ve worked with Optivent, Meta and collaborated with Himax. Lumus recently closed $45M in additional funding from HTC, Quanta and Shanda Group. I’m also watching TruLife Optics and Akonia Holographics to see if they can take their designs to manufacturing. My skepticism of Magic Leap does not stem from their recent PR problems, but from their ability to take their ambitious designs to manufacture (or even to prototype?). Kayvan Mirza of Optivent recently wrote a Magic Leap Analysis that is recommended reading. DigiLens has a much lower profile in the consumer space than either Nokia or Lumus. They’ve principally been a supplier / partner of Rockwell Collins in the avionics space. In addition to their classified work for the U.S. military, they’re the supplier to Rockwell Collins’ waveguide display systems for Embraer Legacy 450 & 500 cockpit displays. They have most recently entered the consumer space with BMW’s smart helmet. They have an aggressive road-map to increase FOV. While Nokia and Lumus’ offerings are passive lenses, DigiLens’ waveguides are active — a liquid crystal based structure that is electrically switchable. They recently closed $22M in additional funding from Sony / Foxconn. I continue to be bullish on DigiLens. Optical see-through displays
  12. 12 Waveguides Optical see-through displays Addendum Journey Technologies In the course of writing this report, I was contacted by the founder of Journey Technologies of Beijing. She boasted that they have a design “like Lumus,” with a 36° FOV and 1280x720 resolution. She claims they can manufacture 500 units per month, and can “easily” ramp up to 1000 per month with their existing manufacturing facility. She included a photo of their optical unit that indeed resembled Lumus’ design. The Chinese are masters of reverse engineering and the Lumus design is very basic, making them vulnerable to commoditization.
  13. 13 There are other near-eye-optic see-through display systems besides waveguides. Meta is notable for their exceptionally wide, 90° FOV, the widest on the market. It is a beautiful display. I compare the Meta 2 display to the top-of-the-line graphic work station displays of the late 90s. Even as flat-panel displays came onto the market in the early oughts, a nice CRT still had better color and higher resolution, and a 21” Viewsonic was a larger screen than anything available in an LCD … but clearly the writing was on the wall. To achieve Meta’s incredible field of view, they project a panel onto a Pepper’s ghost style combiner — a semi-reflective transparent screen that allows the viewer to see their environment through the reflection. These will ultimately become obsolete as waveguides grow in FOV. Until then, the Meta 2 is excellent for UI development in AR. One notable exception to the waveguide trend in low profile displays is ODG (Osterhout Design Group), whose R9 has a magnified OLED micro-display projecting into an outward beam-splitter, onto a reflector/combiner, all in a slim (-ish) form factor, featuring both a higher resolution, and a 50° FOV that rivals that of current waveguides. It is also worth mentioning that they don’t suffer the display artifacts of current generation waveguides including halo/glow and rainbow-ing image distortion. Just as with the inferior image quality of early generation flat-panels, these problems will be solved in time.* META 2 ODG R9 You can read a good counter argument against waveguides at Karl Guttag’s blog, Magic Leap & Hololens: Waveguide Ego Trip?* Others Optical see-through displays
  14. 14 Smartglasses general purpose AR interface
  15. 15 Waveguides + Eye-Tracking Depth Sensors Computer vision electroencephalogram (EEG) Smartglasses general purpose AR interface
  16. 16Smartglasses: a general purpose AR interface EYe-Tracking Eye-Tracking alone can substantially improve image quality in AR & VR by enabling foveated rendering — rendering in highest resolution only that portion of the display directly where the user is looking — putting less burden on the GPU and allowing for higher frame rates. Coupled with depth sensors and computer vision, things get much more interesting. If the device knows both where the user is looking, and has an understanding of the environment (i.e.: knows what the user is looking at), we only need to add intent (user command). It is in this context where I will make the case for Eye-Tracking + EEG. Voice Command is Overrated Voice as command interface is convenient when in the privacy of one’s home, or in an automobile. It looses its appeal when in an office or any public environment. in the car at home Users of smartphone-based voice assistants who use them in the following locations in public at the office 51% 39% 6% 1% Where People Use Voice Assistants Source: Business Insider / Creative Strategies Creative Commons cc BY-NDBased on a survey of 500 consumers in the United States Waveguides + Eye-Tracking + Sensors + Cv + EEG for A | B
  17. 17 International Language translation Smartglasses: a general purpose AR interface Voice is for language translation Waveguides + Eye-Tracking + Sensors + Cv + EEG for A | B + Speaker + Microphone Talk to anyone in any langue, and they hear your translation in real time to any language. Listen to anyone speaking any language, and hear them in the language of your choice. Voice Command is Overrated Voice as command interface is convenient when in the privacy of one’s home, or in an automobile. It looses its appeal when in an office or any public environment. See Real-time Skype Translation by Microsoft Research.
  18. 18Smartglasses: a general purpose AR interface electroencephalogram (EEG) A B UI Analogy: An A | B selection via EEG is derivative of a two button mouse and the user’s eyes as the cursor. EEG AS Command EEG still has a substantially slower response time than that of a human finger on a mouse. On the following page Dr. Geoff Mackellar, CEO, Emotiv gives a more nuance and cautious analysis. “EEG based brain computer interface has demonstrated its capability to control a device such as controlling the fly of a model helicopter. The EEG based binary input device shall become available in the near future depending on the market needs.” —Dr. Bin He, University of Minnesota Notables in the consumer EEG device space include interXon (maker of Muse), Personal Neuro (maker of Spark), and Emotiv (maker of Epoc+, and Insight). All currently marketed as wellness products, with various meditative features, Emotiv has an SDK for third-party developers, some of whom are already experimenting with using EEG for command. Waveguides + Eye-Tracking + Sensors + Cv + EEG for A | B NOTABLE: Safilo’s SMITH brand of eye-frames have partnered with interXon to introduce the Smith Lowdown Focus, Mpowered by Muse. The form factor is proven possible, though the functionality is, at this time, still limited to Muse’s meditation app. Smith Lowdown Focus MUSE by interXon Personal Neuro by Spark Emotiv InsightEmotiv Epoc+
  19. 19Smartglasses: a general purpose AR interface electroencephalogram (EEG) Waveguides + Eye-Tracking + Sensors + Cv + EEG for A | B There are several issues in calculating latency for any response. The Sensory Detection Time is the basic brain response time - typically around 210-260ms depending on age, for an unfamiliar but expected task. This is the point at which the brain has decided to act and starts to initiate the motor signal in response, which takes around 100ms to execute. Sensory Detection time and the motor delay are both reduced for highly trained tasks, so for example athletes and very experienced gamers can reduce the overall reaction time to specific kinds of events as a result of habituation. Emotiv offers a direct mental command system based on a user-trained set of reproducible mental patterns which are related by a machine-learning classifier system as belonging to each of the commands. In this case the classification is made at 250ms intervals, based on analysis of the most recent second of data. Latency in this case depends both on the computational side, with an expected delay of around 250ms, and the user’s ability to form the mental state, which depends on the user’s experience level and ranges from 50ms to several seconds. These delays occur end-on with the ordinary Sensory Detection Time, where the subject must form the intention to act before starting the process. Additional latencies much also be taken into account for EEG systems. Firstly, the signal processing chain usually includes finite impulse response filters to remove signal artefacts such as the 50/60Hz line hum present almost everywhere. This is picked up from the electrical mains supply and contributes very large artefacts into the data stream. Typical filter have an inherent latency of 50-60 milliseconds. Computational effects also introduce latency, where a series of calculations must be made and these are generally applied to a retrospective signal, often in the range of 0.25-5 seconds. In general, EEG-based is not fast enough to compete with direct motor signals except in cases where the subject’s motor system is compromised. — Dr. Geoff Mackellar, CEO, Emotiv EEG AS Command EEG still has a substantially slower response time than that of a human finger on a mouse. Dr. Geoff Mackellar, CEO, Emotiv gives a more nuanced and cautious analysis. While EEG-based response may not be fast enough to compete with motor signals “in general,” it is still the opinion of this author that the trade-offs in privacy for voice command and social acceptance for gestural interfaces will make EEG the winning UI in most consumer user cases. … with Caveat
  20. 20 Other things worth mentioning
  21. 21 Worth mentioning exhibit A Windows 10 vr PartnersProject Alloy At this early stage, Microsoft and Intel should not be considered late to the metaverse. SOMETHING TO PROVE Intel lost out on the smartphone market to ARM, and aims not to make that same mistake again. Cutting deals with Google, Recon, Vuzix and Luxottica, they’ve all but locked down the processor market for AR smartglasses, and they can be expected to go after the VR market just as aggressively. Microsoft has chosen to follow their previous path of success in the desktop / laptop market by extending Windows 10 into headsets, relying on their hardware partners to flood the market with low-cost VR headsets. Microsoft and Intel are very much in the game.
  22. 22 Worth mentioning exhibit B While included on the waveguides page, DigiLens deserves closer examination ­­— they have proven they can not only make active waveguides, they can mass produce them, and at a lower cost than others can make passive lenses. The Sleeping Unicorn Starting in military industrial, DigiLens has diversified into aviation with Rockwell Collins for Embraer, then into motorcycle helmets with BMW, and are now pursuing automotive partnerships. Their technology is the market leader in waveguides, yet until a recent investment from Sony / Foxconn they had never even been mentioned in TechCrunch. Don’t Believe the hype Let’s put DigiLens into context: MagicLeap has raised over a billion dollars with a $4.5B valuation, yet has not thus far been able to demonstrate that they can make technology that DigiLens has already brought to manufacturing. There is a world of difference between being able to do a once off in a lab environment vs scaling to manufacturing at a market-acceptable price. Some will point out that MagicLeap’s secret sauce is adaptive focus. But if they cannot yet even manufacture active waveguides, who will get to adaptive focus first? Only a few companies have shown that they can even manufacture passive waveguides, and DigiLens alone has demonstrated that they can manufacture active waveguides. TruLife Optics of the UK claims to have produced an active lens in a lab, yet the entirety of their reassurance that they can take that to manufacturing is a single sentence on their website that reads, “TruLife Optics is exploring manufacturing options and is confident that our optic will be available in quantity for companies and or individuals wishing to incorporate them into their products.” Really? Prove it. … Akonia Holographics of Colorado spent a decade (unsuccessfully) trying to developing holographic storage. While their team may possess some transferable expertise in holographic technology and they’re worth watching, DigiLens has spent the same decade developing holographic wavesguides and succeeded. My prediction: DigiLens is a sleeping unicorn and will likely exit via a bidding war that may include Apple, among others. AGAIN
  23. 23 Worth mentioning exhibit c goTenna entered the market via a crowd- sourced campaign billed as a P2P antenna, purportedly aimed at wilderness hikers — keep in contact with your companion when outside of the range of cell towers. They executed their go-to market strategy brilliantly. Though their antenna current works for text-messaging, they are often referred to as the new “walkie-talkie.” Jott, a messaging app, uses low energy Bluetooth or a router that can reach within 100 feet of each user, to create a mesh-network among users to send P2P messages. It is favored by many elementary and Jr. high school kids, who typically have less control of their own data plans. While Jott’s user base is still relatively small, I predict that either Jott, or an app like it could ride the same “privacy” wave that is currently catapulting the use of Signal. Mesh-networks are coming goTenna is a mesh-network. goTenna is not the new walkie-talkie. When incorporated into every connected device, mesh-networks will disrupt the global telecommunications industry and upend the state surveillance apparatus. If many are turning to apps like Signal, for their end-to-end encryption, consider the privacy advantages of a P2P platform where the internet itself is bypassed all together.
  24. 24 Face recognition Mesh-networks matter to ARWhy
  25. 25 Mesh-networks matter to ARWhy The wrong way to do face recognition Doing it wrong A database of face profiles is stored in the cloud. 2010 2010 2011 The object of the observer’s gaze is referenced against the cloud database. In the conventional framework, the observed loses all agency over their own identity. In recent years, particularly in public spaces in first-world high density urban environments, and international points of public transit, being surveilled by the state is the new normal. However, there has up to now been an unwritten social contract that, at least among strangers, people imagine themselves to be anonymous in the crowd. Discomfort arises with the breaking of this social contract. At Mobile World Congress in 2010 & 2011, multiple smartphone based augmented reality, face recognition mobile app, proof-of-concepts were demonstrated. None were ever introduced to market. In 2014 Google banned face recognition apps from the Google Glass app store, citing privacy concerns. In December 2016 Blippar announced their intention to introduce face recognition as a new feature in their upcoming release of their iPhone app, more notably, it has been reported* that Blippar intends to encourage users to contribute face-and-name pairs to their database, without seeking consent from the individuals featured. Though the announcement was made over two months ago, no such update has yet appeared in the iOS app store. It is my speculation that Apple has withheld approval. ConsumerFaceRecognition As reported in Newsweek. *
  26. 26 Mesh-networks matter to ARWhy The Right way to do face recognition In the proposed framework, the observed retains agency over their own identity. As recently proposed at MIT Media Lab The user maintains an encrypted version of their own face profile locally on their own device. In my recent lecture at MIT Media Lab, I made my case for face recognition as the killer app for consumer augmented reality. Via a local area mesh network, users who run the app can then set preferences as to who they share their key with. Individual observers, or groups can be blocked. The cloud is often overrated, but I digress. No Database Instead ConsumerFaceRecognition The Case for Consumer Face Recognition in Augmented Reality by Christopher Grayson MIT Media Lab, AR in Action.
  27. 27 Given Dunbar’s research, and given that the business model of a social network follows a variation of Metcalfe’s Law — Linkedin’s business would greatly benefit from face recognition enhanced personal network management. Add to this, Microsoft’s own industry leadership in AR, (including Face API), Microsoft is the candidate best positioned to take the lead on this implementation. I have agitated for advocated for this consumer face recognition framework to the Linkedin UI/UX leadership, as well as members of Microsoft’s AR team. The case for consumer face recognition is not to identify strangers that you don’t know, but to identify and be recognized by the people whom you do know. Two use cases for consumer face recognition — one medical and one modern: Medical: For treating Alzheimer’s and other aphasia inducing expressive language deficits. Cognitive Enhancement: The work of anthropologist and evolutionary psychologist, Robin Dunbar suggest that the evolution of human tribal relationships has placed cognitive limits our ability to associate names to about 1500 faces, yet it is now becoming common to manage social networks of substantially larger numbers. ConsumerFaceRecognition The case for consumer face recognition 150 1500 Why microsoft’s Linkedin Acquisition Also Matters to ar Dunbar’s Number: Approximate number of people that the typical person can maintain meaningful relationships. Approximate number of names that the typical person can associate with a face: includes friends, family, and all notable contemporary and historical figures. One of Dunbar’s non-eponymous numbers:
  28. 28 Upcoming VR/AR events
  31. 32 Conclusions While the media treats VR as the new gaming console, I see both VR and AR as globally transformative technologies — something as big as the invention of the telephone, on par with the invention of the internet itself — VR and AR are the technologies the internet has been waiting for. Virtual Reality, the ability to place two or more people from anywhere in the world into the same virtual space — potentially infinite in size, with a sense of presence — will sweep away national borders, challenge the global political landscape, and possibly even change the way we think of humanity itself. This report has been a collection of observations, ideas, and opinions, loosely connected by the common theme of virtual and augmented reality. If you like what I’ve had to say, I’d like to hear from you, and see how we can change the world together. For custom Reports By Christopher grayson COntact Jay Shiel +1 212-984-8500 Entrepreneur Creative Director Futurist Represented by Christopher Grayson < >
  32. 33 All product names, logos, and brands are property of their respective owners. Christopher Grayson BY-SA Creative Commons Attribution Share-Alike 2.5 This report is available for download at: by < >