HoloLens inventor Alex Kipman talks to CNET about haptics, eye tracking and what happens next.
Alex Kipman is a Microsoft technical fellow and inventor of the Kinect and the HoloLens. During a visit to Microsoft's Redmond campus to try the HoloLens 2 before its debut, we were able to talk to Kipman about his vision for where computing is headed, what HoloLens is becoming and how far away we are from a future where everyday people are actually wearing advanced AR headsets.
To Kipman, and Microsoft, a headset like the HoloLens is one of many devices in which sensors will digest the world with AI. As Kipman says, on a headset, it's the HoloLens. In a home, it's a smart camera. On a car or drone, it's an autonomous vehicle. This time, HoloLens 2 aims to connect its AR experiences in the cloud to other devices, including iOS and Android, and feel more like a work tool than ever.
But when will that magic augmented world become something for the rest of us?
An edited version of our conversation plus a video interview with Kipman are below, from when CNET spoke to him January 31, 2019.
How much of these technologies is the general consumer going to see in the short term?
I have no interest in overhyping, and having a bunch of people think these things are consumer products. And then ... get to the trough of disillusionment when people are like, "my God, I'm not using this instead of a PC, instead of a phone, instead of a television." These devices need to be more comfortable, they need to become more immersive and ultimately they need to have more value-to-price ratio. There is a threshold in the journey where there is enough immersion, enough comfort, enough out-of-box value, where I'll be happy to announce a consumer product. This is not it.
How far are we?
That's impossible me to go guess per se. ... I think humans are terrible predictors of time. In enterprise, in first-line worker scenarios, we're finding great value where this stuff is transformative. Now, to be clear, if I can take the highest watermark, the single best product that exists in this space today and it's still "not ready for consumers," you guys can be the judges of everybody else's product in this space.
Kinect started with consumers, on the Xbox. Do you ever think of revisiting that route for the HoloLens?
Look, like everything in life, you learn. I am incredibly proud, obviously, of the work we did with Kinect, and I think Kinect transformed the world. But, look, people in the living room don't want to stand up and play games ... they want to lean back and enjoy and want the precision of a controller in their hands. We didn't find as much signal in the living room for entertainment with devices like Kinect at the time.
But you know, go look today at an Alexa. What are those things doing? They're recognizing people, they're recognizing speech. They're doing a lot of the things that Kinect was doing in 2010. So, obviously, there's space in people's homes for devices like Kinect, that recognize people, that recognize the objects in it and understand the context of who you are. But we're finding way more signal with Kinect in enterprise workloads. They don't tend to go through a proxy PC. Which is why we then end-of-lifed Kinect for Windows, and we now just launched Azure Kinect, which is of course still tetherable to a PC but also connects directly as an IoT appliance to our cloud.
Where's mixed reality going in the next five years or so, and what part does Microsoft play?
I'm not going to guess five years, to be honest with you. Let me say for the duration of this product, let's say more in the one to two category ... I think all the successful ones will be enterprise-bound, primarily first-line worker scenarios, and increasing over time to knowledge worker scenarios. So I'll give you the prediction for the next two years. Next two years, these are still enterprise-bound.
What do you think the killer apps will be?
I think communication, in any secular trend in computing, essentially defines a secular trend in computing. As it turns out, more often than not, it's the innovation in the communication stack that changes things. Snail mail to real mail, real mail to messaging, to text messaging, to Snapchatting people. It's like going from a still to a video to a teleportation. Like, my daughter being able to teleport to play with her cousins in Brazil, as a parent. Me not having to travel around the world to visit all my partners. How much would I love that? If you guys could be having this level of present experience, and you're in New York, and you guys are in San Francisco and still it's this immersive, with me in Redmond? It's not hard to imagine presence as a killer experience for mixed-reality devices.
As for monitor replacement ... think about you sitting in front of your PC for "n" number of hours a day. This can be immersive and comfortable. Would you go spend this much money, to put this [the HoloLens] on your head with a keyboard and mouse versus buying a 30-inch, $500 monitor? Probably not, so we do try to focus a lot on things that you simply cannot do otherwise. Done right, we will all live in a world very soon where we will interface and interact and instinctually manipulate technology ubiquitously through our days, and you're not going to be interfacing through monitors.
In terms of eye tracking, what challenges or opportunities do you see?
I think ultimately the quest here goes back to having our AI understand people, places and things. You want to get as much signal in that conversation as possible. If you're going to teleport somewhere, I want to be able to know what you're doing, and have that level of understanding so I can really teleport you. I still want your facial expressions to go through it. That's something that I am very excited about, the eyes and the emotion of your eyes -- there is so much signal there for us to mine to create more immersive and more comfortable experiences.
I do think a lot about security of these devices. We're going to have state-of-the-art iris recognition, the single most secure biometric system with iris recognition through HoloLens, that allows me to get all of that data securely, all of my comfort information, all of my customization. And then lastly, is the idea that we can close the loop. I know at any point in time, where is the device, in relation to your eyes, as we're starting to form the hologram. Without that signal, I may or may not be getting the image correctly. Which means your eyes and your brain are doing all of the math for everything that I get wrong. Which is what translates into fatigue at the end of the day. We're able to load most of that math on the device and adjust the hologram as you're moving, to create sharper, more immersive holograms in the experience.
One thing that sets apart your tech versus others in the field is not having any physical control at all, using hands. Is a controller or haptics ever going to happen?
100 percent, we love haptics. We started this journey 11 years ago with input. Kinect was about having sensors on the edge that observed environment to understand people, places and things. We went from Kinect input innovation to HoloLens, input plus output. The last one is having these things in my world exchange energy. Having zeros and ones, that transact into photons, actually transact into energy so I can push a hologram, and it pushes me back with equal force. So I can hold the hologram and I can feel the temperature of a hologram. We can call that haptic feedback. Much more sophisticated than how you traditionally would think about [it], but another level of immersion. The minute that I throw a hologram to you and you can catch it and it pushes you back ... ooh, immersion just took one crank forward. The minute that I'm holding a hologram and there's temperature to it, it changes the level of immersion and believability of the experience.
Now although that's absolutely in our dreams, we also believe that humans are tool builders. I would not want my doctor to operate on me without tools, just with their bare hands, anymore than I'd like to eat my food tonight without a fork and a knife. We don't have any dogma on, "You cannot have something in your hands." As a matter of fact, in our virtual reality handsets, you're holding things in your hands: tools, controllers. That device could work here, but I don't know if you guys have seen that one, it has lights. All you see is the lights over the hologram. It's not that great of an experience. It's super easy for us to go create a version of that that goes in IR, so you don't see the light. It's absolutely also in our roadmap to think about holding things in the hand. Not just things we create. What if I am a person with a real physical hammer? We're holding a coffee cup and I still want to touch my hologram?
Trying on the HoloLens 2.
How long are you spending right now using HoloLens each day?
Several hours a day is the short answer. We actually designed HoloLens on HoloLens. Wearing HoloLens and looking at the model in 3D is a much more visceral way of being able to understand space and creation on it. But look, when I'm in meetings, I'm not wearing a HoloLens. There are plenty of times when I'm in my office, and I'm using my keyboard, mouse and my PC monitor to do any number of things. But I do wear the device several hours a day and so do most people on the team.
What's the one thing that kept bugging you while making the HoloLens 2?
It's everything. I have a dream that one day there's only gonna be one problem that keeps us up at night. We count HoloLens in miracles. You know, we can't have double-digit miracles in any given product cycle. That's how we kind of size how much innovation or issues we're going to pack into one release. This carbon fiber enclosure is a huge issue. It's there so that we can essentially make the device much more comfortable and much more stable. But shipping carbon fiber that doesn't look like carbon fiber ... was incredibly hard, is still incredibly hard, has tons of issues. Inventing a new display engine: that was a huge miracle. The innovation in the lenses, to the vapor chamber in the back, to the fit system, and making the fit system extensible for enterprises, so you can put it under a hard hat, any number of things.
That's just the hardware. The manufacturing, building at scale, at yield? A whole different set of issues. Getting articulated hand tracking to feel instinctual. Getting eye tracking to work over glasses. How do you create this platform from edge to cloud? To staying up at night and saying, "Oh my God, how do we take all of this and make it open?" If you do it wrong, we're gonna have to live with some of these decisions for the next decade-plus.
If you solve all the problems, what's the dream end state you want?
My dream state is I walk on an airplane, man, and every single person on that airplane is wearing our product. That's not this product, by the way, It's probably not the next one either. But, ultimately, the goal is these things transform humans, they empower people and organizations to do things they just plainly were not able to do before, they allow us to displace space and time on a daily basis as if we were born instinctually with those superpowers. It's a work of a lifetime, but certainly I can't think of anything else better to do with my life.