Previously, I described how breakthroughs in computation and imaging have opened up a new dimension for filmmakers — today I’m looking back to trace the history of volumetric video capture, a key element of this burgeoning medium.
The theory goes like this —
Life’s most universal stories feature drama between real people. If it were possible to depict real actors in virtual reality (as opposed to computer generated characters), the medium’s storytelling possibilities would expand to encompass a dramatic potential bigger than today’s cinema. A new narrative format would emerge where the entire range of human emotions and experiences could unfold within an interactive simulation. Resulting in something like…
HBO’s Westworld depicts a far-future vision of what virtual reality hints at today
It turns out there are some serious technical challenges to putting lifelike human performances into interactive virtual reality. Unlike watching a movie, a spectator in VR is able physically look and move through the scene. This freedom renders 2D video (and even stereo 3D video) completely obsolete. A traditional video placed into VR can only be viewed from one perspective, and looks obviously flat as soon the viewer departs from the camera’s original position.
Traditional 2D video placed into true 3D VR environments isn’t going to fool anyone.
The solution? Volumetric Video
Volumetric video is an emerging format of video featuring moving images of real people that exist truly in 3D — like holograms — allowing them to be viewed from any angle at any moment in time. The trick is that this media requires a fundamentally different video technology capable of capturing 3D images of actors at fast framerates.
The tried and true reference for volumetric video— Star Wars Princess Leia Hologram
The last few years have given birth to an explosion of technology startups offering ways to capture and present true-to-life 3D holograms of people in virtual reality. Among them is our very own DepthKit. Catalyzing this frenzy is the burgeoning VR industry’s desire to showcase immersive experiences that appeal to audiences broader than traditional gaming.
The aesthetic progression of volumetric video. Left: Radiohead House of Cards (2007) Right: Microsoft Research (2015)
As the prevalence of volumetric video technology continues to explode into 2017, I want to take a moment look back at the years of cross disciplinary collaborations between artists and researchers that have laid the technical and aesthetic groundwork for this new medium to emerge. Inspired by cyberpunk thrillers and space operas of the prior century, these pioneers have worked through the complexities and embraced the glitches in pursuit of discovering a new dimension of storytelling.
By tracing how these ideas emerged, we can continue to make progress towards realizing their creative potential.
2005 to 2009: Re-appropriation of Research Technologies
Turning back to the clock to the mid-2000’s, three dimensional video wasn’t more than a science fiction concept popularized by visual effects wizards in scenes from Star Wars and Minority Report.
Minority Report (2002) featured holographic home movies
However, deep behind the secure walls of academic research institutions, breakthroughs in computational photography and computer graphics were giving way to the technology that would take volumetric video from science fiction into reality.
During those early days, a few pioneering artists managed to scale the ivory tower to collaborate with researchers. Applying their techniques towards popular culture media. In 2009, Director James Frost collaborated with media artist Aaron Koblin to 3D capture point-clouds of Radiohead’s Thom Yorke performing House of Cards.
House of Cards — Radiohead Directed by James Frost, 2007. Technique: Structured Light
Inspired by the Radiohead video, artist and computer scientist Kyle McDonald figured out how to recreate the technique himself. Kyle translated the academic research into an open source Instructables on how to build a scanner oneself using an off-the-shelf projector and camera. He applied the technique in collaboration with director Alan Poon to create a video for Broken Social Scene’s track Forced to Love.
Broken Social Scene — Forced To Love. Directed by Alan Poon, 2010. Techinque: DIY Structured Light
2010: DIY Volumetric Video with Microsoft Kinect
While Kyle’s Instructables page was helpful in democratizing 3D scanning, nothing could compare to the impact of the Xbox Kinect. Microsoft’s answer to the Nintendo Wii, this $100 peripheral allowed gamers to play by simply flailing their bodies in front of the TV. Powering it was a version of the same technology used for the Radiohead music video: a structured light 3D sensor.
The hacker community couldn’t resist opening this device up for creative exploitation. An open source Kinect driver was released within weeks of the commercial launch. Volumetric video was suddenly available to anyone (well, anyone who could write code…)
This is where we got involved. The previous experimental art matched with readily available hardware inspired myself and my collaborator Alexander Porter, a professional photographer and film DP, to further explore scanning as a method for volumetric photography. Our first collaboration using the device resulted in a series of dystopian prints captured from the platforms of the New York Subway.
Alexander and I weren’t working in isolation, but were part of a vibrant open source community. During this period, hundreds of art projects, music videos, and tech demos utilizing the Kinect’s 3D sensing capabilities flooded the internet. While many have faded over time, one from this period remains the most elegant and mesmerizing aesthetic exploration of volumetric video to date: Unnamed Sound Sculpture by Daniel Franke and Cedric Kiefer.
Unnamed Sound Sculpture — Daniel Franke & Cedric Kiefer, 2012. Technique Multi-Kinect Capture
2012: The RGB+D Toolkit
The hacked Kinect drivers did a lot to lower the barrier to entry for volumetric video exploration, but it still required advanced knowledge of programming to take control of the device. This stranded droves of frustrated filmmakers at the base of a steep learning curve, looking enviously upon the tech demos published by hacker communities.
Sitting squarely between the worlds of computer graphics and filmmaking, Alexander and I decided to bridge the gap by making the tools we’d been creating together available to download. We released the RGBDToolkit — a shareware application allowing video creators to capture and render volumetric video without the need to code.
The RGBDToolkit was software allowing video producers to capture volumetric data in combination with a video camera, which then resulted in true 3D footage that could be stylized and rendered to video.
The RGBDToolkit combined depth sensors with video to get the coveted look without the need to code
The response to RGBDToolkit was overwhelming. Within months of releasing the shareware, hundreds of videos were popping up online featuring the unmistakable shimmering points and lines that characterized the RGBDToolkit’s aesthetic.
While many were from hobbyists and dabblers, several prominent RGBD videos were created in those early days of the kit. For example, music video director Richard Lee took up residence in a Detroit warehouse with renowned rapper Eminem to create a cyberpunk homage to Max Headroom for his track Rap God — a video which has now been viewed over 400 million times.
HBO’s Love Child — Directed by Valerie Veatch, VFX by Alexander Porter. Technique: RGBDToolkit
2013: Virtual Reality becomes Volumetric Video’s Raison d’Etre.
The point clouds and glitches made possible by our free RGBDToolkit software appealed to a dedicated cadre of science fiction inspired filmmakers. But it didn’t go much further than a novel visual technique. All that changed when the first Oculus Rift virtual reality HMD prototype became available in 2013.
Few of us in that small community of creative hackers would have anticipated how soon a display technology demanding volumetric video would be coming to market. Our group of early adopters began expanding rapidly as the VR industry emerged and the need for volumetric video grew.
Placing human performances into virtual space became an appealing way to combine the craft of filmmaking with the technical demands of game engine virtual reality.
2014: CLOUDS Documentary and the DepthKit
Together with filmmaker Jonathan Minard, I had been using the RGBDToolkit to amass an archive of interviews featuring the pioneers of creative coding — among them the very individuals who had made the open source Kinect drivers available. The collection had grown to more than 10 hours of in-depth conversations (no pun intended) covering all topics related to code and creativity. We called the project CLOUDS.
Jonathan and I adapted the CLOUDS documentary to the Oculus Rift, allowing the viewers to navigate the database of topics simply using their gaze. Each interview subject appeared as a data-form surrounded by code visualizations. Seeing real humans discussing the future of code, creativity, and technological progress within a fully holographic virtual space was exhilarating. We were hooked.
CLOUDS Documentary — Directed by James George & Jonathan Minard, 2014. Technique: Oculus Rift + DepthKit
Alexander and I set to work on a new toolset simply called DepthKit. We surmised that if we could expand the aesthetic range beyond point-clouds towards photorealism, we’d be contributing an important enabling technology to the evolution of film.
We weren’t alone in our realization.
2015: Cambrian explosion of Volumetric Video
While we rebuilt DepthKit to meet the needs of real-time volumetric VR, a host of other capture techniques and solutions began to emerge. From high-tech corporate research, to fast-growing start ups, to RGBDToolkit-like freeware, the diversity in approaches to volumetric video expanded rapidly.
In early 2015, a special projects team at Microsoft Research released videos and a research paper demonstrating advanced volumetric capture. Requiring hundreds of cameras, thousands of hours of processing on the highest end graphics cards and multi-core processors, the quality of the results were beyond anything seen to date.
#100 Humans by volumetric video technology company 8i
Following quickly on the heels of 8i, Uncorporeal released two short volumetric scenes also formatted for VR. Allowing for interactive relighting, Uncorporeal’s captures are more visually integrated into their virtual 3D surroundings scene than 8i’s.
Alcatraz Island Lofts by Uncorporeal
While visually impressive, the use of these tools remained confined to the inner circles of Hollywood and behind the walls of research labs. We saw an opportunity to stay committed to serving the growing number of tech-forward filmmakers wanting to make the leap into volumetric video.
Our mission with DepthKit became clear: Lower the barrier to entry for creators to get started with volumetric, while empowering them to explore the entire range of stylized representation from glitchy and holographic to photorealistic.
2016–Present: DepthKit for Accessible Volumetric Video
The first brave users of DepthKit’s new photorealistic capabilities were trusted friends — the producer from CLOUDS, Winslow Porter and his collaborator Milica Zec. Their 2016 VR debut Giant takes place in a basement with a family during a frightening event. What unfolds is a rattling portrayal of raw fear and emotion. For the first time, DepthKit was able to capture the humanity of actor performance without distracting digital artifacts.
Giant — Directed by Milica Zec and Winslow Porter, 2016. Virtual Reality short Film, shot with DepthKit
Exploring the entire range between photorealistic and holographic, this year Scatter (DepthKit’s sister studio) focused on exploring the boundaries between reality and science fiction with Zero Days VR. An adaptation of Alex Gibney’s feature documentary about the Stuxnet virus, this virtual reality exposé utilizes DepthKit to portray the testimony of a mysterious NSA informant as she glitches between photorealistic character and a holographic apparition.
The experience goes a step further at the end with a surprising reveal — the viewer finds themselves face-to-face with the informant during an out of body experience, driving home that cyber threats are not confined to the virtual realm. Using the a live stream captured from an Intel RealSense R200 camera piped into Unity’s game engine on a separate thread, we were able to put a real-time holographic representation of the viewer into VR. The R200’s small form factor and low power consumption made it a perfect fit for integrating into the installation.
Zero Days VR Installation @ Sundance 2017
Zero Days VR is an example of an ongoing art and technology feedback loop. Scatter’s creative projects generate challenges that DepthKit must rise to meet. Stemming from innovation this project demanded, support for RealSense depth sensors are planned to be folded into the features of DepthKit and made available to the community of users — offering VR creators a much more portable volumetric video solution than previously available. The Intel RealSense camera line continues to offer a way forward for low cost, high quality depth sensing for this type of innovation.
The Promise and The Paradox
From the fragmented glimpses of Thom Yorke’s pointillized face, to Microsoft’s immaculately reconstructed Maori warriors — a technology that just ten years ago was science fiction is now a reality. The demonstrations and immersive experiences created along the way point to a new frontier of media that expand well beyond the current boundaries of cinema and video games.
However, as experiences powered by volumetric video become more common, new challenges are emerging. If not couched artfully within the scene of a story, volumetric characters can easily be a paradoxical presence. While a viewer can move around them as if they are actually there, the captures are still fixed recordings. They don’t respond, make eye contact, or interact as one may hope. Without a narrative reason for them to not respond, viewers can become confused or distracted.
Despite the technology rapidly maturing, the creative grammar of volumetric video is still in its infancy.
Discovering successful volumetric story design will require the continued cross-disciplinary collaboration between artists and technologists. It’s my hope that over the next ten years, this blossoming creative community will define a new vernacular for blending reality, fiction, and interactivity to achieve the most moving experiences yet.
If you’re a creator interested in getting started with volumetric filmmaking, sign up for the DepthKit Beta to join our growing community dedicated to exploring the potential of this exciting new medium.