[This article was published in the June 2011 issue of PC & Tech Authority.]
With ten million sold since November, it’s the fastest-selling consumer electronics device in history. It’s also the most hyped console accessory of the past five years, and the only technology that can – in Microsoft’s words – make you the games controller.
Combining cameras and 3D depth sensors with voice and facial recognition, Kinect can capture motion from up to 48 joints in the human body, tell you apart from other members of your household, and allow you to control video games using speech, gestures and full-body motion. It’s made the Wii look like yesterday’s toy, and brought the Xbox 360 to a whole new market.
However, Kinect is fast becoming much more than a plaything. Working with open-source tools and unofficial drivers, a community of enthusiasts has hooked up Kinect to the PC. These can-do coders have been creating home-grown gesture controls for Windows and developing some of the weirdest demo projects you’ve ever seen. From virtual surgery, to gesture-controlled Windows applications and human body artworks being smashed into a million pieces, Kinect has virtually limitless potential.
New era in computing?
That hasn’t gone unnoticed at Microsoft. Kinect’s creator has never denied that it has wider ambitions, and the release of a new official SDK should make it easier to ensure Kinect performs. Suddenly, Kinect feels like the start of something big. In fact, senior figures at Microsoft see it as the beginning of a new era in computing.
It wasn’t always this way. Kinect originally stems from the somewhat desperate situation that faced Microsoft’s entertainment division in mid-2007. While the Xbox 360 was a hit with hard-core gamers, it wasn’t breaking the mainstream audience in the same way as Nintendo’s faster-selling, motion-controlled Wii.
Microsoft needed a new twist, and put a 31-year-old technology guru, Alex Kipman, in charge of a project that could push the console in a different direction. In June 2009, Kipman’s group was ready to demonstrate its wonder product, codenamed Project Natal.
Enthusiasts have hacked the Kinect, using it for applications such as controlling DJ lights
Natal wasn’t all Microsoft’s own work. The basis of the hardware came from a Tel Aviv startup called PrimeSense, which demonstrated a revolutionary 3D full-body motion sensor in 2006. PrimeSense’s unit contained a near-infrared depth sensor and a matching projector, along with a specialist system-on-a-chip (SoC) processor and an optional RGB video camera.
Using a system PrimeSense calls Light Coding, the projector floods the scene with near-infrared light, beyond the visible range of the human eye. A regular CMOS sensor – the same kind found in many webcams – captures the light reflecting back and passes it on to the SoC. This uses complex algorithms to convert the light data into a full 3D-depth map, and this can then be combined with data from the RGB camera to give each 3D pixel a colour value.
Capture all this information, identify which parts are human, and you have all you need to operate full-body, motion-controlled 3D games.
Games were always the driving force behind PrimeSense’s technology. However, it was Microsoft that recognised the sensor’s potential and made it the core technology in Natal. Microsoft added a couple of extras to the reference design, including a more sophisticated, multi-array microphone and a motorised stand that would automatically adjust the position of the camera to keep players in the frame.
Microsoft’s real work, however, lay in the software. Voice recognition was easy, relatively speaking. Making full-body motion tracking work, however, was more of an issue. To work in high-speed games, Kinect needs to track the movement of 20 or more joints in full 3D at a rate of 30 frames per second, while adjusting to the shape of different human bodies and working from a wide range of start positions. This is no trivial feat.
The solution came from work undertaken at Microsoft Research in Cambridge back in 2002. Leading researchers Andrew Blake and Kentaro Toyama had created a new model for tracking human movement, using a probability based system of “exemplars” to anticipate how, if a part of the body starts moving one way, it’s most likely to move next. Research conducted at a later date by Andrew Fitzgibbon and Jamie Shotton refined the technique, ensuring Kinect could recognise specific body parts and evaluate trillions of potential body configurations at 30 frames per second. Kinect doesn’t simply watch how your arm or head moves – it’s constantly guessing where they’re going to move next, even when it can’t see them completely.
Games developers went to work on producing software for the controller, and the first Project Natal units were ready for a public demonstration by August 2009. Little more than a year later, the product arrived with a new name: Kinect.
Enter the hackers
It was inevitable that Kinect would find its own following of keen DIY-ers. “I think it’s so appealing because of the fact that it’s an extremely cool new technology available for a very reasonable price,” said John Simons, founder of one Kinect-hacking project, KinEmote. “I immediately thought: ‘this is going to be big’. A widely available, cheap 3D camera means that a lot of people will know about it and find awesome uses for it.”
Kinect attaches to the Xbox 360 through a standard USB 2 port, so all a PC needs is the software to run it. No sooner was Kinect released than Adafruit Industries, the open-source hardware firm, announced a $3000 Open Kinect bounty for the first programmer who could write the drivers. Within a month, the bounty had been collected by Hector Martin and days later the first Kinect hacks followed. From virtual hand puppets controlled by gestures to video-capture tools that transformed users into clouds of particles, the community found imaginative new ways of harnessing the impressive power of the sensor.
An update to the Xbox 360 will soon allow users to enter motion-captured chatrooms.
Microsoft’s initial response wasn’t overwhelmingly positive. “Microsoft doesn’t condone the modification of its products,” a spokesperson told CNET, warning that the device had “safeguards designed to reduce the chances of product tampering”.
Within days, it softened its stance. First, it noted that Kinect hadn’t actually been modified, merely supported by new drivers. Then, Microsoft Game Studios manager Shannon Loftis explained that it had made her “very excited” to see people “so inspired, that… they had started creating and thinking about what they could do” within a week of Kinect’s launch.
In fact, some within Microsoft have always advocated linking Kinect to the PC. In a blog post, a former member of the development team, Johnny Lee, revealed that he’d approached Adafruit to stage the Open Kinect bounty. “Without a doubt, the contest had a significant impact in raising awareness about the potential for the Kinect outside of Xbox gaming, both inside and outside the company. It was the best $3000 I ever spent.”
Now, things are official. First, PrimeSense formed a new, non-profit organisation called OpenNI, and released its own Kinect drivers, middleware and applications. Then, in February, Microsoft announced it would release a non-commercial Kinect for Windows SDK. It’s a move that’s been welcomed by the hacking community.
“It reflects a larger shift in Microsoft’s official stance to embrace the open community and encourage innovation,” said Joshua Blake, founder of the OpenKinect community group. “With the full capabilities of Kinect unlocked on the PC, it will certainly accelerate development on Windows as well as Linux and Mac. Certainly, there are challenges, as with any corporation-community relationship, but I’m hoping that we can work together. We’re all on the same side, after all.”
New user interfaces
Isn’t Kinect just about fun and games? Maybe not. The Kinect hacks that have attracted the most attention are, inevitably, the silly ones: the hack that turns a baseball bat into a lightsaber for example. There are dozens of programs that turn players into clusters of moving blocks or clouds of floating dots, and they’ve been used to create unique music videos or abstract art installations.
Games enthusiasts have been equally keen to take advantage – if you want to play Portal, Second Life, Left 4 Dead 2 or Angry Birds with Kinect, you’re in luck.
However, Kinect has a serious side. Among all these entertaining hacks, you’ll find demonstrations of futuristic, Kinect-controlled, Minority Report-style user interfaces, and programs that replace the mouse with movements of the hand. Download the current beta of KinEmote
and you can use gestures to control Windows applications, with the pointer mapped to follow your hand, and grabbing gestures used to click, drag and release objects. It’s the long-awaited arrival of the natural user interface.
"Imagine your television or media centre knowing who’s watching it"
Many will argue: why bother? Jim Spadaccini’s company, Ideum, specialises in the design of interactive displays and exhibits. He immediately saw Kinect’s potential. “An inexpensive platform for developing exhibits with more physical interactivity is certainly something that’s appealing to many museums,” he said. “They’re struggling to keep up with technology and provide new and compelling experiences.”
Ideum has already provided tools to make Kinect work with image viewers, virtual-reality modules and Google Maps, allowing users to change viewpoints and pan around or zoom using gestures alone.
“For museums that have large numbers of visitors, I think Kinect is a great fit. There are no parts that visitors touch, so the exhibits can be virtually indestructible,” Spadaccini said.
Kinect also makes sense in the console’s natural home, the living room. Already, Kinect-powered voice and gesture recognition can be used to control video playback, change TV channels and initialise games on the Xbox 360, while a forthcoming update will allow a user’s avatar to engage in motion-captured chatrooms.
KinEmote on the PC can already be used with Windows Media Center or Boxee’s web-focused media playback software to browse through media libraries and control playback using gestures alone. “Although that’s a rather good start, I don’t think that’s how we’ll see this technology used in the future,” said KinEmote’s John Simons. “Imagine your television or media centre knowing who’s watching it and changing the available content based on your preference and/or age. Flicking through pictures and movies is cool, but it remains a gimmick.”
The really interesting aspect is how Kinect fits in with the PC’s evolution. The key factor is vision. “Giving the computer vision really means giving it a lot more capability to not just see the image, but to understand the image,” said Microsoft’s Craig Mundie in a recent Microsoft Research video. “The introduction of the natural user interface, where machines are more like us – they see, they listen, they speak – gives us a supplement to the graphical user interface.”
For Mundie, the result is “one of the biggest shifts in computing ever, where we move the computer from being a tool to being a helper”.
Microsoft Research has already talked about experimental projects combining Kinect sensors with displays powered by “Wedge” lens systems that may, one day, use Kinect to steer stereoscopic 3D images into our eyes.
It isn’t all research either. In comments made to a group of executives in South America last year, Microsoft’s Steve Ballmer highlighted applications for business. “Why am I carrying this [projector remote control]?” he asked.
“I’ve already forgotten three times where I set it down. I should just be able to go like this, and the camera should recognise that gesture and control the slides for me.”
Others believe that Kinect-style interfaces will eventually make the PC invisible. “I think the work we’re seeing from Microsoft and the community is just the first glimpse into the amazing potential in our future,” said Joshua Blake. “Sure, there will still be PCs, but we’ll also have interactive surfaces and spaces integrated into our environment. It will be natural to start interacting and accessing the information we want in the way we want it, whether that’s through touch, motion tracking or other mechanisms.”
Perhaps, then, it’s best not to see Kinect as a toy, or the hackers as creative pranksters. In its own small way, Kinect shows us how computing might evolve. The real potential lies around the corner, perhaps in the Microsoft labs or – just maybe – in the basement of a genius amateur.
THE TOP FIVE KINECT HACKS
Who wants to use a keyboard and mouse in the operating theatre? Researchers at Wake Forest University School of Medicine, North Carolina, use a stack of CT scans as detailed 3D visualisations that can aid surgery.
Now, with Kinect, they can do it hands-free, using gestures to move through the 3D image.
Researchers in Switzerland have found a more gruesome use for Kinect, controlling virtual autopsies (or virtopsies) using 3D virtualisations of an MRI-scanned corpse.
The disintegrating man
Victor Martins is one of the most interesting Kinect artists at work, using the controller to create a trail of clones or manipulate sheets of cloth, fur or skin-like textures with his digitised 3D form. Martins’ pièce de résistance, however, is (hmm) Disintegrates, in which a 3D capture of his body is transformed into a million particles that fall away and scatter before the viewer’s eyes. It’s a spectacular demo, showing how a sensor and some coding nous can create results that used to be the preserve of digital effects studios.
Real-time motion capture
Motion-captured CGI animation commands a Hollywood budget, right? Not anymore. Hacker James Walsh took Kinect and the OpenKinect drivers and found a way to make it work with DAZ Studio, a free 3D figure design and animation package. Beginning with coloured balls mapped to the joints of the human skeleton, Walsh soon moved on to cartoon characters and lip-synched human figures composited in real environments. We’re still a long way from The Polar Express but, considering the modest costs involved, it’s an impressive feat.
V-Sido humanoid robot
There are plenty of examples of robots being controlled by natural user interfaces, some of which are capable of picking up objects or using the Kinect’s depth sensor to navigate in 3D space.
Our favourite is Wataru Yoshizaki’s demonstration of a humanoid robot controlled by Kinect. Using Asura Engineering’s V-Sido real-time robot control software, the humanoid robot uses Kinect data to capture a skeletal map from the human controller, mimicking every movement, pose and gesture, while maintaining its balance.
Da Vinci Physics Illustrator
Originally designed for Microsoft’s Surface table-top computers, Razorfish’s port of its Da Vinci Physics Illustrator is one of the most stunning examples of how we might interact with computers in the future. It enables the user to draw 2D virtual objects with simulated real-world physics, so that effects such as gravity, magnetism and planetary attraction can all be demonstrated in real-time. With Kinect, objects can be drawn by hand movements, then grabbed and dragged across the screen by closing and opening the fist.