Pixelsumo is a blog about interaction, with an emphasis on play, installation, video game culture, playgrounds and toys. Written by Chris O'Shea.
Follow...
RSS Feed
Vimeo Channel
Posted February 10th 2010 under Computer Vision, Games
![]()
(image Popular Science)
Previously on Pixelsumo I posted a closer look at Project Natal, from hardware status & origins. As the post suggests, the key to the technical success is bringing complex algorithms that estimate body joint positions to a mass consumer level. Getting this right in every home & lighting condition is no easy task.
The processing of the depth image was going to be on a chip in the camera, but it’s now been reported that the processing will be done in software, to keep the costs of the hardware down.
Recently lots more behind the scenes videos and press revealed insights into how the body processing works.
![]()
This video shows some of the body estimation processing in the making.
Originally it was thought that Natal might use time of flight (like the 3DV ZCam) to measure the time it takes for infrared light to bounce from objects.
“Frances MacDougall, chief technology officer at GestureTek, said his company was also working on Project Natal. Asked why there were so many vendors on Natal, he said that Microsoft will be using a low-cost 3-D camera from PrimeSense. But it purchased 3DV because it had a strong patent portfolio. And GestureTek itself is providing a software layer that helps interpret the data coming in from the 3-D camera and makes it useful for the game machine”. (source Venture Beat)
Instead of time of flight, the PrimeSense camera projects a pattern (like a barcode) of infrared light, with the sensor reading back this pattern and computing the depth map (which they are calling Light Coding).
This video (linked above) shows how impressive the tracking is, calculating accurately body parts from a side angle, and when joints are occluded out of view.
![]()
Popular Science has a great article about how Microsoft are training the brain behind the pose estimation & how the algorithms work. I really hope to get my hands on this one day…
“Step 2: Then the brain guesses which parts of your body are which. It does this based on all of its experience with body poses—the experience described above. Depending on how similar your pose is to things it’s seen before, Natal can be more or less confident of its guesses. In the color-coded person above [bottom center], the darkness, lightness, and size of different squares represent how certain Natal is that it knows what body-part that area belongs to. (For example, the three large red squares indicate that it’s highly probable that those parts are “left shoulder,” “left elbow” and “left knee”; as the pixels become smaller and muddier in color, such as the grayish pixels around the hands, that’s an indication that Natal is hedging its bets and isn’t very sure of its identity.)”
Comments
(February 11th 2010)
Huh, I really thought the Natal was a pulsing LED sensor like a SwissRanger.
But this is exactly why monopolistic companies like Microsoft are bad for the advancement of technology: they buy a innovative technology and bury it right before it can come to market so it can’t compete with another tech (which is actually less advanced) they’ve bought.
But I suppose if the Natal is successful there will be at least several other low-cost 3D sensors within a few years.
(February 12th 2010)
Well, the 3DV ZCam is buried, which is a shame. But potentially a Natal will run on Windows one day. The processing of body parts is done in the SDK, and by the looks of it will be bar far the best home use of computer vision I’ve seen. The PrimseSense camera you can still buy, buts its only available to electronics manufacturers who want to put it in their products, not consumers.
(February 18th 2010)
Yeah I’ve tried to get a quote for the primesense dev kit but haven’t heard anything.
I still think the Natal is the Zcam, because its resolution looks so low- 64×64 instead of the 640×480 the Primesense has.
(February 24th 2010)
64×64 Resolution for Natal, where did you get that info!?
(March 1st 2010)
Thanks for both of these excellent writeups. It will be fascinating to see what kind of DIY opportunities this hardware presents that are outside of an XBox dev license.
I am curious as to how stereo 3D fits into developments like Natal and others. Having some experience with stereo it seems like a natural fit for dynamically lit environments where dense 3D data is needed. My guess is that stereo latency might be a problem for this kind of interaction.