We’re used to our gadgets being passive objects. They respond to typed or tapped commands, but we don’t expect them to be aware of their surroundings.
Announcements and demonstrations at this week’s Computer Electronics Show make it clear that that’s going to change, and soon. As our devices have more and better sensors, they’re going to be increasingly aware of the world around them, and will interact with the world and with us in more sophisticated ways.
Tablets and smartphones won’t just take pictures, they’ll be able to identify objects in a shot and judge their size and distance. Computers won’t just respond to taps on a keyboard or touchscreen, they’ll respond to gestures, voice commands, and the motion of people around a room. And vehicles — both on the ground and in the air — will increasingly understand the world around them and react intelligently to obstacles. This will mean smartphones that can take precise size measurements with a single click, personal drones that can take breathtaking aerial shots, and dramatically safer cars.
Computer vision is going 3D
When we look at the world around us, our brains automatically build a 3D model of our surroundings. They identify objects like people, animals, and pieces of furniture and figure out how big and far away they are. The cameras on our digital devices don’t do that — they just take flat, 2-dimensional images. But that’s going to change soon.
Microsoft has been a big innovator in this area. The company introduced the Kinect sensor for Xbox in 2010. It combines a traditional camera with an infrared range-finding sensor to capture images with depth.
Microsoft built the Kinect to allow a new generation of games that use gestures instead of a traditional controller. But the technology has proven to have other applications too. It has sparked a renaissance in the robotics community. Kinect’s cheap hardware and powerful software makes it superior to a lot of the industrial-grade sensors researchers had been using previously. Kinect has been used by robotics researchers at NASA, the University of California at Berkeley, Disney, and many other places.
Kinect is cheaper than a lot of other sensors, but it’s not cheap enough to become truly ubiquitous. Five years after Kinect was announced, a standalone sensor still costs at least $150.
Intel is trying to one-up Microsoft by building Kinect-like 3D sensors that are small and cheap enough to integrate into mobile devices. Intel announced the technology, dubbed RealSense, last year. And it has been aggressively promoting it at CES this week.
Intel has been vague about how much RealSense hardware costs, since they’re trying to sell to gadget makers such as Samsung and LG rather than the general public. But the company claims they’ll be cheap enough to reach a mass market (for example, the first product to incorporate a RealSense camera, a Dell tablet, costs $400). If things go according to Intel’s plan, within a few years all of our tablets and laptops, and perhaps even our smartphones, will have fancy 3D cameras instead of boring old 2D ones.
Good software makes 3D cameras more powerful
Being able to capture 3D images is pretty cool in its own right — Intel says you’ll be able to turn everyday objects digital models suitable for 3D printing, for example. But what makes technologies like the Kinect and RealSense really powerful is software that helps app developers recognize the people and objects in a scene.
With a RealSense-equipped tablet, you can take a photo of a room and have it automatically compute the size of objects in the shot. You’ll be able to automatically erase the background from a photo or video or apply different effects to the foreground and background.
Both platforms also include software to track the movements of people and understand their gestures. Kinect software, for example, can “track thumbs, 25 joints of up to six people, and heart rates by scanning a face.” It can also understand two people talking at the same time.
Microsoft and Intel envision a future where people control their computers using gestures rather than keyboards or touchscreens. It’s not clear that this is something PC users actually want — keyboards and mice are pretty efficient input methods. But it could be useful in a lot of other situations.
BMW, for example, is working on technology to let drivers use gestures to control accessories. Right now, BMW is using an ordinary camera, but a RealSense camera could make the gesture-recognition capabilities more sophisticated. Similarly, putting a RealSense camera in home entertainment system could allow people to control their TVs with Minority Report-style gestures instead of a remote control.
The technology could also have major industrial applications. Mounting RealSense cameras in a warehouse or assembly line could allow a company to track employees and equipment as they move around, turning on lights and equipment as needed and gaining insight into how to improve efficiency.
3D sensors in drones and self-driving cars
The most important applications for 3D cameras and sensors are for vehicles that navigate the world autonomously. Intel sees this as a major application for RealSense. In his CES keynote, Intel CEO Brian Krzanich demoed a drone equipped with six RealSense cameras that was able to navigate around obstacles autonomously. This is a situation where the small size and power consumption of RealSense is important, because there are strict limits on how much weight a drone can carry.
Another technology being demoed at CES this week shows the potential of the technology. Airdog is a small, camera-equipped drone that can be programmed to follow you around and make a video of your surroundings. Airdog is touting it as a way for outdoorsy types to capture footage of their mountain biking and rock climbing exploits. Right now, users have to manually mark out obstacles so Airdog knows to avoid them. But a 3D camera like RealSense will allow drones to detect and avoid obstacles automatically, allowing users to launch them and then forget about them until they land.
BMW and Audi both demoed self-driving cars at CES this week. Safety is critical here, so companies developing self-driving cars have been using sensors that are significantly more powerful — and expensive — than RealSense or Kinect. The sensors on Google’s early self-driving cars cost hundreds of thousands of dollars.
These costs are expected to come down as the technology is refined and mass-produced. But sensors suitable for self-driving applications will likely cost thousands of dollars for a while yet. Ford CEO Mark Fields said this week that self-driving cars will come on the market by the end of the decade. But he said Ford probably wouldn’t be the company to make the breakthrough first, because Ford is focused on producing affordable vehicles.
So this is going to be a big area of technological progress in the coming decade. Hardware companies will be working to improve the performance of 3D sensors while reducing their cost. Meanwhile, companies in Silicon Valley, Detroit, and elsewhere will be developing software that uses these sensors to understand the world. The result will be a proliferation of devices, from tablets to self-driving cars, that understand and interact with the world around them.