Science and Technolgy.
科技。
Pilotless aircraft.
无人机。
Giving drones a thumbs up.
向无人机打手势。
How to integrate the control of piloted and pilotless aircraft.
如何让指挥有人机的方法适用于无人机。
DECK officers on American aircraft carriers use hand gestures to guide planes around their vessels. These signals are fast, efficient and perfect for a noisy environment. Unfortunately, they work only with people. They are utterly lost on robotic drones-and even if a drone is under the control of a remote pilot deep in the bowels of the ship, that pilot often has difficulty reading them. Since drones are becoming more and more important in modern warfare, this is a nuisance. Life would be easier for all if drones were smart enough to respond directly to a deck officer’s gesticulations.
在美国的航空母舰上,甲板指挥员会用手势指挥母舰附近的飞机。这种信号快速、高效,而且十分适用于嘈杂的环境。但遗憾的是,只有人才能读懂这些手势。在无人机上,这一套完全失效;而即便有飞行员在船舱深处远程操控这架飞机,他往往也很难识别那些手势。这是个麻烦,因为无人机在现代战争里变得越来越重要了。若无人机的智能达到了足以直接响应甲板指挥员手势的程度,那一切都会好很多。
Making them that smart is the goal of Yale Song, a computer scientist at the Massachusetts Institute of Technology. He is not there yet but, as he reports in ACM Transactions on Interactive Intelligent Systems, he and his colleagues David Demirdjian and Randall Davis have developed a promising prototype.
让无人机具备这样的智能是麻省理工学院计算机科学家Yale Song的目标。这个目标虽尚未达成,不过正如他在《美国计算机学会交互式智能系统学报》所报告的,他和他的同事David Demirdjian和Randall Davis已开发了一个很有前景的原型系统。
To try teaching drones the language of hand signals Mr Song and his colleagues made a series of videos in which various deck officers performed to camera a set of 24 commonly used gestures. They then fed these videos into an algorithm of their own devising that was designed to analyse the position and movement of a human body, and told the algorithm what each gesture represented. The idea was that the algorithm would learn the association and, having seen the same gesture performed by different people, would be able to generalise what was going on and thus recognise gestures performed by strangers.
为了让无人机读懂手语,Yale Song和他的同事制作了一系列视频,视频记录了多位甲板指挥员对着摄象机摆出的24个常用手势。然后他们用一个自己设计的算法(用于分析人体的位置和动作)来处理这些视频,并让算法知道每个手势的意义。他们的设想是,该算法会记住每个手势所对应的意义,而在对不同人摆出的相同手势进行处理后能够概括出手势本身的意义,从而识别任何人摆出的手势。
Unfortunately, it did not quite work out like that. In much the same way that spoken language is actually a continuous stream of sound (perceived gaps between words are, in most cases, an audio illusion), so the language of gestures to pilots is also continuous, with one flowing seamlessly into the next. And the algorithm could not cope with that.
很遗憾,实际与预想的不太一样。其实指挥飞行员所用的手语和人说的话差不多。后者实际上是一串连续的声音(人耳能察觉到的词与词之间的停顿在多数情况下只是听觉上的错觉),而手语同样也是连续的,因为两个动作之间是连贯的。但该算法无法处理这种连续的信息。
To overcome this difficulty Mr Song imposed gaps by chopping the videos up into three-second blocks. That allowed the computer time for reflection. Its accuracy was also increased by interpreting each block in light of those immediately before and after it, to see if the result was a coherent message of the sort a deck officer might actually wish to impart.
为了解决这个问题,Yale Song以3秒为一段将视频截开,在各段之间插入时间间隔。这样就给计算机留出了响应时间。同时识别的准确率也有所提高。因为这样计算机就能根据前一段及后一段的视频来理解当前这一段,看看结果是不是那种甲板指挥员可能真的想摆出的有特定意义的手势。
The result is a system that gets it right three-quarters of the time. Obviously that is not enough: you would not entrust the fate of a multi-million-dollar drone to such a system. But it is a good start. If Mr Song can push the accuracy up to that displayed by a human pilot, then the task of controlling activity on deck should become a lot easier.
他们最终做出了一个正确率为75%的系统。显然,那样是不够的:你不会将一架价值数百万美元的无人机交给这样的系统。但这是个好的开始。若Yale Song能将无人机识别手势的正确率提高至与真人飞行员相当,那么在甲板指挥无人机会容易得多。