The US government’s research arm for intelligence organizations, IARPA, is looking for ideas on how to detect “Trojan” attacks on artificial intelligence, according to government procurement documents.
Here’s the problem the agency wants to solve: At a simple level, modern image-recognition AI learns from analyzing many images of an object. If you want to train an algorithm to detect pictures of a road signs, you have to supply it with pictures of different signs from all different angles. The algorithm learns the relationships between the pixels of the images, and how the structures and patterns of stop signs differ from those of speed-limit signs.
But suppose that, during the AI-training phase, an adversary slipped a few extra images (Trojan horses) into your speed-limit-sign detector, ones showing stop signs with sticky notes on them. Now, if the adversary wants to trick your AI in the real world into thinking a stop sign is a speed-limit sign, it just has to put a sticky note on it. Imagine this in the world of autonomous cars; it could be a nightmare scenario.
The kinds of tools that IARPA (Intelligence Advanced Research Projects Activity) wants would be able to detect issues or anomalies after the algorithm has been trained to recognize different objects in images.