An Experimental Target-Recognition AI Mistakenly Thought It Was Succeeding 90% of the Time (defenseone.com) 65

Posted by EditorDavid on Sunday December 12, 2021 @10:39PM from the war-games dept.

The American military news site Defense One shares a cautionary tale from top U.S. Air Force Major General Daniel Simpson (assistant deputy chief of staff for intelligence, surveillance, and reconnaissance). Simpson describes their experience with an experimental AI-based target recognition program that had seemed to be performing well: Initially, the AI was fed data from a sensor that looked for a single surface-to-surface missile at an oblique angle, Simpson said. Then it was fed data from another sensor that looked for multiple missiles at a near-vertical angle. "What a surprise: the algorithm did not perform well. It actually was accurate maybe about 25 percent of the time," he said.

That's an example of what's sometimes called brittle AI, which "occurs when any algorithm cannot generalize or adapt to conditions outside a narrow set of assumptions," according to a 2020 report by researcher and former Navy aviator Missy Cummings. When the data used to train the algorithm consists of too much of one type of image or sensor data from a unique vantage point, and not enough from other vantages, distances, or conditions, you get brittleness, Cummings said. In settings like driverless-car experiments, researchers will just collect more data for training. But that can be very difficult in military settings where there might be a whole lot of data of one type — say overhead satellite or drone imagery — but very little of any other type because it wasn't useful on the battlefield...

Simpson said the low accuracy rate of the algorithm wasn't the most worrying part of the exercise. While the algorithm was only right 25 percent of the time, he said, "It was confident that it was right 90 percent of the time, so it was confidently wrong. And that's not the algorithm's fault. It's because we fed it the wrong training data."

An Experimental Target-Recognition AI Mistakenly Thought It Was Succeeding 90% of the Time

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 65 Comments Log In/Create an Account

Comments Filter:

it was confidently wrong (Score:5, Funny)

by fustakrakich ( 1673220 ) writes: on Sunday December 12, 2021 @10:45PM (#62073953) Journal

It's alive! That's definitely a human trait

- Re: it was confidently wrong (Score:3)
  
  by Z00L00K ( 682162 ) writes:
  
  And that's why we have all those stupid warnings on our stuff today as well as safety kill switches.
  You can't even scratch your butt when mowing the lawn without the mower stalling.
  - Dunning-Kruger Effect! (Score:5, Funny)
    
    by XXongo ( 3986865 ) writes: on Monday December 13, 2021 @12:21AM (#62074179) Homepage
    
    So Artificial intelligence now matches real intelligence in the Dunning-Kruger effect; it thinks it's smarter than it really is.
    
  - Re: it was confidently wrong (Score:2)
    
    by rantrantrant ( 4753443 ) writes:
    
    Thanks for putting that vivid, grotesque image of you into my mind.
- Cloudy Days ancecdote (Score:3)
  
  by aberglas ( 991072 ) writes:
  
  Old and apocryphal, but an AI was trained to find tanks camouflaged in trees. It did brilliantly on the training set.
  But failed miserably in the field.
  Turned out, that all the photos with tanks in them were taken on cloudy days.
  - Re:Cloudy Days ancecdote (Score:5, Informative)
    
    by Entrope ( 68843 ) writes: on Sunday December 12, 2021 @11:06PM (#62074003) Homepage
    
    If you want a non-apocryphal story, talk about AIs predicting cancer malignancy by detecting rulers [menloml.com].
    
  - I suspect this AI was right (Score:3)
    
    by raymorris ( 2726007 ) writes:
    
    The article says the AIN was "wrong", but doesn't specify what they mean by wrong.
    I suspect the AI *correctly* determined that the image did not match what it was trained to find - a single missile at an oblique angle.
    - Re: (Score:3)
      
      by CaptQuark ( 2706165 ) writes:
      
      Good point. It doesn't say what is considered a right or wrong detection.
      I had assumed it was trained to look for surface-to-surface missiles and fed images that might or might not contain launchers. If it found a launcher, it should identify it as a target. If given 100 images to analyze, it was certain 90% of the time it had identified a target, while only 25% of the time was it correct. Your idea that it should also be certain that an image did not contain a launcher and count that as a right answer
    - Re:I suspect this AI was right (Score:4, Insightful)
      
      by gnasher719 ( 869701 ) writes: on Monday December 13, 2021 @06:02AM (#62074617)
      
      So the AI didn't recognize what the humans thought it was trained to recognise, but it recognised what it _actually_ was trained for. What a silly mistake to make.
      
      - Re: (Score:2)
        
        by ceoyoyo ( 59147 ) writes:
        
        People training computers to do things make lots of silly mistakes. People training people to do things also make lots of silly mistakes.
    - Or, in other words, what we call AI isn't. (Score:3, Interesting)
      
      by Excelcia ( 906188 ) writes:
      
      Exactly right. The AI, or perhaps we should call it "expert system" since AI is bandied about WAY too much, did what it was trained to do.
      When the data used to train the algorithm consists of too much of one type of image or sensor data from a unique vantage point, and not enough from other vantages, distances, or conditions, you get brittleness, Cummings said.
      And on the other hand when the data you use to train the algorithm consists of multiple sensors, vantages, distances, and conditions, you get inability to discriminate. Give it enough data and it'll find a target on TV off station static.
      This is all because we are on the peak of yet another Mount Stupid. In the 50's we were on the peak of one where we thought that AI w
      - Re: (Score:2)
        
        by raymorris ( 2726007 ) writes:
        
        Even basic machine learning can be fed as many signals as you want with no loss of power - the algorithm uses the features that have the strongest correlation, ignoring the rest.
        
        Re: (Score:1)
        
        by Excelcia ( 906188 ) writes:
        
        the algorithm uses the features that have the strongest correlation, ignoring the rest
        Ignoring the rest is exactly the problem. That's a recipe for training your algorithm to work perfectly on the slice of data that falls in the direct middle of the bell curve, and ignoring everything else.
  - Re: (Score:2)
    
    by arglebargle_xiv ( 2212710 ) writes:
    
    Actually the AI was right, in combat everything is a legitimate target. For example if you classify "enemy combatants" as "males aged between 12 and 65" (which is the US military's actual definition of "military-age males") then your drone can holocaust a wedding party and you can still claim you were targeting enemy combatants.
    It's really just a case of modifying language to create a population that's killable. Exactly that the AI is then targeting becomes irrelevant, and you can claim a 90% accuracy rat
    - Re: Cloudy Days ancecdote (Score:2)
      
      by anonymouscoward52236 ( 6163996 ) writes:
      
      I think you need to train the AI to be racist, then you can have success. Was the wedding party full of 18 to 65 year old males with dark skin? Does the GPS coordinates show we are in the correct country that we don't care about? If all of the above are true and the body count is high, then success.
  - Re:Cloudy Days ancecdote (Score:5, Interesting)
    
    by ShanghaiBill ( 739463 ) writes: on Monday December 13, 2021 @04:45AM (#62074513)
    
    Turned out, that all the photos with tanks in them were taken on cloudy days.
    A similar problem occurred in an ANN programmed to identify dogs by breed.
    It learned that if there was snow on the ground, it was a husky.
    
  - Re:Cloudy Days ancecdote (Score:4, Funny)
    
    by jeremyp ( 130771 ) writes: on Monday December 13, 2021 @09:03AM (#62074829) Homepage Journal
    
    Another possibly not true story: in WW2, the Russians trained dogs to look for food under tanks. Then they strapped anti tank mines to them and sent them out in battle to blow up the Germans. Unfortunately, they used their own tanks to do the training and it turned out that the dogs could tell the difference between a German tank and a Russian tank.
    
  - Re: Cloudy Days ancecdote (Score:2)
    
    by rantrantrant ( 4753443 ) writes:
    
    I remember that. It was decades ago, wasn't it? If they haven't learnt the scientific method, i.e. controlling for threats to construct validity, by now, is there any hope or are our 'thinking machines' going to be as stupid as us but without the millions of years of natural selection to make them at least competent at some things?
    - Anti Tank Dogs (Score:2)
      
      by aberglas ( 991072 ) writes:
      
      This one is true, and a good story
      https://en.wikipedia.org/wiki/... [wikipedia.org]
      The problem is the same as the others, neither the dog nor the AI had any deeper understanding of what they were trying to achieve. And we read more into their behaviors than there actually is.
      Mind you, insects can do amazing things without having any real intelligence. And that level of AI is around the corner.
- Re: (Score:3)
  
  by CmdrPorno ( 115048 ) writes:
  
  "Winning so much, you'll get tired of winning!"
- Re: it was confidently wrong (Score:2)
  
  by iwill86 ( 8309616 ) writes:
  
  Sounds like it should be promoted to management!
- Re: (Score:1)
  
  by nospam007 ( 722110 ) * writes:
  
  "It's alive! That's definitely a human trait"
  Even worse, it's Republican!
- Re: (Score:2)
  
  by Anonymous Crowded ( 6202674 ) writes:
  
  Apparently it's a privileged class - it got to grade it's own paper. Someone call Turing!!!
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
oh shit (Score:2)

by Anonymouse Cowtard ( 6211666 ) writes:

AI: I'm not shooting down that target, I have positively identified it as a feather. Army Major: DUCK AND COVER!
- Re: (Score:1)
  
  by vivian ( 156520 ) writes:
  
  After reading a story in NYT today about the large number of civilian casualties from drone strikes by it's drone controllers during the fights against ISIS, I am very concerned about what kind of result you are going to get if you try to train an AI on those videos.
  If humans cant get this right, then what chance does AI have? Worse - when the algorithm pops up with 51% chance it's a legit target, are you going to have strike commanders go ahead because the computer says it's good?
  What is the threshold wher
  - Re: (Score:2)
    
    by arglebargle_xiv ( 2212710 ) writes:
    
    See my earlier comment [slashdot.org], they weren't civilian casualties, they were legitimate military targets, by careful choice of what counts as a military target.
Classic GIGO (Score:1)

by Rainwulf ( 865585 ) writes:

Garbage in - Garbage out.
Training AIs with data is only as good as the data going in.
Works with humans as well, if the only training data you get is Fox news, you end up brain damaged.
- Re:Classic GIGO (Score:5, Funny)
  
  by Tablizer ( 95088 ) writes: on Monday December 13, 2021 @02:02AM (#62074325) Journal
  
  GIWNO: Garbage in, wayward nukes out
  
Dunning Kruger (Score:4, Informative)

by Ann Coulter ( 614889 ) writes: on Sunday December 12, 2021 @11:15PM (#62074015)

AIs can fall victim to the Dunning-Kruger effect. More research is needed. https://citeseerx.ist.psu.edu/... [psu.edu]

- Re: (Score:3)
  
  by Gravis Zero ( 934156 ) writes:
  
  No, for there to be a Dunning-Kruger effect, an AI would have to first be aware. This is just a poorly trained neural network with no awareness at all.
Worried (Score:3)

by Kristoph ( 242780 ) writes: on Sunday December 12, 2021 @11:19PM (#62074025)

The way the military answered these questions suggests that their understanding of AI technologies is, at best, limited.
That is very concerning in am environment which is literally life or death.

- Re:Worried (Score:4)
  
  by DivineKnight ( 3763507 ) writes: on Sunday December 12, 2021 @11:40PM (#62074093)
  
  Seriously. Like the old adage, that aviation engineers avoid flying on planes whose control systems they've designed, I, as a computer scientist, would actively avoid entrusting my life to an AI created by myself or my peers, as I know too well where we get it wrong. Perhaps an 11th generation AI, at some point in the future, or an AI working with biologics to compensate each other for their inherent weaknesses...but a garden variety AI as taught by our universities and funded by our corporations? That level of code being deployed in the field doesn't even qualify as Alpha, let alone Beta, or production.
  
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  ... after very careful consideration, sir, I've come to the conclusion that your new defense system sucks.
  - Re: (Score:3)
    
    by CaptQuark ( 2706165 ) writes:
    
    That's a WOPR of a joke!
ED 209.. (Score:2)

by ihaveamo ( 989662 ) writes:

I think this AI is fine fine- as long as gives you a certain amount of seconds to comply.
- Re: (Score:2)
  
  by 93 Escort Wagon ( 326346 ) writes:
  
  I think this AI is fine fine- as long as gives you a certain amount of seconds to comply.
  Like 20?
Human vision is brittle, too. (Score:2)

by doug141 ( 863552 ) writes:

This moth said so: https://www.youtube.com/watch?... [youtube.com]
Although any "optical illusion" would make the case.
Sounds like my colleagues (Score:2)

by thesjaakspoiler ( 4782965 ) writes:

They also get it wrong 75% of the time while they think they know the right answer for 90% of the time.
Life as an AI must also suck...
It's also about how you choose to present the data (Score:2)

by SuperKendall ( 25149 ) writes:

It could be the issue would have been solved if the oblique data has been shifted to appear to be overhead as well, so that any data fed in would have been normalized.
It's not just about raw data you fed in, but thinking of ways to normalize that data incoming so that the underlying network sees other kinds of data in similar ways to what it knows.
- Re: (Score:2)
  
  by jenningsthecat ( 1525947 ) writes:
  
  It could be the issue would have been solved if the oblique data has been shifted to appear to be overhead as well, so that any data fed in would have been normalized.
  It's not just about raw data you fed in, but thinking of ways to normalize that data incoming so that the underlying network sees other kinds of data in similar ways to what it knows.
  I was thinking something similar but approaching it from a different angle. Could analysts and artists analyze some of the existing overhead data and apply their human reasoning and experience to create a valid oblique-vantage data set for use in training the AI?
  - Re: (Score:1)
    
    by SuperKendall ( 25149 ) writes:
    
    Could analysts and artists analyze some of the existing overhead data and apply their human reasoning and experience to create a valid oblique-vantage data set for use in training the AI?
    I was wondering about that also, as it seems like you are throwing away some useful information by creating overhead views out of oblique, but I was thinking that creation of oblique image data from original overhead material might introduce too much error and mess up the training for real oblique images. Maybe it would wo
- Re: (Score:2)
  
  by AuMatar ( 183847 ) writes:
  
  You wouldn't do it like that. They could just transform the data in point A to point B and run the same training algorithm with 2x the data points. You wouldn't want to run the same data through the training algorithm twice because you're biasing the algorithm to 2x that particular input.
  That's also assuming that you could just translate it and get the same quality data, which I doubt. Especially if this was a visual sensor, you would get different shadows, different reflections. You'll get far better re
  - Re: (Score:1)
    
    by SuperKendall ( 25149 ) writes:
    
    Especially if this was a visual sensor, you would get different shadows, different reflections. You'll get far better results just feeding in real data from a variety of angles.
    Yeah but sometimes (often) you don't have the luxury (or money or skill) to get data from all angles, you just have scattered data from different angles.
    I'm not saying feed in the data twice, I'm saying normalize all data to one angle of view via crude image manipulation before feeding it in. I agree that aspects like reflections or
Simulated input (Score:4, Interesting)

by crow ( 16139 ) writes: on Monday December 13, 2021 @12:00AM (#62074135) Homepage Journal

Well, if you can't get real input that matches criteria you need to train against, you can simulate it. Tesla demonstrated this in their AI Day presentation, using a video game engine to create realistic simulations of situations that the cars need to be able to handle, but aren't common enough for them to acquire enough training data on. Yes, it takes a good bit of work to make the simulated input have the same errors from cameras or other physical sensors, but it's a valid technique when obtaining sufficient training data is otherwise impractical.

- Re: (Score:3)
  
  by jarkus4 ( 1627895 ) writes:
  
  While its certainly better then nothing, such data is highly risky as AI might instead find and focus on some flaw in your simulation method and with very limited real data it may be hard to control against it.
Missy Cummings (Score:3)

by crow ( 16139 ) writes: on Monday December 13, 2021 @12:02AM (#62074139) Homepage Journal

Note that yes, this is the same Missy Cummings who has been very critical of Tesla's Autopilot system and has been appointed to a senior advisor for safety at NHTSA.

- Re: Raytheon at it again? (Score:2)
  
  by iwill86 ( 8309616 ) writes:
  
  The Patriot missiles "failed" because the SCUDs were breaking up in flight and RTN's computer could not figure out which portion was "the target". "The problem" was addressed in the next generation.
Dunning-Kruger AI. (Score:2)

by iamnotx0r ( 7683968 ) writes:

Even dumb AI is like a dumb person. That is scary.
because it's not an AI (Score:2, Insightful)

by cats-paw ( 34890 ) writes:

it's a statistical classifier. an AI would have some, however small, amount of "common sense". calling these machine learning systems "AI", is very, very wrong.
it's hardly surprising that a poor training set can cause all manner of problems. I would be willing to bet a good training set can cause all manner of problems.
Also, i for one do NOT welcome our skynet overlords.
- Re: (Score:2)
  
  by narcc ( 412956 ) writes:
  
  calling these machine learning systems "AI", is very, very wrong.
  Well, this is the world we live in. "AI" wasn't something from science fiction that was later applied to these technologies. The people who created the field were calling it that from the beginning, though not without the same complaints you're making. Pamela McCorduck tells the story in her book Machines Who Think.
  Also, i for one do NOT welcome our skynet overlords.
  You have nothing to worry about. That's complete science fiction. We don't even know where to begin. Though, on an interesting note, we do have reason to believe that computationalist approa
AI (Score:5, Insightful)

by ledow ( 319597 ) writes: on Monday December 13, 2021 @04:21AM (#62074473) Homepage

All "AI" that we have:
- is statistical, not intelligent.
- needs heuristics (human input), in this case "the right training data", plus feedback telling it that it was wrong.
- doesn't learn at all, it ends up following a narrow niche at random in which it appears to have a slightly better probability of achieving its target.
- is very difficult to "untrain" once trained. A pilot you can tell "No, I know we set you up to fire on the purple targets, but these ones are blue" and it will adjust. AI won't. And if you trained it on a million data points, it'll take FAR MORE than another million data points until those initial assumptions are combatted (if ever).
- because of the above: AI always plateaus. Usually just in time to write up the PhD paper and say "we enjoyed great initial gains", because past those initial gains, it basically stops getting better.
And, ironically, it doesn't seem to matter "how" it learns, what tech it's based on, how big a computer it runs on, or how long it runs for - the same things hit it all the time. It has spoon-fed rules and training data, almost infinite resources, and yet each time it ends up in a very tiny niche, plateaus and cannot be repurposed without starting from scratch.
This isn't AI. This is statistical modelling for a very, very, very narrow single statistic covering the data. It's like looking for the colour of the "average" pixel in an image by random chance and feeding it a billion images. Sure, enough selection, human modification and training and it'll get to within a margin of error in doing so. But you could have just got there easier with a single mathematical rule and no guesswork involved.
With AI we are quite literally leaving everything to chance, including our military weapons and self-driving cars.

- Re: (Score:2)
  
  by fleeped ( 1945926 ) writes:
  
  > But you could have just got there easier with a single mathematical rule and no guesswork involved.
  Creating mathematical rules is hard. Using computational resources (because you can afford it and the competition may not) is easier. Like writing a worthy script vs plastering CGI in movies.
  I'm tired of listening to interviewees profess they have GPU knowledge/skills, and you ask them if they've ever written a single kernel, and they go "huh?"
- Age old terminology battle (Score:1)
  
  by Tablizer ( 95088 ) writes:
  
  Yes, the term "AI" is fraught with problems, but I doubt there is an easy fix because no feature will be Boolean in its presents. A really drunk human has some common sense, but not often enough. What is "enough"? Is it task specific? Can you put all possible tasks into a definition? I doubt definitions based on a feature check-list can be made clear-cut without being hundreds of pages long, which probably disqualifies it from being a "definition".
- Re: (Score:1)
  
  by vivian ( 156520 ) writes:
  
  Humans are hard to untrain too. Just try convincing someone who has been exposed to some provably wrong anti-vax conspiracy info and see how hard it is to change their mind.
  - Re: (Score:2)
    
    by ledow ( 319597 ) writes:
    
    An opinion is a different thing to a skill.
    Nobody cares which way an Amazon worker leans politically. But if you want to move them to another conveyor doing a different job, they'll pick it up in a few minutes and be proficient in a few hours/days.
The nature of software testing (Score:3)

by petes_PoV ( 912422 ) writes: on Monday December 13, 2021 @05:42AM (#62074573)

It's because we fed it the wrong training data
This "brittleness" is not restricted to AI training routines. It is universal in testing software.
So many times I have seen people only test software to produce the correct answers when given valid data. And when it does this, it is assumed to work properly, Maybe due to time pressures (testing is always the last function before release, so all the earlier delays compress the time allowed for testing) or a desire not to uncover problems. However, this lack of rigour seems to be very common, both historically and in every area where software gets used

Bugs in Software. (Wrong) pictures at 11! (Score:1)

by bettersheep ( 6768408 ) writes:

Is this *really* news ?
Tactical AI evasion system (Score:2)

by rantrantrant ( 4753443 ) writes:

So to stop the AI from detecting an incoming missile, all we have to do is make the missile look a little bit different? How much are these defence contractors being paid in tax payer money?
Misnomers (Score:1)

by OpenSourced ( 323149 ) writes:

An Experimental Target-Recognition AI Mistakenly Thought
That wasn't AI, but a pattern recognition algorithm, and should be called that. And never ever use the "think" word with a program, they never think, they calculate, so "An experimental Target-Pattern-Recognition program gave 90% false positives" would be a better, if perhaps not so clickbait-y title.
This may seem finicky but these things are becoming important, you are more likely to entrust your life to a mighty Artificial Intelligence driving your car than to a Pattern Recognition program with just a 3%
This is how people believe the demonstrably false (Score:2)

by Kevoco ( 64263 ) writes:

...While the algorithm was only right 25 percent of the time, he said, "It was confident that it was right 90 percent of the time, so it was confidently wrong. And that's not the algorithm's fault. It's because we fed it the wrong training data."
...It was confidently wrong, 90% of the time... (Score:1)

by haedus ( 3892441 ) writes:

The Former President? Yeah, no shit. lol.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

it was confidently wrong (Score:5, Funny)

Re: it was confidently wrong (Score:3)

Dunning-Kruger Effect! (Score:5, Funny)

Re: it was confidently wrong (Score:2)

Cloudy Days ancecdote (Score:3)

Re:Cloudy Days ancecdote (Score:5, Informative)

I suspect this AI was right (Score:3)

Re: (Score:3)

Re:I suspect this AI was right (Score:4, Insightful)

Re: (Score:2)

Or, in other words, what we call AI isn't. (Score:3, Interesting)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: Cloudy Days ancecdote (Score:2)

Re:Cloudy Days ancecdote (Score:5, Interesting)

Re:Cloudy Days ancecdote (Score:4, Funny)

Re: Cloudy Days ancecdote (Score:2)

Anti Tank Dogs (Score:2)

Re: (Score:3)

Re: it was confidently wrong (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

oh shit (Score:2)

Re: (Score:1)

Re: (Score:2)

Classic GIGO (Score:1)

Re:Classic GIGO (Score:5, Funny)

Dunning Kruger (Score:4, Informative)

Re: (Score:3)

Worried (Score:3)

Re:Worried (Score:4)

Re: (Score:1)

Re: (Score:3)

ED 209.. (Score:2)

Re: (Score:2)

Human vision is brittle, too. (Score:2)

Sounds like my colleagues (Score:2)

It's also about how you choose to present the data (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Simulated input (Score:4, Interesting)

Re: (Score:3)

Missy Cummings (Score:3)

Re: Raytheon at it again? (Score:2)

Dunning-Kruger AI. (Score:2)

because it's not an AI (Score:2, Insightful)

Re: (Score:2)

AI (Score:5, Insightful)

Re: (Score:2)

Age old terminology battle (Score:1)

Re: (Score:1)

Re: (Score:2)

The nature of software testing (Score:3)

Bugs in Software. (Wrong) pictures at 11! (Score:1)

Tactical AI evasion system (Score:2)

Misnomers (Score:1)

This is how people believe the demonstrably false (Score:2)

...It was confidently wrong, 90% of the time... (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals