Forgot your password?
typodupeerror
AI

After Outages, Amazon To Make Senior Engineers Sign Off On AI-Assisted Changes (ft.com) 83

UPDATE: Amazon later published a blog post to address what it calls "inaccuracies" in the Financial Times report that the company's own AI tool Kiro caused two outages in an AWS service in December.

An anonymous Slashdot reader had shared this report from the Financial Times: Amazon's ecommerce business has summoned a large group of engineers to a meeting on Tuesday for a "deep dive" into a spate of outages, including incidents tied to the use of AI coding tools. The online retail giant said there had been a "trend of incidents" in recent months, characterized by a "high blast radius" and "Gen-AI assisted changes" among other factors, according to a briefing note for the meeting seen by the FT. Under "contributing factors" the note included "novel GenAI usage for which best practices and safeguards are not yet fully established."

"Folks, as you likely know, the availability of the site and related infrastructure has not been good recently," Dave Treadwell, a senior vice-president at the group, told employees in an email, also seen by the FT. The note ahead of Tuesday's meeting did not specify which particular incidents the group planned to discuss. [...] Treadwell, a former Microsoft engineering executive, told employees that Amazon would focus its weekly "This Week in Stores Tech" (TWiST) meeting on a "deep dive into some of the issues that got us here as well as some short immediate term initiatives" the group hopes will limit future outages.

He asked staff to attend the meeting, which is normally optional. Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added. Amazon said the review of website availability was "part of normal business" and it aims for continual improvement. "TWiST is our regular weekly operations meeting with a specific group of retail technology leaders and teams where we review operational performance across our store," the company said.

This discussion has been archived. No new comments can be posted.

After Outages, Amazon To Make Senior Engineers Sign Off On AI-Assisted Changes

Comments Filter:
  • by Anonymous Coward on Tuesday March 10, 2026 @11:35PM (#66034594)
    Soon enough you get diarrhea.
    • Some years ago, marketing updated the old tech phrase from "eat your own dog food" to "drink your own champagne."
    • by drnb ( 2434720 ) on Wednesday March 11, 2026 @01:56AM (#66034672)
      Using AI generated code without review is not "eating your own dog food". Dog food is a known, it is the result of an engineered process.
      Using Ai generated code without review is more like "eating road kill", "road kill" where you didn't even have the common sense to have an experienced hunter or butcher examine the meat for signs of disease.

      AI is a useful tool, it is not a panacea. It has to be used appropriately. That includes reviewing the code. At least a light review for things that are well known and understood, say things in every data structures and algorithms textbook of the last 40 years. A more comprehensive review for things that are more novel.
      • How many times does review actually catch bugs? Why should review wotk better for AI generated code than human written? For both you need some rigorious, i.e. non-AI, systems ti catch bugs: tests and static code analysers. If you have too many bugs, up your testing. Use the tests to safely refactor provlematic code. Review is at best a way to align developers and train newbees, not Q.
        • by AleRunner ( 4556245 ) on Wednesday March 11, 2026 @06:44AM (#66034842)

          Another comment asked for "strict, rigorous, in-depth testing,"; I think there's a big problem with this. The first thing to say is that you can in principle have strict rigorous proofs that software is "correct" ("Beware of bugs in the above code; I have only proved it correct, not tried it." - Mr. K.) you cannot have strict rigorous tests because it is impossible to test all possible software stated.

          The AI is absolutely brilliant at finding paths through tests. Mostly and often ones that make the software correct. It is also brilliant at finding paths through the tests that achieve the goal it thinks it's trying to achieve and which are wrong whist seeming correct to humans. I think the code review for AI should be something different. You don't care about the individual lines of code, which are an artifact that can be rewritten. My guess is that you should think of the AI as a North Korean programmer working in your company because the bosses decided she's cheaper than your old friend they fired. Unfortunately your wealth is in share options that are only going to vest if you and the company survive till next year. She's always trying to slip a new backdoor into the code or destroy the company, but you aren't allowed to admit that, so maybe each time you code review, try to work out what new tests should be being added to make sure that her backdoor and destructive scripts don't go through and demand that they get written and verified.

          • hate auto correct.

            test all possible software states

            • by Gilmoure ( 18428 )

              It's just one character.

              Can't possibly have any effect of signifigance.

              • Can't possibly have any effect of signifigance.

                It's the first time I realized someone is an AI, not because they invented something that doesn't exist, nor because they showed a complete lack of understanding. No, this time it's because they are clearly working up to trying to kill me.

              • Buttle ... Tuttle

          • by sjames ( 1099 )

            That reminds me of an early result while trying to use a GA to program an FPGA. The result actually worked, but only on that particular chip. When they examined it, they found that the GA was taking advantage of side effects that were not part of the FPGAs spec. There weren't even rules against it because nobody in their right mind would even try some of the tricks (such as linking gates together to form a resonator that just happened to affect nearby gates).

            I suspect putting testing and debugging tools aga

        • If you don't know why code review is superior to unit testing, then you need to take the plunge and go to a second year programming class.
        • by gweihir ( 88907 )

          There is reason to believe that code review cannot compensate for bad coding, without making everything a lot more expensive than using competent coders in the first place. Code review is inherently harder than code writing. This has been known for a long time.

          One reason why using "AI" to generate code with higher quality requirements does not work or at least does not make sense.

        • by sjames ( 1099 )

          Bugs in AI generated code are probably HARDER to find. A recurring problem with AI is that it is VERY fluent. It reads like it "knows" what it's talking about. Of course, even when it's correct, it doesn't actually "know" in a conscious sense since it isn't conscious.

        • by drnb ( 2434720 )

          How many times does review actually catch bugs?

          For stuff that isn't a regurgitation of a textbook topic, frequently.

          • by BranMan ( 29917 )

            "How many times does review actually catch bugs?"

            Mostly it depends on 2 things:

            1) How well the reviewers know the codebase and what it is trying to accomplish, in detail.

            2) The amount of stuff that is being reviewed( document, code, system design, use cases, etc.) at one time.
            Can you review 10 lines of code and find all the bugs? Very likely.
            Can you review 1,000,000 lines of code at once and find the bugs? Not a chance.
            Can you review 1,000 l

  • When you believe your own marketing too much. Somewhat refreshing to see a company that big acknowledge it's not all there yet.
    • by Anonymous Coward
      No this is classic arse covering, now they have more tech people they can blame vs the entire broken system.
  • Cost/benefit (Score:5, Interesting)

    by battingly ( 5065477 ) on Wednesday March 11, 2026 @12:19AM (#66034612)

    I've seen estimates of the cost of the most recent AWS outage of hundreds of millions of dollars. It's going to be a very long time before the benefits of AI in Amazon's operations outweigh the costs.

    • Re:Cost/benefit (Score:5, Insightful)

      by anoncoward69 ( 6496862 ) on Wednesday March 11, 2026 @12:33AM (#66034628)
      Yeah I can only imagine how much that 2-3 hour outage at amazon's ecommerce site cost them last week. I'm sure a lot orders got put though after the outage ended, but they probably lost out on a ton of impulse shopping, where people later on never came back to purchase.
    • Yeah, but they make more in investment capital from the *illusion* that AI is ready for prime time. "See, we have an AI strategy .We fired a bunch of staff and replace them with AI. You too can fire all your people and replace them with bots"
    • And dollars directly spent on wages. But there is a less tangible benefit, or at least less immediately tangible, in the form of reducing total employment.

      As an employer the power I have the set wages is based on supply and demand. If I can reduce the supply, in other words the number of available jobs, then that affects demand. Specifically it gives me much more power in wage negotiations and I can pay substantially less overall.

      For small and mid-sized businesses this is a concern but it's not nece
  • Too much to ask? (Score:5, Insightful)

    by skogs ( 628589 ) on Wednesday March 11, 2026 @12:21AM (#66034614) Journal

    Is it too much to ask for a tiny little lovenote somewhere acknowledging that the PHB that allowed this to happen in the first place is enjoying a severance package?

    In what world do you let a newguy do anything important...let alone a newguy that doesn't even breathe.

    • Re:Too much to ask? (Score:4, Interesting)

      by anoncoward69 ( 6496862 ) on Wednesday March 11, 2026 @12:44AM (#66034636)
      Exactly I would assume that anything produced by junior and mid level engineers AI or not was getting a sign off by someone more senior and tested before getting pushed to prod on an environment as large and money producing as Amazon. I guarantee you that heads rolled after that 2-3hr outage they had last week, of course it probably wasn't the people that should have been the gate keepers to production.
  • Never release anything to production without strict, rigorous, in-depth testing, regardless of the tools used to create it

    • by HiThere ( 15173 )

      The problem is that you *can't* do "strict, rigorous, in-depth testing", You can test the places you expect it might fail, and as systems get more flexible even that becomes iffy. That method works for small pieces of code. Perhaps an AI could test all possible failure paths...but then the problem becomes the spec that the AI used.

      "Proving code correct" is impossible, because you can't prove that the specs you work from will produce the desired result. One of the problems with AI is that it DOES work fr

  • FAFO (Score:5, Insightful)

    by gweihir ( 88907 ) on Wednesday March 11, 2026 @12:56AM (#66034646)

    Hilarious. Engineering cheaper than possible does not work. Who would have expected that....

    I also wonder how they think they will get future senior people if they do not hire junior ones now.

    • Almost nobody is trying to answer that question and I'm asking it from the beginning of this shitshow. I assume their idea is: we don't need juniors now and in the future AI gets good enough we won't need seniors. But this is a flawed logic because current approach to AI will never result in it being capable of replacing a senior humans. It's too variable and unpredictable, that's why it fails in production. My prediction is it's going to be extremely lucrative to be in IT in the coming years. There won't
      • by gweihir ( 88907 )

        My prediction is it's going to be extremely lucrative to be in IT in the coming years. There won't be enough juniors, seniors will be in decline while the whole world and their grandma will be making apps with better AI tools. There will be unprecedented demand in the wake of unseen decline of supply of engineering talent.

        That is quite likely, yes.

    • Hilarious. Engineering cheaper than possible does not work. Who would have expected that....

      I also wonder how they think they will get future senior people if they do not hire junior ones now.

      The lack of a pipeline for junior hires at most places predates the AI boom. The thinking we can optimize away experience and planning, that's related, and also preexisting. It's this mindset that we need only experienced developers and operators because we wing everything and it's already nearly impossible to get experienced folks up to speed, so forget recent graduates. The sink or swim culture. Hire a bunch of consultants - wing it on bigger projects, related. Same predictable results any way you slice i

  • by high_rolla ( 1068540 ) on Wednesday March 11, 2026 @12:56AM (#66034648) Homepage

    The clear answer is to go further. They should get AI to check their AI written code for bugs. They should also get AI to mentor the junior programmers and another AI to check on all the other AI's and write a summary for the execs on how it is all going.

    • And then an AI to fire the guys who signed off on the AI signing off on the decisions middle managers that signed off on the AI signing off on AI code output.
      I'm very interested in LLMs, and have been using them in all kinds of cool places... but seriously, this shit is just clown shoes.
    • We apologize for the recent breakdown in services. Those responsible for using AI to check code written by AI, have been sacked.

      • We apologize for the recent breakdown in services. Those responsible for using AI to check code written by AI, have been sacked.

        Unfortunately the sacking was done by AI, and it just automatically assumed that the responsible people were the senior developers and not the managers who bypassed them. No worries though, we got it to justify a big bonus for automating the HR AI anyway.

    • by Gilmoure ( 18428 )

      AI

      [/Peg Bundy voice]

  • normal practise (Score:5, Insightful)

    by fortunatus ( 445210 ) on Wednesday March 11, 2026 @03:05AM (#66034718)
    Pardon me, but isn't code review before merge to production normal practice? If senior engineers aren't reviewing, then I would say that the problem is not the use of AI tools...
    • There should have been at least some method of validation.

      Either they turned it off specifically for AI code, which is dumb, or they never had it, which is dumber.

    • The review was probably something like, "Oh, AI wrote this? OK, looks good."

      The *volume* of code is so great, that it's hard to truly review it. Humans tend to make PRs that are somewhat digestible in size. AI tends to write huge mountains of code that you can get lost in.

      • by gweihir ( 88907 )

        AI tends to write huge mountains of code that you can get lost in.

        So essentially unreviewable and probably unmaintainable down the road? Why does that not surprise me?

        • Precisely. And it's a short road. My own experience has shown me that AI quickly gets lost in its own logic. In several cases I asked it to fix an issue with code that it just generated, and gave up after several failed attempts.

    • by gweihir ( 88907 )

      It should be, but it is not necessarily. The problem is that "management" does not understand competent risk management and hence does not want to pay for it.

    • Before AI: junior dev writes junior-level code, mid dev reviews, finds junior-level mistake, they fix the issue.

      After AI: first day dev writes groundbreaking level code, have no idea if there's any mistakes. Mid level dev checks the code, does not understand everything because junior cannot explain it. Both are pressed by management because "if this sprint does not get deployed this week, AI will eat you both" and send the code along. Code breaks unexpectedly (or expectedly) later and nobody can explain w
  • juniors with AI will generate much more hard to read crap so there will be many more senior engineers needed to read it and approve it...

    • by gweihir ( 88907 )

      Sounds like a good way to increase you headcount ...
      That is what they want to do, right?

  • One month ago Amazon announced huge AI related layoffs and reduction of red tape for decision making. Anyone could have predicted this outcome. There a new phenomenon, AI induced psychosis, which I think is much more widespread than people realize. It's almost unavoidable if you use AI frequently and will quietly creep into your thinking in whatever domain you use LLMs for.
    • by gweihir ( 88907 )

      Indeed. Using AI intensively not only reduces the quality of your work (but you can create more slop), it seems to be a huge mental risk.

  • Developers carefully review any pull request written by a human.

    But the moment that pull request contains code written by an AI code assistant, many AI skeptics review it just superficially, or even approve it without any review at all. They would never be that careless with a human-authored PR. I understand the AI skeptics are trying to highlight AI issues yet they are doing that in the worst possible way: by applying a different standard.
    • by Njovich ( 553857 ) on Wednesday March 11, 2026 @05:52AM (#66034798)

      In my experience it's really hard to review AI generated code though. The problem is that LLM's whole thing is that they generate tokens that 'look correct'.

      If I see a junior's code with a ton of mistake I can spot that from a mile away. LLM generated code looks really good initially, then when you get into the nitty gritty you find out that it can be absolutely riddled with mistakes that are completely non-obvious.

      Then the second issue is that it can generated absolutely enormous amounts of such 'visually good looking' code and it will make sure to pass any tests.

      LLM's basically are writing review-resistant code and it requires extremely good knowledge of the codebase and company to properly review such code.

      • by gweihir ( 88907 )

        "Review-resistant code", I like that expression! One critical aspect of KISS in all engineering (and even more so in coding) is to make review easy. This is by clearly documenting all ideas and approaches, making an implementation that is a simple and clear as possible and clearly show how the implementation derives from ideas and approaches, including documenting all limitations and alternatives considered. It seems AI code is doing pretty mich the opposite.

        But review resistance not only means code review

    • by gweihir ( 88907 )

      Noce claim you have there. Got any evidence for it?

    • You sound like an AI cultist. What evidence do you have for this wild assertion?
  • Vibe City (Score:5, Informative)

    by Njovich ( 553857 ) on Wednesday March 11, 2026 @05:48AM (#66034796)

    I find it really hard to understand how this is going to end well.

    The ability of shitty engineers to generate enormous amounts of vibe coded lines vastly outstrips the ability of seniors to review it.

    You will just end up tying up all of your senior's time if you want to review properly.

    • Re:Vibe City (Score:4, Interesting)

      by dcollins ( 135727 ) on Wednesday March 11, 2026 @09:26AM (#66035038) Homepage

      This was my thought as well.

      A friend of mine recently left a senior role, largely b/c of frustration with the firehose of junior AI-slop pull requests.

      • by gweihir ( 88907 )

        That may be a catastrophic trend: More slop requires more review, but AI code is basically review resistant and makes that hard and stressful. This then causes the actually competent senior people to leave and go to places where sane (non-AI or low-AI-use) coding is done. These people always have options as there are never enough of them. And then this goes directly into a larger and larger catastrophe.

        Of course, the LLM industry tells us that "just a bit longer" and LLMs will be competently reviewing code

    • by gweihir ( 88907 )

      At this time, I see no way for this to end well. The later the crash comes, the larger the damage done though and some of it may just be impossible to fix.

      I call this type of engineering "cheaper than possible engineering". It never ends well, and the history of engineering failure is full of pretty spectacular catastrophes caused by this approach. I guess IT will add a few more to that in the next years.

  • So much for the benefits of AI... It is a bunch of hype that creates double work. After it borks your project, you are stuck with fixing what it borked. In the olden days, this stuff was thoroughly tested before it made it to production.
  • Blame game. (Score:3, Insightful)

    by sleschdott ( 2110488 ) on Wednesday March 11, 2026 @06:32AM (#66034826)

    It's easier to find a scapegoat (the senior engineer to sign off) than to question the "productivity gains" that brought us these outages in the first place. But where will they find people willing to risk their career in the future?

    I'm not overly confident that AI will be ready for unsupervised production use in the forseeable future.

  • No quality problem is ever solved by adding more signature lines to the paperwork. Code needs to be reviewed and tested no matter what the source. I just migrated some old software to a newer library, and used an automated tool to make some syntax changes to the code. Had it been an AI tool instead nothing would have gone differently: reviewed, tested, committed.

  • NOT ready for prime time !! It has been repeatedly proven that AI is not accurate !
  • There are many industries where an engineer has to sign off work before it goes into production.

    Frankly it's about time someone said that a professional has to put their chop on critical work. It's an important part of delivering and managing important infrastructure and systems.

  • by flink ( 18449 ) on Wednesday March 11, 2026 @10:31AM (#66035200)

    Assuming the author is somewhat competent, it is much harder to spot a mistake in someone else's code than it is to just write it correctly yourself. AI can spew out reasonable-looking, almost-correct code at a much higher rate than any human reviewer can hope to keep up with.

    What's maybe even worse, is that in working with claude code, I've noticed even if the code is correct, the comments it generates are often off. There will be false assumptions or mis-statements about the API being accessed. Those are time bombs for a junior engineer or a future AI session that comes in and takes them at face value.

    • by gweihir ( 88907 )

      Indeed. Somebody else here called AI generated code "review resistant". I think that covers it perfectly.

      I had a look into finding backdoors in code a few years back. This was for outsourcing code generation where you do not fully trust the outsourcer. The consent seems to be that this is much harder and much more expensive than rewriting the code with trusted people from scratch. Which makes a lot of sense, because when you write code carefully, it will also follow you way to view things and safe and estab

    • Programming is math. Why would a person use something to 'generate code' when that something can not do math reliably? AI coding is like an idiot's idea of what a programmer does... with similar results. (AI absolutely *IS* useful, but apparently, not useful for idiots)

  • after Anakin murders all the younglings?

  • I read an article that Amazon forces using some internal coding tool. Claude has gotten significantly better but that still doesn't eliminate human judgement.

  • This was the exact same promise as outsourcing code: pay less for the same work. The problem is that management messed up and it wasn't ready yet, so we got a lot of work fixing code and verification issues after the initial push.

    AI feels the same way, promise cheap work and management jumped too early on it, leading to issues like this.

    The problem will come between low level and senior work: used to be that senior level engineers would review and teach entry level programmers to teach them to be future sen

  • Fancy having to check and sign off someone else's work?

Cobol programmers are down in the dumps.

Working...