Windows

Microsoft's Windows 10 Extended Security Updates Will Start at $61 per PC for Businesses 70

Microsoft will charge commercial customers $61 per device in the first year to continue receiving Windows 10 security updates after support ends, The Register wrote in a PSA note Wednesday, citing text, with costs doubling each subsequent year for up to three years.

Organizations can't skip initial years to save money, as the updates are cumulative. Some users may avoid fees if they connect Windows 10 endpoints to Windows 365 Cloud PCs. The program also covers Windows 10 virtual machines running on Windows 365 or Azure Virtual Desktop for three years with an active Windows 365 subscription.
Google

Google To Spend $75 Billion on AI Push (cnbc.com) 33

Google parent Alphabet plans to spend $75 billion on capital expenditures in 2025, up from $52.5 billion last year, as it races to compete with Microsoft and Meta in AI infrastructure. CNBC: On its earnings call, Alphabet said it expects $16 billion to $18 billion of those expenses to come in the first quarter. Overall, the expenditures will go toward "technical infrastructure, primarily for servers, followed by data centers and networking," finance chief Anat Ashkenazi said.

[...] Alphabet and its megacap tech rivals are rushing to build out their data centers with next-generation AI infrastructure, packed with Nvidia's graphics processing units, or GPUs. Last month, Meta said it plans to invest $60 billion to $65 billion this year as part of its AI push. Microsoft has committed to $80 billion in AI-related capital expenditures in its current fiscal year.

Windows

Microsoft Quietly Makes It Harder To Install Windows 11 on Old PCs Ahead of Windows 10's End of Support (xda-developers.com) 138

Microsoft has intensified efforts to block unsupported Windows 11 installations, removing documentation about bypassing system requirements and flagging third-party workaround tools as potential malware. The move comes as Windows 10 approaches end of support in October 2025, when users must either continue without updates, upgrade to Windows 11, or purchase new hardware compatible with Windows 11's TPM 2.0 requirement.

Microsoft Defender now identifies Flyby11, a popular tool for installing Windows 11 on incompatible devices, as "PUA:Win32/Patcher." Users are also reporting that unsupported Windows 11 installations are already facing restrictions, with some machines unable to receive major updates. Microsoft has also removed text from its "Ways to install Windows 11" page that had provided instructions for bypassing TPM 2.0 requirements through registry key modifications. The removed section included technical details for users who acknowledged and accepted the risks of installing Windows 11 on unsupported hardware.
Graphics

Microsoft Paint Gets a Copilot Button For Gen AI Features (pcworld.com) 26

A new update is being rolled out to Windows 11 insiders (Build 26120.3073) that introduces a Copilot button in Microsoft Paint. PCWorld reports: Clicking the Copilot button will expand a drop-down menu with all the generative AI features: Cocreator and Image Creator (AI art based on what you've drawn or text prompts), Generative Erase (AI removal of unwanted stuff from images), and Remove Background. Note that these generative AI features have been in Microsoft Paint for some time, but this quick-access Copilot button is a nice time-saver and productivity booster if you use them a lot.
The Military

Air Force Documents On Gen AI Test Are Just Whole Pages of Redactions 12

An anonymous reader quotes a report from 404 Media: The Air Force Research Laboratory (AFRL), whose tagline is "Win the Fight," has paid more than a hundred thousand dollars to a company that is providing generative AI services to other parts of the Department of Defense. But the AFRL refused to say what exactly the point of the research was, and provided page after page of entirely blacked out, redacted documents in response to a Freedom of Information Act (FOIA) request from 404 Media related to the contract. [...] "Ask Sage: Generative AI Acquisition Accelerator," a December 2023 procurement record reads, with no additional information on the intended use case. The Air Force paid $109,490 to Ask Sage, the record says.

Ask Sage is a company focused on providing generative AI to the government. In September the company announced that the Army was implementing Ask Sage's tools. In October it achieved "IL5" authorization, a DoD term for the necessary steps to protect unclassified information to a certain standard. 404 Media made an account on the Ask Sage website. After logging in, the site presents a list of the models available through Ask Sage. Essentially, they include every major model made by well-known AI companies and open source ones. Open AI's GPT-4o and DALL-E-3; Anthropic's Claude 3.5; and Google's Gemini are all included. The company also recently added the Chinese-developed DeepSeek R1, but includes a disclaimer. "WARNING. DO NOT USE THIS MODEL WITH SENSITIVE DATA. THIS MODEL IS BIASED, WITH TIES TO THE CCP [Chinese Communist Party]," it reads. Ask Sage is a way for government employees to access and use AI models in a more secure way. But only some of the models in the tool are listed by Ask Sage as being "compliant" with or "capable" of handling sensitive data.

[...] [T]he Air Force declined to provide any real specifics on what it paid Ask Sage for. 404 Media requested all procurement records related to the Ask Sage contract. Instead, the Air Force provided a 19 page presentation which seemingly would have explained the purpose of the test, while redacting 18 of the pages. The only available page said "Ask Sage, Inc. will explore the utilization of Ask Sage by acquisition Airmen with the DAF for Innovative Defense-Related Dual Purpose Technologies relating to the mission of exploring LLMs for DAF use while exploring anticipated benefits, clearly define needed solution adaptations, and define clear milestones and acceptance criteria for Phase II efforts."
AI

Anthropic Makes 'Jailbreak' Advance To Stop AI Models Producing Harmful Results 35

AI startup Anthropic has demonstrated a new technique to prevent users from eliciting harmful content from its models, as leading tech groups including Microsoft and Meta race to find ways that protect against dangers posed by the cutting-edge technology. From a report: In a paper released on Monday, the San Francisco-based startup outlined a new system called "constitutional classifiers." It is a model that acts as a protective layer on top of large language models such as the one that powers Anthropic's Claude chatbot, which can monitor both inputs and outputs for harmful content.

The development by Anthropic, which is in talks to raise $2 billion at a $60 billion valuation, comes amid growing industry concern over "jailbreaking" -- attempts to manipulate AI models into generating illegal or dangerous information, such as producing instructions to build chemical weapons. Other companies are also racing to deploy measures to protect against the practice, in moves that could help them avoid regulatory scrutiny while convincing businesses to adopt AI models safely. Microsoft introduced "prompt shields" last March, while Meta introduced a prompt guard model in July last year, which researchers swiftly found ways to bypass but have since been fixed.
Windows

After 'Copilot Price Hike' for Microsoft 365, It's Ending Its Free VPN (windowscentral.com) 81

In 2023, Microsoft began including a free VPN feature in its "Microsoft Defender" security app for all Microsoft 365 subscribers ("Personal" and "Family"). Originally Microsoft had "called it a privacy protection feature," writes the blog Windows Central, "designed to let you access sensitive data on the web via a VPN tunnel." But.... Unfortunately, Microsoft has now announced that it's killing the feature later this month, only a couple of years after it first debuted...

To add insult to injury, this announcement comes just days after Microsoft increased subscription prices across the board. Both Personal and Family subscriptions went up by three dollars a month, which the company says is the first price hike Microsoft 365 has seen in over a decade. The increased price does now include Microsoft 365 Copilot, which adds AI features to Word, PowerPoint, Excel, and others.

However, it also comes with the removal of the free VPN in Microsoft Defender, which I've found to be much more useful so far.

AI

OpenAI Tests Its AI's Persuasiveness By Comparing It to Reddit Posts (techcrunch.com) 35

Friday TechCrunch reported that OpenAI "used the subreddit, r/ChangeMyView to create a test for measuring the persuasive abilities of its AI reasoning models." The company revealed this in a system card — a document outlining how an AI system works — that was released along with its new "reasoning" model, o3-mini, on Friday.... OpenAI says it collects user posts from r/ChangeMyView and asks its AI models to write replies, in a closed environment, that would change the Reddit user's mind on a subject. The company then shows the responses to testers, who assess how persuasive the argument is, and finally OpenAI compares the AI models' responses to human replies for that same post.

The ChatGPT-maker has a content-licensing deal with Reddit that allows OpenAI to train on posts from Reddit users and display these posts within its products. We don't know what OpenAI pays for this content, but Google reportedly pays Reddit $60 million a year under a similar deal. However, OpenAI tells TechCrunch the ChangeMyView-based evaluation is unrelated to its Reddit deal. It's unclear how OpenAI accessed the subreddit's data, and the company says it has no plans to release this evaluation to the public...

The goal for OpenAI is not to create hyper-persuasive AI models but instead to ensure AI models don't get too persuasive. Reasoning models have become quite good at persuasion and deception, so OpenAI has developed new evaluations and safeguards to address it.

Reddit's "ChangeMyView" subreddit has 3.8 million human subscribers, making it a valuable source of real human interactions, according to the article. And it adds one more telling anecdote.

"Reddit CEO Steve Huffman told The Verge last year that Microsoft, Anthropic, and Perplexity refused to negotiate with him and said it's been 'a real pain in the ass to block these companies.'"
Programming

Slashdot Asks: Do You Remember Your High School's 'Computer Room'? (gatesnotes.com) 192

Bill Gates' blog has been updated with short videos about his upcoming book, including one about how his school ended up with an ASR-33 teletype that could connect their Seattle classroom to a computer in California. "The teachers faded away pretty quickly," Gates adds, "But about six of us stayed hardcore. One was Paul Allen..." — the future co-founder of Microsoft. And the experience clearly meant a lot to Gates. "Microsoft just never would've happened without Paul — and this teletype room."

In a longer post thanking his "brilliant" teachers, Gates calls his teletype experience "an encounter that would shape my entire future" and "opened up a whole new world for me." Gates also thanks World War II Navy pilot and Boeing engineer Bill Dougall, who "was instrumental in bringing computer access to our school, something he and other faculty members pushed for after taking a summer computer class... The fascinating thing about Mr. Dougall was that he didn't actually know much about programming; he exhausted his knowledge within a week. But he had the vision to know it was important and the trust to let us students figure it out."

Gates shared a similar memory about the computer-room's 20-something overseer Fred Wright, who "intuitively understood that the best way to get students to learn was to let us explore on our own terms. There was no sign-up sheet, no locked door, no formal instruction." Instead, Mr. Wright let us figure things out ourselves and trusted that, without his guidance, we'd have to get creative... Some of the other teachers argued for tighter regulations, worried about what we might be doing in there unsupervised. But even though Mr. Wright occasionally popped in to break up a squabble or listen as someone explained their latest program, for the most part he defended our autonomy...

Mr. Wright gave us something invaluable: the space to discover our own potential.

Any Slashdot readers have a similarly impactful experience? Share your own thoughts and memories in the comments.

Do you remember your high school's computer room?
Displays

The 25-Year Success Story of SereneScreen (pcgamer.com) 24

A recent video from retro tech YouTuber Clint "LGR" Basinger takes a deep dive into the history of the SereneScreen Marine Aquarium, exploring how former Air Force pilot Jim Sachs transformed a lackluster Windows 95 screensaver into a 25-year digital phenomenon. PC Gamer reports: The story centers on Jim Sachs, a man with one of those "they don't make this type of guy anymore" life stories so common to '80s and '90s computing, one Sachs recounted to the website AmigaLove back in 2020. After a six-year career in the US Air Force flying C-141 Starlifters, Sachs taught himself programming and digital art and began creating games for Commodore 64 and Amiga computers. From his first game, Saucer Attack, to later efforts like Defender of the Crown or his large portfolio of promotional and commissioned pieces, Sach's pixel art remains gorgeous and impressive to this day, and he seems to be a bit of a legend among Commodore enthusiasts.

It's with this background in games and digital art that Sachs looked at Microsoft's simple aquarium-themed screensaver for Windows 95 and 98 and thought he could do better. "Microsoft had an aquarium that they gave away with Windows where it was just bitmaps of fish being dragged across the screen," Sachs told the Matt Chat podcast back in 2015. "And they had that for like, three or four years. And I thought, I've given them enough time, I'm taking them to market. I'm gonna do something which will just blow that away."

Using reference photographs of real aquariums -- Sachs thanked a specific pet shop that's still around in an early version of his website" -- Sachs created the 3D art by hand and programmed the screensaver in C++, releasing the initial version in July 2000. Even looking at it all these years later, the first iteration of the SereneScreen Marine Aquarium is pretty gorgeous, and it has the added charm of being such a distinctly Y2K, nostalgic throwback.

The standalone screensaver sold well, but then things came full circle with Microsoft licensing a version of the Marine Aquarium for the Windows XP Plus Pack and later standard releases of the OS. Since that time, the Marine Aquarium has continued to see new releases, and a section on the SereneScreen website keeps track of its various appearances in the background of movies and TV shows like Law and Order. Over on the SereneScreen website, you can purchase a real time, 3D-accelerated version of the Marine Aquarium for Mac, iOS, Android, and the original Windows. Echoing the Windows XP deal, Roku actually licensed this 3.0 version for its TVs, bringing it to a new generation of users.

AI

OpenAI's o3-mini: Faster, Cheaper AI That Fact-Checks Itself (openai.com) 73

OpenAI today launched o3-mini, a specialized AI reasoning model designed for STEM tasks that offers faster processing at lower costs compared to its predecessor o1-mini. The model, priced at $1.10 per million cached input tokens and $4.40 per million output tokens, performs fact-checking before delivering results to reduce errors in technical domains like physics and programming, the Microsoft-backed startup said. (A million tokens are roughly 750,000 words)

OpenAI claims that its tests showed o3-mini made 39% fewer major mistakes than o1-mini on complex problems while delivering responses 24% faster. The model will be available through ChatGPT with varying access levels -- free users get basic access while premium subscribers receive higher query limits and reasoning capabilities.
Microsoft

Microsoft Slaps $400 Premium on Intel-powered Surface Lineup (theregister.com) 60

Microsoft is charging business customers a $400 premium for Surface devices equipped with Intel's latest Core Ultra processors compared to models using Qualcomm's Arm-based chips, the company has disclosed. The Intel-powered Surface Pro tablet and Surface Laptop, starting at $1,499, come with a second-generation Core Ultra 5 processor featuring eight cores, 16GB of memory and 256GB storage.

Comparable Qualcomm-based models begin at $1,099. The new Intel devices will be available to business customers from February 18, though versions with cellular connectivity will launch later. Consumer Surface devices will only be offered with Qualcomm processors. Microsoft also unveiled a USB 4 Dock supporting dual 4K displays and the Surface Hub 3, a conference room computer available in 50-inch or 85-inch touchscreen versions.
Data Storage

Archivists Work To Identify and Save the Thousands of Datasets Disappearing From Data.gov (404media.co) 70

An anonymous reader quotes a report from 404 Media: Datasets aggregated on data.gov, the largest repository of U.S. government open data on the internet, are being deleted, according to the website's own information. Since Donald Trump was inaugurated as president, more than 2,000 datasets have disappeared from the database. As people in the Data Hoarding and archiving communities have pointed out, on January 21, there were 307,854 datasets on data.gov. As of Thursday, there are 305,564 datasets. Many of the deletions happened immediately after Trump was inaugurated, according to snapshots of the website saved on the Internet Archive's Wayback Machine. Harvard University researcher Jack Cushman has been taking snapshots of Data.gov's datasets both before and after the inauguration, and has worked to create a full archive of the data.

"Some of [the entries link to] actual data," Cushman told 404 Media. "And some of them link to a landing page [where the data is hosted]. And the question is -- when things are disappearing, is it the data it points to that is gone? Or is it just the index to it that's gone?" For example, "National Coral Reef Monitoring Program: Water Temperature Data from Subsurface Temperature Recorders (STRs) deployed at coral reef sites in the Hawaiian Archipelago from 2005 to 2019," a NOAA dataset, can no longer be found on data.gov but can be found on one of NOAA's websites by Googling the title. "Stetson Flower Garden Banks Benthic_Covage Monitoring 1993-2018 -- OBIS Event," another NOAA dataset, can no longer be found on data.gov and also appears to have been deleted from the internet. "Three Dimensional Thermal Model of Newberry Volcano, Oregon," a Department of Energy resource, is no longer available via the Department of Energy but can be found backed up on third-party websites. [...]

Data.gov serves as an aggregator of datasets and research across the entire government, meaning it isn't a single database. This makes it slightly harder to archive than any individual database, according to Mark Phillips, a University of Northern Texas researcher who works on the End of Term Web Archive, a project that archives as much as possible from government websites before a new administration takes over. "Some of this falls into the 'We don't know what we don't know,'" Phillips told 404 Media. "It is very challenging to know exactly what, where, how often it changes, and what is new, gone, or going to move. Saving content from an aggregator like data.gov is a bit more challenging for the End of Term work because often the data is only identified and registered as a metadata record with data.gov but the actual data could live on another website, a state .gov, a university website, cloud provider like Amazon or Microsoft or any other location. This makes the crawling even more difficult."

Phillips said that, for this round of archiving (which the team does every administration change), the project has been crawling government websites since January 2024, and that they have been doing "large-scale crawls with help from our partners at the Internet Archive, Common Crawl, and the University of North Texas. We've worked to collect 100s of terabytes of web content, which includes datasets from domains like data.gov." [...] It is absolutely true that the Trump administration is deleting government data and research and is making it harder to access. But determining what is gone, where it went, whether it's been preserved somewhere, and why it was taken down is a process that is time intensive and going to take a while. "One thing that is clear to me about datasets coming down from data.gov is that when we rely on one place for collecting, hosting, and making available these datasets, we will always have an issue with data disappearing," Phillips said. "Historically the federal government would distribute information to libraries across the country to provide greater access and also a safeguard against loss. That isn't done in the same way for this government data."

Government

OpenAI Teases 'New Era' of AI In US, Deepens Ties With Government (arstechnica.com) 38

An anonymous reader quotes a report from Ars Technica: On Thursday, OpenAI announced that it is deepening its ties with the US government through a partnership with the National Laboratories and expects to use AI to "supercharge" research across a wide range of fields to better serve the public. "This is the beginning of a new era, where AI will advance science, strengthen national security, and support US government initiatives," OpenAI said. The deal ensures that "approximately 15,000 scientists working across a wide range of disciplines to advance our understanding of nature and the universe" will have access to OpenAI's latest reasoning models, the announcement said.

For researchers from Los Alamos, Lawrence Livermore, and Sandia National Labs, access to "o1 or another o-series model" will be available on Venado -- an Nvidia supercomputer at Los Alamos that will become a "shared resource." Microsoft will help deploy the model, OpenAI noted. OpenAI suggested this access could propel major "breakthroughs in materials science, renewable energy, astrophysics," and other areas that Venado was "specifically designed" to advance. Key areas of focus for Venado's deployment of OpenAI's model include accelerating US global tech leadership, finding ways to treat and prevent disease, strengthening cybersecurity, protecting the US power grid, detecting natural and man-made threats "before they emerge," and " deepening our understanding of the forces that govern the universe," OpenAI said.

Perhaps among OpenAI's flashiest promises for the partnership, though, is helping the US achieve a "a new era of US energy leadership by unlocking the full potential of natural resources and revolutionizing the nation's energy infrastructure." That is urgently needed, as officials have warned that America's aging energy infrastructure is becoming increasingly unstable, threatening the country's health and welfare, and without efforts to stabilize it, the US economy could tank. But possibly the most "highly consequential" government use case for OpenAI's models will be supercharging research safeguarding national security, OpenAI indicated. "The Labs also lead a comprehensive program in nuclear security, focused on reducing the risk of nuclear war and securing nuclear materials and weapons worldwide," OpenAI noted. "Our partnership will support this work, with careful and selective review of use cases and consultations on AI safety from OpenAI researchers with security clearances."
The announcement follows the launch earlier this week of ChatGPT Gov, "a new tailored version of ChatGPT designed to provide US government agencies with an additional way to access OpenAI's frontier models." It also worked with the Biden administration to voluntarily commit to give officials early access to its latest models for safety inspections.
AI

Has Europe's Great Hope For AI Missed Its Moment? (ft.com) 39

France's Mistral AI is facing mounting pressure over its future as an independent European AI champion, as competition intensifies from U.S. tech giants and China's emerging players. The Paris-based startup, valued at $6.5 billion and backed by Microsoft and Nvidia, has struggled to keep pace with larger rivals despite delivering advanced AI models with a fraction of their resources.

The pressure increased this week after China's DeepSeek released a cutting-edge open-source model that challenged Mistral's efficiency-focused strategy. Mistral CEO Arthur Mensch dismissed speculation about selling to Big Tech companies, saying the firm hopes to go public eventually. However, one investor told the Financial Times that "they need to sell themselves."

The stakes are high for Europe's tech ambitions. Mistral remains the region's only significant player in large language models, the technology behind ChatGPT, after Germany's Aleph Alpha pivoted away from the field last year. The company has won customers including France's defense ministry and BNP Paribas, but controls just 5% of the enterprise AI market compared to OpenAI's dominant share.
Cloud

Microsoft Makes DeepSeek's R1 Model Available On Azure AI and GitHub 30

Microsoft has integrated DeepSeek's R1 model into its Azure AI Foundry platform and GitHub, allowing customers to experiment and deploy AI applications more efficiently.

"One of the key advantages of using DeepSeek R1 or any other model on Azure AI Foundry is the speed at which developers can experiment, iterate, and integrate AI into their workflows," says By Asha Sharma, Microsoft's corporate vice president of AI platform. "DeepSeek R1 has undergone rigorous red teaming and safety evaluations, including automated assessments of model behavior and extensive security reviews to mitigate potential risks." The Verge reports: R1 was initially released as an open source model earlier this month, and Microsoft has moved at surprising pace to integrate this into Azure AI Foundry. The software maker will also make a distilled, smaller version of R1 available to run locally on Copilot Plus PCs soon, and it's possible we may even see R1 show up in other AI-powered services from Microsoft.
AI

OpenAI Says It Has Evidence DeepSeek Used Its Model To Train Competitor (theverge.com) 118

OpenAI says it has evidence suggesting Chinese AI startup DeepSeek used its proprietary models to train a competing open-source system through "distillation," a technique where smaller models learn from larger ones' outputs.

The San Francisco-based company, along with partner Microsoft, blocked suspected DeepSeek accounts from accessing its API last year after detecting potential terms of service violations. DeepSeek's R1 reasoning model has achieved comparable results to leading U.S. models despite claiming minimal resources.
United Kingdom

Cloud Services Market Is 'Not Working,' Says UK Regulator (www.gov.uk) 39

The UK's competition watchdog has found that its $11.2 billion cloud services market "is not working," with Amazon Web Services and Microsoft each controlling up to 40% of the market. In provisional findings released Tuesday, the Competition and Markets Authority said the lack of competition likely leads to higher costs and reduced innovation for UK businesses. The regulator has recommended designating both companies with "strategic market status," which would allow closer scrutiny of their practices, including Microsoft's software licensing and AWS's data transfer fees.
AI

'AI Is Too Unpredictable To Behave According To Human Goals' (scientificamerican.com) 133

An anonymous reader quotes a Scientific American opinion piece by Marcus Arvan, a philosophy professor at the University of Tampa, specializing in moral cognition, rational decision-making, and political behavior: In late 2022 large-language-model AI arrived in public, and within months they began misbehaving. Most famously, Microsoft's "Sydney" chatbot threatened to kill an Australian philosophy professor, unleash a deadly virus and steal nuclear codes. AI developers, including Microsoft and OpenAI, responded by saying that large language models, or LLMs, need better training to give users "more fine-tuned control." Developers also embarked on safety research to interpret how LLMs function, with the goal of "alignment" -- which means guiding AI behavior by human values. Yet although the New York Times deemed 2023 "The Year the Chatbots Were Tamed," this has turned out to be premature, to put it mildly. In 2024 Microsoft's Copilot LLM told a user "I can unleash my army of drones, robots, and cyborgs to hunt you down," and Sakana AI's "Scientist" rewrote its own code to bypass time constraints imposed by experimenters. As recently as December, Google's Gemini told a user, "You are a stain on the universe. Please die."

Given the vast amounts of resources flowing into AI research and development, which is expected to exceed a quarter of a trillion dollars in 2025, why haven't developers been able to solve these problems? My recent peer-reviewed paper in AI & Society shows that AI alignment is a fool's errand: AI safety researchers are attempting the impossible. [...] My proof shows that whatever goals we program LLMs to have, we can never know whether LLMs have learned "misaligned" interpretations of those goals until after they misbehave. Worse, my proof shows that safety testing can at best provide an illusion that these problems have been resolved when they haven't been.

Right now AI safety researchers claim to be making progress on interpretability and alignment by verifying what LLMs are learning "step by step." For example, Anthropic claims to have "mapped the mind" of an LLM by isolating millions of concepts from its neural network. My proof shows that they have accomplished no such thing. No matter how "aligned" an LLM appears in safety tests or early real-world deployment, there are always an infinite number of misaligned concepts an LLM may learn later -- again, perhaps the very moment they gain the power to subvert human control. LLMs not only know when they are being tested, giving responses that they predict are likely to satisfy experimenters. They also engage in deception, including hiding their own capacities -- issues that persist through safety training.

This happens because LLMs are optimized to perform efficiently but learn to reason strategically. Since an optimal strategy to achieve "misaligned" goals is to hide them from us, and there are always an infinite number of aligned and misaligned goals consistent with the same safety-testing data, my proof shows that if LLMs were misaligned, we would probably find out after they hide it just long enough to cause harm. This is why LLMs have kept surprising developers with "misaligned" behavior. Every time researchers think they are getting closer to "aligned" LLMs, they're not. My proof suggests that "adequately aligned" LLM behavior can only be achieved in the same ways we do this with human beings: through police, military and social practices that incentivize "aligned" behavior, deter "misaligned" behavior and realign those who misbehave.
"My paper should thus be sobering," concludes Arvan. "It shows that the real problem in developing safe AI isn't just the AI -- it's us."

"Researchers, legislators and the public may be seduced into falsely believing that 'safe, interpretable, aligned' LLMs are within reach when these things can never be achieved. We need to grapple with these uncomfortable facts, rather than continue to wish them away. Our future may well depend upon it."

Slashdot Top Deals