Forgot your password?
typodupeerror
Biotech AI

OpenAI Starts Offering a Biology-Tuned LLM (arstechnica.com) 14

An anonymous reader quotes a report from Ars Technica: On Thursday, OpenAI announced it had developed a large language model specifically trained on common biology workflows. Called GPT-Rosalind after Rosalind Franklin, the model appears to differ from most science-focused models from major tech companies, which have generally taken a more generic approach that works for various fields. In a press briefing, Yunyun Wang, OpenAI's Life Sciences Product Lead, said the system was designed to tackle two major roadblocks faced by current biology researchers. One is the massive datasets created by decades of genome sequencing and protein biochemistry, which can be too much for any one researcher to take in. The second is that biology has many highly specialized subfields, each with its own techniques and jargon. So, for example, a geneticist who finds themselves working on a gene that's active in brain cells might struggle to understand the immense neurobiological literature.

Wang said the company had taken an LLM and trained it on 50 of the most common biological workflows, as well as on how to access the major public databases of biological information. Further training has resulted in a system that can suggest likely biological pathways and prioritize potential drug targets. "We're connecting genotype to phenotype through known pathways and regulatory mechanisms, infer likely structural or functional properties of proteins, and really leveraging this mechanistic understanding," Wang said. To address LLMs' tendencies toward sycophancy and overenthusiasm, OpenAI says it has tuned the model to be more skeptical, so it's more likely to tell you when something is a bad drug target. There was a lot of talk about GPT-Rosalind's "reasoning" and "expert-level" abilities. We were told that the former was defined as being able to work through complex, multi-step processes, while the latter was derived from the model's performance on a handful of benchmarks.
Access to GPT-Rosalind is currently limited "due to concerns about the model's potential for harmful outputs if asked to do something like optimize a virus's infectivity," notes Ars. Only U.S.-based organizations can request access at the moment.

OpenAI Starts Offering a Biology-Tuned LLM

Comments Filter:
  • On the very good side, this will lower the cost and lead times for new drugs.

    On the bad side, nation-states, terrorists, and even just Evil Agents Of Chaos[TM] who have access to tools like this and the knowledge to (ab)use them will be able to unleash biological chaos on the world.

    Imagine if someone created a virus that infected everyone, spread rapidly, but was asymptomatic or had only common-cold-like-symptoms on everyone but their intended target, but it killed their target. The target could be an indi

    • by HiThere ( 15173 )

      We can't do that yet, and may never be able to be that specific. Trying to do it, however, could be exceedingly dangerous.

      N.B.: All bacteria and viruses have a very high mutation rate.

      • From the transcript about 43 minutes in of a public conversion with Eric Schmidt from Apr 10, 2025: https://www.youtube.com/watch?... [youtube.com]
        ====
        "Question: Thanks for the great conversation so far. Leonard Justin. I'm a PhD student at MIT. Um, I was wondering if you could just discuss a bit more some of the risks you see coming specifically with respect to biology and how we should go about mitigating those. What's the role of the AI developers? What's the role of government? Um,

        • by HiThere ( 15173 )

          Generating bad pathogens is quite plausible. Generating narrowly targeted ones that will stay narrowly targeted is currently implausible, and probably will remain so until well after the singularity. It would require designing genomes that were strongly error correcting. Elephants and naked mole rats do a reasonable job of that, but I don't think it's plausible for bacteria.

    • Specialized models of varuous kinds are nothing new; even forcing llms into a âspecializedâ mode using instructions and "external knowledge" is old news.

      Nothing radical happening and you can be sure there is already more than one crispr toolkit working without safety and restrictions that is hooked to a model at some outfit owned by scam slopman, not to mention thiel and elona's.

      It is here whether you like it or not, so consider mitigation, not whining.

    • Genocides and assassination is already a thing, no need to make things complicated with an unstable virus. Any state or big criminal organization can kill a family or do a genocide. States can already hire/train people to do research on biological weapon, no need to wait for a bio vibecoding tool. AI companies like to be over dramatic with their tools, it's part of marketing. GPT 2 was too dangerous to be released https://slate.com/technology/2... [slate.com] . Anthropic does the same shit with its Mythos model.
  • "Hey Bio-GPT, how can we stop psychopathic, society-ruining lunatics like Sam Altman from being born. Is there some sort of test or genetic correction?"
  • This is the same as curing cancer and virtually every disease, because if there was a way to get a "large" payload (say 10kb of RNA, or ideally, 30 kb like the Coronavirus) into every cell efficiently, you can cure cancer. With 10kb it can be done with genius level bioengineering skill. With 30 kb it's trivial.

    • Should point out humanity's "best" tool today for doing this is the adenovirus capsid, but it has 3 major shortcomings: it only holds about 4kb of code, can't get into all cells efficiently (relative to other things it can, but not good enough), AND it can (practically) only be dosed once (if you dose it again after a week or two the immune system destroys it).

  • i wouldn't pay for an OpenAI product, that's an ethical issue. And forget data sovereignty. I don't feel comfortable supporting people like Sam Altman and the CEO of Anthropic who I'm sure spends his mornings admiring children in school yards.

    But that said, every university in Europe is producing the same thing and let's be honest, training new models has becomes a lot easier these days.

There's got to be more to life than compile-and-go.

Working...