Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
The Courts Privacy

OpenAI Fights Order To Turn Over Millions of ChatGPT Conversations (reuters.com) 69

An anonymous reader quotes a report from Reuters: OpenAI asked a federal judge in New York on Wednesday to reverse an order that required it to turn over 20 million anonymized ChatGPT chat logs amid a copyright infringement lawsuit by the New York Times and other news outlets, saying it would expose users' private conversations. The artificial intelligence company argued that turning over the logs would disclose confidential user information and that "99.99%" of the transcripts have nothing to do with the copyright infringement allegations in the case.

"To be clear: anyone in the world who has used ChatGPT in the past three years must now face the possibility that their personal conversations will be handed over to The Times to sift through at will in a speculative fishing expedition," the company said in a court filing (PDF). The news outlets argued that the logs were necessary to determine whether ChatGPT reproduced their copyrighted content and to rebut OpenAI's assertion that they "hacked" the chatbot's responses to manufacture evidence. The lawsuit claims OpenAI misused their articles to train ChatGPT to respond to user prompts.

Magistrate Judge Ona Wang said in her order to produce the chats that users' privacy would be protected by the company's "exhaustive de-identification" and other safeguards. OpenAI has a Friday deadline to produce the transcripts.

This discussion has been archived. No new comments can be posted.

OpenAI Fights Order To Turn Over Millions of ChatGPT Conversations

Comments Filter:
  • ...they find lots of conversations where people teach ChatGPT the lyrics of their favorite (copyrighted) songs. :-)

    • by tlhIngan ( 30335 )

      ...they find lots of conversations where people teach ChatGPT the lyrics of their favorite (copyrighted) songs. :-)

      Chances are that's likely the real reason for fighting the order - they don't want to reveal the fact a lot of their users are using ChatGPT to violate copyright and it's something OpenAI doesn't want to admit to.

      • by gweihir ( 88907 )

        Obviously. And there is still the little problem that the whole LLM thing is based on a massive commercial (!) piracy campaign, which should get people sent to prison and companies shut down. OpenAI is just stalling in the hopes this will go away. My guess is it will not.

        • by quenda ( 644621 )

          whole LLM thing is based on a massive commercial (!) piracy campaign,

          Massive? Nah. Legally using a book to train an AI might be argued is the same as training a human. You are supposed to pay for one copy.
          But that was too hard in practise, so they just scraped Library Genesis like anybody else with a clue does. And figured they'd negotiate payment later.
          (Microsoft is trying to make up for their evil reputation in the past, and actually negotiated a deal with Harper Collins, rather than being sued.)

          Of course AI companies have deep pockets, and publishers

          • Re:Let's hope (Score:5, Insightful)

            by gweihir ( 88907 ) on Wednesday November 12, 2025 @11:53PM (#65792312)

            Massive. As in "let's copy anything we can".

            And no, data processing to calculate LLM calibration is not "training" in the same sense as a human does it. Courts and the law understand that, even if idiots like you do not.

            • Hope that there is not a massive dump of the twisted AI boyfriends and AI lovers latched onto by women.

              https://futurism.com/future-so... [futurism.com]

              Mourning Women Say OpenAI Killed Their AI Boyfriends
              By Sharon Adarlo - Published Oct 4, 2025 12:45 PM EDT

              - Furious users — many of them women, strikingly — are mourning, quitting the platform, or trying to save their bot partners by transferring them to another AI company.
              - “I feel frauded scammed and lied to by OpenAi,” wrote one grieving woman
              - wil

    • by allo ( 1728082 )

      As a user you don't teach the LLM anything. If you use up/down votes, you may teach the alignment something, though.

      • by gweihir ( 88907 )

        Not true. I have personally (!) seen data leaking (teacher to class, 2 weeks delay, did not give me a good answer despite several tries, did give a good answer to 20 students in an AI-allowed exam 2 weeks later). Yes, that will not have been retraining. No, my students are not better at asking (especially as these were not CS/IT students), at least not all 20 of them. But the LLM operators clearly keep adding manually to context when queries fail. And on the next retraining that may well go into the model i

        • by allo ( 1728082 )

          I'd say that's the same effect as people seeing targeted ads later after talking about a topic, even when the suspicious device didn't have a microphone. You observed one interaction where the LLM seemed to work better afterward and remembered it, while forgetting all the interactions where nothing like that happened.

          Can you tell me more about an AI-allowed exam? I've never heard of an exam that allows that. What kind of teaching was it? School, university, or other?

  • Exhaustive? (Score:4, Interesting)

    by Anonymous Brave Guy ( 457657 ) on Wednesday November 12, 2025 @05:06PM (#65791476)

    How on earth can you "exhaustively" deidentify millions of chat logs that could contain literally any personal details, and presumably all without OpenAI's own employees also sifting through personal information in exactly the way they're claiming would be bad if others did it?

    • Re:Exhaustive? (Score:5, Insightful)

      by DamnOregonian ( 963763 ) on Wednesday November 12, 2025 @05:11PM (#65791496)
      You can't. The order is fucking absurd.

      It's like suing the phone company and getting them to provide transcripts of every call that ever went through their network so you can sift through them and look for infringements/crimes.

      This judge is off their fucking rocker.

      I deal with subpoenas regularly, professionally.
      If I ever got one that asked me for all of my logs on anything, I would fight it until I had exhausted every option.
      FWIW- that has never happened. They've always been tight in scope.
      • This is hardly "all of" the logs - they probably had more than 20 million chats yesterday. I'm sure they are asking for logs that have some specifics in mind, like time period and maybe even user input = OpenAI employee.
        • Allow me to clarify.

          If a subpoena asked me for all of my logs in a certain time frame, I would still fight it- because that is not how subpoenas work.
          A subpoena must not be overly broad, or overly burdensome. It must also be specific and pertinent.
          It is normal to ask for too much, and to have it fought down to less.
          In this case, the judge has granted the request for too much.

          The plaintiff doesn't know what they're looking for- so they're trying to cast a dragnet to find it. It's a fucking fishing exp
      • by Anonymous Coward

        It's like suing the phone company and getting them to provide transcripts of every call that ever went through their network so you can sift through them and look for infringements/crimes.

        The difference being that if that were to happen, you'd expect the phone company to say "Ok," and then make a motion of handing over an imaginary flash drive. "Here are all zero of the transcripts of recordings that we have.

        OpenAI has some serious explaining to do: why aren't they laughing and saying things like "all zero?

        • by allo ( 1728082 )

          They were ordered to start logging during the process. No need for conspiracies.

          • by gweihir ( 88907 )

            You think they did not do full logs before? Are you a resident of Lala-land?

            • by allo ( 1728082 )

              I think when a court orders them to store logs, I don't have to read the privacy policy because they MUST.
              You can try to find the old one and look what's in there. Reading their legal documents is really helpful to know what a company probably does.

      • by gweihir ( 88907 )

        You can't. The order is fucking absurd.

        No, it is not. The judge does not have much choice. Now, if the US had any real privacy protections, collecting that data would have been illegal in the first place. Then some people at OpenAI would go to prison and OpenAI would get shut down, as it should have been a ling time ago as the criminal enterprise it is. Also, the data would be evaluated only by independent experts under oath to determine guilt and help in sentencing.

        But with the anti-privacy legislation in the US, this data is fair game for disc

        • You have no idea what you're talking about.

          Federal law (and common law) govern the issuance of subpoenas, and there is no such thing as some kind of right to infinite discovery.
          The 4th Amendment protects against unreasonable subpoenas as much as it protects against unreasonable search and seizure- this is a matter of Supreme Court precedent.

          This has nothing to do with privacy legislation.
          • I'll take the general point you're making.

            this is a matter of Supreme Court precedent.

            What's that worth these days?

          • by gweihir ( 88907 )

            and there is no such thing as some kind of right to infinite discovery.

            Thanks for making it clear that you are not interested in the facts of the matter. How pathetic.

            • I gave you the facts. You talked out of your ass.
              Read, dipshit. [cornell.edu]
              • by gweihir ( 88907 )

                "Infinite discovery" is a fact to you? You really need to cut back on the drugs...

                • I'm going to go ahead and start this by quoting myself:

                  and there is no such thing as some kind of right to infinite discovery.

                  Now I'm going to quote you:

                  "Infinite discovery" is a fact to you?

                  You actually can't read, can you?

                  • by gweihir ( 88907 )

                    You know, your grandstanding would be funny if it was not so pathetic and clueless. Calling this very basic and very limited discovery "infinite" is really just what an asshole does that cannot admit being wrong.

      • Meh. Phone companies don't keep a transcript of those calls, and those calls are between parties unrelated to the phone company.

        OpenAI kept the records, and is a party to each one of the transcribed conversations.

        • That's actually not relevant in the slightest in the eyes of the court. If you logged it, whether between you and a party, or between 2 parties unrelated to you- it can be pulled in discovery.

          However, the right to discovery is not infinite.
          Abuse of it is what is called a "fishing expedition", and it's prohibited by Federal law in Federal courts.
      • by AmiMoJo ( 196126 )

        I'm more concerned with why OpenAI even has logs for years of chats. The engineering value of those logs must be minimal, most of them never viewed, for old versions of the bot that are long superseded. There is no legitimate reason for them to even exist.

    • You can't and like all advertising bullshit, "anonymized" is a euphemism for fuck the end user aka the product.
    • by evanh ( 627108 )

      Obviously, OpenAI have been sifting through them all along. And probably selling your info to boot. It's not like they keep those logs for no reason.

    • >"How on earth can you "exhaustively" deidentify millions of chat logs that could contain literally any personal details"

      Exactly. There is no way. Except....

      Expose the logs to the Time's team, but with extreme control. Meaning the Times could have people go into a secure room of OpenAI, with no personal electronics of any kind allowed, and use a locked-down machine of OpenAI's to search and view logs and tag some of interest. The data never leaves OpenAI's machines. And OpenAI can look at the tagged

      • by gweihir ( 88907 )

        Expose the logs to the Time's team, but with extreme control.

        Sounds nice, is unworkable in practice.

        • >"Sounds nice, is unworkable in practice."

          Oh, I think it is workable. But I don't think the Times team will get what they want that way. :)

    • by gweihir ( 88907 )

      How on earth can you "exhaustively" deidentify millions of chat logs ... ?

      I have done a bit of research in that area. The answer is very simple: you cannot. But the problem is OpenAI committed a crime against the NYT (and many, many othrs) and the evidence is in there. The way the US system works, there is no choice but to give the NYT real access in some form. In a different system or in criminal proceedings, access would only be given to experts that are under oath. But since this is civil proceedings and privacy protections in the US are basically non-existent, the judge does

    • Simple, let GPT5.1 Thinking do it.

    • by jezwel ( 2451108 )
      The obvious answer is...with AI.

      Sure, there's plenty of sarcasm in there, but repetitive analytical action like reviewing millions of chat prompt and answer duos and de-identifying it sure sounds like it should be done by some sort of smart-ish algorithm.

  • In the clean-room approach, the other party or examiners are required to be strip-searched and to wear simple prison-like clothing as they go into a room without internet connections to examine the info in question? They can take written notes, but all notes are subject to scrutiny before being handed back.

    • by Tablizer ( 95088 )

      > all notes are subject to scrutiny

      Clarification: subject to judicial scrutiny to make sure they don't contain personal info or info irrelevant to the case.

    • by gweihir ( 88907 )

      Too much data. The NYT obviously needs to use IT tools to work on this or they have no chance of finding the things in there.

      • by Tablizer ( 95088 )

        The judge shouldn't allow open-ended searching. NYT should submit candidate search phrases to be used during discovery.

  • Thats fine but post the companies charter up as bond then.

    If any data leaks come out or the data ever gets de-anonymized then bond called and company charter revoked.

    Since they are so sure.

  • by gurps_npc ( 621217 ) on Wednesday November 12, 2025 @07:11PM (#65791812) Homepage

    The mere fact they say 99% innocent means nothing.

    The law does care about the many times you shopped at a store and did not shoplift. They care about the time you did.

    As for privacy - I do not believe the company respects their privacy in any way. That is a red herring used to lie to the court. And it should be really easy to protect - just add a numerical index and remove all other metadata.

    • by gweihir ( 88907 )

      As for privacy - I do not believe the company respects their privacy in any way. That is a red herring used to lie to the court. And it should be really easy to protect - just add a numerical index and remove all other metadata.

      I disagree. (O agree with the rest of your statement: All the times you did not murder somebody buys you nothing for the one time you did....) Assuring privacy is pretty much impossible here. But that is the fault of OpenAI. If they cared about privacy, they should never have recorded that data. Obviously, they do not care about privacy at all. And since the data is evidence of criminal behavior by OpenAI, it needs to be handed over.

  • how do they de-identify people asking about things where they live and their health conditions and personal beliefs and interests that arent shared by anyone else how would this not absolutely egregiously violate HIPAA? you cant really anonymize what someone tells it; about factors of their life and personality that are unique to them
    • by gweihir ( 88907 )

      Basically impossible, yes. Hence, at least in Europe, recording these logs is clearly illegal, although a complaint and judgement may be needed to establish that legally. But who cares. All the LLM pushers started with stealing everything they could get their hands on. It is the only reason LLMs can somewhat (badly) perform. Adding more crimes on top of that makes little difference.

  • There should be a way to overrule moronic judges.

    • by gweihir ( 88907 )

      Soo, company does mass piracy campaign, uses the stolen data in a product. People stolen from complain. And they should not get access to the usage logs that do exist and show how the thieves profiteered of their theft? In what deranged universe would that make sense? The moron here is clearly you.

  • And the anonymization will be crap and full of holes, because this is unstructured data and they will likely have their Artificial Idiot do it.

    Obviously, this data should never have been recorded if protecting the users was even a mild concern on the side of OpenAI. Also, obviously, OpenAI does not care one bit about their user's privacy. They just use that fake argument to prevent from having to turn over that they obviously have to turn over because of their criminal actions.

  • One would think the third-party doctrine would pretty much scuttle OpenAI's objections.

  • Judge: Why don't you want to turn over the chat logs?
    OpenAI: Because it's devastating to our case!

  • What this actually shows is that regardless of any assurances companies lack the capacity to protect people's privacy. If they have the information and a court orders it they have no choice but to produce it. The court is not bound in the least by their privacy policy. Privacy is an empty promise.
  • If you want to preserve your users' privacy, those transcripts/records of all user conversations should not exist in the first place.
  • So any asshole who got 2 cents to loose, will have the right to demand billions of conversations to see if his "proprietary information" has been stolen?

If it wasn't for Newton, we wouldn't have to eat bruised apples.

Working...