Judge Orders OpenAI to Hand Over 20 Million ChatGPT Logs in Copyright Fight

0
88
OpenAI secures landmark $200 million Pentagon contract to develop cutting-edge AI solutions, reinforcing national security and defense capabilities.

A federal judge in New York has ordered OpenAI to produce roughly 20 million anonymized ChatGPT user logs to The New York Times, prominent authors, and other plaintiffs who claim the AI company unlawfully used copyrighted material to train its models.

In a short order issued Monday, U.S. District Judge Sidney H. Stein upheld an earlier ruling by U.S. Magistrate Judge Ona T. Wang, rejecting OpenAI’s objections and clearing the way for the large-scale data production.

The judge agreed with the magistrate’s conclusion that both logs directly referencing plaintiffs’ works and those that do not may still be relevant to the dispute.

Signup for the USA Herald exclusive Newsletter

Privacy Arguments Rejected

OpenAI had challenged the discovery order by arguing that user privacy concerns were not properly weighed, citing a Second Circuit decision involving wiretapped phone calls. Judge Stein dismissed that comparison.

“Privacy interests in the wiretapped recordings of private phone conversations in [SEC v.] Rajaratnam are stronger than the privacy interests in users’ conversations with ChatGPT, which users voluntarily disclosed to OpenAI and which OpenAI retains in the normal course of its business,” the judge wrote.

He added that the wiretapping case “focused on the potential illegality of the wiretaps,” while no party in the current litigation claims OpenAI unlawfully possesses ChatGPT conversation data.

Court Declines Narrower Discovery Proposal

Judge Stein also rejected OpenAI’s suggestion that it should be allowed to run targeted keyword searches across the logs instead of turning over the full dataset.

“OpenAI identifies no caselaw requiring a court to order the least burdensome discovery possible or to explain specifically why it rejects a party’s discovery proposal,” the judge said.

The ruling leaves intact Judge Wang’s November order compelling production of the logs in their entirety.

Broader Copyright Battle

The discovery dispute is part of sweeping multidistrict litigation accusing OpenAI and its financial partner Microsoft of infringing copyrighted works by using protected material to train large language models without authorization. The lawsuits also allege violations of the Digital Millennium Copyright Act.

The cases involve a wide range of plaintiffs, including The New York Times and other news organizations, as well as the Authors Guild and well-known writers such as George Saunders, Scott Turow, Jonathan Franzen, and Jia Tolentino.

In April, the Judicial Panel on Multidistrict Litigation consolidated pretrial proceedings for the cases in the U.S. District Court for the Southern District of New York, where Judge Wang is overseeing discovery.

Counsel for the parties did not immediately respond to requests for comment.

The case is In re: OpenAI Inc. Copyright Infringement Litigation, case number 1:25-md-03143, in the U.S. District Court for the Southern District of New York.