OpenAI Accused of Using Paywalled O’Reilly Books for AI Training

OpenAI Accused of Using Paywalled O’Reilly Books for AI Training
A recent paper from the AI Disclosures Project claims that OpenAI trained its GPT-4o model on paywalled content from O’Reilly Media without proper licensing. The study, co-authored by O’Reilly himself, employed a method to detect copyrighted material in AI training data. Findings suggest that GPT-4o shows greater recognition of non-public O’Reilly books compared to older models. OpenAI has faced scrutiny over its training practices, and while it has licensing agreements with some content providers, the controversy continues. OpenAI did not respond to requests for comment.