OpenAI’s ChatGPT Search is struggling to accurately cite news publishers, according to a study by Columbia University’s Tow Center for Digital Journalism.
The report found frequent misquotes and incorrect attributions, raising concerns among publishers about brand visibility and control over their content.
Additionally, the findings challenge OpenAI’s commitment to responsible AI development in journalism.
Background On ChatGPT Search
OpenAI launched ChatGPT Search last month, claiming it collaborated extensively with the news industry and incorporated publisher feedback.
This contrasts with the original 2022 rollout of ChatGPT, where publishers discovered their content had been used to train the AI models without notice or consent.
Now, OpenAI allows publishers to specify via the robots.txt file whether they want to be included in ChatGPT Search results.
However, the Tow Center’s findings suggest publishers face the risk of misattribution and misrepresentation regardless of their participation choice.
Accuracy Issues
The Tow Center evaluated ChatGPT Search’s ability to identify sources of quotes from 20 publications.
Key findings include:
- Of 200 queries, 153 responses were incorrect.
- The AI rarely acknowledged its mistakes.
- Phrases like “possibly” were used in only seven responses.
ChatGPT often prioritized pleasing users over accuracy, which could mislead readers and harm publisher reputations.
Additionally, researchers found ChatGPT Search is inconsistent when asked the same question multiple times, likely due to the randomness baked into its language model.
See also: SearchGPT vs. Google vs. Bing
Citing Copied & Syndicated Content
Researchers find ChatGPT Search sometimes cites copied or syndicated articles instead of original sources.
This is likely due to publisher restrictions or system limitations.
For example, when asked for a quote from a New York Times article (currently involved in a lawsuit against OpenAI and blocking its crawlers), ChatGPT linked to an unauthorized version on another site.
Even with MIT Technology Review, which allows OpenAI’s crawlers, the chatbot cited a syndicated copy rather than the original.
The Tow Center found that all publishers risk misrepresentation by ChatGPT Search:
- Enabling crawlers doesn’t guarantee visibility.
- Blocking crawlers doesn’t prevent content from showing up.
These issues raise concerns about OpenAI’s content filtering and its approach to journalism, which may push people away from original publishers.
OpenAI’s Response
OpenAI responded to the Tow Center’s findings by stating that it supports publishers through clear attribution and helps users discover content with summaries, quotes, and links.
An OpenAI spokesperson stated:
“We support publishers and creators by helping 250M weekly ChatGPT users discover quality content through summaries, quotes, clear links, and attribution. We’ve collaborated with partners to improve in-line citation accuracy and respect publisher preferences, including enabling how they appear in search by managing OAI-SearchBot in their robots.txt. We’ll keep enhancing search results.”
While the company has worked to improve citation accuracy, OpenAI says it’s difficult to address specific misattribution issues.
OpenAI remains committed to improving its search product.
Looking Ahead
If OpenAI wants to collaborate with the news industry, it should ensure publisher content is represented accurately in ChatGPT Search.
Publishers currently have limited power and are closely watching legal cases against OpenAI. Outcomes could impact content usage rights and give publishers more control.
As generative search products like ChatGPT change how people engage with news, OpenAI must demonstrate a commitment to responsible journalism to earn user trust.
Featured Image: Robert Way/Shutterstock