The rapid evolution of the digital landscape is punctuated by remarkable innovations in artificial intelligence (AI). Yet these developments come with their own share of challenges.
One such issue, increasingly capturing the attention of industries ranging from academia to journalism, is the fabrication of information by AI, specifically OpenAI's ChatGPT, which recently closed a $10-billion multiyear investment deal with Microsoft Corp
From Personal Experience To Academic Confirmation
I have personally experienced this problem on multiple occasions.
Seeking to verify these concerns, I set out on a quest to discern the truth. I tasked ChatGPT with retrieving quotes from a particular person, meticulously feeding it precise information. To my surprise, it furnished me with plausible responses, fashioned in the speaker's style, but they were not genuine quotes.
Even when armed with copious context and explicit directives not to fabricate quotes, ChatGPT still invented remarks. It packaged these counterfeit quotes as if the person in question had voiced them verbatim, attributing them to the person and enclosing them in quotation marks.
This practice represents a stark misrepresentation, as the source never uttered these words. Despite my prompts' stipulations, ChatGPT continued to invent quotes the interviewee never said, even after supplying it with a Q&A transcript.
While the process was undertaken with care to maintain the utmost fidelity to the provided information, these anomalies underscored the need for rigorous fact-checking.
The Research On ChatGPT Inaccuracies: This growing concern was brought into sharp focus by the study "High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content," conducted by Mehul Bhattacharyya, Valerie M. Miller, Debjani Bhattacharyya and Larry E. Miller.
Through an analysis of 30 medical papers generated by ChatGPT-3.5, each containing at least three references, the researchers uncovered startling results: of the 115 references generated by the AI, 47% were completely fabricated, 46% were authentic but used inaccurately and only 7% were authentic and accurate.
Their findings reflect the larger concern that ChatGPT is capable of not only creating fabricated citations, but whole articles and bylines that never existed. This propensity, known as "hallucination," is now seen as a significant threat to the integrity of information across many fields.
Benzinga contacted OpenAI for commentary. While nobody at the company was available to comment, the team pointed Benzinga toward a new blog post and research paper addressing some of these issues.
The 'Hallucination' Issue: Is GPT Tripping?
The repercussions of this phenomenon are notably severe within the journalism industry. Instances of reporters finding their names falsely attributed to non-existent articles or sources are escalating.
One such victim of this trend was AI researcher and author Kate Crawford, whose name was inaccurately linked to criticisms of podcaster Lex Fridman. Moreover, ChatGPT's ability to concoct references for completely non-existent research studies raises the risk of propagating disinformation, ultimately undermining the credibility of legitimate news sources.
This has happened many other times. For instance:
- The Guardian reported that ChatGPT created entire articles and bylines that it never actually published. This has been noted as a worrying side effect of democratizing tech that can't reliably distinguish truth from fiction.
- Journalists at USA Today discovered that ChatGPT generated citations for entire non-existent research studies about how access to guns doesn't raise the risk of child mortality.
- Noah Smith shared a tweet showing how he asked ChatGPT to write an argument against wage boards in the style of economist Arin Dube. The AI completely fabricated the quote, and also mischaracterized Dube's stance, who is, in reality, a proponent of wage boards. The AI also cited a non-existent article as its source.
While AI's prowess is beyond doubt, recent events have cast a somber shadow over its potential misuse. At the heart of the storm is once again OpenAI's ChatGPT, which has run amok in the field of law, presenting unforeseen dilemmas and legal conundrums.
This AI-induced chaos unfurled around the case of Mata Vs. Avianca, when Mata's legal counsel employed ChatGPT to find prior court decisions pertinent to the case. What seemed like an innovative application of AI turned into a bizarre sequence of events, with ChatGPT fabricating non-existent cases.
In response to queries about the validity of these cases, ChatGPT went a step further. It concocted elaborate details of the made-up cases, which were then presented as screenshots in the court filings.
The lawyers even sought AI's "confirmation" of the fabricated cases, presenting it as part of their legal argument. An incredulous twist that, in the words of lawyer Kathryn Tewson, demands attention: "did you see the 'cases'? You have GOT to see the cases."
A hearing is slated for the coming month to discuss potential sanctions on the lawyers - a potential risk to their livelihoods and reputations. This incident echoes another recent predicament where an Indian judge grappled with the question of granting bail to a murder suspect who sought legal advice from ChatGPT.
Highlighting the severity of the issue, a Reddit user shared their concern about people's blind faith in AI's capabilities, even in sensitive areas like medicine or law.
The release of the latest version of ChatGPT, ChatGPT-4, has elicited mixed responses. On one hand, this AI model, capable of scoring 90% on the U.S. bar exam and perfect scores on SAT math tests, signifies a leap in AI development. On the other, it raises concerns about the risks of misuse and the potential for large-scale disinformation and cyberattacks.
Elon Musk and OpenAI CEO Sam Altman have been vocal about the need for AI regulation.
"Society, I think, has a limited amount of time to figure out how to react to that, how to regulate that, how to handle it," Altman has said.
Altman is particularly concerned about malicious uses of ChatGPT. Since it can write computer code, GPT-4 can be used for large-scale disinformation and offensive cyberattacks, he said.
Altman has also addressed the "hallucination" issue.
"The thing that I try to caution people the most is what we call the 'hallucinations problem,'" Altman said in an interview with ABC News. "The model will confidently state things as if they were facts (but they) are entirely made up."
Musk, a co-founder of OpenAI who recently said he invested approximately $50 million in the company, has previously expressed concerns regarding AI. It's notable as Musk is the CEO of another company, Tesla
AI technology is also employed to enhance the performance of Tesla's autonomous vehicles, improving their ability to understand and respond to their surroundings. Assuming Tesla's push into AI is subject to some of the same risks - hallucinations - as other generative AI solutions, it's an interesting prospect given the potential for autonomous, AI-assisted vehicles with little to no human input on the roads en masse.
Reports from 2023 suggest a potential collaboration between Tesla and OpenAI that could further strengthen Tesla's AI operations. Yet this could also amplify the potential risks if the AI hallucination problem isn't effectively addressed. There is no evidence to suggest that Tesla vehicles as they are constructed now are a danger to society.
Other tech leaders, such as Sundar Pichai, Ginni Rometty, Marc Benioff and Satya Nadella, have also voiced concerns about AI's negative potential, adding to the growing clamor for regulation and oversight.
AI, undoubtedly, is the future - at least when it comes to the idea that it can solve administrative hurdles and boost productivity. As we rocket into this new era, these incidents serve as a stark reminder of the challenges and responsibilities that come with the power of AI. Balancing the promise and perils of AI is a task that demands our utmost attention, vigilance and discernment.
Understanding Hallucinations And Potential Solutions
OpenAI is actively refining its models to mitigate hallucination issues and is exploring ways to allow users to customize ChatGPT's behavior while setting safeguards to prevent misuse. Strategies like training AI on a more narrow and vetted dataset could potentially reduce the instances of hallucination, albeit at the risk of limiting the AI's creativity and breadth of knowledge.
Other potential solutions could include improving the user interface, indicating sections of generated text with varying levels of confidence, and integrating external databases. However, these remedies come with their own sets of challenges such as data privacy concerns and the technological feasibility of integrating large amounts of external data.
As we delve deeper into the era of AI, the challenge lies in leveraging its remarkable text-generating abilities while mitigating the risk of misinformation. Navigating this intricate balance calls for a steadfast dedication to truth and accuracy, along with a willingness to adapt and evolve amidst the fast-paced landscape of artificial intelligence.