Too Much Trust in AI Poses Unexpected Threats to the Scientific Process

It’s vital to “keep humans in the loop” to avoid humanizing machine-learning models in research

Sliced, glitchy illustration of scientist and robot. Artifical intelligence technologies, singularity concept. — Moor Studio/Getty Images

Machine-learning models are quickly becoming common tools in scientific research. These artificial intelligence systems are helping bioengineers discover new potential antibiotics, veterinarians interpret animals’ facial expressions, papyrologists read words on ancient scrolls, mathematicians solve baffling problems and climatologists predict sea-ice movements. Some scientists are even probing large language models’ potential as proxies or replacements for human participants in psychology and behavioral research. In one recent example, computer scientists ran ChatGPT through the conditions of the Milgram shock experiment—the famous study on obedience in which people gave what they believed were increasingly painful electric shocks to an unseen person when told to do so by an authority figure—and other well-known psychology studies. The artificial intelligence model responded in a similar way as humans did—75 percent of simulated participants administered shocks of 300 volts and above.

But relying on these machine-learning algorithms also carries risks. Some of those risks are commonly acknowledged, such as generative AI’s tendency to spit out occasional “hallucinations” (factual inaccuracies or nonsense). Artificial intelligence tools can also replicate and even amplify human biases about characteristics such as race and gender. And the AI boom, which has given rise to complex, trillion-variable models, requires water- and energy-hungry data centers that likely have high environmental costs.

One big risk is less obvious, though potentially very consequential: humans tend to automatically attribute a great deal of authority and trust to machines. This misplaced faith could cause serious problems when AI systems are used for research, according to a paper published in early March in Nature.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

“These tools are being anthropomorphized and framed as humanlike and superhuman. We risk inappropriately extending trust to the information produced by AI,” says the new paper’s co-author Molly Crockett, a cognitive psychologist and neuroscientist at Princeton University. AI models are human-made products, and they “represent the views and positions of the people who developed them,” says Lisa Messeri, a Yale University sociocultural anthropologist who worked with Crockett on the paper. Scientific American spoke with both researchers to learn more about the ways scientists use AI—and the potential effects of trusting this technology too much.

[An edited transcript of the interview follows.]

Why did you write this paper?

LISA MESSERI: [Crockett] and I started seeing and sharing all sorts of large, lofty promises of what AI could offer the scientific pipeline and scientific community. When we really started to think we needed to write something was when we saw claims that large language models could become substitutions for human subjects in research. These claims, given our years of conversation, seemed wrong-footed.

MOLLY CROCKETT: I have been using machine learning in my own research for several years, [and] advances in AI are enabling scientists to ask questions we couldn’t ask before. But, as I’ve been doing this research and observing that excitement among colleagues, I have developed a sense of uneasiness that’s been difficult to shake.

Beyond using large language models to replace human participants, how are scientists thinking about deploying AI?

CROCKETT: Previously we helped write a response to a study in [Proceedings of the National Academy of Sciences USA] that claimed machine learning could be used to predict whether research would [be replicable] just from the words in a paper.... That struck us as technically implausible. But more broadly, we’ve discovered that scientists are talking about using AI tools to make their work more objective and to be more productive.

We found that both of those goals are quite risky and open up scientists to producing more while understanding less. The worry is that we’re going to think that these tools are helping us to understand the world better, when in reality they might actually be distorting our view.

MESSERI: We categorize the AI uses we observed in our review into four categories: the Surrogate, the Oracle, the Quant and the Arbiter. The Surrogate is what we’ve already discussed—it replaces human subjects. The Oracle is an AI tool that is asked to synthesize the existing corpus of research and produce something, such as a review or new hypotheses. The Quant is AI that is used by scientists to process the intense amount of data out there—maybe produced by those machine surrogates. AI Arbiters are like [the tools described] in the [PNAS] replication study [Crockett] mentioned, tools for evaluating and adducting research. We call these visions for AI because they’re not necessarily being executed today in a successful or clean way, but they’re all being explored and proposed.

For each of these uses, you’ve pointed out that even if AI’s hallucinations and other technical problems are solved, risks remain. What are those risks?

CROCKETT: The overarching metaphor we use is this idea of monoculture, which comes from agriculture. Monocultures are very efficient. They improve productivity. But they’re vulnerable to being invaded by pests or disease; you’re more likely to lose the whole crop when you have a monoculture versus a diversity of what you’re growing. Scientific monocultures, too, are vulnerable to risks such as errors propagating throughout the whole system. This is especially the case with the foundation models in AI research, where one infrastructure is being used and applied across many domains. If there’s some error in that system, it can have widespread effects.

We identify two kinds of scientific monocultures that can arise with widespread AI adoption. The first is the monoculture of knowing. AI tools are only suited to answer certain kinds of questions. Because these tools boost productivity, the overall set of research questions being explored could become tailored to what AI is good at.

Then there’s the monoculture of the knower, where AI tools come to replace human thinkers. And because AI tools have a specific standpoint, this eliminates the diversity of different human perspectives from research production. When you have many different kinds of minds working on a scientific problem, you’re more likely to spot false assumptions or missed opportunities.

Both monocultures could lead to cognitive illusions.

What do you mean by illusions?

MESSERI: One example that’s already out there in psychology is the illusion of explanatory depth. Basically, when someone in your community claims they know something, you tend to assume you know that thing as well.

In your paper you cite research demonstrating that using a search engine can trick someone into believing they know something—when really they only have online access to that knowledge. And students who use AI assistant tools to respond to test questions end up thinking they understand a topic better than they do.

MESSERI: Exactly. Building off that one illusion of explanatory depth, we also identify two others. First, the illusion of exploratory breadth, where someone thinks they’re examining more than they are: There are an infinite number of questions we could ask about science and about the world. We worry that with the expansion of AI, the questions that AI is well suited to answer will be mistaken for the entire field of questions one could ask. Then there’s the risk of an illusion of objectivity. Either there’s an assumption that AI represents all standpoints or there’s an assumption that AI has no standpoint at all. But at the end of the day, AI tools are created by humans coming from a particular perspective.

How can scientists avoid falling into these traps? How can we mitigate these risks?

MESSERI: There’s the institutional level where universities and publishers dictate research. These institutions are developing partnerships with AI companies. We have to be very circumspect about the motivations behind that.... One mitigation strategy is just to be incredibly forthright about where the funding for AI is coming from and who benefits from the work being done on it.

CROCKETT: At the institutional level, funders, journal editors and universities can be mindful of developing a diverse portfolio of research to ensure that they’re not putting all the resources into research that uses a single-AI approach. In the future, it might be necessary to consciously protect resources for the kinds of research that can’t be addressed with AI tools.

And what sort of research is that?

CROCKETT: Well, as of right now, AI cannot think like a human. Any research about human thought and behavior, and also qualitative research, is not addressable with AI tools.

Would you say that in the worst-case scenario, AI poses an existential threat to human scientific knowledge production? Or is that an overstatement?

CROCKETT: I don’t think that it’s an overstatement. I think we are at a crossroads around how we decide what knowledge is and how we proceed in the endeavor of knowledge production.

Is there anything else you think is important for the public to really understand about what’s happening with AI and scientific research?

MESSERI: From the perspective of reading media coverage of AI, it seems as though this is some preordained, inevitable “evolution” of scientific and technical development. But as an anthropologist of science and technology, I would really like to emphasize that science and tech don’t proceed in an inevitable direction. It is always human-driven. These narratives of inevitability are themselves a product of human imagination and come from mistaking the desire by some to be a prophecy for all. Everyone, even nonscientists, can be part of questioning this narrative of inevitability by imagining the different futures that might come true instead.

CROCKETT: Being skeptical about AI in science doesn’t require being a hater of AI in science and technology. We love science. I’m excited about AI and its potential for science. But just because an AI tool is being used in science does not mean that it is automatically better science.

As scientists, we are trained to deny our humanness. We’re trained that human experience, bias and opinion have no place in the scientific method. The future of autonomous, AI “self-driving” labs is the pinnacle of realizing that sort of training. But increasingly we are seeing evidence that diversity of thought, experience and training in humans that do the science is vital for producing robust, innovative and creative knowledge. We don’t want to lose that. To keep the vitality of scientific knowledge production, we need to keep humans in the loop.