This article was originally published in Knowable Magazine.
Several years ago, Christian Rutz started to wonder whether he was giving his crows enough credit. Rutz, a biologist at the University of St. Andrews, in Scotland, and his team were capturing wild New Caledonian crows and challenging them with puzzles made from natural materials before releasing them again. In one test, birds faced a log with drilled holes that contained hidden food; they could get the food out by bending a plant stem into a hook. If a bird didn’t try within 90 minutes, the researchers removed it from the data set.
But, Rutz says, he soon began to realize that he was not, in fact, studying the skills of New Caledonian crows. He was studying the skills of a subset of New Caledonian crows that quickly approached a weird log they’d never seen before—maybe because they were especially brave or reckless.
The team changed their protocol: They gave the more hesitant birds an extra day or two to get used to their surroundings, then tried the puzzle again. “It turns out that many of these retested birds suddenly start engaging,” Rutz says. “They just needed a little bit of extra time.”
More and more scientists are realizing that animals, like people, are individuals: They have distinct tendencies, habits, and life experiences that may affect how they perform in an experiment. That means, some researchers argue, that much published research on animal behavior may be biased. Studies claiming to show something about a species as a whole—the distance that green sea turtles migrate, for example, or how chaffinches respond to the song of a rival—may say more about individual animals that were captured or housed in a certain way, or that share certain genetic features. That’s a problem for researchers who seek to understand how animals sense their environments, gain new knowledge, and live their lives.
“The samples we draw are quite often severely biased,” Rutz says. “This is something that has been in the air in the community for quite a long time.”
In 2020, Rutz and his colleague Michael Webster, also at the University of St. Andrews, proposed a way to address this problem. They called it STRANGE.
Why “STRANGE”? In 2010, an article in Behavioral and Brain Sciences suggested that the people studied in much of published psychology literature are WEIRD—drawn from Western, educated, industrialized, rich, and democratic societies—and are “among the least representative populations one could find for generalizing about humans.” Researchers might draw sweeping conclusions about the human mind when, really, they’ve studied only the minds of, say, undergraduates at the University of Minnesota.
A decade later, Rutz and Webster, drawing inspiration from WEIRD, published a commentary in the journal Nature called “How STRANGE Are Your Study Animals?”
They proposed that their fellow behavior researchers consider several factors about their study animals: social background, trappability and self-selection, rearing history, acclimation and habituation, natural changes in responsiveness, genetic makeup, and experience.
“I first began thinking about these kinds of biases when we were using mesh minnow traps to collect fish for experiments,” Webster says. He suspected—and then confirmed in the lab—that more active sticklebacks were more likely to swim into these traps. “We now try to use nets instead,” Webster says, to catch a wider variety of fish.
That’s trappability. Other factors that might make an animal more trappable than its peers, besides its activity level, include a bold temperament, lack of experience, or simply being hungrier for bait.
Other research has shown that adult female pheasants housed in groups of five performed better on a learning task (figuring out which hole contained food) than those housed in groups of three—that’s social background. Jumping spiders raised in captivity were less interested than wild spiders in videos of prey (rearing history), and honeybees learned best in the morning (natural changes in responsiveness). And so on.
It might be impossible to remove every bias from a group of study animals, Rutz says. But he and Webster want to encourage other scientists to think through STRANGE factors with every experiment, and to be transparent about how those factors might have affected their results.
“We used to assume that we could do an experiment the way we do chemistry—by controlling a variable and not changing anything else,” says Holly Root-Gutteridge, a postdoctoral researcher at the University of Lincoln, in the United Kingdom, who studies dog behavior. But research has uncovered individual patterns of behavior—scientists sometimes call it “personality”—in all kinds of animals, including monkeys and hermit crabs.
“Just because we haven’t previously given animals the credit for their individuality or distinctiveness doesn’t mean that they don’t have it,” Root-Gutteridge says.
This failure of human imagination or empathy mars some classic experiments, Root-Gutteridge and co-authors noted in a 2022 paper focused on animal-welfare issues. For example, experiments by the psychologist Harry Harlow in the 1950s involved baby rhesus macaques and fake mothers made from cloth or wire. They allegedly gave insight into how human infants form attachments. But given that these monkeys were torn from their mothers and kept unnaturally isolated, are the results really generalizable, the authors ask? Or do Harlow’s findings apply only to his uniquely traumatized animals?
“All this individual-based behavior, I think this is very much a trend in behavioral sciences,” says Wolfgang Goymann, a behavioral ecologist at the Max Planck Institute for Biological Intelligence and the editor in chief of Ethology. The journal officially adopted the STRANGE framework in early 2021, after Rutz, who is one of the journal’s editors, suggested it to the board.
Goymann didn’t want to create new hoops for already overloaded scientists to jump through. Instead, he says, the journal simply encourages authors to include a few sentences in their methods and discussion sections addressing how STRANGE factors might bias their results (or how they’ve accounted for those factors).
“We want people to think about how representative their study actually is,” Goymann says.
Several other journals have recently adopted or recommended using the STRANGE framework, and since their 2020 paper, Rutz and Webster have run workshops, discussion groups, and symposia at conferences. “It’s grown into something that is bigger than we can run in our spare time,” Rutz says. “We are excited about it, really excited, but we had no idea it would take off in the way it did.”
His hope is that widespread adoption of STRANGE will lead to findings in animal behavior that are more reliable. The problem of studies that can’t be replicated has lately received much attention in certain other sciences—human psychology in particular.
The psychologist Brian Nosek, the executive director of the Center for Open Science, in Charlottesville, Virginia, and a co-author of the 2022 paper “Replicability, Robustness, and Reproducibility in Psychological Science” in the Annual Review of Psychology, says that animal researchers face similar challenges as those who focus on human behavior. “If my goal is to estimate human interest in surfing, and I conduct my survey on a California beach, I am not likely to get an estimate that generalizes to humanity,” Nosek says. “When you conduct a replication of my survey in Iowa, you may not replicate my finding.”
The ideal approach, Nosek says, would be to gather a study sample that’s truly representative—but that can be difficult and expensive. “The next-best alternative is to measure and be explicit about how the sampling strategy may be biased,” he says.
That’s just what Rutz hopes STRANGE will achieve. If researchers are more transparent and thoughtful about the individual characteristics of the animals they’re studying, he says, others might be better able to replicate their work—and be sure that the lessons they’re taking away from their study animals are meaningful, not quirks of experimental setups. “That’s the ultimate goal,” Rutz says.
In his own crow experiments, he doesn’t know whether giving shyer birds extra time changed his overarching results. But it did give him a larger sample size, which can mean more statistically robust results. And, he says, if studies are better designed, it could mean that fewer animals need to be caught in the wild or tested in the lab in order to reach firm conclusions. Overall, he hopes that STRANGE will be a win for animal welfare.
In other words, what’s good for science could also be good for the animals—seeing them “not as robots,” Goymann says, “but as individual beings that also have a value in themselves.”