Recently, there has been some hubbub over whether or not you should hug your dog. The Psychology Today blog post that reignited the debate, by Dr. Stanley Coren of the University of British Columbia, titled “The Data Say ‘Don’t Hug the Dog!’” claimed to answer this question with science. Dr. Coren collected 250 photographs of people hugging dogs from the Internet, and analyzed them for visual cues that indicate stress in dogs. He found that 81 percent of these photos contained dogs displaying one or more stress signals, indicating that the dog did not like being hugged. Rachel Feltman took issue with the coverage of this op-ed in a brilliant critique of how science news is covered by the media (I will reiterate many of her points here with more detail).
In response to the attention, Dr. Coren has released some additional details in a Facebook post that allows us to dive a little bit deeper into his pilot study. In the post, he uses a statistical analysis to provide support for his conclusions.
Again, these are not peer-reviewed data, but with these details, I can evaluate the problems with the design as a window into the type of comments he would have gotten had he submitted the paper for peer review instead of advertising it in the media first. And, as is frequently the case, the devil is in the details. Let me explain why I would reject these data if I were asked to peer review this study.
Big problem #1: bias in sampling. Dr. Coren states that the photographs were “randomly” selected from a search of Internet photographs. Apparently, he did not select all, or even the first number of the search results, which indicates that there was sampling beyond the criteria he gave for inclusion on his part. This is problematic because what you think of as “random” is often not.
In science, “random” is a descriptor for a very specific process in sampling that must be led by truly random process, such as throwing perfectly weighted dice or observing radioactive decay. Humans cannot pick random numbers, and therefore cannot randomly sample.
When you “randomly” select something not directly guided by a truly random process, you are using haphazard selection. What’s the problem with haphazard selection? One word: bias. Although you think you are picking “at random,” unconscious bias seeps in like storm water through a cracked foundation. The Transportation Security Administration (TSA), for example, is infamous for “randomly” selecting Sikh men in turbans and women in hijabs for additional screening. Not necessarily because TSA agents are openly biased against people of particular faiths, but because unconscious biases drive their choices while appearing to them to be random.
Although Dr. Coren may not have a conscious bias for including pictures of dogs that look unhappy in photos (i.e., he wasn’t trying specifically to pick more pictures with dogs that look unhappy than ones that seemed to enjoy hugging), the facts are that he feels strongly that dogs don’t like hugs already, and he did nothing to control for his unconscious bias. This should give any reader pause in drawing strong conclusions about hugging based on the quality of the data.
So how would one fix this problem in picture selection? It’s actually quite easy. There are many protocols that allow people to choose based on random number generators. To go back to our TSA example, the Administration has recently introduced an app that removes the decision from a human and gives it to a random number generator (technically it’s a pseudorandom number generator, but hey, let’s not get bogged down). For every person entering the line, the app generates a random number and if the number is below a threshold, the traveler is selected for extra screening. This removes all the biases (unconscious or otherwise) of human choice.
Of course, one doesn’t need to spend $1.4 million a random number generator—although someone failed to mention this to the TSA. You can access a free random number generator here. (Generate any random number with a click of a button! Play with it for all the time you’d like!) A simple protocol for including 50 percent of the photos that fit the other criteria would be:
- Identify a potential picture of a dog and a person hugging or otherwise interacting.
- Generate a random number between 1 and 100.
- If the number is between 1 and 50, include the picture; if the number is between 51 and 100, do not include the picture.
And this doesn’t even touch on the bias of scoring these photos, which also needs to be controlled to trust the results!
Big problem #2: no control condition for the study. Most—although not all—of science can be boiled down to testing specific hypotheses to find out if the effect you are seeing is due to your hypothesis or a null hypothesis. Dr. Coren’s posts, the question he is asking—do dogs dislike being hugged?—is addressed by the hypothesis that dogs being hugged will exhibit more stress signals than you would expect (here, that 50 percent of photos will have dogs exhibiting some stress signals). The null hypothesis is that dogs being hugged would exhibit some stress signals at the same rate as you would expect (about 50 percent, according to Dr. Coren).
But how valid is this figure of 50 percent? He admits in his Facebook post that it is a guess. While there’s nothing wrong with taking an educated guess to form a null hypothesis, is it a good guess that 50 percent of dogs in similar situations (minus the hugging) would show at least one stress signal?
The problem here is that this hypothesis assumes that only the hugging would be responsible for stress signals in these photos, not any other part of the situation of photographing that dogs might also find stressful. I can attest to how much my dog dislikes being photographed: in the picture below, you can see her throwing several of the same signals including whale eye, turned head, and ears back; no hugging necessary.
(Don’t worry, she got a treat for putting up with my camera invading her seven-foot bubble of personal space!) Even one of Dr. Coren’s example photos used to illustrate the stress signals was of a dog not being hugged. So what if the situation of being photographed is enough to increase the rate at which you’d expect to see stress signals? Maybe it isn’t 50 percent, but 75 percent? And what about other parts of the situation, separate from even the act of taking the picture, that you can’t glean from a photograph?
How could you control for the other situational stress that might occur together with photographing someone hugging their dog? A better comparison is in order: not just selecting and scoring photos of people hugging dogs, but selecting and scoring dogs by themselves in similar situations (minus hugging). It is simple, relies on the same criteria and scoring, and could be done just as quickly. The inclusion criterion could be the dog at the same relative distance from the camera. This would establish at least a basic control on the setting and provide a better null hypothesis value for testing than simply guessing.
So, you may ask, what’s the use of going to so much trouble to split hairs with a pilot study like this? Well, first off, I’ll quote Rachel Feltman, “Welcome to science, it’s a pain in the butt.” The reason that advances happen, that we trust science so much, is exactly this hair-splitting scrutiny over the quality of the data and the interpretation that goes along with it. Without these, we have very little way of knowing if we are seeing a real effect or simply tricking ourselves into a conclusion. If bad sampling causes the observed rate to drop by 7 percentage points, a pretty small drop, the 95 percent confidence interval drops to 0.6822 from 0.7906. And if it turns out that closer to 70 percent of dogs show stress signals simply by being photographed (or 0.70), then all of the sudden the results are no longer significant!
Second, it’s a matter of how the results are reported in the popular media. As Feltman’s article points out, there have been many breathless stories of science “proving” that you shouldn’t hug your dog. Few people understand that science proves nothing: ideas are simply supported or unsupported by evidence. The strength of that support comes in part from the quality of the data. Here, there is very poor quality data being given more time and attention than it deserves. It is a good idea, perhaps even a correct one, but scientists and the media owe it to the public to put those data into the proper context. So do the data say “don’t hug the dog!”?
Not even close.
Stay skeptical, my friends.
Lindsay Waldrop, PhD, is a biologist and dog enthusiast who will start as an assistant professor in August 2016.