I'm enjoying the media furor about Daryl Bem's recent psi-research. It's always entertaining to see scientists getting hot and bothered about parapsychology.
Bem, a highly-regarded Cornell University psychology professor, has described nine experiments that he carried out over the past eight years. The one that attracted the most attention involved getting students to look at a computer screen that showed two curtains and to guess which one had an erotic image behind it. In fact both spaces were blank, and the image was randomly assigned by the computer after the subject had guessed. Bem says that volunteers showed a slight tendency to identify the space with the 'rewarding' sexy picture, by a margin of 53%. In trials where the picture held no particular interest the results were the 50% expected by chance.
Another gave a spin to a classic memory test, in which subjects are given a list of words and asked to identify a suitable category for half of them (if the word was 'tiger' the category would be 'animal', for instance). Tested on their recall later, subjects are more likely to remember the words they categorized. In Bem's version, the process was reversed, so that subjects first tried to recall words, and only afterwards carried out the categorization element. But the words they picked to categorize tended towards those that they had earlier been able to recall. Bem suggests that this shows that practicing a set of words after the recall test does, in fact, 'reach back in time' to facilitate the recall of those words.
In a parapsychology journal the report might not have attracted much notice. But it has been accepted for publication in the mainstream Journal of Personality and Social Psychology, which makes it earth-shattering, apparently.
A New York Times piece last week appeared to be on the verge of panic. It said scientists were 'mortified' and quoted Ray Hyman saying. "It's craziness, pure craziness. I can't believe a major journal is allowing this work in. I think it's just an embarrassment for the entire field.' Hyman thought Bem might be carrying out a practical joke. In the comments thread others agreed: perhaps the prank had a purpose, to expose the shortcomings of his own discipline.
The problem is that Bem is not your garden-variety ghost-hunting whacko. He's a senior psychologist, top of his field. His work is in all the textbooks. He wrote the textbooks, for God's sake! What's he doing advocating this ESP nonsense?
Some think the publication itself is overrated, but that doesn't really work either: journalists describe it as one of psychology's flagship journals. Others say the magazine knows nothing about statistics...oops, its editor, Charles Judd, is said to be one of the world's leading stats experts. So sceptics are anxiously looking for other explanations, for instance appealing to Occam's Razor and/or Randi's prize - 'if the shrink wins I'll believe it'. Or instant dismissal: 'This study is junk' (from the noted philosopher of science The Amazing Kreskin).
Two approaches in particular offer possibilities. One is the familiar 'experimental flaws' ploy which Ray Hyman pioneered in the 1980s, and which has become the standard CSI(COP) response. James Alcock has used it to make the prosecution case in the Skeptical Inquirer, quite effectively, I thought. Bem's work was extended over many years and was subject to frequent revision, which makes it easy to bury in a mass of complaints about procedural 'messiness'.
In fact here the professional sceptics are being upstaged by conservative statisticians, who it appears have an axe to grind with the ways statistics are used in the social sciences and medicine. Their idea is that the threshold of significance in experiments in these disciplines is far too generous, with the result that all kinds of unlikely claims are validated. They argue instead for the use of Bayesean methods, after the eighteenth century mathematician Thomas Bayes, who wanted the numbers to be weighted to take account of what was observable in the outside world. If the claim that the stats seemed to validate are inherently unlikely, then they should be downgraded accordingly.
In the past, statistics hasn't been the most successful area of attack against parapsychology. Statisticians themselves have been among its most vocal supporters, beating off complaints about improper statistical analysis. There's a lot of uninformed pontificating, as in this comment from the NYT thread:
What people apparently don't want to realize is that the laws of probability absolutely require you to take all prior information (i.e. results of past experiments) into account. The very, very vast majority of past ESP experiments has produced negative results. If you conduct thousands of ESP experiments, you are virtually certain to obtain extraordinary results once in a while, but picking out the "winners" is unsound and unscientific.
Well it would be if that's what's going on, but parapsychologists insist this so-called 'file drawer' problem is a myth, and even the professional debunkers aren't seriously pursuing it. But then you'd need to take an interest to know that.
However this looks to be a bit different, as the sceptics here are arguing not that the method has been wrongly applied, but that a different method is required altogether. Some statisticians have apparently been campaigning about this for years, with regard to the social sciences, and this controversy is a perfect opportunity for them to press their point. There have been a couple of rebuttals of Bem's findings along these lines (here and here).
But really, is it that different from anything else that sceptics have complained about over the years? For you don't have to be a mathematician to see the problem with applying Bayesian techniques. It's not just that the whole notion of weighting the numbers to allow for inherent unlikeliness is suspect, it's the specific factors that the Bayesians identify as relevant are so arbitrary: No mechanistic theory for precognition and no idea of how brain processes could produce it. If it's true, the world should be full of powerful psychics, but it isn't. There's no real-life evidence that people can feel the future. No one has won Randi's prize.
These are all loaded with assumptions about what should be the case, and exhibit a complete ignorance of the data that describes the phenomenon, as it appears in real-life situations. So using them to weight a statistical experiment is pointless, like carefully measuring out the ingredients for a cake, and then adding random handfuls of flour, sugar and flavouring until the result 'feels right'.
What it all seems to boil down to is the complaint that the margin of significance is insufficient to support the claim. In practice that's what sceptics feel anyway, otherwise we wouldn't be arguing about it any more. All this does it give their gut-rejection a veneer of scientific respectability.
It's certainly true that the effect size in Bem's experiments is quite small, as it is in parapsychology generally. A lot here has to do with the size of the sample. An average 51% result where 50% is the chance mean is taken to be significant in psychokinesis studies, for instance, only because huge numbers of trials are involved. And it's widely recognised that the effect sizes reported by parapsychology are as great or greater than, for instance, the link between aspirin and heart attack prevention, calcium intake and bone mass, second hand smoke and lung cancer, and condom use and HIV prevention - none of which are especially controversial. So in theory, there's a case to answer from Bem's work.
As it happens, I don't think these experiments alone provide particularly good evidence of psi, but they do confirm a long-established trend in parapsychology. More to the point, they offer a new experimental approach, which I guess is what Bem is more concerned with than making grand claims about psi's existence. Like Rupert Sheldrake's staring experiment, it's quite easy to do and he wants others to try it. Apparently there have been three failed replication attempts, so far, which has given heart to sceptics, but for Bem it's still early days.
I'd be interested to hear what statisticians like Jessica Utts, for instance, have to say about the Bayesian critics, and it will be interesting to see how the controversy pans out. For the moment it seems clear that the sceptics have a new argument to run with - the NYT in particular is keen to push it, as in this follow-up piece a couple of days ago.
All this aside, I'm encouraged by the responses the Bem debate has been throwing up. The comments thread on the first NYT piece contained the usual snarky responses, but there were also a number of thoughtful ones. Quite a few urged that it's the job of science to investigate, not to suppress or make a priori assumptions. Some recognized quantum entanglement effects validate the possibility of psi and others described their own experiences, rebutting claims that there's no real-life experience of it.
And a lot of the Internet coverage has been informed and positive, eg this Huffington Post article. Psychologists in the past have been fiercely critical but these two pieces in Psychology Today are informed and interested (here and here). My sense is there are plenty of professionals out there in the scientific mainstream who take parapsychology seriously, and who won't take the sceptical brouhaha completely at face value.