It's Madness - Madness I Tell You!
January 12, 2011
I'm enjoying the media furor about Daryl Bem's recent psi-research. It's always entertaining to see scientists getting hot and bothered about parapsychology.
Bem, a highly-regarded Cornell University psychology professor, has described nine experiments that he carried out over the past eight years. The one that attracted the most attention involved getting students to look at a computer screen that showed two curtains and to guess which one had an erotic image behind it. In fact both spaces were blank, and the image was randomly assigned by the computer after the subject had guessed. Bem says that volunteers showed a slight tendency to identify the space with the 'rewarding' sexy picture, by a margin of 53%. In trials where the picture held no particular interest the results were the 50% expected by chance.
Another gave a spin to a classic memory test, in which subjects are given a list of words and asked to identify a suitable category for half of them (if the word was 'tiger' the category would be 'animal', for instance). Tested on their recall later, subjects are more likely to remember the words they categorized. In Bem's version, the process was reversed, so that subjects first tried to recall words, and only afterwards carried out the categorization element. But the words they picked to categorize tended towards those that they had earlier been able to recall. Bem suggests that this shows that practicing a set of words after the recall test does, in fact, 'reach back in time' to facilitate the recall of those words.
In a parapsychology journal the report might not have attracted much notice. But it has been accepted for publication in the mainstream Journal of Personality and Social Psychology, which makes it earth-shattering, apparently.
A New York Times piece last week appeared to be on the verge of panic. It said scientists were 'mortified' and quoted Ray Hyman saying. "It's craziness, pure craziness. I can't believe a major journal is allowing this work in. I think it's just an embarrassment for the entire field.' Hyman thought Bem might be carrying out a practical joke. In the comments thread others agreed: perhaps the prank had a purpose, to expose the shortcomings of his own discipline.
The problem is that Bem is not your garden-variety ghost-hunting whacko. He's a senior psychologist, top of his field. His work is in all the textbooks. He wrote the textbooks, for God's sake! What's he doing advocating this ESP nonsense?
Some think the publication itself is overrated, but that doesn't really work either: journalists describe it as one of psychology's flagship journals. Others say the magazine knows nothing about statistics...oops, its editor, Charles Judd, is said to be one of the world's leading stats experts. So sceptics are anxiously looking for other explanations, for instance appealing to Occam's Razor and/or Randi's prize - 'if the shrink wins I'll believe it'. Or instant dismissal: 'This study is junk' (from the noted philosopher of science The Amazing Kreskin).
Two approaches in particular offer possibilities. One is the familiar 'experimental flaws' ploy which Ray Hyman pioneered in the 1980s, and which has become the standard CSI(COP) response. James Alcock has used it to make the prosecution case in the Skeptical Inquirer, quite effectively, I thought. Bem's work was extended over many years and was subject to frequent revision, which makes it easy to bury in a mass of complaints about procedural 'messiness'.
In fact here the professional sceptics are being upstaged by conservative statisticians, who it appears have an axe to grind with the ways statistics are used in the social sciences and medicine. Their idea is that the threshold of significance in experiments in these disciplines is far too generous, with the result that all kinds of unlikely claims are validated. They argue instead for the use of Bayesean methods, after the eighteenth century mathematician Thomas Bayes, who wanted the numbers to be weighted to take account of what was observable in the outside world. If the claim that the stats seemed to validate are inherently unlikely, then they should be downgraded accordingly.
In the past, statistics hasn't been the most successful area of attack against parapsychology. Statisticians themselves have been among its most vocal supporters, beating off complaints about improper statistical analysis. There's a lot of uninformed pontificating, as in this comment from the NYT thread:
What people apparently don't want to realize is that the laws of probability absolutely require you to take all prior information (i.e. results of past experiments) into account. The very, very vast majority of past ESP experiments has produced negative results. If you conduct thousands of ESP experiments, you are virtually certain to obtain extraordinary results once in a while, but picking out the "winners" is unsound and unscientific.
Well it would be if that's what's going on, but parapsychologists insist this so-called 'file drawer' problem is a myth, and even the professional debunkers aren't seriously pursuing it. But then you'd need to take an interest to know that.
However this looks to be a bit different, as the sceptics here are arguing not that the method has been wrongly applied, but that a different method is required altogether. Some statisticians have apparently been campaigning about this for years, with regard to the social sciences, and this controversy is a perfect opportunity for them to press their point. There have been a couple of rebuttals of Bem's findings along these lines (here and here).
But really, is it that different from anything else that sceptics have complained about over the years? For you don't have to be a mathematician to see the problem with applying Bayesian techniques. It's not just that the whole notion of weighting the numbers to allow for inherent unlikeliness is suspect, it's the specific factors that the Bayesians identify as relevant are so arbitrary: No mechanistic theory for precognition and no idea of how brain processes could produce it. If it's true, the world should be full of powerful psychics, but it isn't. There's no real-life evidence that people can feel the future. No one has won Randi's prize.
These are all loaded with assumptions about what should be the case, and exhibit a complete ignorance of the data that describes the phenomenon, as it appears in real-life situations. So using them to weight a statistical experiment is pointless, like carefully measuring out the ingredients for a cake, and then adding random handfuls of flour, sugar and flavouring until the result 'feels right'.
What it all seems to boil down to is the complaint that the margin of significance is insufficient to support the claim. In practice that's what sceptics feel anyway, otherwise we wouldn't be arguing about it any more. All this does it give their gut-rejection a veneer of scientific respectability.
It's certainly true that the effect size in Bem's experiments is quite small, as it is in parapsychology generally. A lot here has to do with the size of the sample. An average 51% result where 50% is the chance mean is taken to be significant in psychokinesis studies, for instance, only because huge numbers of trials are involved. And it's widely recognised that the effect sizes reported by parapsychology are as great or greater than, for instance, the link between aspirin and heart attack prevention, calcium intake and bone mass, second hand smoke and lung cancer, and condom use and HIV prevention - none of which are especially controversial. So in theory, there's a case to answer from Bem's work.
As it happens, I don't think these experiments alone provide particularly good evidence of psi, but they do confirm a long-established trend in parapsychology. More to the point, they offer a new experimental approach, which I guess is what Bem is more concerned with than making grand claims about psi's existence. Like Rupert Sheldrake's staring experiment, it's quite easy to do and he wants others to try it. Apparently there have been three failed replication attempts, so far, which has given heart to sceptics, but for Bem it's still early days.
I'd be interested to hear what statisticians like Jessica Utts, for instance, have to say about the Bayesian critics, and it will be interesting to see how the controversy pans out. For the moment it seems clear that the sceptics have a new argument to run with - the NYT in particular is keen to push it, as in this follow-up piece a couple of days ago.
All this aside, I'm encouraged by the responses the Bem debate has been throwing up. The comments thread on the first NYT piece contained the usual snarky responses, but there were also a number of thoughtful ones. Quite a few urged that it's the job of science to investigate, not to suppress or make a priori assumptions. Some recognized quantum entanglement effects validate the possibility of psi and others described their own experiences, rebutting claims that there's no real-life experience of it.
And a lot of the Internet coverage has been informed and positive, eg this Huffington Post article. Psychologists in the past have been fiercely critical but these two pieces in Psychology Today are informed and interested (here and here). My sense is there are plenty of professionals out there in the scientific mainstream who take parapsychology seriously, and who won't take the sceptical brouhaha completely at face value.
People such as Ray Hyman are completely dishonest. He and his ilk blast parapsychologist for not publishing in mainstream journals and then when they do, they attack the credibility of the journals for allowing a parapsychologist to publish. This is the old persecutors trick of stopping you from publishing and then blasting you in public for not publishing.
Posted by: Kris | January 12, 2011 at 03:18 PM
You might find this decidedly fretful piece from Douglas Hofstadter amusing. (In fairness, some of the other contributors are pretty sensible.)
http://www.nytimes.com/roomfordebate/2011/01/06/the-esp-study-when-science-goes-psychic/a-cutoff-for-craziness
Posted by: BenSix | January 12, 2011 at 04:36 PM
Yes, plenty of handwringing here! But it's good to see the Brits, Wiseman and Goldacre, keeping cool under fire.
I like this remark, by Anthony Gottlieb:
It’s very suspicious that hard evidence of paranormal powers only ever seems to show up in laboratories. If people really can predict the future in extrasensory (and extra-rational) ways, how come they only seem to manage it when ESP researchers ask them to do something trivial, like guess a playing card or a picture?
Perhaps I'll send him a copy of Randi's Prize.
Posted by: Robert McLuhan | January 12, 2011 at 05:02 PM
Typical "skeptic". Suppress that which does not fall into your world view. Heaven help us all if these " skeptics" ever gain control of governments it will make the inquisition look like child's play.
Posted by: Kris | January 12, 2011 at 05:40 PM
the moderator of this blog: http://www.talyarkoni.org/blog/2011/01/10/the-psychology-of-parapsychology-or-why-good-researchers-publishing-good-articles-in-good-journals-can-still-get-it-totally-wrong/comment-page-1/#comment-3035
chalked out the usual "why hasn't Bem applied for Randi's prize" line. My reply I hope, clarifies things!
Posted by: michael duggan | January 13, 2011 at 01:19 AM
Nice post. I love it. Waiting your new posts. Thank you...
Posted by: Devremülkler | January 13, 2011 at 01:54 PM
Robert – first I’ll mention that I received your book in the post recently. I haven’t had time yet to do more than skim the contents, but the parts I have looked at are well laid out and seem to form some tight arguments. What I like in particular is that the book is set out in sections that can be read as independent pieces in their own right. That is useful for me right now because I am not able to set aside several hours at a time to get into hefty tomes (and it is a very involved book – I applaud you for being able to put it all together). No doubt I’ll be able to make some comments at a later date.
With regard to Bem’s research, I don’t think anyone who supports the paranormal hypothesis should get too excited just yet. He has reported results that are statistically significant, typically between 1.7 and 3% above chance expectation, but that does not imply that anything paranormal is happening. Some other researchers will likely assume that he has been less than careful in his methodology, for example, and of course the acid test is replication – of which there have been at least three failures so far.
Let me quote the physicist Ernest Rutherford: “If your result needs a statistician then you should design a better experiment.”
That, for me, sums it up nicely. As I have suggested many times here, why don’t the psychics do their stuff without any ambiguity? It’s all very well to argue about statistical analysis of alleged paranormal ability, which only leads to further argument about the validity of those statistical analyses, etc., ad infinitum, but we really need to see solid results.
I wonder if we could find a “holistic” car mechanic, for example. The believers in the paranormal could take their broken-down cars there to be repaired, and the rest of us could take our broken-down cars to a time-served mechanic. And then we compare the results.
When I have had to use the services of a qualified mechanic in the past, I have had no problem (except for a couple of cowboys). But I have certainly never had to pay for a car to be returned to me still broken down, but then being told that, statistically, it works, but if it doesn’t then it’s my own fault for not having enough faith. But that is what happens when someone visits a psychic or a faith healer or anyone else who claims to be getting their “powers” from the “other side.”
There really is a simple bottom line here: can someone who claims to have paranormal powers actually demonstrate it? Can Uri Geller really bend a chrome vanadium spanner with “psychic powers”? Some people say yes. I would say to Geller, do it with a spanner that I supply, and I will be convinced (never mind James Randi and the rest of the world, just convince me).
Additional note: I followed Michael Duggan’s link. If you read the whole article, you will find an excellent analysis of Bem’s work, and an illustration of some of the problems inherent in statistical analysis and interpretation of results.
Posted by: Harley | January 15, 2011 at 11:56 PM
Not suprisingly, the three failed replication attempts (one of which clearly suffered from severe limitations) receive much attention, while successful replication(s) are ignored. Unscientific predjudice in online media at its finest.
See http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1715954 for:
Batthyany, A. (2010). Retrocausal Habituation and Induction of Boredom: A Successful Replication of Bem (2010; Studies 5 and 7). Social Science Research Network, Working Paper Series.
Also see:
Savva, L., Child, R. & Smith, M. D. (2004). The Precognitive Habituation Effect: An Adaptation Using Spider Stimuli. The Parapsychological Association Convention 2004, pp. 223 - 229.
and
Parker, A., & Sjödén, B. (2010). Do some of us habituate to future emotional events? Journal of Parapsychology, 74, 99–115.
Posted by: vo | February 02, 2011 at 04:57 PM
[I just noticed that my above comment could be misconstrued. I'm not accusing anyone on this blog of "prejudice"; rather, that was in reference to the NYTimes Piece, for example.]
Posted by: vo | February 02, 2011 at 05:07 PM