The following article is reprinted from Zetetic Scholar #10, December 1982, pp. 50-65, with the permission of the publisher (Marcello Truzzi, Dept. of Sociology, Eastern Michigan University, Ypsilanti, MI 48197). The article is copyright © 1982 by Marcello Truzzi.

This article and correspondence between Richard Kammann, Marcello Truzzi, and George Abell were probably largely responsible for the publication of George Abell, Paul Kurtz, and Marvin Zelen's "The Abell-Kurtz-Zelen 'Mars Effect' Experiments: A Reappraisal" in the Spring 1983 (vol. 7, no. 3) issue of Skeptical Inquirer. The "Reappraisal" was authored by Abell and published at his instigation.

Richard Kammann was associate professor of psychology at the University of Otago in New Zealand, co-author with David Marks of The Psychology of the Psychic (1980, Prometheus Books), and a Fellow of CSICOP until his resignation in 1981 as a result of the "Mars Effect" controversy. He died in 1984. -- Jim Lippard

Contents:


THE TRUE DISBELIEVERS:
Mars Effect Drives Skeptics to Irrationality
(Part I)

RICHARD KAMMANN

What follows is a two-part analysis of the Mars Effect Controversy surrounding the tests conducted by prominent Fellows of the Committee for the Scientific Investigation of Claims of the Paranormal. An advance copy of this article was sent to the Executive Council of that Committee, and a letter announcing the article and offering a free copy was sent to all of the Fellows and Scientific Consultants of the organization. Thus, all persons associated with the Committee have been urged to comment on this paper or to persuade others representing the Committee to reply. We hope that this will elicit responses to Dr. Kammann's critique (we have thus far received no responses to Patrick Curry's critique in ZS#9 that might appear in ZS#11). But if replies appear elsewhere (e.g., in THE SKEPTICAL INQUIRER), we will so inform ZS readers in ZS#11. Of course, as with all ZS materials, readers are welcome to enter these dialogues.

Perhaps it should be mentioned that the first part of this article by Dr. Kammann was originally submitted for possible publication to THE HUMANIST, PSYCHOLOGY TODAY, and THE SKEPTICAL INQUIRER, all of whom declined to publish it. I mention this not to impugn any of these publications or their policies--there are many good reasons why a publication may reject an excellent article. Rather, I mention it to point out that this article was not solicited by ZS and was not originally written for ZS. -- M. Truzzi

In recent years psychologists have become increasingly fascinated by the versatility of the true believer in finding reasons to go on believing in spite of clear evidence to the contrary. These belief-preserving maneuvers are most readily seen in everyday aberrations like racial prejudice, superstition, religion, and the slogans of politicians, but it is now recognized that they also occur in the very halls of science from which "truth" is supposed to be broadcast with dispassionate, value-free objectivity. Kuhn has written about the "paradigms" that organize scientific thought in each field of study at each point in time, while psychologist Michael Mahoney (The Scientist as Subject) has documented a number of signs of fallible logic and irrational conviction among scientists, especially social scientists.

This article is a case study in which a small group of anti-pseudoscience skeptics fall back on a remarkable line of illogic and defensiveness when confronted with intractible data suggesting that the position of Mars in the sky when one is born has an effect on the likelihood of becoming a sports champion. In the July 1982 issue of Psychology Today, UCLA astronomer George Abell reviews the data on the Mars effect, a quasi-astrological claim by French scientists Michel and Francoise Gauquelin and presents his reasons for disbelieving the claim. Again in the August 30, 1982, Newsweek, Abell says, "The Gauquelins have no way of proving they did not cheat." Of course, such a statement can be made against any scientific claim. Such a slur may only be raised properly where there are positive reasons for suspicion as there is in this case--not by the Gauquelins, but by Abell and his collaborators!

Abell's ultimate line of attack against the Mars correlation is his argument that the Gauquelins' data may not be trustworthy. To support this case, Abell points to data produced by the Gauquelins in response to a control group challenge issued by Abell's collaborator Marvin Zelen in The Humanist (Jan./Feb. 1976). The essence of the challenge was for Michel Gauquelin to produce the Mars data on ordinary people for comparison with the Mars effect on sports champions. Incidental to this test, the Gauquelins extracted a subsample of 303 of their original 2088 European champions. According to Abell, these champion data have two anomalies in them. First, there were "disparities" among the three regional samples from Paris, the rest of France, and Belgium. Second, the fact that the size of the Mars effect in this 303 subsample happens to match almost perfectly the size of the effect in the original sample of 2088 champions is seen by Abell as "improbably good," meaning that it was too good to have plausibly occurred by chance alone.

It is incredible that Abell should have produced these erroneous arguments for the 4th time in spite of repeated warnings by many critics over four years that they are incorrect. Nevertheless, having entered them (once again) into the public record, he says "Although we suspect that the Gauquelins' sample was not random, we can imagine ways that bias could have entered without intentional cheating." He then provides an analogy that makes the Gauquelins look grossly incompetent if they are not to be accused of cheating. Not satisfied with the damage done by this innuendo, Abell adds the thought, "I find it hard to believe that they [the Gauquelins] would intentionally falsify the data, but of course personal feelings are irrelevant in the scientific evaluation of a claim."

Following Abell's dictum, I here put aside my personal loyalties to him, to the Committee for the Scientific Investigation of Claims of the Paranormal (CSICOP) which he represents as a member of its Council, and to those other Councilors of CSICOP whose talents and purposes I have always admired. In Part 1 of this two-part paper I shall show that Abell, along with Professor Paul Kurtz, the Chairman of CSICOP and former editor of The Humanist, and Professor Marvin Zelen, statistician at Harvard University and Fellow of CSICOP, have persisted in offering to the public a set of demonstrably false statistical arguments against the Mars effect in spite of four years of continuous and steadily mounting criticism of their illogic.

Also in this Part, I report a hypothesis that I worked out to explain away some of these errors, and even their unsinkability, as non-malicious acts of blind prejudice, and will explain why I was unable to extend that scenario to cover all the errors committed. Many of these errors were initially exposed by Dennis Rawlins in a smashing attack on CSICOP in the October 1981 issue of Fate magazine. Readers who are not aware of this whole controversy, and the vehement behind-the-scenes controversy it has created for CSICOP, can refer to the Chart for the cast of characters and the chronology of main events.


Cast of Characters

  1. CSICOP

  2. The Trio (who committed the errors)

  3. The Man who Complained

  4. The Victims of the Errors

Brief Chronology


The Zelen Challenge

Readers who have followed the Mars feud already know that Michel and Francoise Gauquelin found that European sports champions were born in Mars sectors 1 and 4 at a rate of 22% instead of the 17% expected by chance. However, the definition of "chance" requires making assumptions, for example, about the times of day that people are born which are not truly random. While one can try to take all such factors into account to calculate a chance baseline, Marvin Zelen proposed a shortcut in 1976 by looking at the Mars sectors of ordinary people to see how often they are born in sectors 1 and 4. As a method of finding the right baseline, the Zelen challenge is a definitive test.

The first error by the skeptics occurs in the funny way Zelen designed this challenge. Quite logically he said that the control group should be born at the same times and places as the champions. He suggested using 100 or 200 of the original champions to locate the matched control group. Practically nobody noticed in the fine print of Zelen's statistical design that he planned to see if the Mars effect in these 100 or 200 champions was above the baseline effect in their birth mates.

The catch-22 is the small sample size Zelen suggested. If there really is a Mars effect of 22% above 17%, a sample of 100 champions is far too small to detect the effect reliably. The Gauquelins not only spotted the error, but presented Zelen with a mathematical proof of it. As far as I know, Zelen has never admitted the point.

Taking up a corrected version of the Zelen challenge, the Gauquelins deleted a part of their champions group because it would be too difficult to get their control data, and used as many of the remaining champions as they could. Since the local French birth records offices would not always supply the data, the champions group dwindled to 303, but through them, a large control group of 16,756 non-athletes was located. When Zelen analyzed the data, the control group baseline came in almost perfectly, at 16.4%, and the 303 champions incidentally came in with a Mars effect of 21.8%, both as Michel Gauquelin had predicted.

Even the observed Mars effect in the 303 champions subsample was significant at the .02 level. This says that if there really is no Mars effect, and we ran the experiment 100 times, always using 303 champions, we should only observe a value as high as 21.8% in a mere 2 experiments. By a scientific rule of thumb, the investigator is allowed to claim a real effect whenever the "chance probability" of the observed result falls below 5 in 100 experiments. (This arbitrary .05 threshold will appear over and over again here as the "litmus test of truth.")

The Skeptics' Reply to the Gauquelins

Let us be clear about the Gauquelins' unquestionable victory for the Mars effect--among 16,756 ordinary people, Mars was in sectors 1 and 4 for 16.4% of their births, just as expected, while for 2,088 European sports champions it was in these sectors for 21.6% of their births, a difference that is totally outside the realm of mere chance. The control group result also eliminates a number of statistical doubts raised by the skeptical Belgian Para Committee in their attempt to disown their own positive verification of the Mars claim.

Nevertheless, this strong evidence in favor of planetary influences de-materialised after Zelen, Kurtz and Abell performed their statistical numerology on the data in the November/December 1977 issue of The Humanist alongside the "good news" report from the Gauquelins. To accomplish this, the trio completely ignored the original sample of 2,088 champions and proceeded to bludgeon the subsample of 303 champions that had merely been used to locate the matched control group of non-athletes.

Their first method was sample-splitting. They divided the 303 champions data into three geographical regions, or alternatively split them into two parts according to sources of the data. The Mars-positive percentages were: 32% (Paris), 21% (France-minus-Paris), 15% (Belgium) and again, 21% (Gauquelin data), and 22% (Para Committee data). Out of these five small sub-sub-samples, only the 32% for Paris was statistically "significant," that is, reliably above the baseline level of 16.4%. This was good news for the skeptical trio since a private line from Mars to Paris should seem absurd to the most starry-eyed believer.

They failed to mention, however, that non-significance in the sub-sub-groups should occur automatically by the reduction in the size of the samples. (Small samples have large fluctuation zones or standard errors; to achieve the .05 level, the observed Mars effect needs to be only 20.3% among five hundred people, but must climb to 26.7% among fifty people.) The absurdity of all this sample-splitting was clearly demonstrated six months prior to the trio's article by Michel Gauquelin who pointedly showed Zelen how to break the Paris data into seven smaller samples to get rid of the Mars effect altogether!

A here-and-there Mars effect, however, led the skeptics to hint darkly about possible flaws in the data collection. With a Mars effect occurring only in Paris, they referred to "possible irregularities," "striking differences," or now to "disparities" in the subsets of data. In his first private analysis of the data, Zelen concluded his memo with the bald statement, "There is not enough information to verify how the sample was drawn," in spite of the fact that Michel Gauquelin had long before sent him three detailed descriptions of the sampling procedures which were entirely straightforward and barred Gauquelin himself from influencing the data.

The claim of disparities among the sub-sub-groups is simply incorrect statistics. The correct method for such a claim is not to compare each group with the 16.4% baseline, but to compare them directly with each other, as any statistics professor can tell you. Although the trio never did this, Eric Tarkington and Dennis Rawlins both did it independently and reported no significant differences among the groups. Therefore there is no basis whatsoever to say that there is a bigger Mars effect in one place (not even Paris) than there is in another place (not even Belgium). The different sub-sub-group percentages could all result normally from a constant Mars effect of 22% in all categories. There are no anomalies. There are no disparities. There is nothing special about Paris. There is no evidence that the Gauquelins' data are in any way unusual, except for the Mars effect itself.

When more criticisms mounted after the trio published the 1977 paper, they persisted doggedly. They said it was only proper to explore the effect in "recognizable subsets" of the data (such as that well-known place "France-minus-Paris"???). After all, they said, the Gauquelins themselves had looked at sub-sub-groups. This was technically true, but with a major difference. The Gauquelins' breakdowns were scientifically more rational, such as the one between larger and smaller localities. More importantly, the Gauquelins did not try to run statistical tests on these small samples. Their purpose was only to show that there was at least a sign of the Mars effect in the different categories.

The next error, removal of the females, requires a background fact first. In Zelen's first and private analysis of the 303 champions, Paris was highly significant, France-minus-Paris was "marginally significant" at the .06 level, and Belgium was not significant. Coming into print, the trio noted that there were too few female champions for a proper analysis and dropped all 9 of them from the calculations, along with the control-group women. A small side-effect was that the significance level in France-minus-Paris happened to slip from .06 to .09, now called "not significant," while the whole sample slipped from .02 to .04. (The number 303 should actually read 294 in the preceding paragraphs.)

Of course, the .04 result was also doomed. In the bizarrest maneuver of all, the trio argued that the overall Mars effect now depended merely on the results of a single champion since, if only one more champion had been born outside a key sector, all results would fade into non-significance! Even Randi the magician could hardly match such a comprehensive vanishing trick!

This reasoning violates the basic principles of statistical analysis. A significance test only makes sense when applied to the actual data, not the data upped or downed a few notches to suit the researcher's personal prejudices. The main result did not depend on a single champion, but on all 63 of the key-sector athletes needed to beat the chance prediction of about 51. The trio's illogic invites any number of games with the data. The Gauquelins could ask, but what if there was one more champion born inside a key sector, or for that matter, ten more? And does the result only depend on one champion considering that three out of the nine deleted females were also Mars-positive?

Meanwhile, everybody but Dennis Rawlins had forgotten about the original huge sample of 2,088 champions. In a letter to Paul Kurtz in 1978, Rawlins roughly estimated the significance level in that group, that is, the odds against there being no real effect behind the 22% result, at 1 in 10 million. It was fully four years later in the Winter 1981-82 issue of The Skeptical Inquirer before Abell and Kurtz (sans Zelen) acknowledged that the data from the full 2,088 champions sample "would seem to be statistically significant" and went on to say, "It was not a conscious omission. Indeed, the point was made very clearly in the Gauquelins' companion paper published in the same issue of The Humanist (Nov/Dec 1977)." That sounds better--until you read the Gauquelins' paper and discover that the alleged point isn't there!

The striking feature of all these fallacies was not just their unmooring from the anchors of logic but their unsinkability in four years of competent statistical bombardment by Michel Gauquelin, Elizabeth Scott, Dennis Rawlins and Ray Hyman. Indeed, the trio's reply to Rawlins' "sTarbaby" bomb, where the errors were laid out in blunt English, was to refurbish and re-circulate the whole lot in a paper sent privately by Abell and Kurtz to the Fellows of CSICOP, presumably to ward off a possible stampede of resignations. Remarkably enough--it even worked.

A Scenario for Innocence?

After seven months of investigation which included extensive correspondence with members of CSICOP's Executive Council, I developed a theory which classified the errors--even their indestructability in the face of criticism--out of the arena of deliberate distortion, and into the theatre of blind prejudice. After receiving a thought-provoking letter from CSICOP Councilor Ray Hyman, I re-studied that documents from the point of view of innocent errors, in which case Rawlins' suggestion of cynical manipulation might go away. This interpretation is, I still believe, correct--as far as it goes.

With the extra help of unpublished background papers and letters, here is what I imagine took place. When the control group data unexpectedly came in at 17%, an exasperated Marvin Zelen immediately turned a skeptical eye back on the 303 subsample of champions and quickly made two erroneous discoveries. Analyzing the champions data a page at a time (which he did) the results bounced around--in the Belgium pages there was no Mars effect at all, while in the Paris pages it was highly significant. (A third set of pages created the arbitrary region "France-minus-Paris.") Without thinking twice about what he was doing, he incorrectly ran significance tests on these subsets against the baseline to confirm his suspicion of "disparities."

Another thing he happened across at this stage was a "striking similarity" between the Mars effect of 21.8% in the sample of 303 champions, and the 21.6% level in the parent group of 2,088 champions. These were obviously too close to be an accident, he thought, and with the versatility of a professional statistician he demonstrated a significant non-difference(!) at the .049 level. This appears in his original memorandum, but was not published in the 1977 paper. It only surfaced in the private 1981 memorandum to CSICOP's Fellows, and of course, as Abell's "improbably good" argument in the July 1982 issue of Psychology Today.

In spite of its professional flourish, this analysis was faulty because Zelen did not take into account the different pairs of data points in the whole set of results. There were probably several ways in which an unusually large or small difference could occur among the Mars sector percentages by chance, any one of which would seem suspicious when looked at by itself. Only if such possible pairs were listed in advance could Zelen make a correct test, in which case his "finding" would have disappeared.

The illusion that the Mars effect occurs only in Paris and that two sub-samples had a "striking similarity" convinced Zelen that the Gauquelins had somehow produced a biased sample. He was immediately so suspicious that he closed his memorandum with his unjustified complaint on the lack of sampling information from the Gauquelins.

At this point a process of subjective validation took over which I have outlined in The Psychology of the Psychic (Marks and Kammann, 1980) to account for the persistence of false beliefs in the face of contrary evidence. The model says that once a belief or expectation is found, especially one that resolves uncomfortable uncertainty, it biases the observer to notice new information that confirms the belief, and to discount evidence to the contrary. This self-perpetuating mechanism consolidates the original error and builds up an overconfidence in which the arguments of opponents are seen as too fragmentary to undo the adopted belief.

As soon as Zelen's suspicion about a biased sample was shared among the trio, it was quickly noted that the Belgium data, where there was no Mars result, was the only subset not collected by the Gauquelins; furthermore, Paul Kurtz then remembered some anomalies that he had glossed over as unimportant when he had previously spot-checked the Gauquelins' records in France. With all of such factors pointing so "clearly" to untrustworthy data, the answer was apparently in hand, even if not provable or fully publishable.

Partly because the minor pieces of this argument were not published, including the "striking similarity" effect, critics like Dennis Rawlins, Elizabeth Scott and Ray Hyman were seen as nit picking and lacking of the big picture. (They were also not offering any good alternative to demolish the so-called Mars effect.) For example, when it was later shown that the differences among the three regions were not significant, the trio would not be impressed--such a test cannot prove that there are no differences either, and the (imagined) negative result from Belgium was felt to be too "revealing" to be given up.

But It Doesn't Cover the Territory

Although the evidence seems strong to me that the preceding scenario correctly describes how the trio jumped into the stew, it does not cover several other errors. For these I had to construct a separate mini-scenario for each case. Although I was not at all confident about these, it seemed worthwhile to pursue an emerging innocence theory as far as it would go.

In this direction, the trio's removal of the females might have been, as they eventually claimed, the simplest way to balance the sexes between the control group and the 303 champions group. (This also required them to drop the female half of the control group, but it was already so large that this had no effect.)

The nonsensical one-single-champion argument was possibly a didactic device to explain to nonstatistical readers how really weak is the evidence for a Mars effect conveyed by a .04 significance level.

The reason that the trio overlooked the massive significance in the full 2,088 champions sample was the result of Zelen's pre-occupation with a perfect statistical design--in a technical sense, only the 303 champions subsample was precisely matched with the non-champions control group.

Why then did Abell and Kurtz later claim that this "2,088 error" had been separately covered by the Gauquelins when it had not been? Perhaps they were under so much pressure from "sTarbaby" they rushed into print with a statement based on memory rather than re-reading. But this seems odd because that argument is the centerpiece of their entire one-page public defense against "sTarbaby."

How viable, then, is a comprehensive innocence theory? While each mini-scenario is plausible enough in isolation, their summation with the main scenario is hardly reassuring and to be accepted requires a conclusion of pervasive ineptitude. Unfortunately, the trio have never bothered to comment one way or the other. But perhaps the only important scientific point is that the errors exist and are still being repeated.

What About the Mars Effect?

The bottom line is that an apology is owed the Gauquelins for the mis-treatment of their data, and the aspersions cast on their authenticity. I don't wish to convey that I'm a believer, because I also have skeptical reservations about the Mars effect. What makes this claim suspect is the scientific perversity of the proposition that the location of Mars in the sky at the time a person is born has some effect on that person's athletic performance 30 or 40 years later. It has been repeatedly noted that the natural forces that emanate from Mars, such things as gravitational pull, electromagnetic radiation, and so on, are infinitesimally tiny and must be effectively zero when compared with such familiar earth-bound objects as hills, buildings and even furniture.

The sector locations of Mars do not even reflect its distance from earth, nor are the key sectors simply above and below the horizon in some sensible way that could produce an earth-shadow effect. Even worse, the Gauquelins believe the effect does not appear in merely good athletes, but only shows up in top-top champions. There is no precedent in biology or psychology for a performance factor, such as motivation or temperament, to become suddenly operative when skill reaches an extraordinary level.

Nevertheless, the Mars effect has been once replicated by the skeptics of the Belgian Para Committee (whose gyrations in disclaiming it would make another interesting case study) and once not replicated on U.S. champions by Rawlins, Kurtz, Abell and Zelen. It has, therefore, a residual prima facie case as a valid scientific anomaly. But let us be clear--nobody has produced more massive evidence against the claims of traditional astrology than Michel Gauquelin, himself (e.g., in The Skeptical Inquirer, Spring 1982, pp. 57-65).

In the long-run, the skeptics' worthy battle against superstition and pseudoscience must rely on trustworthy evidence, rational self-correcting debate, and their ability to supply normal explanations for paranormal claims. It is not the belief in astrology, ESP, UFOs or mythical monsters that is the real problem of our times, but the possibility that rationality itself will be submerged under social dis-locations arising from the economic and technological pressures that interfere with our collective will to live out our human potential in peace and without poverty. Thus, I believe that the method of debate is more important to the advancement of rationality than the specific debunking of popular fantasies. I object to Abell's interpretation of the Mars effect, not because we differ widely in the conclusion, but rather in the mode of argument.

None of which is to retract my general agreement with the goals of CSICOP or my admiration for the many enlightening articles that regularly appear in its journal, The Skeptical Inquirer. Knowing as I do how unpopular the skeptic can be in a society of believers, I appreciate the skeptics' need for social support by like-minded thinkers, and the urge to produce a definitive rebuttal to every paranormal claim. Oversimplification and errors are inevitable not only in debunking, but in all exercises in rationality and science, and no harm is done as long as they are amenable to open debate and correction. Unfortunately, CSICOP, as the self-avowed champion of rationality, refuses to admit publicly that the errors occurred and refuses to take any action to stop their endless repetition by Abell in his public attack on the Gauquelins' reputations.

Footnote

  1. The sources for this paper are available in the unpublished manuscript "Statistical Numerology in the Skeptics' Response to the Mars Effect" by R. Kammann, available from CSICOP, 1203 Kensington Ave., Buffalo, NY 14215. An excellent review of the larger controversy which also consdiers the disputes arising from the later Mars test on U.S. champions is found in Patrick Curry's "Research on the Mars Effect," Zetetic Scholar, 1982, #9, 34-53, along with 9 additional commentaries on pages 54-83. The author's subjective validation concept is presented in chapters 11-13 of The Psychology of the Psychic by D. Marks and R. Kammann (Buffalo: Prometheus Books, 1980). Thanks are due to Philip Klass, Paul Kurtz, George Abell and Ray Hyman for their patience in responding to my correspondence and to Dennis Rawlins, Marcello Truzzi, Piet Hein Hoebens, and Michel Gauquelin for supplying background documents.

(Part II)

Recap of Part 1: Faced with unfaultable evidence of a connection between the position of planet Mars at birth and success in sports, skeptical Professors Paul Kurtz, George Abell and Marvin Zelen repeatedly offered fallacious statistics to deny astrology's only ray of hope. Focusing only on a small section of the Mars data, deleting the favorable results for females, dividing the sub-sample into tiny bits and applying the wrong statistical tests, the trio still could not get rid of the Mars effect. They ultimately argued that it was based on faulty data, due either to incompetence or cheating by Michel Gauquelin of France, who produced the original finding.

"Our society has opted for a complete free-for-all of conflicting theories. But if it is this chatoric, who will ensure that there is law and order? Who will guard the truth? The answer is: CSICOP will!" With these words, Douglas R. Hofstadter began his glowing account of the Committee for the Scientific Investigation of Claims of the Paranormal in the February 1982 issue of Scientific American. In Hofstadter's account, CSICOP is a small and heroic band of nonsense fighters providing a steady buoy of rationality in a vast sea of public superstition.

It is remarkable that Hofstadter's piece appeared four months after Dennis Rawlins published "sTarbaby" in the October issue of Fate. In Rawlins' detailed account, the Council of CSICOP had covered up a scandalous demonstration of irrationality by three of their most prominent members, including the Chairman, Paul Kurtz. If Hofstadter did not know about "sTarbaby" then--proof of the pudding--CSICOP hadn't told him, but it is more likely he accepted the CSICOP party line that Rawlins was just a raving malcontent.

On the surface, this is plausible. The trouble with "sTarbaby" on first reading is that the case is too strong, and the cover-up too deep to be entirely believable. Like the other Fellows of CSICOP, I couldn't accept that Dennis Rawlins was the single honest and correct person on a nine-man Council consisting of men of such stature and reputation as Martin Gardner (whose mathematical games column in Scientific American had just been taken over by Hofstadter), Professor Ray Hyman, the Amazing Randi and Kendrick Frazier. In fact, Rawlins seemed to grasp at straws to include these bystanders in the conspiracy plot. It seemed more likely that Rawlins had let his anger get out of control and was seeing connivance in the most innocent remarks. This attitude might then explain why his analysis of the Mars effect had been ignored, and why he was eventually voted off the Council and out of CSICOP. Undoubtedly Rawlins was making a mountain out of a molehill.

After seven months of research, I have come to the opposite conclusion. CSICOP has no good defense of the trio's Mars fiasco and has progressively trapped itself, degree by irreversible degree, into an anti-Rawlins propaganda campaign, into suppression of his evidence, and into stonewalling against other critics. In short, progressively stuck on the trio's tarbaby.

It is now vital to understand where CSICOP lost its bearings and how far off course it has drifted. I shall briefly review the essential events leading up to "sTarbaby" as I have confirmed them since, after which I present my personal experience of Council's modus operandi in the months following.

How Did the Trio Go Wrong?

Michel Gauquelin had already run into one group of irrational skeptics in the Belgian Para Committee who, upon unexpectedly confirming the Mars effect, dismissed their results. They fastened on the fact that babies are not born equally often during the 24 hours of the day and supposed that this could produce a spurious Mars effect. In effect, they suggested that everybody, not just sports champions, has a Mars effect.

Dennis Rawlins, then on the Council of CSICOP and its astrology subcommittee, checked this argument out mathematically and found it to be irrelevant. Nevertheless, Zelen, Kurtz and Abell grabbed the Belgian theory and publicly challenged Gauquelin to produce a control group of nonchampions. Michel and Francoise Gauquelin promptly accepted this "definitive test" as the trio called it and, as Rawlins predicted, won hands down. There was no Mars effect for ordinary people.

George Abell sensibly wrote Paul Kurtz saying the Gauquelins had won that round, and he suggested getting on with the new test on American athletes. Rawlins used this "smoking gun" letter as proof that the trio knew the true situation right from the start, but the case is not strong. Abell specifically asks in the letter what Zelen saw in the data. Meanwhile, as I described in Part 1, Zelen fancied he saw two anomalies in the data that suggested a biased sample. In my "subjective validation" scenario, Zelen's erroneous statistics became the starting point for the trio's private belief that the Gauquelins had probably cheated. By the time the paper got to print, Zelen's skeptical approach had replaced Abell's; although the trio did not openly accuse the Gauquelins of fraud, they smothered the victory under a blanket of bogus side issues, partly achieved by deleting the favorable Mars results for female champions.

Against an "innocent goofs" theory, the trio was warned before publication that their statistics were wrong, once by Michel Gauquelin and once by Elizabeth Scott, Professor of Statistics at Stanford University. (Rawlins was not consulted.) Even worse, after the paper came out, neither Scott nor Gauquelin could get space in The Humanist for a reply.

How Did CSICOP Go Wrong?

After Rawlins read the trio's 1977 paper, he set out with documented good will to educate Chairman Kurtz in statistical reasoning. This seemed to go well during 1978, especially when Kurtz called upon Rawlins to analyze the data for the American sports champions. Rawlins had every reason to believe that the Zelen-Kurtz-Abell errors would fade into history.

He was in for a shock. After Rawlins completed all the computer runs on the U.S. data (no Mars effect), Kurtz announced the trio would present Rawlins' results in a major CSICOP press conference in December 1978. When Kurtz refused to budge on this, Rawlins appealed to the other Councilors for help. It was at this crucial point that Council took its first and fatal wrong turn and embarked on a course they could not subsequently reverse. They ignored Rawlins' complaints. To avoid an embarrassing public split, Rawlins was promised a debate with Zelen and Abell in front of Council. After Rawlins did not blow the whistle with the press, this debate evaporated. When he finally got the floor in Council meeting, he met a wall of resistance.

We can only guess the thoughts of the Councilors. If the idea had been planted that Rawlins was jealous over getting a back seat at the press conference, his anger was explained, but only if Councilors missed the merits of his case. Alternatively, perhaps Paul Kurtz was so indispensable as the group's leader that a reprimand was unthinkable. ("sTarbaby" focuses on Council's obsession with its public relations image.)

After all this flak, it is only reasonable that Council would at least stop any repetition of the trio's past errors, but just the opposite occurred. The Zelen-Kurtz-Abell analysis was re-published a year later in CSICOP's own journal, The Skeptical Inquirer, overriding severe criticisms by one referee, statistician and Councilor Ray Hyman, which confirmed Rawlins. Rawlins was to cover only the technical aspects of the U.S. test, and claimed in "sTarbaby" that editor Ken Frazier censored his full protest of the trio's errors. Meanwhile, he sent a memorandum out to most of the Fellows but to no useful effect.

A year later, Rawlins was voted off the Council and soon after was quietly dropped from the list of Fellows. "sTarbaby" was his reply.

Council's Response to "sTarbaby"

After reading the Rawlins expose in Fate magazine, I was only sure of one thing--if any part of his story were true, I could count on Gardner, Hyman, Randi and Frazier to set the record straight. It was a very long time before I gave up that belief.

CSICOP's first reply was to circulate some photocopied old letters to show (ad hominem) that Rawlins was a habitual troublemaker. Two months later, CSICOP mailed out two privately authored white papers, without taking an official stance.

In "The Status of the Mars Effect," Abell, Kurtz and Zelen simply re-hashed all the statistical errors that Rawlins (Gauquelin, Scott, Hyman, Tarkington) had protested. I did not see this, however, until I had spent hours analyzing four years of published statistics--the errors were even worse than Rawlins had stated, but most Fellows would never learn this.

"Crybaby" was written by Councilor Philip Klass. Although it offered to refute the cover-up charge, it ignored practically every specific point that Rawlins had made. Instead it offered blatant ad hominem attack on Rawlins' motives and personality, bolstered with rhetorical ploys--including crude mis-quotation.

Believing that a full understanding would still get this fiasco straightened out, I sent in a 28-page report called "Personal Assessment of the Mars Controversy." I came to three conclusions: (a) the scientific errors were gross, (b) Paul Kurtz was not guilty of a cover-up on grounds of lack of statistical understanding, (c) CSICOP was guilty of a cover-up by not taking Rawlins seriously, while "Crybaby" was a disgrace.

This report went to Council in December 1981, underlined by my resignation as a Fellow, and my request that it be circulated to all the Fellows. This was not done. Two months later it was casually described as a "lengthy letter" from me along with other routine news in a general CSICOP bulletin. The ho-hum context was so effective I yawned myself.

As my report was going in, the next issue of The Skeptical Inquirer (Winter 1981-82) was coming out with one good move and two bad ones. The good move was to give Rawlins space for a completely uncensored final article, which Rawlins unfortunately wasted on an unreadable script.

The first bad move was a boxed one-page Statement signed by the nine members of the Council, with George Abell now in Rawlins' vacant seat. It asserted starkly that there was nothing to hide and no cover-up. Without giving any useful evidence, it declared that Rawlins' entire jam-packed article in the same issue "contains many demonstrably false and defamatory claims." (Name them!) It referred to all of Rawlins' "assertions and innuendoes" as being based on "half-truth and distortion." (The scientific errors alone disprove this claim.) Worst of all, Council offered for sale the hopeless "Status" and "Crybaby" papers, now officially endorsed by CSICOP. In a flurry of group-think, the whole Council lunged at the tarbaby.

On the facing page, George Abell and Paul Kurtz (sans Zelen) acknowledged only one of the major science errors (while repeating another one). They now claimed their failure to analyze the full sample of 2088 champions, rather than the small sub-sample of 303 champions, was merely an oversight and blandly understated that the total Mars effect "would seem to be statistically significant." With unconscionable bravado they falsely declared that the Gauquelins had already covered this point in their 1977 companion paper.

I still doggedly believed that the Trustworthy Four would come forth as soon as "Personal Assessment" had been digested, but the response over several months was dead silence. I racked my brain for an explanation--was my analysis wrong, had I spoken too harshly, did they still not understand?

Meanwhile, Paul Kurtz and Councilor Philip Klass each sent me a long letter which I naively took to be personal correspondence. (Kurtz' letter was marked CONFIDENTIAL.) Although I eventually disputed both letters, especially the nonsense by Klass, I only learned much later that both letters had been distributed to counter "Personal Assessment" for the few members who had asked to see it (my "lengthy letter"). Kurtz also refused my request to send his letter to Rawlins. The control on information was pervasive.

The Klass letter started a long and exasperating exchange in which he talked about everything but the statistical errors and the real cover-up. He kept me busy for a while answering irrelevant questions, while periodically attacking my objectivity, intelligence or integrity. From time to time, he threatened to expose my cover-up of scientific evidence he imagined he had uncovered. After he regularly ignored all my serious answers and questions, I nicknamed him T.B. Diago--the best defense is a good offense. He eventually fell back on the traditional Council stance--he didn't understand statistics.

Around March, Zetetic Scholar featured a review of Mars and CSICOP with a lead article by Patrick Curry who not only agreed with Rawlins and me about the Zelen test fiasco, but presented a good case for more bungling in the U.S. champions test. But Council had already adopted the line that ZS editor Marcello Truzzi was on a "vendetta" kick. Ad hominem be thy name.

Fleeting Rays of Sunshine

Still getting no response from the 4 stony faces on CSICOP's Mt. Rushmore, I submitted a completely new paper fully documenting all the scientific errors with sources and omitting all charges of a cover-up by Council. Called "Statistical Numerology in the Skeptics' Response to the Mars Effect," and strictly limited to a small circle of addresses, this paper finally got some results.

George Abell produced 71 pages of explanations and apologies, accepting "Numerology" with two minor disclaimers (both wrong). Ray Hyman concurred on the errors but saw them as ordinary slip-ups in the process of science. Many scientists, he argued, try to publish nonsense but are blocked by a strong system of peer reviews and editorial control. Of course, there were no such controls for The Humanist or The Skeptical Inquirer, especially since Paul Kurtz had ultimate control on both. Ken Frazier agreed that a shorter and softer version of "Numerology" could be published in The Skeptical Inquirer but emphasized that nobody was interested in this dull old topic.

My faith in the goodness of CSICOP now flowering, I set to work on a readable third version of the paper. With Hyman's case for ordinary human errors humming in my head, I hit upon the subjective validation scenario for some of the errors (see Part 1) and even convinced myself that the whole cover-up was merely selective perception by Rawlins. The happy ending was in full sight.

The glow didn't last long. Frazier cabled that the editorial board was split and to shorten it severely. Meanwhile, my innocence theory was cracking under the strain to cover all the errors, and I sensed that no version I could write would be acceptable. Strong letters from Martin Gardner and Philip Klass now defined the situation as "resolved" by the Abell and Hyman letters. I was now exhausted and feeling the pressure to pronounce the benediction.

A reply to Hyman from parapsychologist R.A. McConnell said, "Nonsense. What we are talking about is elementary statistics--Abell and Zelen's specialty--and a third professor who is enhancing his status by lending his name in a field in which he presumably has no competence whatsoever. Of course, I'll buy your claim of no conscious dishonesty. Neither was it a chance occurrence. Unexamined dishonesty is rampant in this world. I don't see how you can excuse scientists' publicly trading upon their professional reputations when they are not willing to exert self-discipline." I tried to ignore McConnell, but that phrase "unexamined dishonesty" kept haunting me. The happy ending was slipping from my grasp.

My new revision made a desperate attempt--to the point of bias--to classify the whole fiasco as a set of silly mistakes. But now the trio looked so pathetically inept that I foresaw another wall of resistance and resentment. Just as I sent it in, I learned that Abell was working on his own version, a confessional piece to be cosigned by Paul Kurtz if not by the now hibernating Marvin Zelen. Aha, this is an even better ending--I withdrew my paper from The Skeptical Inquirer.

But this happy ending didn't occur either. Abell now echoed Frazier that nobody was interested in this topic any more. (Given the massive and malicious attack on Rawlins and the massive information control afterwards, this was obnoxiously true.) He now hinted that a short note in Marcello Truzzi's Zetetic Scholar might be sufficient. Next, his first draft of the piece actually repeated more errors than it corrected and continued the innuendoes about the Gauquelins' honesty. There was no more talk of even Kurtz co-signing; and, soon after, Abell said he was too busy with other work to get to the Mars paper very soon. Numbly I urged him not to let it fade away.

I did not know, of course, that Abell was at the same moment hitting the streets with a new round of the old arguments in the July 1982 issue of Psychology Today. He even worked in a new red-herring--that only Gauquelin's own data showed the Mars Effect, thus dropping the Para Committee's replication. Still fixated on the final chorus of joy, I desperately tried to dismiss this new folly as a hapless hold-over from the bad old days. At least some errors had disappeared. But this denial mechanism couldn't hold. Abell had now launched the trio's fourth round of slurs against Gauquelin's integrity in five years, in spite of a relentless barrage of strong criticism against the statistical nonsense being used. This censure now included "sTarbaby" in October, "Personal Assessment" in December, Patrick Curry's paper in March, and "Numerology" in April, none of which deflected Abell in July. Thus CSICOP had chosen for its inner circle a habitually erroneous skeptic to replace Dennis Rawlins whose competence and integrity had proved to be exemplary in the Mars debate.

When the whole record is examined over five years, there is almost no instance in which merit wins out over self-serving bias. The one clear exception was providing Rawlins a carte blanche space in The Skeptical Inquirer, and even this was undermined by a flurry of simultaneous misstatements. Not only is the trio, in spite of all private admissions, publicly unstoppable, but Council backs them every inch of the way and gives Paul Kurtz almost total control over CSICOP's information flow. If the Fellows and Scientific Consultants of CSICOP do not put a stop to this, who do they think will?


Postscript

Subsequent to the publication of the above article, as noted in my remarks at the very beginning, the article by Abell which Kammann mentions in the last section was--at long last--published. Kurtz and Zelen gave their approval and lent their names as co-authors as well. After a few letters--including one from Gauquelin--in the Fall 1983 Skeptical Inquirer, nothing more was published on the "Mars Effect" in the journal's pages for nearly a decade, when Suitbert Ertel's reanalysis of the CSICOP U.S. test was published in the Winter 1992-93 issue. Ertel's method of grading athletic eminence via citation counts in a set of sports encyclopedias showed a "Mars Effect" in CSICOP's own data.

The French skeptics organization (CFEPP) began a "Mars Effect" replication attempt in the early 1980's, completing their data collection by 1986. Although Prometheus Books had been advertising the publication of a book reporting the results for nearly a year as of this writing, problems with the manuscript delayed publication until February 1996.

Another analysis of the CFEPP's data was published in March 1996, in Suitbert Ertel and Kenneth Irving's book, The Tenacious Mars Effect (London: The Urania Trust), to which I contributed the foreword and an abbreviated version of my work-in-progress chronology of events and publications relating to the "Mars effect." The full version may be found in rich text format (RTF, Interchange Format) on my web site, at http://www.discord.org/~lippard/mars-effect-chron.rtf. This file is currently 221K in length, and I still have hundreds of pages of documents to catalog.

CSICOP has continued to demonstrate that it has problems dealing with internal criticism, most recently with its mishandling of reports of unattributed copying in the work of CSICOP Fellow Robert Baker. -- Jim Lippard, 16 December 1995