The tool works by comparing the baseline data, such as the height, sex, weight and blood pressure of trial participants, to known distributions of these variables in a random sample of the populations.
If the baseline data differs significantly from expectation, this could be a sign of errors or data tampering on the part of the researcher, since if datasets have been fabricated they are unlikely to have the right pattern of random variation. In the case of Japanese scientist, Yoshitaka Fuji, the detection of such anomalies triggered an investigation that concluded more than 100 of his papers had been entirely fabricated.
Klein said that in some cases the anomalies could be put down to “misinterpretation, statistical error or plain simple mistakes”, such as transcribing data wrongly or mislabelling a column.
The latest study identified 90 trials that had skewed baseline statistics, 43 of which with measurements that had about a one in a quadrillion probability of occurring by chance.