Saturday, 30 April 2011

The devil is inaccurate


Charles Williams always insisted on what he termed accuracy - a trait essential to an editor of the Oxford University Press; but more than this, C.W. regarded inaccuracy as a sin: characteristic of evil.

And he was right!


What is accuracy? the main components are validity and precision.

Validity mean that a measurement is truly representative of what it claims to measure.

Precision refers to the statistical exactitude of a measurement.

So, if we were measuring the average height of the adult English population it might be valid but not accurate if the sample was 1000 randomly chosen subjects (because a random sample is representative of the whole population), but if the scale was only segmented in metre lengths, the estimate would not be precise - because the measure would only be to the nearest metre.

A sample using a one millimetre scale applied to a non-random sample (e.g. the first 1000 people you found in the telephone directory or met in the street, or 1000 women but no men) would be 1000 times more precise (because measured in millimtres not metres) but would not be valid, because the sample would not be representative of the English population.


As a professional epidemiologist I fought a constant, losing, battle to emphaisize the greater importance of validity than precision.

e.g. and references.

It is more accurate to have imprecise but valid knowledge than precise but non-valid knowledge - yet precisely measured garbage is the material of modern science, administration and politics.


This applies to everything in life - it is always infinitely better to be approximately right than precisely wrong.

The sin of inaccuracy is in claiming or assuming that precision somehow compensates for invalidity, or that greater precision somehow renders validity irrelevant.

This sin is endemic in modern administration - large, complex, quantitative databases are regarded as both essential and sufficient for policy - despite that the information in such databases is always invalid.

Always invalid because the process of data collection is not-even-trying to be valid - the data collection is indeed part of the policy, designed to support policy and not trying to understand the world.


I once termed this system Infostat -


So accuracy properly implies maximum validity as an iron rule, and precision only as an optional aspiration.

And this is not a technical, methodological point: it is a moral imperative.