Should be required reading.
http://neptune.spacebears.com/cars/stories/margins.html
When looking at the fluctuation in the numbers over the course of our oil study, it's natural to ponder the margin of error for the tests. Because even though we suspend disbelief and accept that the testing found exactly 2,930 mg/Kg of calcium, deep down inside we know that no test is perfect. At best it is a very very good approximation.
So just how good is the approximation? Most people tend to think of this as a margin of error expressed as a percentage. Well, sometimes that's true, but in the world of science it's seldom that easy. What we call margin of error, science geeks call precision, and they measure it in terms of repeatability and reproducibility:
Repeatability
* Same operator
* Same laboratory
* Same equipment
* Same conditions
* Identical sample
Reproducibility
* Different operator
* Different laboratory
* Equivalent equipment
* Equivalent conditions
* Identical sample
In a nutshell, repeatability is when you make back-to-back passes at a dragstrip; reproducibility is when you let your buddy borrow your car and he races it at a different dragstrip. Based on the standards we've been reading, reproducibility has a margin of error two to four times worse than repeatability. Sometimes this margin of error is expressed as a percentage, but not always. With any margin of error statement, watch for the measure of error (percentage, parts per million, or whatever it may be).
Now, a complicating factor is that when we look at oil degradation trends, we're not actually using an identical sample as required by both measures of precision: each test point is a new sample (to continue the analogy, your buddy is using a car nominally identical to yours, rather than actually using your car). We have no basis at all to expect precision within the repeatability range over the course of this study, so we will instead hope to see precision within the reproducibility range.
To get an idea of the real-world reproducibility of oil analysis, we subjected Blackstone Laboratories to a blind test. When we drew the 6,000-mile sample, we drew a second sample immediately following the first. We submitted the sample by way of accomplice Les Carnes, who -- lucky dog -- "owns" a 2003 Corvette made of vaporware. Les sent in the sample for us and forwarded the results. Blackstone was therefore "blind" to the true origins of the sample, ruling out any potential bias at the lab. The results were quite favorable, well within reproducibility ranges for most tests.
It's worth noting that this is not a perfect reproducibility test. Though the oil samples were very similar, there is no way to ensure that they were identical. The second sample came out of the pan several minutes after the first, giving the oil more time to settle and cool. The second sample also had to travel to Texas before going to the lab; in all there is a six-day difference between tests, a point mostly of importance to TBN, which can sometimes age unpredictably. And then there is simply the random distribution of particulates, which cannot be assumed to be a uniform mixture.