Science is a method for uncovering knowledge (i.e. the scientific method). It requires postulation of a testable outcome (a hypothesis) that is put to trial either by experimentation or observation. A scientific experiment or analysis can only support or reject a given hypothesis. It can’t by itself tell you what is really going on. After many experiments or analyses fail to reject a hypothesis, it is elevated to the level of a theory, which must continued to be tested. Hypotheses that can explain all existing observations and withstand the test of time become accepted as scientific principles or laws (i.e thermodynamics or evolution). This doesn’t mean they are correct, just that they suffice to explain what we see or think we know. The point is that science is a specific way to answer questions. You can’t, for example, postulate that oil brand A is better than oil brand B in protecting an engine and prove that hypothesis using the scientific method. But you can use science to test whether there is a statistically significant difference in protection between brand A and B by postulating that there is no difference between the oils. This is what is called the null hypothesis. If the data show that the null hypothesis cannot explain the results, then the null hypothesis is rejected and alternative hypotheses must be considered (brand A is better than B or visa versa).
As 3MP correctly pointed out, there are 2 ways to answer oil performance questions scientifically using UOA data. The first is to conduct controlled experiments were all variables are the same except for the brand of oil. As several have mentioned, this is almost impossible to do in the real world. But here is how it could be done: Several identical cars would need to be driven under very similar conditions using 2 different oils. Analysis would be performed at identical intervals. The actual number of identical cars needed for a valid set of data would have to be determined in a pilot experiment that would measure the variance in wear metal accumulation between vehicles using the same oil brand. The greater the variation between engines (due to manufacturing processes), the more cars that would be needed for a valid scientific study. Statisticians do this determination all the time in what are called training studies. All of the cars would then need to start the final study simultaneously to control for variables such as temperature and road conditions and be driven under a carefully proscribed mixture of city and highway conditions. There is going to be variation wear rates as determined by oil analysis within the brand A and brand B data sets due to different engines and different drivers. But by postulating that there is no difference between the mean values for wear metals between the brand A and brand B data sets, statistical analysis will either support or reject the hypothesis.
The second way to use science to extract useful information on brand performance is to collect a very large amount of uncontrolled UOA data (different engines, different sample intervals, etc) and run what is know as a meta analysis. There are statistical techniques that will look for trends in these type of data. In general, they work by sorting out all possible variables and comparing each against all other variables. To be valid, a very large number of samples are required. I have actually used one of these techniques on a set of 100 UOA results and was unable to find any correlation between oil brand and wear metal accumulation rates. The only correlation that was statistically significant was between miles on the oil and wear metal accumulation. But this was not a scientific study because the sample size was too small. Again, the sample size necessary to see a meaningful correlation increases with the amount of variation seen with a given brand of oil.
So is UOA scientific? Not in the formal sense. But that doesn’t in any way detract from the usefulness of UOA for predictive maintenance. It is a science based methodology just like medicine. If a physician always waited until all possible test results are in before making a decision, he would loose a lot of patients. Most of the impact of science on human activities is through science based methodologies (engineering, drug approval, safety standards, etc) rather than pure science.