I can certainly give some "top level" viewpoint in regard to my comments. Just please realize that the world of statistical analysis (I do statistical process quality control for a living), is not something one will understand fully in a short web-session. But I believe I can make it consumable for most in just a post or two.
In simple terms, standard deviation (aka stdev, or "sigma") is the magnitude of variation of a set of numbers, that which represents the "normal" variation away from the average value of those numbers. Consider the following ...
think of these 10 numbers: 6,4,6,4,6,4,6,4,6,4 ... the "average" of those is "5", and the variation away from "5" is very low.
think of these 10 numbers: 10,0,10,0,10,0,10,0,10,0 ... the average of those is still "5", and the variation is MUCH greater.
Averages (aka the "mean") are not the same as standard deviation (aka "variation").
So that explains what stdev is.
Now we need to understand why we need some many samples to get an ACCURATE value for stdev ...Look at the graph below ...
it represents the 95% confidence interval of the value of 1 for any given sample size. What you see are the two response curves (upper and lower) for the considered deviation away from the value of "1", in terms of standard deviation. Standard deviation is the "variation" of a data stream (typically a process measurable). The "Y" axis (vertical) is the amount of variation away from the value of 1. The "X" (horizontal axis) is the quantity of samples taken. As you can see, the amount of std deviation variance gets VERY broad as the sample size gets smaller. Typically 30 is a minimum, and 50 is preferred. Anything much over 50 does not greatly increase the accuracy; anything past 100 does nothing to refine the data in pragmatic terms. Having only 20 samples does not truly give a good indication of std dev; too much inaccuracy in the math below 30 samples.
At 100 samples, the actual value of "1" could be as low as .85 and as high as 1.17; the difference being .32 or about 1/3 the value of "1".
At 50 samples, the actual value of "1" could be as low as .825 and as high as 1.25; the difference being .425 or less than 1/2 of the value of "1".
At 30 samples, the actual value of "1" could be as low as .8 and as high as 1.35; the difference being .55, or about 1/2 of the value of "1".
At 20 samples, the actual value of "1" could be as low as .75 and as high as 1.5; the difference being .75, or about 3/4 the value of "1".
At 10 samples, the actual value of "1" could be as low as .7 and as high as 1.8; the difference being 1.1, or more than the actual value of 1 itself!
As you can see, the accuracy of stdev falls off sharply below 30 samples. It does not increase dramatically over 50 samples. That range (30-50 samples) is the "sweet spot" where you get the best bang for the buck. You can take a LOT more samples, but you'll not get a huge increase in accuracy for the efforts. You can take a bit fewer than 30 samples, but your accuracy falls of horridly.
I have thousands upon thousands of UOAs in my database. I don't really need that many, but it does help see trending year over year. But the accuracy of the data really is just as good when I look at 50 samples, versus 500 samples. Once I have 50 samples, I have a very good understanding of the typical (normal) process variation.
Knowing the mean (aka the "average" valve) is only half of the story. You must also understand the variation (standard deviation) away from the mean.
This is just an application of what most would understand to be called "Six sigma" control.
Mathematically, you can get an "average" using as few as two numbers.
Mathematically, you can get a "standard deviation" using as few as three numbers.
But for them to be accurate, you need a lot more than that.
So my comment regarding your data sheet is that they certainly can give you a stdev value for 20 samples, but it's not an accurate trustworthy value. You need 10 more samples AT A MINIMUM to really get a decent stdev value.
This is why I tell folks to quit looking at one or three or five UOAs and think that they understand how the equipment is performing. You can look at one or two UOAs, compared to macro data, and have a very good understanding. You CANNOT look at even 20 UOAs, of micro data, and yet know what is "normal".
When you get a chance, please read the article about UOA normalcy; it will help you understand stuff better.
It is critically important to understand the differences between macro and micro data sources.
https://bobistheoilguy.com/used-oil-analysis-how-to-decide-what-is-normal/
There's a LOT more to it than just this, but this is at least where most folks can get a decent understanding of why I complain so much about understanding "variation" of UOAs. Too few people here understand, but think they know it all. I'm not a chemist; I cannot explain tribology in those terms and I must defer to those who have chemical backgrounds. But results-driven data (like UOAs), I know really well; it's what I do for a living. I cannot tell you what goes into a lube bottle and how it interacts with all the other chemicals in the bottle. But I can darn well tell you what happens when those bottles come out of the crankcase!