Any BITOGers know how to calculate Standard Deviation (statistics)?

Thanks n35, but I am sure you have me confused with another member of BITOG. But thanks so much, it is a pleasure to be around so many good people that bring so much knowledge to one place. I am a student here.
C'mon, you're the fellow who kept us enthralled with the epic tale of the Expedition, and now your efforts to find an affordable home in a nice place. Great reading!
 
  • Haha
Reactions: GON
GON, I had to take an applied statistics class in graduate school and the most valuable thing I was taught was to remember the phrase "...lies, **** lies, and statistics..."
 
  • Haha
Reactions: GON
Lol…I’m taking a business stats course for my MBA and I’m doing my homework as I post this. I make a table of mean, sum of squares, etc, and calculate variance and standard deviation.

So, the values of (X-u)^2 have to be plugged into the variance formula, THEN take the square root of that number to get S.D.


The sum of squares was 42428 then divide by 4 = 10609.7 and take sqrt = 103
Glad I didn't decide on an MBA if I had to take statistics again!

I'm doing homework for an MA in Organizational Management as I post this. "The Functions of Modern Management" is this course. Yuck.
 
My current view homework…
Glad I didn't decide on an MBA if I had to take statistics again!

I'm doing homework for an MA in Organizational Management as I post this. "The Functions of Modern Management" is this course. Yuck.
I like math and while this is nowhere near the difficulty level of differential equations or linear algebra it's still kinda fun to review this stuff again.
 
This brings back fond and distant memories of when I was doing my MBA.

OP, try to understand the concept of standard deviation and things will get much easier.

Your sample size is too small and that's why you are getting a wonky answer. But 92 seems to be reasonable.
 
  • Helpful
Reactions: GON
My current view homework…

I like math and while this is nowhere near the difficulty level of differential equations or linear algebra it's still kinda fun to review this stuff again.
For no good reason, I recall from my linear algebra course that A x B is NOT equal to B x A. Got fooled by that on the first midterm in September or October 1975. Tempus fugit.

I remember a bit more calculus, but nothing of Fourier Analysis or LaPlace Transforms.

Never used a single one of them in the workplace.
 
https://en.wikipedia.org/wiki/Standard_deviation

In a nutshell, calculate the square (so positive negative doesn't matter) of how far each sample is from the mean (average), then average these square of "differences" (add them up and divide by how many samples there are), you get variance.

Standard deviation is the square root of the variance.
And the practical application of SD is that a normal distribution fits a bell curve, and one SD on either side of the mean covers 68% of the population, and two SDs either side cover 96%.

I suspect this sort of analysis gets done by auto manufacturers - "Based on our testing, if we build to standard X, our timing chains will last 100,000 miles on average with a standard deviation of 20,000 miles. That means 16% of the chains will fail before 80,000 miles, and 2% will fail before 60,000 miles. In almost all cases we're safe as far as warranty goes, but can we afford that many disgruntled customers? Let's spend $5 more per car, and take the mean up to 150,000 miles."

(Totally made-up example, but this is how I imagine SD being applied in industry.)
 
  • Helpful
Reactions: GON
And the practical application of SD is that a normal distribution fits a bell curve, and one SD on either side of the mean covers 68% of the population, and two SDs either side cover 96%.

I suspect this sort of analysis gets done by auto manufacturers - "Based on our testing, if we build to standard X, our timing chains will last 100,000 miles on average with a standard deviation of 20,000 miles. That means 16% of the chains will fail before 80,000 miles, and 2% will fail before 60,000 miles. In almost all cases we're safe as far as warranty goes, but can we afford that many disgruntled customers? Let's spend $5 more per car, and take the mean up to 150,000 miles."

(Totally made-up example, but this is how I imagine SD being applied in industry.)
Yes, typically you need at least 3 sigma (3 standard deviation) to have confidence something is "working" and depending on what application it is can go up to 6 or so too.

Most important thing I learn in manufacturing back in 20 years ago, is that it is not just these numbers but how the distribution look like that matters. Standard deviation only works if it is a bell curve. If it has 2 bell curves in the graph you should fix that first, instead of hiding behind fuzzy math equation and says it is "good enough". Many people without science / engineering background, especially those in finance, fail to pay attention to that.
 
For no good reason, I recall from my linear algebra course that A x B is NOT equal to B x A. Got fooled by that on the first midterm in September or October 1975. Tempus fugit.

I remember a bit more calculus, but nothing of Fourier Analysis or LaPlace Transforms.

Never used a single one of them in the workplace.
Laplace and Fourier transformations are what make me realize I am never going to be a real electrical engineer.
 
Laplace and Fourier transformations are what make me realize I am never going to be a real electrical engineer.
I'm pretty much convinced that school gives you a vocabulary and exposes you to the concepts, and getting through shows your potential employer that you have the ability to learn. Beyond that, though, you learn the important stuff on the job.

We used to joke, while doing stuff like pulling an ATV out of a swamp, "Good thing I get to use the calculus now!"
 
Yes, typically you need at least 3 sigma (3 standard deviation) to have confidence something is "working" and depending on what application it is can go up to 6 or so too.

Most important thing I learn in manufacturing back in 20 years ago, is that it is not just these numbers but how the distribution look like that matters. Standard deviation only works if it is a bell curve. If it has 2 bell curves in the graph you should fix that first, instead of hiding behind fuzzy math equation and says it is "good enough". Many people without science / engineering background, especially those in finance, fail to pay attention to that.
Agreed on the dreaded bimodal distribution!

I get how you would want better than 98% reliability for many things (seat belts, air bags, child seats, smoke detectors, etc.)

So if two sigmas take the failure rate down to 2%, what do three (and more) sigmas do? 0.5%? 0.1%?

And regarding misunderstanding statistics, did you hear about the two mathematicians who drowned walking across a stream with an average depth of 24"?
 
Agreed on the dreaded bimodal distribution!

I get how you would want better than 98% reliability for many things (seat belts, air bags, child seats, smoke detectors, etc.)

So if two sigmas take the failure rate down to 2%, what do three (and more) sigmas do? 0.5%? 0.1%?

And regarding misunderstanding statistics, did you hear about the two mathematicians who drowned walking across a stream with an average depth of 24"?
Never heard of the joke, but I am always a disbeliever in statistics to begin with and probably always will be.

I once got 8 groups of dots on a graphs instead of a "bell curve" like graph after some sensor upgrade. In the end I narrowed some math and add some adjustment for human input, and that 8 groups converge into 1 group that's 8x smaller. Statistician would just tell people to buy a cheaper sensor to save money because the result is the same.
 
@PandaBear, per Wiki's "68-95-99.7" article, the standard deviation percentages are as follows:

1: 68.27
2: 95.45
3: 99.73
4: 99.99

The tails get pretty thin at both ends pretty quickly.

One common application of the Bell Curve and SD is w.r.t. intelligence.

100 is typically defined as average, and SD is 15.

For world population of c. 8B:

1.27B (approximately 1 in 6) would have a IQ > 115

182M > 130 (approximately 1 in 45)

10.8M > 145 (approximately 1 in 750)

800,000 > 160 (approximately 1 in 10,000)

All that to say, when you look at Einstein or John von Neuman, both of whom were more like 1 in a billion (or better), you can imagine how high their IQs must have been. 190? 205?
 
One should never overlook the topics of population size versus sample size, especially when trying to understand standard deviation. When the population is small, it's bad form to "sample"; better to review the entire population to avoid skewing results. Large populations typically cannot be fully studied, hence the need to sample.

Another concern with using std dev is that it's inherent inaccuracy in small sample sets is, well, disturbing to say the least. Any decent sample size should include AT LEAST 30 samples, and preferrably 50. It's laughable to see a standard deviation calculated when the total population is only 5 data points, etc.
 
One should never overlook the topics of population size versus sample size, especially when trying to understand standard deviation. When the population is small, it's bad form to "sample"; better to review the entire population to avoid skewing results. Large populations typically cannot be fully studied, hence the need to sample.

Another concern with using std dev is that it's inherent inaccuracy in small sample sets is, well, disturbing to say the least. Any decent sample size should include AT LEAST 30 samples, and preferrably 50. It's laughable to see a standard deviation calculated when the total population is only 5 data points, etc.
n = 100 used to be the standard (back when I was learning this stuff).
 
There are many different formulas to determine the correct sample size (n) when the population is very large; the selection of the proper one depends on several criteria too in-depth to discuss here.

100 may or may not be enough; that would depend on several characteristics in the population and the study intent.


My point is that the accuracy of the standard deviation is heavily dependent upon the sample quantity. The accuracy falls off precipitously under 30 samples. It gets fairly accurate at 50. As you continue to take samples, the accuracy gets better, but the return on the effort diminishes. The accuracy at 1000 samples versus 100 samples doesn't improve that much; you'll not get 10x the accuarcy for taking 10x more samples (1000 vs 100).
 
Back
Top Bottom