# Standard deviation and average are poor statistical measures of latency - Journal of Omnifarious

## Nov. 24th, 2008

### 10:39 am - *Standard deviation and average are poor statistical measures of latency*

I've noticed that ping and a few other similar utilities that measure network latency have begun to include an interesting statistic. They show the standard deviation of all the latencies gathered from each individual ping packet. I think this is bad statistics.

If I am not mistaken standard deviation is based on the idea that your sample set follows a normal distribution, a bell-curve. Network ping times do not follow this distribution. I would guess that network ping times follow a power-law curve in which the majority of ping times are hover just above the theoretical minimum value for the path with increasingly rare outliers arbitrarily far from that value.

It would be nice to have some sort of statistical measure that more accurately reflected this measure. Perhaps something like a measure of how shallow the curve was. The shallower it is, the more uncertainty there is.

That also means the the mean ping time is also a poor measure. There should be some measure of a power law curve where you can guess that 50% of the values would be below and 50% would be above.

The reason I'm guessing that ping times follow a power law curve is that I remember seeing research showing that measuring network traffic bursts showed that network traffic burstiness displayed scale invariant properties. That basically a measure of traffic spikes looked approximately the same at almost any scale you wanted to examine. Scale invariance, fractal patterns and power law curves are strongly related.

And this brings to mind another issue. Given the widespread applicability of Benford's Law, it's clear the scale invariance is a property of many statistical sample sets. Yet it seems that standard bell-curve distributions are considered the default. IMHO, power law curve based statistics are what should be taught in High School, not the mean/mode/median/standard-deviation 'normal distribution' based statistics that are currently taught.

Incidentally, the widespread applicability of Benford's Law also lends even more support to the already overwhelming evidence that scale invariance is the default property of almost any network, a hypothesis that is thoroughly explored in Linked: The New Science of Networks.

**Current Mood:**contemplative

esoterrica## Teaching moment ahoy!

(Link)If I am not mistaken standard deviation is based on the idea that your sample set follows a normal distribution, a bell-curve.Not so--standard deviation is the square root of the variance, which is a property of all parametric statistical distributions and can be calculated for any sample. Perhaps the power-law curve could be expressed by an exponential distribution? Exponential distributions are commonly used for problems which involve waiting for something to occur, and an exponential distribution would be a good theoretical fit for the power-law curve. The variance of an exponential random variable is the square of its expected value, so the standard deviation you mentioned would also be the mean--and now you can express ping time with a well-known and well-behaved distribution!

There should be some measure of a power law curve where you can guess that 50% of the values would be below and 50% would be above.That would be the median, which is also on the Wikipedia page. If you know the mean and assume an exponential distribution, you can calculate this.

IMHO, power law curve based statistics are what should be taught in High School, not the mean/mode/median/standard-deviation 'normal distribution' based statistics that are currently taught.I did not encounter statistics until I hit undergrad. I think you are mixing populations and samples--mode is most often considered when dealing with sample summaries, and mean, median and SD can be calculated for any sample, and also well-behaved distributions. (If you want to see a wonky distribution, check out the Cauchy distribution.) In order to understand distributions you have to be fluent in multivariate calculus, and to

reallyunderstand statistical theory you have to know measure theory, so I would be surprised if any high school course moved beyond calculating summary statistics.omnifarious## Re: Teaching moment ahoy!

(Link)Thanks for pointing me at that wikipedia page. :-) And I appreciate your response, but sometimes it seems like you assume I know very little, and at other times it seems like you assume I know a lot more than I do, and it's sort of frustrating to read, so my response probably sounds a little rough.

I have never taken a statistics course myself. And I agree with you that 'mode' has nothing really to do with what I'm interested in. I only talked about it because I remember in the basic (and rather lacking in the explanations of fundamentals) statistics I had in HS as part of math. It was one of the measures we were told how to derive, along with the mean, the median and the standard deviation (though the latter was considered a bit advanced).

I was unaware the the 'mean' and 'median' were concepts that existed apart from the standard bell-curve distribution. I note that the method of computing the mean of an exponential distribution is not by adding together all your samples and dividing by the number of samples. So while the concept is the same, the method of computing it is different from how you would do so for a standard distribution.

What I meant was that the most useful distribution to teach people the rote means of computing the various parameters for (like the mean) is the exponential distribution. They don't have to understand it in detail, they just have to know how to compute the parameters and some rough idea of what the parameter means for the expected values.

Edited at 2008-11-25 01:02 am (UTC)esoterrica## Re: Teaching moment ahoy!

(Link)From what you now know about the sample you could make some guesses about the distribution. Since we are dealing with time, all possible ping times are greater than zero. You have hypothesized that Benford's Law makes sense in this case, so the probability of observing a low ping time is quite high, and there are few extreme ping times. Finally, you are waiting for something to occur. The exponential distribution fits these criteria. Assuming you know everything about the distribution is a pretty big jump, but not an unreasonable one in this case. The exponential distribution is characterized by a particular density function, seen here. With the density function you can use calculus to come up with expressions for the population mean (aka expected value), variance (standard deviation squared), median, and other population summaries. These values will probably not match the sample values unless the sample is pretty big.

I hope that made some sort of sense...

Edited at 2008-11-25 01:51 am (UTC)omnifarious## Re: Teaching moment ahoy!

(Link)Ahh, that makes a whole lot of sense. Thank you for explaining.

My basic complaint is that all that is shown to someone using these programs is the sample analysis. I think it would be much more useful to show an analysis of a particular distribution assuming that the samples fit it.

Ping time basically measures how long it will take for a packet sent from one computer to reach the destination, be sent back and reach the original computer. The amount of time it takes information to reach one computer from another could be guessed to be half the ping time.

Ping time is affected by the traffic load between the source and destination. I am guessing that it fits an exponential distribution because the load over time tends to.

I guess one problem would be that the samples will not always fit that distribution in all networking situation. That might be hard for a program to notice and account for, especially after only a few samples.

But telling me the sample mean isn't very helpful because it doesn't really do a very good job of giving me a solid prediction I can use. And the standard deviation is frequently larger than the sample mean, which tells me that it's also utterly useless for making any decent predictions about what will happen.

If no distribution is assumed, telling me the sample median would be much more helpful. And then it would be most useful to come up with some kind of a variance measure that is bi-valued (i.e. -x +y) because it will likely vary between a little lower than the median and a lot higher.

sparklewench## Re: Teaching moment ahoy!

(Link)I was interested to reply but hesitant because of the whole issue of guys spouting off and feeling inferior if I have something to say that corrects them based on my education. That's a topic in my life right now, not just in reference to this interaction.

All that said, I think school children should be taught to visualize population curves and distributions as a matter of course in grade school. As soon as they can graph, they can begin to conceptualize groups of measurements. I think our society would be better if people could converse at that level of abstraction. For example, I am a lefty liberal who is against the minimum wage concept, or rent control. Why? I want to cut of the lower tail of poverty levels rather than shift the mean up. IF we put the hump higher up, we still have tails into the very low (income level) region. Minimum wage and rent control simply shift the distribution up and maybe reduce absolute numbers in those tails on the graph, but don't get at the core problem they try to address. blah blah blah.

If we had to pick a single, non-bell-shaped curve to work with I would want to teach a Gaussian. The beauty of bell-shaped curves is that they typically represent what we hope for on a grade distribution, so students naturally relate to them more. Plus, symmetry makes people comfy.

So much of math is about being comfortable with the topics. Having the confidence to be wrong, a safe sapce to be wrong and learn, not be ridiculed. emotional stuff that gets in the way.

omnifarious## Re: Teaching moment ahoy!

(Link)Well, actually I wouldn't have minded if it was all pitched at a level that was below me. What I found frustrating was the inconsistency of the response. I felt like she chose to tell me that mode wasn't in the same class as median and mean, then assumed I understood the difference between stuff about the sample data vs. stuff about the population as a whole and how that difference applied to what I was saying.

If you noticed, she was much more careful the second time and pitched it all at a level that made sense to me. And it's clear she knows more about and understands statistics better than I do, and it's OK with me.

I agree with you that kids in high school and possibly even elementary school should be taught about population distributions. It's not that hard a concept.

foxfirefey## Re: Teaching moment ahoy!

(Link)statisticallyto me.esoterrica## Re: Teaching moment ahoy!

(Link)omnifarious## Re: Teaching moment ahoy!

(Link)My favorite mixture of the mathematical and the erotic has to be "

The Cyberiad" by Stanislaw Lem. But, it isn't exactly a pick-up line. :-)omnifarious## Re: Teaching moment ahoy!

(Link)*chuckle*iceprincess1010(Link)omnifarious(Link)What's CI? :-) I am not particularly well-versed in statistics.

iceprincess1010(Link)omnifarious## df?

(Link)By 'df', do you mean distribution function?

iceprincess1010## Re: df?

(Link)omnifarious## Re: df?

(Link)If I gave you a bunch of data, would you be willing to plough through it and tell me some interesting and useful statistical details I could clean about it and similar data sets?

iceprincess1010## Re: df?

(Link)omnifarious## Re: df?

(Link)I have the data now. The easiest form for me to give it to you is to give you a URL to some ASCII files that look like this:

The first value is the time since t0 the ping was sent, and the second value is how many milliseconds the ping took to get pack. One thing I'm not sure how to deal with is pings that don't come back. I'm sure that data might be useful, but there is no good way to represent it in that format.