Flutterby™! : Let the outliers out

Let the outliers out

2012-05-03 19:43:24.18266+02 by Dan Lyke 8 comments

Put away the bell curve: Most of us aren't average:

"If you had a superstar performer working at your factory, well, that person could not do [a] better job than the assembly line would allow," Aguinis said. "If you unconstrain the situation and allow people to perform as best as they can, you will see the emergence of a small minority of superstars who contribute a disproportionate amount of the output."

[ related topics: Interactive Drama Work, productivity and environment Heinlein ]

comments in ascending chronological order (reverse):

#Comment Re: made: 2012-05-03 23:56:23.771767+02 by: jeff

I don't really see how this study disproves the notion of a randomly distributed normal distribution curve. In fact, many of the statements in the article actually reinforce the notion of one:

"In each of these kinds of industries, we found that a small minority of superstar performers contribute a disproportionate amount of the output."

Isn't that exactly what a normal distribution would indicate? A small number of people (perhaps 2-3 standard deviations to the arbitrary right of the mean) actually contribute the most? By any criteria one can use?

(The operative word is "small") In other words, the number of outliers is "small."

#Comment Re: made: 2012-05-04 05:28:53.846164+02 by: ebwolf

"These superstars, moreover, accounted for much of the success of the group as a whole. The vast majority of the others in the group, Aguinis said, were actually performing below the mathematical average."

In other words, significantly more than half the population perform below the average. Which means the average is being skewed by the performance of the outliers. In a bell curve, half the population is below the average and half above. If you compared the median to the mean, you would see the median falling well below the mean.

#Comment Re: made: 2012-05-04 05:39:38.314248+02 by: jeff

Agreed, you're referring to a bi-modal distribution curve, in that particular example, Eric.

But the sample they are referring to is not taken from the entire random distribution for the characteristics they are measuring, right? (We might need to dig deeper to better understand EXACTLY what they were measuring.)

From my view it's still a poorly written and largely inconsistent article.

#Comment Re: made: 2012-05-04 13:19:36.316529+02 by: Larry Burton

What they are measuring is the effect of firing everyone two standard deviations below average performance. With only average performers and above average performers supplying output you don't have the leveling effect of the below average performers.

#Comment Re: made: 2012-05-04 15:56:47.218078+02 by: jeff

This would be a fun one to draw up on a whiteboard. :)

#Comment Re: made: 2012-05-04 17:31:22.410881+02 by: Dan Lyke

It's worth clicking through to The Best And The Rest: Revisiting The Norm of Normality of Individual Performance. Basically the argument is that when we convolved the normal Gaussian distribution of our expectations over the actual sample that's closer to a Paretian distribution, we end up with a bad fit that would be better matched if we started with the assumption that the distribution fell in the latter curve.

#Comment Re: made: 2012-05-06 15:34:00.914798+02 by: jeff

That's a good read Dan.

One takeaway for me is that the term "outlier" should have a context specifically related to the type of distribution where it is being used. The concept of an outlier in one distribution may not be comprise an outlier in a different distribution, and vice-versa.

For example, one cannot use "outliers" in the same context when comparing a normal distribution to a power distribution or to a bi-modal distribution.

Interesting stuff and Pareto had an interesting life.

#Comment Re: made: 2012-05-07 17:58:44.866693+02 by: m

Experimentalists believe in the Normal curve because they believe that it has been proved by theoreticians. Theoreticians believe in the Normal curve because they believe it has been observed by experimentalists.

But for any given curve, the area or probabilty outside n standard deviations is given by the Tchebysheff’s inequality.

0<= area 0<= 1/sd^n

So for the probability included outside two standard deviations from the mean, for any given curve, is between 0 and p=0.25. There is also a formula for continuous curves, but I don't recall its name. My recollection (getting faulty) is that the greatest possible area outside a continuous curve is about 11%.

Comment policy

We will not edit your comments. However, we may delete your comments, or cause them to be hidden behind another link, if we feel they detract from the conversation. Commercial plugs are fine, if they are relevant to the conversation, and if you don't try to pretend to be a consumer. Annoying endorsements will be deleted if you're lucky, if you're not a whole bunch of people smarter and more articulate than you will ridicule you, and we will leave such ridicule in place.

Flutterby™ is a trademark claimed by

Dan Lyke

for the web publications at www.flutterby.com and www.flutterby.net.