The Central Limit Theorem, Grading, and Uniqueness

Originally published on Ribbrish

It was during the month of July in 2013 that I was asked by the department to explain the Grading system at the institute to the incoming undergrad students. Always eager to show off I had readily agreed and prepared a rather emotional speech about dreams, capabilities and grades. The main thing I was talking about was how their grades were more a reflection of their choices rather than their capabilities. I got so carried away by my rant on choices capabilities and choices that Dumbledore must have started turning in his grave as I turned his epic one-line quote from Harry Potter and the Chamber of Secrets into a boring dull speech for first years.
Two hours later when I re-read the entire thing I laughed my head off and decided to speak on the grading system extempore (One of the better decisions of my life). I spoke on why only less than 10% students get the much coveted (am using the word coveted not courted as courting also requires efforts in that direction) dassis (or the perfect 10 grades) . A few more people got 9s, a major chunk got 6,7 and 8, and those managing to get just the passing grades or failing were almost equal in number to those getting the highest grades. In other words, the probability of a student randomly selected from the class doing extremely well, or fabulously bad was almost equal and rather low. Whereas, the probability of the same student lying in the middle rungs of the grading ladder was higher than either of these. I must again emphasize here that I do firmly believe that these grades are more an indicator of choice rather than that of capability. They only measure how interested one is in the subject and the amount of attention he or she is ready to devote to its study.
However, coming back to the main theme, it is interesting to see that how a sizable chunk of the students appearing for an exam lies in the middle, neither at the top nor at the bottom. This is a common natural phenomenon, observed not only in exam marks but almost everywhere. As another example from human behaviour consider the arrival time of guests at a party (not according to the north Indian standard time).
If, the time of a party is say at 7 in the, then a small fraction of the people (like me) will arrive at 6:30 and will start clamouring for food. The rate of arrival of people will increase as the clock ticks towards the actual time of arrival. This rate will peak at around fifteen minutes after the start of the party and then start to decrease. Of course people will continue to pour in but at continuously decreasing rates. Then, a few people will pop up when everyone is sloshed and tired from dancing.
(The word “party” is not to be confused with the phrase “Punjabi wedding”, as in the case of the latter the probability of finding anyone but stray dogs at the venue at the designated time is lower than that of India winning the next football world cup. All the times written in a Punjabi wedding card are at least 2 hours ahead of the actual schedule.)
However, the arrival of guests also follows the same pattern, slow at first, starts to increase with time, peaks at around the mean time of arrival and then decreases until all the guests arrive. That is to say that the probability of a random guest arriving at the party at a given time depends on the time difference between that time and the mean time of arrival. It may be noted that the probability of arrival of a guest decreases as this difference increases. This is similar to the pattern of the marks scored by students in exams. A similar pattern may be observed, if we check the yellowness of the leaves that have fallen from trees. A very few leaves are absolutely green, then as the degree of yellowness increases, the number of such leaves also increases, following this, the number of leaves with a darker shade is again low.
If we take a graph paper, and plot the probability of a student scoring some marks against the scored marks, that of the number of leaves of a given degree of yellowness against that degree of yellowness, or the number of guests arriving at any given time against the time. We will get a similar shaped curve, the notoriously famous bell shaped curve, also known as the Gaussian curve. A main lobe in the middle and two tails, exponentially decaying. The mathematical expression for a such a curve looks like


Where µ represents the mean of x, that may be interpreted as the average marks of students, the average time of arrival or the average degree of yellowness of the leaves. The symbol σ represents the standard deviation of the random variable that is, on an average how far a random sample be from the average. Sounds confusing doesn’t it ? Well let me explain. Consider two people A and B, A is in the possession of $110 and B in the possession of $ 90. We may say that the average wealth per person is $100. In this case, the average deviation from this average is $10. Consider another group, of C and D, where C has $195 and D has $5. The average wealth per person is still $100, but the average difference from the average has risen to $190. We may say that the second group has a greater disparity. The standard deviation of a random phenomenon is a measure of this disparity. (Probability theory purists: please note that I am not using the phrase random variable in order to avoid defining it. The next thing you will want that I use the axiomatic definition of probability). The point to note here is that most of the natural phenomena follow the same pattern. There is plenty of room at both the top and the bottom and the middle is a phenomenally crowded place. The question now is Why?

Why do most of the natural phenomena behave in this absurd manner?
To answer this, we must look at a common thread that runs to all of these phenomena. All of these are random. That is the when we pick a random student, a random leaf, or a random guest. We do not possess any knowledge about its conditions. Therefore we do not know how the parameter being measured by us will behave.
For a student, his grades are a reflection of the time spent by him studying for the subject, his understanding of the fundamentals, the hours of sleep taken before the exam, any personal issues that might distract him, the evilness of the invigilator, etc. For a leaf, these may be the nutrition received by the said tree, the winds, birds trying to pluck it, etc. In short, the values of these parameters may depend on so many different parameters it is not possible to count them. The funny thing is each of these parameters is in itself random. (Example: think of the leaves that fall when a schoolboy is going to a party after his last exam and decides to shake a tree for the sheer delight of it.)

The final value of these random parameters may therefore be taken as a sum effect of all these independent random phenomena. Quite difficult to comprehend, isn’t it? This is where the central limit theorem comes into play. The theorem simply states that the probability distribution of any random value which is the sum of a large number of independent random phenomena is Gaussian. This fits in well with all our natural observations and wherever there is a large number of parameter affecting the value of another parameter, the central limit theorem may be applied.

Looking at the curve again we see that the probability of finding a sample from a Gaussian distributed population exhibiting a certain value of a parameter decreases drastically as we move away from the average. In other words the farther from the average you are, the more unique you are. With respect to that parameter.
But, this uniqueness is with respect to only a given parameter. We humans are a mixture of so many things. Our sense of humor, our ability to sing, to dance, do math problems, to love, hate, and countless other things. We choose many of these things but then again there is no prescribed formula for these choices, these are again random. To us these may not seem random, but that is how randomness is. While talking statistics, we don’t talk about an individual, or about a group of individuals, we talk about large numbers in millions or even billions.
Yet, the funny thing is that in all this randomness, if we calculate the probabilities of a person having exactly the same traits as you then that probability comes out to be negligibly small. You therefore are unique, like no one else and yet unique like everyone else.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s