Information Theory, Statistics and Life

Originally published on Ribbrish


I wanted to make the title something like Life, the universe, and everything, so that I can start off with some catchphrase. But considering the fact that comparing these titles will be something like comparing mess food and home cooked food, I give up.

For eight years now, I have been trying to learn about the behavior of communication systems. Newton, on his deathbed is supposed to have said something similar to “I have collected pebbles from a beach”. Going by those standards, I am unsure whether I am capable enough to say, “During the past six years I have managed to get some sand from the same beach to stick to my hands.” But as they say, an empty vessel makes much noise. So here I am, to make noise.

It is said that middle age is when broadness of mind and narrowness of waist interchange places. That way, I can say that I was born middle aged, one peculiar (and rather annoying) habit that people tend to develop in their middle age is prophesizing armchair philosophy. Going by the urges that I have had to compose this write-up, I believe that I have acquired another (apart from my narrow mind and broad waist) folly of middle age. Here, I am going to talk about the applications of the little (and when I say little I mean absolutely little) statistical signal processing that I have learnt over the past years applied to everyday life.

In each the next few sections, I will take an idea from statistical signal processing or communication theory and beat it to death trying to explain some real world phenomenon using it. I however do warn you, dear reader, that many (or all) of these theories may seem to you like the product of a deranged mind, in which case you would be right. So without much ado further I seek your permission to begin.

Measuring Information

All of us, at some point in our lives, come across people who have mastered the art of speaking without saying anything at all. People who make statements like “If water is falling from the sky then it is raining”.


For those who do not understand what I mean, look at the image above to understand what I am talking about. The poor ACP Pradyuman and his gang are infamous for making obvious statements or rather statements that contain absolutely no information at all. Therefore, we would be right to believe that obvious statements contain no information at all, and consequently, the less obvious a statement is, the more information it contains. Based on this idea, some of the pioneers in Information theory have proposed to measure the information content of events by measuring their unlikeliness. The logic behind this measurement is the same as described previously. The less likely an event is, the more information its occurrence will contain. As an example, consider the following news items.

  1. Celebrity acquitted of all charges after going on a shooting spree
  2. India wins the cricket world cup
  3. India wins the football world cup

The first news is most likely to be ignored by most people as it contains absolutely no information. It is a well-established fact that celebrities cannot commit crimes so reporting their acquittal is an absolute waste of space.  The second headline is sure to evoke some interest, because despite being fatigued with too much cricket all through the year, most of us still love the game. The last headline is sure to be an eyeball grabber. Winning in any sport except cricket is sure to evoke angry responses from all the cricket sponsoring agencies fans. So my point here is that the unlikeliness of an event can be treated as a measure of its information content and this is the information that we communicate.

Channel Capacity and Free Lunches

When I was first taught the Shannon’s channel capacity theorem, it was in its classical single antenna form. However, I am putting in the more modern multi-antenna form of the equation over here.

C= min(M,N) W log (1+S/N)

This is said to be the holy grail, the “Shanon-Greal” of electronic communication systems and communication engineers are supposed to send their entire lives chasing it. For the uninitiated, it represents the amount of information that can be sent over a communication medium. On the left hand side the symbol “C” represents the channel capacity, in simpler words, the amount of information (or data) that can be sent over a communication channel. I would like to clarify here that the word information here is used to qualify any form of communication intelligence that is being sent from a source to a destination. This can be text, video, voice or anything.  The word channel is used here to qualify the medium over which the communication is taking place.

The symbol “W” represents the available communication spectrum bandwidth (The same spectrum of 2G scam fame.) This is somewhat like a pathway over which the information can be sent. In simpler words, the more the bandwidth, the broader is the “Information Highway”, and greater is the amount of information that can be transmitted. The ratio “S/N” is the signal to noise ratio at the receiver, in simpler terms, it may be taken as a measure of the power with which we are transmitting (analogous to how loud are we shouting). Also, for those unfamiliar with the log function, its value increases with an increase in its argument but the increase is nonlinear. That is, we cannot say that if I double the power, the channel capacity will be doubled. Whereas, in the case of bandwidth this statement is true. Lastly, the term min(M,N) represents the minimum out of the number of the antennas at the transmitter and the receiver. This simply translates to the more antennas you can afford, the more data can you transmit.

The simplest implication of this equation is, “there are no free lunches in this world” or we have to pay for anything that we want. We cannot have a single antenna system high data rate system transmitting at low power over a small bandwidth. If we want a high data rate, we have to pay either in terms of the bandwidth, or of transmitted power or of the number of antennas at the transmitter and the receiver. Similar to our life where everything comes at a cost. Want a high paying job, sacrifice your interests. Want an interesting job near your hometown, there are better paid ones far away. It is all about the trade-offs or the bargains that one is ready to admit. Also, it may be seen that the effect of the costs on the channel capacity is different.

Also, we may achieve the same capacity by several different combinations of the involved parameters, each of which will depend on the system design parameters. A satellite system for example can afford a large number of antennas and a high bandwidth but it cannot afford good SNRs due to the large distances involved. On the other hand, a mobile phone system, can have access to a large bandwidth but is constrained by its physical dimensions and limited battery power. Therefore, each communication system, like every person is unique and has his own constraints for achieving the channel capacity. The solution for each system is also unique and it will be imprudent to compare the performance of any two systems based on only one of these parameters. Like it will be imprudent to compare any two people based on these achievements. Everyone has his or her own constraints and what is more, everyone has his or her own definition of the channel capacity. While comparing these systems, it is good to keep in mind the trade offs involved. With people, it is not possible to know these trade-offs and therefore actual comparison becomes next to impossible.