I’m always fascinated where curiosity takes me while learning Physics and In the recent times it has just been accelerated. It is like having a Child-Like curiosity and eagerness to Learn how things around us work by asking the most complex questions like What is the the beginning of the Universe? How did it evolve ? Is there an end to it ? How did life came to exist on Earth? Is there Life beyond Earth ? What is Life ? to the most simple questions like What is Sound ? Why does the sky look blue ? Why doesn’t a ball bounce as high the second time as it did the first time? There’s something beautiful in everything if you look and observe deep enough.

Imagine taking a time machine back to the very beginning of the Universe and to the very moment it expanded - Is it not fascinating to know? The last stars will die 120 trillion years (Our own sun will die 5 billion years from now before turning into a white dwarf ) from now followed by 10^108 of just black holes , Imagine what it’s like to be at the very ending of last black hole and we humans are born and dead In less than a second compared to the grand cosmic scale - If facts like these don’t boggle your mind and make you curious enough to learn, I don’t know what else will!

Okay! - Let’s get back to what I wanted to talk about

Entropy - Meet The Invisible Force That’s Shaping Our Lives And The Universe

I was writing an article (It’ll be out soon!) that explains in simple terms What the heck are GPTs(Generative Pre-Trained Transformers) , Why is it able to Capture the Imagination of so many people around the world and What’s going to be the State Of AI In the near future. While trying to explain how these models (ex: GPT-4, Claude-3 , LLaMA-2 etc) are getting better at what they do, I stumbled upon a concept that guides these models to strive better and It is the very own fundamental force that guides you , me , a cat , dog , our sun and everything else in the Universe. It’s one of the most misunderstood concept’s in whole of Physics and yet so fundamental that we don’t acknowledge It often.

It is also probably the only proof we have got so far for the existence of time and even be the reason that life exists on earth or elsewhere.

Entropy

Entropy is nothing but the measure of disorder , randomness , uncertainty associated with the outcome of a certain event. In simple terms It explains why everything In Life gets complicated , not less as time goes on.

Entropy increases over time and It is the reason Why relationships gets complicated over time , your room gets messier , battles are fought and people die , revolutions occur, your body gets weaker as you age , the heat from your coffee spreads out , the reason why business fails , empires fall and so on. It’s a natural non-reversible process and In the long run disorder always increases.

Entropy was discovered by a German Physicist and Mathematician Rudolf Clausius who studied the The Mechanical Theory Of Heat , which led to the discovery of the Second of Law Of Thermodynamics.

He explained that heat always flows from a body at higher temperature to the one at a lower temperature and very unlikely (Improbable) a heat can flow from a body at colder temperature to higher temperature without having an accompanying change elsewhere .

Have you ever wondered how,when you rub your hands together and share that warmth with someone else, the heat flows from your hands to theirs, like a generous gift of coziness? but the same kind of warmth cannot be experienced back from them without rubbing their hands. Imagine in the summer turning off your AC and the heat from the surrounding enters your house and it gets hotter and hotter (a natural process) over time where as in colder months no matter what without having a heater , your room cannot get heat. This is the same way how your coffee cools down the longer It is left out and the heat from the coffee spreads out into the room which happens naturally but if you want to heat a cold water to make coffee - you need to do some work and cannot happen without an accompanying change ie, without a power source you cannot naturally heat a cold water.

Clausius also observed that in a steam engine which converts thermal to a mechanical energy , only a fraction of energy was actually being useful to get the work done. Where did the rest of the energy go?

This Curiosity led to the discovery of Entropy and Rudolf Clausius summarised his findings as :

Image From Veritasium

Just like how without a constant supply of AC to your room in summer it get’s hotter and hotter , how if you do not constantly work every day on your relationships someday you might eventually wake up to divorce papers and how you might be physically be in bad shape if you do not workout on a regular basis , Clausius said over time things will move from being ordered to a more disordered state ( similarly he figured out that in a steam engine , 100% of the energy was not being utilised as energy was spreading out over time and constant work had to be done to keep things moving)

Derek from the YouTube Channel Veritasium Beautifully explains the Entire concept of Entropy

Entropy also explains why time always moves in one direction.

The increase of disorder or entropy is what distinguishes the past from the future, giving a direction to time.

Stephen Hawking, A Brief History of Time

Brian Cox Explaining The Arrow Of Time

The Arrow of Time dictates that as each moment passes, things change, and once these changes have happened, they are never undone. Permanent change is a fundamental part of what it means to be human. We all age as the years pass by — people are born, they live, and they die. I suppose it’s part of the joy and tragedy of our lives, but out there in the universe, those grand and epic cycles appear eternal and unchanging. But that’s an illusion. See, in the life of the universe, just as in our lives, everything is irreversibly changing.

If you closely observe Entropy is everywhere and it’s part of our everyday lives

  • Companies and Business will fail if we are not constantly putting the work to keep moving things like adapting to the changes, hiring bright minds and moving on from the unfits. Making sure that things are in orderly state requires constant energy expenditure and In order to minimise Entropy over time companies spend money , resources , time to make sure business does not lead to bankruptcy.

In Everyday Life : Say that you are cleaning your room every single day and it looks amazing ( orderly state) and now consider the scenario over time you become lazy and you leave your room as it is for days and months. It get’s messy like hell and without constant work ( ie keeping your room clean everyday) your room will always be in a more disorder state. There are few ways to be ordered but a million ways to be more disordered.

  • This is why most of the simple things in life like regularly working out , Learning a skill , keeping your relationships health feels complicated but it’s necessary to put constant effort for things not to fall.

You might think all of this might make us sad or pointless right ? But Imagine a world where everything is amazing , everything works as it is perfectly and there is no Entropy at all. What would such world look like ? Amazing , Awesome right ! ....... I don't think so If that's the case Humans might not have any purpose at all , we become lazy , without problems and suffering there is no place for Creativity and Innovation and hence no need for Progress. Just Imagine that world. Even a person who has everything is always looking to figure out what's his next purpose in Life.

This is why those who take advantage of Entropy and acknowledge that it’s a natural part of our everyday life and needs a constant effort.

Take Elon Musk

The Kind of everyday Entropy that he is dealing with on a regular basis is something that’s unimaginable to people like you and me. Take SpaceX and Tesla , Musk’s Rocket and Electric car companies, By embracing the inevitable challenges in rocket launches and the energy transformation in cars , Musk turns entropy from a challenge into a opportunity for the advancement of Human species.

Entropy In Artificial Intelligence

“Can entropy ever be reversed?”

“We both know entropy can’t be reversed. You can’t turn smoke and ash back into a tree.”
“Do you have trees on your world?”
The sound of the Galactic AC startled them into silence. Its voice came thin and beautiful out of the small AC-contact on the desk. It said: “THERE IS INSUFFICIENT DATA FOR A MEANINGFUL ANSWER” - Isaac Asimov

Just like how we humans learn that in order to be less chaotic everyday by taking constant effort to get better we need to also teach the AI models to be less chaotic and improve over time . This is where the concept of Entropy become’s so fundamental in building ML models that are responsible , ethical and do things that it is intended to do.

The Entire concept of Entropy is based on Probabilities - To exactly understand how Entropy plays an Important Role In AI we need to understand three concepts

  1. Information Content
    Information content is nothing but the amount of information or in general surprise associated with the outcome of an event. Less Probable Events Have High Information Content And More Probable Events Have Low Information Content .
    Suppose Let’s say Roger Federer and Carlos Alcaraz are in the finals of Wimbledon Championship. The Probability of Federer winning ( basis his form and him winning many in the past and also playing against a kid ) is very high compared to Alcaraz. Say the Predicted Probabilities are Federer : 90% , Alcaraz : 10% But after the finals out of nowhere Alcaraz against all the odds wins against Federer. Here the Information Content is very high compared to the information content you would have gained from Federer winning ( because it would not have been a surprise since you knew he would win with high probability )
    P(Federer) > P(Alcaraz) , hence IC(Federer) < IC (Alcaraz)
    Hence IC(X) = f(P(X)) ~ -log_b(P(X))

    IC(Federer) = 0.152 bits
    IC( Alcaraz) = 3.321 bits

  2. Entropy
    We know entropy is the measure of randomness or uncertainty associated with an event. From the above scenario know the true probability distribution is :
    P(Federer , Alcaraz) = { 0 , 1 }
    Entropy is the average information content across all possible outcomes of true distribution and Entropy H(P)=−∑​P(x)log(P(x))
    **Since the true outcome is certain (Alcaraz wins with probability 1), the entropy is:
    **
    H(P)=−∑ P(x)log(P(x)) = − (0 * log_2(0) + 0*log_2(1)) = 0 bits

    Entropy is 0, reflecting that there’s no uncertainty in the true distribution when one outcome (Alcaraz winning) is certain.

  3. Cross - Entropy

    Cross Entropy is the measure of predicted probability distribution compared to the actual true distribution. In our case
    true probability distribution is P = { 0 , 1} predicted probability distribution is Q = {0.9 , 0.1 }

    Given two discrete probability distributions,P and Q, over the same random variable X, the cross entropy between P and Q is the expected information content when using Q to describe P

    H(P,Q) = −∑ P(x)log(Q(x))
    w.r.t our scenario

    H(P,Q) = − [0*log_2(0.9) + 1*log_2(0.1) ] ≈ 3.321 bits

    Information Content: 3.321 bits for Alcaraz’s win, indicating surprise compared to the actual outcome.
    Entropy: 0 bits, showing no uncertainty in the true outcome (Alcaraz’s win was certain).
    Cross Entropy: Approximately 3.32 bits, reflecting the inefficiency/surprise of using the predictions instead of the true distribution.

  • So even In AI , the idea is to train the ML models in such a way that over time the entropy is reduced and the distribution of probabilities it predicts should be much closer to the actual true distribution. This is the core of how today's GPT models like ChatGPT , Claude-3 , Gemini work. It's Just predicting the next word in the sequence and the core idea is make sure we train these models in such a way by that over time we minimise entropy i'e the next word predicted is as closer to the true distribution. Ex: If the training data has the sample ' The Cat Runs' and if we ask the ChatGPT 'The Cat ______' it should be able to Complete Runs as the final answer. Initially the entropy is so high that it might output random junk but over time we reduce the loss and hence you see good results

References

  1. https://www.youtube.com/watch?v=DxL2HoqLbyA

  2. https://fs.blog/entropy/

  3. https://calteches.library.caltech.edu/4326/1/Time.pdf