In my last post part 1 of Algorithms for life, I discussed some learnings from the book Algorithms to live by. This post is the continuation of the same. Let’s jump right into it.
This is one of the meaty parts of the book. Life is filled with probabilistic occurances and those who can navigate uncertainity, generally live a much more fulfilling life. The book discusses Bayes theorem which is about using “priors” or existing knowledge about a given situation to make predictions about the future. I have discussed Baye’s theorem in detail in my post on my post on Thinking fast & slow.
The Laplace’s law can be used to make general predictions after a series of observations. Imagine having to predict the percentage of overall winning tickets in a lottery after seeing how many tickets won in the latest 10 draws. The law states that the probability of a winning ticket is the number of winning tickets in the last 10 draws plus 1, divided by the total number of tickets drawn plus 1.
(w+1)/(n+1) where w is the number of winning tickets and n is the total number of tickets drawn.
So if in the first 10 draws, there were 2 winners, the probability of a winning ticket in the next draw is 3/11. This is a very simple example but it can be used to make predictions in a lot of situations.
Another interesting principle to internalise is the Copernican principle (also known as the Lindy effect) which formalises the intuition that old is gold. Something that has been around for x years, is likely to continue to stay around for x more years. This principle was used in World War II to estimate how many tanks the Germans were producing every month by looking at the VIN numbers of the tanks found by the Allies.
Finally, one of the most important concepts in probability is the distributions that surround us. A distribution describes the likelihood of occurances in a group setup. Normal distribution or the bell curve distribution, has more occurances in the centre. You can make average predictions and be right most of the time, for example, someone at the median age is likely to live more than someone who is very near the extremes of the curve. Power law distributions are in which events are not normal and most items are not clustered around the average but lie in some specific extremes. Events like box office earnings, venture capital returns, etc, follow the power law. The book also discusses Erlang distributions which can be thought of as an intermediate between normal and power law distributions, where the tails are fatter than normal but not as fat as power law distributions. In events following erlang distributions the predictions can be constant. For example, the number of people who will show up at a party is likely to be constant or the amount of time people spend on a call is likely to be constant. Erlang distributions are also used in call centres to estimate the number of agents required to handle a given number of calls.
Overfitting is when we try to fit a model to a given situation and end up making it too complex. In life too, we tend to overthink and develop a complicated understanding of what’s happening around us. In reality, truth is often far too simple. Overfitting is a common problem in machine learning and the way to avoid it, is to add a regularisation parameter which penalises the model for being too complex. In life, we can use the same principle to avoid overfitting. We can use the principle of Occam’s razor to avoid overfitting. Occam’s razor states that the simplest explanation is the most likely to be true. This is a very powerful principle to keep in mind. This probably also explains our evolutionary wiring of heuristics and our inhrent fast thinking.
Taleb has written extensively about how to avoid being fooled by randomness and also how to build yourself to thrive in uncertainity. This book discusses an interesting concept of annealing which is a process of letting a system warm up (introducing randomness) and then cool down (perform optimisations like hill climbing to reach a local maxima) and doing these regularly to hit what seems like a global maxima. This technique was used by the authors of this famous paper to design chips with more transistors than other known techniques of the time.
Introducing a bit of randomness can be used to break stale mates and direct us towards more creative solutions. Disruptive brainstorming sessions, like the one described by Atlassian, are quite popular where random concepts are introduced to direct the participants to incorporate them in their ideas, often leading to novel innovations.
One other advantage of randomness is using it to create algorithms like Bloom filters that us to be somewhat sure of our solution quickly rather than seek out the perfectly optimal solution everytime. Contrary to common belief, this is quite preferable in many situations like cyber security.
Game theory is a deep field that studies how people interact in different situations. Looked at through this lens, almost all interactions are games with each player looking to maximise their payoffs. The famous prisoners dillema is a game where two players are put in a situation where they can either cooperate or defect. If both players cooperate, they both get a payoff of 3. If both defect, they both get a payoff of 1. If one player defects and the other cooperates, the defector gets a payoff of 5 and the cooperator gets a payoff of 0. In this situation defecting is the optimal strategy for both players. The book discusses how this game is played in real life and how it can be used to make predictions about how people will behave in different situations.
Price of anarchy is the difference between the payoffs arising due to the “always defect” and “always hold out”, i.e. 100% competitive and 100% cooperative strategies in a prisoners dillemma like game. A low price of anarchy means the system is stable and doesn’t need a managing centralised authority. A high price of anarchy on the other hand, like the one described in the prisoner’s dillema above, requires external intervention. The book discusses how the price of anarchy can be used to predict the stability of a system and also to design mechanisms around us. The classic problem of tragedy of commons. The profits from overgrazing common land accrue to you alone while the losses, get amortized over everyone. This is a classic case of a high price of anarchy.
Mechanism design is the reverse of game theory. It is utilising our knowledge of equilibriums and creating a game where the equilibrium is beneficial for all the players. Imagine in the prisoners dillemma there is a Godfather of a crime syndicate that bans defection in jails (or else the accused sleeps with the fish). This will move the equilibrium from always defect to always hold out.
I run a startup called Harmonize. We are hiring and if you’re looking for an exciting startup journey, please write to jobs@harmonizehq.com. Apart from this blog, I tweet about startup life and practical wisdom in books.