Urban75 Home About Offline BrixtonBuzz Contact

Occam's Razor

dshl

Well-Known Member
I'm sure we've all heard this principle being referenced over and over in various situations. However, I never really got the value in it? Maybe someone can elucidate this value.

The idea that you should prefer the simplest explanation of an event in terms of its *likelihood of it being true, that's the bit that eludes me.

So the question is:

Why would a more complicated explanation of something be less likely to be true and if it isn't then what's the point in Occam's Razor?
 
Last edited:
I'm sure we've all heard this principal being referenced over and over in various situations. However, I never really got the value in it? Maybe someone can elucidate this value.

The idea that you should prefer the simplest explanation of an event in terms of its *likelihood of it being true, that's the bit that eludes me.

So the question is:

Why would a more complicated explanation of something be less likely to be true and if it isn't then what's the point in Occam's Razor?
I believe it's a popular misconception (held by me, until I likewise got a bit curious a while ago and looked into it) that it's about simplicity and complexity.

Rather, it's actually about the number of assumptions each explanation requires. The fewer, the better, as each assumption introduces an additional probability of error :)

So the 'simplicity' is more in terms of how many holes you have to fill in in order for an explanation to be true.

I think. Someone else'll be along soon, hopefully...
 
I think Occam's Razor is a rule of thumb that covers a number of ideas, including that you shouldn't invent new causes for things that can be explained with existing causes, and that you should assume that the more commonplace phenomenon is occurring than a relatively rare one.
 
One of the mysteries of the universe is that it makes sense.

An example of this would be the principle of least action. The universe appears to prefer efficiency, and it may be that it couldn't be otherwise, although that's a tricky question.
 
chickenrazor.jpg
 
The razor just sits in his washbag these days since he had the laser treatment.
 
I believe it's a popular misconception (held by me, until I likewise got a bit curious a while ago and looked into it) that it's about simplicity and complexity.

Rather, it's actually about the number of assumptions each explanation requires. The fewer, the better, as each assumption introduces an additional probability of error :)

So the 'simplicity' is more in terms of how many holes you have to fill in in order for an explanation to be true.

I think. Someone else'll be along soon, hopefully...
Thanks but I don't get that, in terms of increasing the probability of error.

It's not a plate balancing contest where the more I have to spin the more likely it is that one will crash.

Complicated or simple explanations don't have any universal likelihood to be more or less true than each other, no?
 
I think Occam's Razor is a rule of thumb that covers a number of ideas, including that you shouldn't invent new causes for things that can be explained with existing causes, and that you should assume that the more commonplace phenomenon is occurring than a relatively rare one.
If it's more commonplace then the question has already been answered.

I thought the point is that you're supposed to favour the simpler explanation not the more commonplace.
 
Last edited:
If it's more commonplace then the question has already been answered.

I thought the point is that you're supposed to favour the simpler explanation not the more commonplace.
To be clear when I say the question has already been answered I mean the question of likelihood.
 
If it's more commonplace then the question has already been answered.

I thought the point is that you're supposed to favour the simpler explanation not the more commonplace.
They're often the same.

Colds and other minor viruses are more commonplace than lots of other diseases, so if I feel unwell it's more likely to be one of those than something very rare. So in this case the simpler explanation is the more likely.
 
The way I see it more complex solutions require more "work" of one kind or another. And everything in the universe tries to do as little work as possible.
 
They're often the same.

Colds and other minor viruses are more commonplace than lots of other diseases, so if I feel unwell it's more likely to be one of those than something very rare. So in this case the simpler explanation is the more likely.
I have to give this more thought and have to go now, however my first thoughts are, if I use the commonness approach then I would say it's more common that we explore and look for explanations in situations that are uncommon.

Also, just to add before I go:

Hypothetically - even if it could be demonstrated that the more simpler the explanation the more likely is to be true, which I haven't yet excepted, why on earth is it constantly used as though that's where we should stop.

For instance, let's say someone is explaining a terrorist attack and claims the simplest explanation was X, why would the Occam's principal be used to stop there and not consider other less likely versions (sometimes slightly less likely), like say, the terrorist had other motives or took a different route, or had further collaborators, or there was a wider conspiracy, etc.

But again that could be just a common misuse of the term, and again that's accepting that the simplest explanation is the most likely.
 
I believe it's a popular misconception (held by me, until I likewise got a bit curious a while ago and looked into it) that it's about simplicity and complexity.

Rather, it's actually about the number of assumptions each explanation requires. The fewer, the better, as each assumption introduces an additional probability of error :)

So the 'simplicity' is more in terms of how many holes you have to fill in in order for an explanation to be true.

I think. Someone else'll be along soon, hopefully...
Sorry I know I said I was going but again before I go:

For instance a friend of mine, Bob, had hundreds of pounds stolen from under his bed. He claimed it was the landlord which means the landlord had to travel there at the right time (so has been observing when they leave), he had to look in the right place, etc, etc.

He called the police and when the police came round they told him perhaps it was your wife that stole it.

My friend knows that it wasn't his wife because that doesn't even make sense as he knows his wife, they have lived together as one entity for decades (financially and otherwise, meaning she doesn't even have to steal it), they both worked out it probably was the Landlord together, and basically it's a bloody stupid suggestion.

However it is the simplest explanation, I mean they do live together and she knows where the money is, so I guess it's the most likely to be right, lol.

That's the difference between likelihood and simplicity, to my mind.
 
Last edited:
It is only a heuristic. It's not always right. Or at least, to make it right, you have to shift your perspective.

Newton's law of gravity is nice and simple. It is a force of attraction between two objects that is proportional to the masses of the objects and reduces as a square of the distance between them. It can be expressed with a nice simple equation:

800wm


But it isn't quite right. It's conceptually entirely different from a more predicitively accurate equation from Einstein's General Relativity, which says that gravity is a result of the bending of space-time by mass. In contrast to Newton's lovely simple equation, Einstein's Equation is rather difficult, involving a tricky tensor term (T):

einstein-equation1.png


As stated here,

The deceptively simple looking equation is actually comprised of ten distinct highly non-linear partial differential equations which describe the behaviour of the system. To write each of the ten equations out in full, in their most general form without any grouping of the terms, would literally take hundreds of pages.

Newton's law is nice and simple but it did contain an explanatory gap, namely it doesn't explain why gravity happens - famously Newton recognised this by stating 'Hypotheses non fingo'. So we might prefer complex answers if they provide explanatory power, and also if they are generalisable. That's a key. Over-fitting hypotheses so that they perfectly explain one set of circumstances but cast no light on anything else is not great. Ptolemy's system for planetary motion ended up being very ugly. That is a clue that you've missed something. But again, only as a heuristic.
 
Thanks but I don't get that, in terms of increasing the probability of error.

It's not a plate balancing contest where the more I have to spin the more likely it is that one will crash.

Complicated or simple explanations don't have any universal likelihood to be more or less true than each other, no?
Imagine that I have a phenomenon that seemingly can be explained by an equation f(x1, x2, … , x50). This equation alone explains 45% of the variability in my data.

Now imagine I introduce another variable, x51. No matter what value I pick for x51, I still can’t get the new model to explain more than 45% of the data.

Do I need variable x51? Does it help? Should I assume I need it spite of the fact that it doesn’t help and carry it along for the ride, or should I just act as if it is unnecessary?

If I’m carrying it along for the ride, what value should I pick for it? After all, the data doesn’t support any particular answer. Anything I pick is almost certainly completely wrong.
 
Imagine that I have a phenomenon that seemingly can be explained by an equation f(x1, x2, … , x50). This equation alone explains 45% of the variability in my data.

Now imagine I introduce another variable, x51. No matter what value I pick for x51, I still can’t get the new model to explain more than 45% of the data.

Do I need variable x51? Does it help? Should I assume I need it spite of the fact that it doesn’t help and carry it along for the ride, or should I just act as if it is unnecessary?

If I’m carrying it along for the ride, what value should I pick for it? After all, the data doesn’t support any particular answer. Anything I pick is almost certainly completely wrong.

Thanks and forgive me as writing this quite quickly.

I believe I understand what you're saying but you've already decided in your example that x51 doesn't increase 45%. What if it did?

So to put it in more visual terms:

Theory: The simpler the explanation the more likely it being true.

Apple missing from table in room. What happened to it?:

Simple explanation = A. Someone walked in and stole the apple.
Complex explanation = B. Someone walked in, did a jig, sneezed then stole the apple.

If we go by the theory it's A as doing a jig and sneezing is more to get right thus less likely. Here the theory is correct.


However again:

It was Burt. What did he do?

Simple explanation = A. Burt walked in and stole the apple.
Complex explanation = B. Burt walked in, did a jig, sneezed then stole it.

Burt is known to steal apples and he usually does a jig and sneezes beforehand. If we go by the theory it's A but here the theory is wrong. B is more likely. With no further parameters/no particular restrictions on its application we should basically take the theory on face value. The theory is what it says it is and here it is wrong.

Now you can't just define 'simple' as 'likely' and 'complexity' as 'unlikeliness' and say something like B is not more complex as Burt normally does that whereas if he did something out of the ordinary then that would be more 'complex', because that would make the theory pointless. It would be like saying what is more likely is more likely and what is less likely is less likely.
 
Whether or not the thief sneezed and/or did a jig has little bearing on the fate of the apple, so both eventualities can be excluded from any theoretical model of events without loss of relevant data.
 
Whether or not the thief sneezed and/or did a jig has little bearing on the fate of the apple, so both eventualities can be excluded from any theoretical model of events without loss of relevant data.
The question was: What did he do?
 
Whether or not the thief sneezed and/or did a jig has little bearing on the fate of the apple, so both eventualities can be excluded from any theoretical model of events without loss of relevant data.
I was assuming that you were referring to the second example. But if you are referring to the first: 'what happened to it'. All we need to do is tweak that question slightly to 'What happened?' and we end up with doing a jig and sneezing as relevent.

Yes, thanks, probably best altering the question slightly to better illustrate what I'm saying. But I won't edit it.
 
It is only a heuristic. It's not always right. Or at least, to make it right, you have to shift your perspective.
Sorry didn't respond but I think this first line is sort of agreeing with me - the not always right bit. But I don't believe it is generally understood and applied as heuristic. It's applied and thought of as an all-encompassing theory - often applied in a logically wrong way.
 
It’s not “the simpler the theory, the more likely it is to be true”. It’s “don’t add parameters beyond the point that they increase the predictive power of your model”. These are completely different things. If a new parameter does increase the predictive power of your model then great. Nobody is saying that a one-dimensional model is necessarily the best option.
 
It’s not “the simpler the theory, the more likely it is to be true”. It’s “don’t add parameters beyond the point that they increase the predictive power of your model”. These are completely different things. If a new parameter does increase the predictive power of your model then great. Nobody is saying that a one-dimensional model is necessarily the best option.
tbf there are different interpretations. What you say is as per Wittgenstein, who said:

Occam's Razor is, of course, not an arbitrary rule nor one justified by its practical success. It simply says that unnecessary elements in a symbolism mean nothing. Signs which serve one purpose are logically equivalent; signs which serve no purpose are logically meaningless.

But others have a different take, eg this New Scientist writer:

The principle can be applied in many fields of science and logic. If two computer programs do the same job, for example, the shorter one, in which less code can go wrong, is probably preferable. Or if you are a doctor and a patient turns up complaining of a blocked nose, it is more likely they have a common cold than a rare immune-system disorder.

Occam’s razor.

I think you're right, though, regarding Occam's original intention with his razor:

Frustra fit per plura quod potest fieri per pauciora ("It is futile to do with more things that which can be done with fewer"

Occam was mostly talking about God, mind you, not science, ironically enough!
 
Sorry didn't respond but I think this first line is sort of agreeing with me - the not always right bit. But I don't believe it is generally understood and applied as heuristic. It's applied and thought of as an all-encompassing theory - often applied in a logically wrong way.
Being boring, I think you're right but also that kabbes is right.

The idea that you shouldn't use unnecessary steps in a solution kind of is an all-encompassing theory, hence Wittgenstein dealt with it to produce a logical justifiction. But the more general idea as outlined by you and by that New Scientist article is merely a heuristic and should not be applied as if it were anything else.

As often happens, the point of disagreement has its root in semantics. Resolve those differences and you resolve the point of disagreement. Which is very Wittgensteinian, appropriately enough.
 
“don’t add parameters beyond the point that they increase the predictive power of your model”.
In the context of theorising about an event, you can't know which parameters to leave out unless you subscribe to another theory evaluating those parameters - that theory here is the simpler the explanation the more likely.

Occam's razor is normally applied to the context of situations where we don't know what has gone on and we are theorising in order to explain it.

The way that you talk seems to be that it's a kind of testing scenario where there is feedback. So for instance: step 1, let's try adding x to the formula; step 2, we test it out; step 3, if it increases predictability we keep it and if not we chuck it. I believe this is what you are saying Occam's Razor is, and technically perhaps it is - I don't know. If it is, I agree, it is a perfectly logical principle.

However, whenever I have heard of it being used, it has not been used in this fashion. Perhaps it's just generally misused.

So, for example, to my mind it's more the case that someone would use it to say that Lee Harvey Oswald as a lone gunman is the simplest explanation so it's therefore more likely than CIA involvement, which to me is logically flawed. As shown in my second example, in my earlier post, simplicity does not necessarily equal higher predictability.

Yes, agreed, we seem to be talking about two different things.
 
Last edited:
I believe it's a popular misconception (held by me, until I likewise got a bit curious a while ago and looked into it) that it's about simplicity and complexity.

Rather, it's actually about the number of assumptions each explanation requires. The fewer, the better, as each assumption introduces an additional probability of error :)

So the 'simplicity' is more in terms of how many holes you have to fill in in order for an explanation to be true.

I think. Someone else'll be along soon, hopefully...
Sorry to quote you LC again but this poster seems to claim that Occam's razor is about simplicity increasing the odds. So the more complicated a theory is the more holes you have to plug.

I think this would be the equivalent of someone only betting on who scored the first goal in a football match as opposed to a combined bet of who scores the first goal, when they score it, and if they'll take their shirt off or not: or something like that. Obviously the simple bet is the most likely to give returns (albeit less returns).

But the problem I have with this is that in the context of theorising about an event, it doesn't necessarily increase your chances of getting to the truth. If the truth happens to be complex (in the sense of involving a number of aspects) then this principle of sticking with simplicity is either going direct you to an incorrect conclusion or direct you to only a fraction of the full explanation.
 
Little Baby Jesus sorry I haven't got time to go over everything but you quoted the following:


The principle can be applied in many fields of science and logic. If two computer programs do the same job, for example, the shorter one, in which less code can go wrong, is probably preferable. Or if you are a doctor and a patient turns up complaining of a blocked nose, it is more likely they have a common cold than a rare immune-system disorder.

I see these two examples as different:

The first (computer programs) is opting for simplicity due to mathematical probability.

The second (the blocked nose) is using knowledge of commonality, and nothing necessarily to do with simplicity. A common cold biologically could be just as complicated as an immune system disorder. In a parallel universe an immune system disorder could be more common than a cold and therefore that would be the go-to answer instead.

With the second example again you can't use the claim that something more common is more simple as that would make Occam's Razor redundant. It would be like simply saying that something more likely is more likely and something less likely is less likely.
 
Back
Top Bottom