You said there was a mathematical model that could show 'why and when one changes from ad-hoc modifications to current theory to shifting to an entirely new theory'. You didn't explain what that meant, and you knew it wouldn't be clear to non-mathematicians, which is why you said it. There are a couple of excellent mathematicians here, I'm not one of them. You hoped to intimidate me. I understood you as saying that you can explain the progress of science by applying a mathematical model, divorced from the activity that produces it, i.e no need for social science here. You then edited your post to explain what you meant, but again did it in such a highly arrogant way that was meant to let us all know how superior you are, and still didn't provide clarity.
Maths doesn't make me queasy, people applying inappropriately abstract concepts to the messy stuff of human society, that makes me feel queasy because it seems mad, divorced from life. If that's not what you were doing, then fair enough, but you write in such a way as to make it very difficult to understand what you mean, it's like you're arguing with a version of yourself.
I don't think my personal visceral reactions do constitute arguments, they're descriptions not explanations - sometimes that's all we have and its imperious to insist otherwise.
Ok maybe we got off on the wrong foot here, and I wasn't trying to intimidate you, I guess I was just assuming you had more background knowledge. As for clarity, let me restart and try to be more "pedagogical" rather than "argumentative" for lack of a better way of putting it.
In science - specifically physics - what we do is use models to explain observations. These models are expressed in precise mathematical terms. For example if I let some objects fall to the ground then I can observe the height from which I let them fall and the time they took to hit the ground. These constitute observations, they are the data. I can then use a model, for example Newtonian gravity, to explain these observations. In particular, I can use this model to predict the time the objects took to hit the ground based on the height from which they are falling. Fundamentally, that's what science is, finding models to explain and predict observations. The question then becomes, out of a number of different models for explaining a given set of observations, which should I choose? That's the problem of model selection. There are many ways to do this, but since the advent of algorithmic information theory in the 60s and 70s there is one related set of formalisms that is particularly suited to formalizing that which is called paradigm shifts (which are, in and of themselves, merely instances of model selection) and that is where Kolmogorov and Turing come in.
Let's introduce some notation first. Both observations and models will be given as a finite string of symbols from a finite alphabet, enclosed in quotes. For example if we say that the symbol L represents letting go off an object and the symbol D represents the object falling down, then "LDLDLDLD" represents the observation of letting go off an object 4 times and watching it fall down 4 times. Two dots (..) will represent an arbitrarily large yet still finite number of repetitions of the preceding characters. For example "LD.." represents letting go off an object and watching it fall down an arbitrarily large number of times. | will be used to denote length, for example |"LDLD"| = 4. Lastly, -> represents a mapping and will be used in model descriptions. This will become clear in a moment.
Now suppose we have the observation "LD.." and we are looking for a model for this observation. In particular, and this is the crucial point, note that observation has a certain pattern. Writing it out a bit longer "LDLDLDLD.." you can see there's a pattern there, every L is succeeded by a D. The other part of the crucial point is that such strings can be compressed. We could write the data "LD.." as the combination of "L->LD" and "L..". We'll call the original "LD.." the data, "L->LD" the model, and "L.." the compressed data. The original data can be recovered by applying the model to the compressed data. In our little toy example the model "L->LD" is a crude law of gravity, saying that if you let go off something then it falls down. This brings us to model selection, which model should I choose? The answer being, you should choose the model such that the concatenation of the model and the compressed data has minimum length (the minimum such achievable length is called the Kolmogorov complexity of the data).
Now for how this ties into the question of either adding an ad-hoc modification to an existing model or switching to an entirely different model (the normal science vs paradigm shift). Suppose you have data D and two competing models M and M' giving the same compression ratio of the data with |M| << |M'|. At this point you're going with M, given that it leads to a smaller length of the model+compressed data. Now suppose new data comes in that isn't explained by M (those "new facts" that you referred to). We have the choice of either adding an ad-hoc modification to M or switch to M'. At first we'd add an ad-hoc modification to M, making M slightly larger. If we keep doing this though then at some point we'll have made M so large that it has become larger than M', rather than |M| < |M'| it is now |M| > |M'| and hence we switch from M to M' - that's the paradigm shift. That's what I meant by saying that formalisms exist in which the notion of why and when to switch from adding ad-hoc modifications to existing models to using an entirely different model can be made precise.
And lastly for the disclaimers. First of all, I have of course left out a whole lot of stuff and took some shortcuts to try to make it easy and so there are several technical inaccuracies in the above. Secondly, this particular formalism (and there are many related ones) isn't how this is done in practice. Foremost because Kolmogorov complexity is undecidable (there is no possible algorithm that will give you the Kolmogorov complexity of an arbitrary string) and also because actually formalizing actual physics models and data in this way would be an incredible amount of work for little practical benefit. What it does do, however, is give a precise formal notion as to
why and
when either ad-hoc modifications to existing models are made or entirely different models are substituted. Edit: As a last disclaimer, since you brought up the social sciences, if your models and data are not mathematically precise then they are not amenable to algorithmic information theory and the above obviously doesn't apply. I'm not familiar enough with social sciences to readily tell, so for sake of argument just assume I'm talking here physics only.