The Talk.Origins Archive Post of the Month: December 2007

Subject:    | The Relationship of Gaps to Thresholds
Date:       | 04 Dec 2007
Message-ID: | 1bc29aa6-af69-4610-867b-b06ea2922c7f@d61g2000hsa.googlegroups.com

Beginning from comments by Sean Pitman:
>>> In this same line, your wild idea that the minimum likely
>>> gap distance is therefore always the minimum possible
>>> distance (i.e., one mutation) is equally inane.

>> Quite the opposite. What I say is that, for those functions
>> that *did* evolve, they did so *because* there was a short
>> pathway (in number of mutational steps) available. If you
>> remove the short pathway, the function will not evolve. That
>> is, those functions that *have* evolved represent a biased
>> sample.

> It is your position that everything evolved. The argument is
> over that position. You cannot assume the truth of the very
> position that is being questioned. You have to come up with
> some sort of evidence to support this position. You can't
> just assume it as a given in this particular discussion.

Yes. *If* my mechanism is correct and new functions arise by modification of old structures, I would expect, especially for recently evolved 'new' functions, that there should be substantial sequence similarity between the old structure from whence it sprang and the new 'function' that it is derived from. But, importantly, for recently evolved 'new' functions, there would be greater than necessary similarity. That is, like hemoglobin's globin genes, there will be similarity in features that are not *required* for function, such as the placement of introns. And, to the extent that selection is conservative, there will be specific retention of selectively relevant sites. Enough so that genes can be grouped into families of proteins. Now, we can actually check if this, in fact, is what we see.

*If* Sean's model of evolution were correct, each new function must start at some average distance away (that distance being a function of the size of the end product) and must *independently* evolve by a random walk. That would only produce families of genes if there were only one way to *independently* produce a particular function. Otherwise, and especially if there there were more than one possible way to produce a function from scratch, in each organism that formed such a function, that function would randomly be this structure that can perform the function or that structure that can perform the function. Yet, what we *see* is lineage-specific structures that perform a function. That, at the very least, tells us that once a function has formed, the specific structure producing it must be transmitted in a lineage-specific fashion.

But we know that, although there is a bias favoring the first functional sequence or structure, since organisms with that can fill the niches where such a sequence or structure is useful, independent generation of functions can and do occur. In each case of this, however, we can clearly observe that the new structure was an independent process (e.g., the fins of fish, whales, and penguins; the rotary flagella of eubacteria and archae). Moreover, such structures and functions are retained (although sometimes secondarily lost) within lineages determined independently. Moreover, we know that there are structures that can perform a function that 'nature' has not found (or, if found, was not utilized because an already more optimized structure for performing that function existed, making the new version selectively irrelevant.)

Now Sean does not say so, but his *real* model involves magical poofing of each protein sequence *independently* in *each* created organism at the time that that organism was magically poofed into existence by a magical poofer. The only reason I can think of for why a magical poofer would produce the pattern we see of lineage-specific variation within genes with the same function and also why it would intentionally produce independent structures to perform a function in a different lineages rather than borrow structure is the same reason why humans (the only intelligent agents we know the motives of) fraudulently produce a historical pattern or record: in order to deceive other humans. If that is what he wants his god to be, that is his business.

*If* my mechanism of evolution is correct, then the entirety of known protein structures will be 'clustered' rather than being found in widely separated randomly placed 'islands' (precisely because not all of protein structure space has been explored). The structure of the space bounded by and including the functional sequences that *do* exist will be organic, with flows outward and along useful axes as nearby structure space is gradually explored.

*If* Sean's model of evolution is correct, then I would expect to see a random pattern representing the randomly placed functional structures in total sequence space. You can make out randomly generated 'clusters' or constellations that happen by chance alone, but you cannot drape a skin over the entire structure. Further an analysis of the pattern of proteins in structure space, in my model, would show that *existing* proteins are NOT randomly distributed in total structure space. They should be clustered. In Sean's idea that evolution is a random search for randomly distributed functional structures (and, again, I point out that I am using structure here because structure better correlates to function than sequence does), one would *expect* existing functional proteins to be randomly distributed in structure space. They are not.

In Sean's *real* idea about how he really thinks protein structures came into existence (by magical poofing by a magical poofer, a cheap magician of a god, and one apparently intent on producing a false historical pattern), one could expect the observed pattern because that would be part of the deception.

So, do I merely *assume* that there is a detectable pattern to the pattern of protein structure and sequence that would differ between the expectations of one that would be generated by Sean'sideas that are a simple-minded minor modification of the "747 in a tornado" strawman evolutionary model? No. There are ways to test these models. They have different expectation. And, though this may be a surprise to Sean, I actually agree that the model of evolution he is arguing against has been falsified (although it may have produced some minor chance 'surprises'). The problem is that falsifying Sean's strawman model of evolution does not falsify real evolutionary models.

That Sean also produces bogus numerology (telling us that the probability of cytochrome c functional sequences to total sequence space is the ratio of 'beneficial sequences' to 'non-beneficial sequences' for one example; assuming that total sequence space is the relevant denominator to represent total trials is another) to support his 'falsification' of a model of evolution that no evolutionary biologist would hold.

> Of course, if a particular system did in fact evolve it couldn't have
> crossed a large gap distance of the size I'm suggesting - I agree with
> that. The distance would have to have been quite small indeed from
> what came before. The question is, is such a small gap distance at
> all likely to have ever existed for certain types of functional
> systems? That is the real question here.

And your numerology based response is irrelevant. Because the denominator you use is irrelevant.

> If the size of the minimum gap distance is in fact related to the
> minimum structural threshold requirements for a system, then the
> answer to this question for higher level systems is no. It is not at
> all likely or reasonable to suggest that the gap distance would ever
> have been small enough to cross - - even given trillions upon
> trillions of years of time.

And this is an assertion based on numerology alone. If you really want to demonstrate that there is a correlation between gap size and total size, you need to present *data*, not numerology. Otherwise it is merely an assertion based on an assumption.

> And, it is not likely that any higher level system will ever
> be evolved in the future even over trillions upon trillions of
> years of time.
>
> What do I mean by a higher-level system? I mean a system that
> has a minimum structural threshold requirement of more than
> 1,000 specifically arranged amino acid residues and/or codons
> of genetic real estate. Is this threshold requirement the gap
> distance as you keep claiming? No. Absolutely not. As I've
> noted for you more times than I can count now, the gap
> distance is always smaller than the threshold size.

BUT, as I keep pointing out, you do not mention gap distance UNTIL I remind you that that is the number you need. I am merely trying to get you to *honestly* present *your* argument rather than hide *your* argument by only using the size number. I am NOT lying about *your* argument when I point out that the number you need is "gap size" and not "total size". I am trying to get you to *honestly* present *your* argument.

> They are not the same thing. Is there a relationship between
> the two? There most certainly is and this relationship is
> linear.

Except that you have presented no actual data to support this. You are merely *assuming* (actually, repeatedly asserting) that the relationship is linear. I see 'new' modifications of old proteins (such as resistance to antibiotics) that are not a function of the size of the old proteins. I also see many differences that are functionally irrelevant. I see evidence of *quantitative* and *qualitative* features (secondary substrate, minor activities) in current proteins which can be amplified or changed into bifunctional proteins. And I also observe that substitutability in sequence, if anything, gets larger with size (the real correlation is with the fraction of sequence that is involved with the substrate, that generally gets smaller as the protein gets larger).

> A linear increase in the threshold limitation results in a
> linear increase in the minimum gap distance that exists
> between anything in a gene pool and the closest potentially
> beneficial system to that gene pool at a given level of
> minimum threshold requirements.
>
> Where is the evidence for this relationship? For one thing,
> it is very clear, even to mainstream science, that the vast
> majority of potential sequences would not beneficial to a
> particular organism in a particular environment. Even Richard
> Dawkins in his book, The Blind Watchmaker, first chapter,
> notes the following:
>
> "But, however many ways there may be of being alive, it is
> certain that there are vastly more ways of being dead, or
> rather not alive."
>
> This fact is so obvious that it is very difficult for anyone
> to deny - even you if you are honest with yourself.

I don't deny it. Any time you have a complex system, no matter how it came into being, there are more ways to destroy it than to improve it. But that doesn't tell us squat about how the system came into existence. That is one of Behe's fallacies about IC systems.

> The next step is to consider the relationship this fact has to
> increasing minimum structural threshold requirements. How
> many additional potentially beneficial sequences/structures
> are added to sequence space with an increased threshold
> requirement of just one amino acid? While the exact figure
> cannot be known, what can be known is the trend. The size of
> sequence/structure space increases 20 fold. It is 20 times
> bigger than it was less one residue. According to Richard
> Dawkins's argument, what ratio of this increase should be
> comprised of potentially beneficial sequences/structures?

Somewhere between 0 and 20, depending on whether that particular amino acid plays a crucial role, is functionally irrelevant to function (say at the carboxy end), or only plays a role in maintaining a secondary structure. Besides, although there are means for both reducing or increasing the size of a protein by one or a few aa residues (and many functionally equivalent proteins with such deletions and additions exist in nature), there is no necessary increase in non-beneficial to beneficial proteins by adding or deleting a single aa residue or even changing one. IOW, even here there is no significant correlation between adding an aa residue and adding non-functional sequences.

> Obviously, the ratio of the 20 fold increase that might
> actually be beneficial would be a very small fraction. The
> increase in potentially non- beneficial sequences vastly
> outpaces the increase in potentially beneficial sequences.
> This is true in all language/information systems.

Do you actually have any data to support your assertion for actual proteins? No. An argument by analogy does not work here. Protein "meaning" or "function" is not a direct consequence of sequence. Your numerology below is meaningless GIGO wrt the systems you need to look at, real proteins.

> To illustrate this point, simply consider the English language
> system. How many defined meaningful 2-character sequences are
> there? Well, out of 677 possibilities, there are 96 of them.
> This creates a ratio of meaningful vs. non-meaningful of 1:6.
> What happens when the threshold increases to 3-characters? The
> number of meaningful sequences also increases to 972, but the
> number of non-meaningful sequences increases much much more to
> 16,604 for a ratio of 1:17. Let's increase the threshold to
> 7-character sequences. The number of meaningful sequences
> increases to over around 25,000 while the number of
> non-meaningful sequences increases to 8,031,785,176 for a
> ratio of less than 1 in 30,000.
>
> Do you notice a pattern here? The very same thing happens in
> all language/information systems to include computer code and
> genetics. For further confirmation, I've shown you the decline
> in the ratio for protein-based systems that already exist in
> living things. All one has to do is do a blast search to
> compare the number of existing systems at different levels to
> the size of sequence space at that level while adding a
> generous degree of flexibility for each unique system (like
> 1e90 per 100aa - a generous suggestion given the numbers
> listed in literature). I've also given you papers such as the
> one by Choi and Kim which demonstrate this exponential decline
> in ratio as well as the linear increase in gap distance with
> an increase in minimum size requirements:
>
> http://www.pnas.org/cgi/content/full/103/38/14056

Bullshit. That paper does not tell you that. And *I* was the one who pointed it out to you. That paper *explicitly* points out that the proteins of life are NOT randomly distributed in structure space. Nowhere in the paper is there *any* statement that there is a linear increase in gap distance with an increase in size. You seem to be willing that interpretation by looking at a figure that does not include all functional proteins that have ever existed, but only a sampling of such proteins. And even in that figure, large functional proteins appear to be the result of an organic outgrowth due to combining smaller ones, as would be expected by the *real* evolution model.

Sean, your interpretation of that paper is pure imaginary bullshit and is unrelated to what that paper presents. What that paper clearly shows is that all the functional proteins that have been found have been found within an organically connected single search space and not found in randomly distributed sites thoughout total structure space. Your 'evolution model', the strawman idea of the "747 in a tornado" model, would predict the latter.

> Yet, somehow, despite the obvious implications of the data,
> you refuse to recognize that your notion of small gap
> distances existing at high levels is simply untenable.

That is because you seem to be looking at black and claiming it is white wrt this paper. Your misinterpretation of the data in this paper puts your incompetence in accurately stating what Yockey's numbers really represent (which is not the ratio of "beneficial sequences" to "non-beneficial sequences") to shame.

> You keep saying, because we know things evolved, the gap
> distances had to be small. That notion is based on nothing
> but blind faith. You have absolutely no genetic or molecular
> evidence of any kind to back up this bald assertion when it
> comes to actually demonstrating these small gaps you believe
> exist at such high levels. They simply don't exist.

I certainly do have supporting evidence against your model of what you call 'evolution'. And the evidence supports my model; that not all sequence space has been searched and that the search involved repeated searches from pre-existing functional sequences that has found nearby functional sequences. Certainly there is no evidence that each organism evovled each function independently of each other, purely on the basis of lineage-limited changes of proteins with the same function. I have described this above.

> This is supported by the fact that no novel system of function
> has been shown to evolve beyond the 1,000aa threshold level.

This is just an assertion based on shifting definitions of what "the 1000 aa threshold level" means and what constitutes a "novel system of function". Apparently you mean that no one claims to start with a random sequence 1000 aa long that has, according to you, a gap of 300 aa which must change completely randomly with no intermediate functions *at all* until some magical moment when it hits a *randomly* distributed functional island. My point is that this is irrelevant.

> It just doesn't happen and statistically it will not happen
> and therefore almost certainly never did happen.

Actually, Sean, I *agree* that it would not happen by the mechanism you imply, starting from some random sequence some large number of sequences (300? 50?) away from some teleologic functional goal. My argument is that this is a strawman model of evolution, not that it is possible for the functions that have arisen to have arisen this way.

> Not even one of your proposed steppingstones along the pathway
> toward higher-level systems, like flagellar motility, has ever
> been shown to evolve, not one step.

I agree. Not a single step in the evolution of flagella occurred by the strawman mechanism you propose. Instead, each step occurred by a short sequence of events from a pre-existing *functional* structure. And I have indeed shown, by virtue of a model system, that the acquisition of a *function* (rotary motility) can arise in a single step from pre-existing systems that could, each, easily have useful functions in a cell (because related structures do).

> The reason for this, obviously, is that the gaps between your
> proposed steppingstones are much much farther apart than you
> and others like Kenneth Miller seem to realize.

> You both need to go back to the data and actually consider
> that what seems morphologically like a small enough gap is
> actually huge statistically.

Only when you depart from actual evidence and resort to numerology equivalent to saying that no one can find a Starbucks because, if you divide the area covered by Starbucks by the totality of surface space on the earth, the probability that one would be close to you is quite small.

> Yet, you evolutionists stick with your hand waving and bald
> assertions that the morphologic gaps are small enough without
> ever doing any real statistical analysis on your assumptions.
> There are no mathematical calculations or analyses in
> literature when it comes to the estimated time needed for your
> proposed mechanism to cross any of these suggested
> steppingstones of yours.

The amount of time needed to find a bacteria resistant to streptomycin is one day. The amount of time needed to find, not one but two, different mutations that generated a functional mobility system *different* from the original system was less than a month. Both of those involved changes in function of quite large systems.

> All that one can find are assumptions that given millions of
> years such a gap would certainly be crossed. That's because
> those who makes such assumptions don't actually bother with
> any real numbers or statistical calculations when it comes to
> evaluating the actual mechanism proposed.

That is because the numbers that you present are GIGO numbers. And the real mechanism does not involve a mechanism in which such numbers are relevant. Historical constraint is more important.

Using historical constraints in protein structure to detect "magical poofing" in Intelligent Design

Post of the Month: December 2007

Using historical constraints in protein structure to detect
"magical poofing" in Intelligent Design