Steve Jones has reiterated his two main points in this discussion as:
>1. The second law of thermodynamics cannot be ignored in evolutionary
>theory bercause Evolution and entropy are headed in opposite
>directions; and
>
>2. Naturalistic evolution is contradicted by the second law of
>thermodynamics, *unless there is a pre-existing energy-conversion
>system*:
I side with Pim and Brian on these points. The first part of 1. is true,
but the second part is not. It is true that the SLOT cannot be ignored
in evolutionary theory, but this is not saying much. Since the SLOT
concerns all spontaneous nonequilibrium processes in all macroscopic
physical systems, it also is, by default, relevant for biological systems
(which are not in equilibrium and which macroscopic enough to contain
very
many degrees of freedom) in particular, and therefore, indirectly, for
biological theories, i.e. evolutionary theories. Saying that the SLOT
cannot be ignored is kind of like saying that gravity cannot be ignored.
It's true, but it goes without saying. But, it has not been demonstrated
that "evolution and entropy are headed in the opposite directions". On
the
contrary, Prigogine has shown how the SLOT *drives* self-organization at
the macroscopic level via the spontaneous formation of dissipative
structures in far-from-equilibrium systems whose disequilibrium is
maintained by sufficiently strong externally enforced gradients of
various
intensive thermodynamic potentials across a system's boundaries. It is
true
that all the mechanisms of such self-organization have not been shown in
detail for the *very* complicated behaviors, metabolisms, and
organizations
of living things, and *especially* for their origin (abiogenesis
problem),
but the results on *much* simpler systems *are* suggestive. It should be
kept in mind, though, that such a suggestion is neither proof nor hard
evidence. Just how encouraging such a suggestion is for the possible
demonstration of a naturalistic abiogenesis senario depends, to a great
extent, on how optimistic one is (by prior disposition) about such a
program's chances for success. Steve regards the program as as
hopelessly
impossible, while Pim considers it far more likely than that. I am
personally doubtful, but am not committed to its impossibility. In any
event, there is no warrant for claiming that 'evolution and entropy are
headed in opposite directions'. At most, one may claim that certain
macroevolutionary steps (and abiogenesis, too, for that matter) are too
large to happen with a reasonable probability, under the conditions
supposed, in the time allowed, with the supposed prior populations
present.
Just because something may not be likely under naturalistic conditions,
and
just because entropy and the SLOT are based on probability considerations
regarding a system's microscopic degrees of freedom, that is not a valid
reason to conclude entropy and the SLOT give support for the
probabilistic
argument concerning the thing's low probability of occurrence. IOW, not
everything related to improbable events relates to the SLOT.
Steve's second point that "Naturalistic evolution is contradicted by the
second law of thermodynamics, *unless there is a pre-existing
energy-conversion system*" is not correct either. The existence or
absence
of an "energy-conversion system" has nothing to do with the SLOT. In the
special case of certain far-from-equilibrium systems Priogine has shown
that the SLOT can, under certain circumstances, drive the spontaneous
formation of macroscopic dissipative structures that *function as*
energy-
conversion systems. For instance, the disequilibrium maintained in the
earth's ocean-atmosphere system allows the formation of hurricanes (or
typhoons or cyclones depending on where they form). Such a storm can be
thought of as a kind of heat engine doing macroscopic work on its
environment as energy is taken from the warm ocean surface and dissipated
as waste heat in the upper atmosphere, which then cools by radiation into
outer space and by advection into colder climes via the motion of the
tropical Hadley cells. Some spontaneous processes work without any
special
energy-conversion mechanism, some of them grow their own
energy-conversion
mechanisms, and some of them operate using previously existing such
mechanisms. In any event, the presence or absence of such a mechanism is
not a concern of the SLOT.
Lest Steve think I'm picking on him, my next point mostly agrees with his
oppostion to the contention by Pim (and later by Brian and the
authorities
that Brian quotes) that "the information theory entropy has no
relationship
to the entropy as defined by thermodynamics". Before I get into this
point let me mention that a couple of the quotes that Steve gave from his
daughter's introductory physics book by Giancoli are misleading and just
about hopelessly confused. So, even though Steve is right about there
being a relationship between thermodynamic entropy and information
theoretic entropy, his appeals to Giancoli's authority weakens his
otherwise stronger case. It is not always a good idea to use low level
textbooks when arguing a subtle point about physics. Such books by their
very nature have to distort the actual situation (as best as it is known
by
physicists) in order to make what they do present understandable to the
neophyte reader. Some books are better than others at minimizing these
distortions. Often the textbook's author is confused him/herself about
some of the more subtle points of advanced physics, and it shows in
his/her
simplified treatments of the point in the textbook.
I agree with Steve that information theoretic entropy *does* have a close
relationship to thermodynamic entropy. It's just not a close enough
relationship for the SLOT to necessarily have anything to say about the
behavior of an info-entropy in a non-thermodynamic context, such as in
quantifying the information contained in biological information-bearing
molecules, and making origin-of-life calculations such as those of
Yockey.
I agree with Pim and Brian that no such general info-theoretic 2nd law
exists (AFAWK).
The concept of entropy (first named by Clausius) has been repeatedly
generalized and abstracted from its thermodynamic roots. Boltzmann was
the
first to realize the statistical connection between the logarithm of the
number of microscopic states accessible to an isolated macroscopic system
when it is in equilibrium to the (macro)state function which, when
changed
quasistatically, changes by the amount of heat reversibly absorbed
divided
by the system's current absolute temperature. Later Gibbs extended the
idea to include systems that were not necessarily isolated but were
allowed
to exchange some thermal energy with their fixed-temperature environment,
or were allowed to exchange some of their volume with their
fixed-pressure
environment, or were allowed to exchange some of their particles with
their
fixed-chemical potential environment. Under these more general
situations
Boltzmann's simple entropy formula: S = k*ln(W) breaks down. It was
Gibbs
who figured out the formula: S = -k*SUM_r(ln(p_r)*p_r) for the
equilibrium
distribution of microscopic states. In this case the summed parameter r
labels each of the possible microstates of the system allowed by the
environmental conditions and p_r represents the probability that the
system
happens to be in the r^th microstate at any given time. When Shannon
used the same formula (except for a trivial change of scale units) in the
context of communication theory he wisely (following Von Neumann's
suggestion) called the function the entropy. This was, I think, the
first
non-stat mech non-thermodynamic use of the entropy concept. Since then
the concept has flowed, with no small impetus by E.T. Jaynes, throughout
the relatively new field of information theory and throughout the
established mathematical field of Bayesian probability theory more
generally. It is now, essentially, standard practice to consider the
concept of entropy to be both the functional: S = - SUM_r(log(p_r)*p_r)
and
its value defined over *any* generic probability distribution {p_r}
defined
on any generic (assumed countable) probablity space {r}. The meaning of
the entropy for a given probability distribution is that it is the
average
minimal amount of information necessary to determine the exact outcome r
for a sample drawn from an ensemble characterized by the probability
distribution {p_r}. IOW, the entropy is the average (minimal) amount of
information about an outcome (averaged over all the outcomes) *not*
contained in the distribution and which is needed *besides* the
information
contained in the distribution to specify which one of the possible
outcomes
actually obtains when a sample is drawn from that distribution. The base
to which the logarithm in the above formula is taken defines the units
that
the entropy comes in. If the base is 2 then the entropy is in bits; if
it
is 10 then the entropy is in decimal digits; if the base is 256 then the
entropy is in bytes; if the base is 2^8192 then the entropy is in
kbytes. If the base is e^(1/(1.3806258 x 10^(-23))) then the entropy is
in
joules/kelvin. Thus, in converting entropy from one unit to another we
have: 1 bit = log_10(2) decimal digit = (1/8) byte = (1/8192) kbyte =
ln(2)*1.3806258*10^(-23) J/K. As one might surmise, this last case is
*only* used in a thermodynamic context. (There is no other use in
probability theory for such a crazy base of logarithms.)
As implied above, the thermodynamic entropy of a macroscopic system is a
*special case* of the (now) more general concept of entropy used in
Bayesian probability theory. In particular, the probability distribution
whose entropy is the thermodynamic entropy for a macroscopic system, is
the distribution of possible accessible microscopic states consistent
with
the system's macroscopic description, i.e. its macrostate. In general, a
given macrostate state description determines the corresponding
microstate
distribution by the following algorithm: Of the space of all conceivable
microstate distributions {{p_r}} which are compatible with a given
macrostate, the particular one which is the actual {p_r} distribution is
the one which maximizes the S({p_r}) functional, and that maximal value
of
S *is* the thermodynamic entropy for that macrostate. This algorithm
works
whether or not a given system is in thermodynamic equilibrium or not. In
the *special case* of an *isolated* macroscopic system which is not in
equilibrium, what happens is that the system's macrostate changes with
time
in some perceptible way at the macroscopic level, and as it does the
corresponding entropy S of the maximizing microstate distribution
consistent with the time-dependent macrostate changes with time as well.
The SLOT states that this time-dependent S value is a monotonically
rising
one. Once the system reaches thermal equilibrium then its macrostate no
longer changes with time, and thus the corresponding S value ceases to
rise
as well. This maximal S value is the equilibrium S value for the system
and it depends only on a few extensive macroscopic parameters which
characterize the equilibrium macrostate (i.e. total internal energy,
total
volume, total number of particles of each species, etc.).
Since thermodynamic/stat mech entropy is a *special case* of the more
general info-theoretic entropy concept, one cannot say that the
thermodynamic entropy has nothing to do with the info-theoretic entropy
concept. After all, the *meaning* of the thermo entropy *is* that it is
the average minimal information necessary to determine, with certainty,
the
exact microscopic state of a thermodynamic system, given just the
macroscopic description of that system. As such we see that the
thermo/stat mech entropy *is* an info-theoretic entropy.
But, just because the thermo entropy happens to be a special case of the
general info-theoretic entropy concept, that does *not* mean that the
SLOT applies to *other* different special cases of that concept. In
particular, there is no reason to suppose that the SLOT has anything to
do
with entropy measures defined of the distributions of possible base-pairs
or amino acid sequences used in various naturalistic origin-of-life
senarios. I personally think (agreeing with Yockey) that such senarios
tend to be quite untenable from a probabalistic point of view, but such
conclusions have nothing to do with the SLOT itself.
If there wasn't enough confusion already about entropy and the SLOT many
authors unfortuantely muddy the water even further by appropriating the
term 'entropy' and use it in information theoretic contexts that go
beyond
the above -SUM(p*log(p)) definition. As such they sometimes use it and
the
terms, 'information', 'disorder', 'uncertainty', 'complexity', etc.
with an assortment of sometimes incompatible and sometimes overlapping
meanings. I believe that the term 'entropy' should be limited to the
above definition. This does not unduly restrict the concept, since there
are many different contexts which define many different probability
distributions each with there own entropy. Once one considers how
various
conditional, marginal, and joint distributions may be related in some
context, one sees how correlations and other such relationships may be
expressed by various entropy differences, sums, and inequalities between
the different entropies defined for the different, but related,
distributions. In all such cases the essential entropy idea is that it
is
a measure of the *average* information *missing* about the particular
special outcome for some situation characterized by a probability
distribution.
Now usually, at least to my mind, the term 'uncertainty' also invokes the
image of some information missing about some situation. When there is
uncertainty, the actual state of affairs is not nailed down. Thus, I
believe it is quite permissible to say that the entropy of a distribution
is a particular (nonparametric) measure of the uncertainty associated
with
that distribution. I would like to keep the term 'uncertainty' as a
conceptually broader idea than that of entropy. That is, entropy is (or
should be) considered as a *particular* (logarithmic average) measure of
the uncertainty associated with a given distribution. There can/should
be
other allowed uncertainty measures though. For instance, the standard
deviation or the variance of a distribution should also be considered as
other (parametric) measures of 'uncertainty'. Entropy (and, more
generally, uncertainty) is a property of a statistical ensemble. It is
determined by the distribution and is a global average property. It is
*not* a property of a given configuation or a realization of some system
unless that realization itself defines some probability distribution.
For
instance, suppose we consider a distribution of all possible sequences of
10 bits (say the uniform distribution generated by independently tossing
10
fair coins and associating the outcome 'heads' with the '1' bit and the
outcome 'tails' with the '0' bit. It is proper (according to my
definitions of things, at least) to say that this statistical ensemble of
all 1024 different bit patterns of 10-bit strings has an entropy of 10
bits
(just work out the math and this is what comes out). It is not proper to
say that the string 1111100000 has less entropy than the string
1001110110.
This is because neither string by themselves define a probability
distribution. They are just two samples taken from a distribution which
*does* have an entropy however.
As far as the much-abused terms 'disorder' and 'order' go, I wish to keep
them distinguished from that of uncertainty in general, and entropy in
particular. To my mind the ideas of 'order' and 'disorder' conjure up in
the mind a particular quality of the arrangement of the parts of a whole.
Thus both order and disorder are properties of individual particular
arrangements of things, not of global distributions of such arrangements.
A perfectly (or most) ordered arrangement is one which easiest to
specify and requires little explanation to define. A (mostly) ordered
arrangement is one that differs little from a fiducial most-ordered one,
and consequently, requires little extra explanation as how to construct
it
by perturbing a most-ordered conifiguration. A disordered configuration
is one that requires lots of extra instructions about how construct it,
as
there are few patterns which could be exploited to shorten the
instructions.
There is an information-theoretic measure involving such ideas called the
'complexity' or sometimes the 'Chaitin-Kolmogorov complexity'. The CK
complexity of a given arrangement or configuation of something is just
the
length in bits (or any other convienient unit of information) of the
shortest possible *complete* description of that arrangement or
configuation. Thus, the complexity is a measure of *both* a
configuration's intrinsic "complicatedness" and "number" of its parts
*and*
also includes the disorder of the arrangement of those many parts. We
can
then use the CK complexity idea to measure the disorder of a particular
arrangement by taking the difference of the CK complexity of that
arrangement minus the corresponding CK complexity of a most-ordered
configuration (where a most-ordered configuration is one with a smallest
CK complexity taken over all possible arrangements). Taking this
difference effectively subtracts off the "intrinsic complicatedness"
associated with a given system and just measures the disorder in the
arrangement of those pre-existing parts. Thus, the disorder ought to be
considered as the CK complexity associated with the arrangement of the
system's parts and the CK complexity of a most-ordered configuation of
the
system is a measure of how intrisically complicated those parts are by
themselves.
One might be tempted to associate the entropy of a given distribution of
possible states of a system as the statistical average (expectation) of
the
CK complexity of each of its possible realizations. This is not quite
right. The CK complexity measures the information needed to completely
reconstruct a given realization. The entropy averages, however, only the
information necessary to choose which realization obtains out of the
set of choices given by the probability space. One doesn't need to
reconstruct the outcome from scratch, just pick it out of the set of all
possible ones via the value of some label which uniquely numbers each
possibility.
What about the term 'information'? This term is used much more
consistently in the field and usually means the *quantity* of symbols (as
measured by the length of a sequence of such symbols) taken from some
finite symbol set (e.g. "0', & '1' for bit sequences) needed to encode or
characterize some situation. The term does *not* have anything
necessarily
to do with the "meaning" or the quality of the information conveyed by
those symbols. Thus, if a digital version of a given work of Shakespere
were PK-zipped into the smallest possible file that could be
reconstituted
into a printed version of that work, the length of that file in bytes
would
be close to (a little longer than) the CK complexity of that work where
the
information unit for the complexity measure is the byte. Say this file
happens to come out as 250 kbyte long. Then the CK complexity is a
little
less than 250 kbytes and the file, being a 250 kbyte file, represents
250 kbytes of information that needs to be stored somewhere. This file
represents just as much information as any other 250 kbyte file, even if
the other one is just a file consisting of entirely white noise of purely
random bytes. Information-theoretic measures make no value judgements
about the meaning or the quality of the information quantified. A
symphony
which has been maximally compressed down to a length of its CK complexity
has essentially the same amount of information (as defined by information
theory) associated with it as a white noise string of random bytes of the
same length, and is, essentially, statistically indistinguishable from
it.
What about the notion of 'specified complexity'? Well, I have yet to
encounter a definition of it which is objectively quantifiable as the
other
info-theoretic concepts discussed above. Even though the precise
meanings
of the terms discussed above sometimes vary from author to author, they
can be, and are, used consistently by each author with well-defined
objective definitions--even if different authors don't always agree on
what
those objective definitions ought to be. I should point out that I don't
necessarily believe that no objectively quantifiable defininition of
'specified complexity' can exist in principle. I just have not yet seen
one.
In order to minimize the confusion related to the field of information
theory I would like to see all the workers in this field adopt a
standardized set of definitions for the terms that I have discussed here.
I
would prefer that the standard set closely match my suggested set, but I
am
flexible on this point.
David Bowman
dbowman@gtc.georgetown.ky.us