Text 78, 169 rader
Skriven 2004-09-17 06:09:00 av Guy Hoelzer (1:278/230)
Ärende: Re: Dawkins gives incorre
=================================
in article cicd21$f2f$1@darwin.ediacara.org, Tim Tyler at tim@tt1lock.org
wrote on 9/16/04 8:55 AM:
> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
>> in article ci7mqk$24qd$1@darwin.ediacara.org, Tim Tyler at tim@tt1lock.org
>>> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
>>>> in article chvng2$2hqs$1@darwin.ediacara.org, Tim Tyler at
tim@tt1lock.org:
>>>>> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
>>>>>> in article chsg65$1hqg$1@darwin.ediacara.org, Tim Tyler at
>>>>>> tim@tt1lock.org:
>>>>>>> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
[snip]
>>> A frequency is normally a measurement of the number of times that a
>>> repeated event occurs per unit time.
>>
>> I am aware of that definition, but I am using a different conventional
>> meaning. This distinction might be a source of some of our differences.
>> The definition I am using is the one I believe to be most commonly used in
>> the biological sciences, and it well represented by the one expressed by "A
>> Dictionary of Ecology, Evolution, and Systematics." It reads:
>>
>> "The number of items belonging to a category or class; the number of
>> occasions that a given species occurs in a series of examples."
>>
>> This dictionary does not list any other definitions for "frequency."
>
> I note that that still doesn't result in a series of numbers that add
> up to 1.0.
Good point. That definition failed to mention that the frequency is "the
number of items belonging to a [particular] category or class" DIVIDED BY
THE TOTAL NUMBER OF ITEMS IN THE POPULATION. That is why they must add to
unity.
[snip]
>> A good resource for learning about AIC and its application (IMHO) is the
>> book:
>>
>> Burnham, K. P., and D. R. Anderson. 1998. Model selection and inference: a
>> practical information-theoretic approach. Springer-Verlag, New York, New
>> York, USA. 353 pp.
>>
>> The authors explain why Kullback-Leibler information is more fundamental
>> than Shannon information and show that it is more general (it includes
>> Shannon information). It is Kullback-Leibler that is assumed under the AIC
>> paradigm, which does not posit an hypothetical observer, according to the
>> authors. Instead, they argue, the set of AIC values (or adjusted analogues,
>> such as AICc) that you get out of a comparative analysis express the
>> relative distance of competing models from objective Truth. That claim took
>> me by surprise when I first ran across it, but you really have to examine
>> the theory closely to make an informed judgment about it.
>
> I had never heard of Kullback-Leibler information.
>
> I visited http://googleduel.com/ with the terms
>
> "shannon information" and "Kullback-Leibler information"
>
> Shannon information won by more than 100 to 1.
I am not surprised, but this is very misleading with regard to the
fundamentals of information theory. I think that if you ask anyone deeply
involved with information theory, they will tell you that KL information is
more general and fundamental to the theory than Shannon information, which
is a subset of KL information.
> Maybe an option for you would be to use one of the terms
> referring to this quantity - if it is what you are talking
> about.
I am pretty sure that the terms are the same as those I have been using,
although I appreciate that the sharing of terms among models can lead to
confusion.
> The terms "relative entropy", "divergence", "directed divergence",
> and "cross entropy" all appear to refer to this metric.
>
> The metric represents a measure of distance between two probability
> distributions. If the distributions are given, then metric does not
> depend on who measures it.
>
> However Shannon information does not normally consider the probabilities
> it is considering to be given and agreed-upon in advance - instead it
> allows the possibility that different observers may have different
> information about the events and may make different estimates of
> their probabilities. In the terminology of relative entropy,
> they would be said to be considering different models.
>
> If you caluclate the /relative entropy/ between the predictions of
> different models and some fixed set of observations then you would
> indeed arrive at different values.
>
>>>>> They always add up to 1.0 - like probabilities do.
>>>>
>>>> Like frequencies always do.
>>>
>>> Frequencies are usually measured in Hertz - and never add up to a
>>> dimensionless quantity such as 1.0.
>>>
>>> Indeed, adding the values of frequencies together is usually a bad move:
>>> since 1hz+2hz != 3hz.
>>
>> Under the definition provided above frequencies must always add to one if
>> you have included all possible types in your data. For example, if you
>> consider the frequency of each allele present in a data set, those
>> frequencies must add to one.
>
> How could they possibly - if the frequency is defined to be a count
> of the number of occurrences of an item in a set?
>
> Frequencies have no upper bound. They can become as large as you like.
Surely you recognize that it is conventional to talk about allele
frequencies, or any other type of frequencies in biological populations,
being constrained to add to unity. There is no doubt in my mind that your
definition excludes the conventional usage in population biology in this
case.
> You appear to be talking about a proportion of some sort - not a frequency.
Yes. In population biology a frequency is a proportion. You may not like
it for perfectly good reasons, but that is the convention. It is in all the
textbooks and it is used that way in the literature.
> Your unorthodox definition of frequency appears to matches your unusual
> definition of information. This sort of thing seems bound to cause
> communication problems :-|
We obviously have different orthodoxies. I granted you that my use of the
term "information" is different from the original use by Shannon, and argued
that this convention has changed over the decades. I also grant you that
this convention hasn't changed in all scientific circles, so there are
surely others using it as you do. I wrote that I did not have any data to
estimate the proportion (I almost wrote frequency) of the overall scientific
community that uses "information" in these two ways, so I would not argue
that your way is invalid. It is just a difference we need to be careful of
in our communications.
On the other hand, I am certain that my use of "frequency" is the one used
by virtually ALL population biologists and evolutionary biologists. It may
not be the "orthodox" definition, but it is certainly the conventional one.
>>> It doesn't appear to be what you are talking about - but it shares
>>> the element of observer-independence (though it tends to become
>>> language-dependent in the process).
>>
>> You are correct that this is not exactly what I am talking about, but I do
>> not see how it is observer-dependent. [...]
>
> I said it had "observer-*in*dependence" not "observer-dependence".
Sorry. I read your sentence too quickly. My bad!
Cheers,
Guy
---
ū RIMEGate(tm)/RGXPost V1.14 at BBSWORLD * Info@bbsworld.com
---
* RIMEGate(tm)V10.2á˙* RelayNet(tm) NNTP Gateway * MoonDog BBS
* RgateImp.MoonDog.BBS at 9/17/04 6:09:30 AM
* Origin: MoonDog BBS, Brooklyn,NY, 718 692-2498, 1:278/230 (1:278/230)
|