Text 40, 174 rader
Skriven 2004-09-13 16:43:00 av Guy Hoelzer (1:278/230)
Ärende: Re: Dawkins gives incorre
=================================
in article chvng2$2hqs$1@darwin.ediacara.org, Tim Tyler at tim@tt1lock.org
wrote on 9/11/04 1:34 PM:
> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
>> in article chsg65$1hqg$1@darwin.ediacara.org, Tim Tyler at tim@tt1lock.org:
>>> Guy Hoelzer <hoelzer@unr.edu> wrote or quoted:
[snip]
>> Are you arguing that treating p_i as frequency is almost never done, or that
>> this practice has not increased in frequency? Or are you just arguing that
>> you don't think it has become sufficiently common to call it a transition?
>
> p_i is /always/ the probability of the i'th symbol arising.
>
> Sometimes the probabilities are determined completly by symbol frequencies
> - but the p_i's are never frequencies.
If they are "determined completely by by symbol frequencies" then they are
frequencies. You might argue that the use of frequencies is merely a way of
estimating the underlying probabilities, but I would respond that this then
results in a method that is no longer sensitive to the perceptual
differences among observers, which is what I have been arguing all along.
I must say I am quite surprised at your continuing insistence that model of
information is unlike anything in the minds of scientists publishing in this
area. Are you unaware of the current debate over the meanings of both
entropy and information in the context of order/disorder (dispersion)? How
do you explain the information theoretical methods of analysis, such as the
Akaike Information Content measure, that have been growing fast in
application. It is fundamental to these methods that they yield precisely
the same result in the hands of every scientist, so that they are repeatable
and verifiable. The role of perceiver, which was Shannon's initial concern,
has been dropped from information theory by many.
> They always add up to 1.0 - like probabilities do.
Like frequencies always do.
>>>>> An observer who knows what symbol is coming next (because he
>>>>> has seen a very similar message before) will assign different
>>>>> probabilites to the symbols from an observer who is seeing
>>>>> the message for the first time - and both will assign different
>>>>> probabilities from an observer who is not even familiar with
>>>>> the language the message is written in.
>>>>
>>>> This is a nice description of the (severe IMHO) limitations of the
>>>> "telegraph"-context-laden version of the theory that Shannon originally
>>>> devised for his telegraph-company employer. With all of your protestation
>>>> about my lack of fidelity to Shannon's original context, you haven't
>>>> suggested any reasons why treating p_i as a frequency, rather than a
>>>> probability, is problematic. Can you think of any such problems? [...]
>>>
>>> The ones above?
>>
>> I didn't see any problems suggested in your previous post or in the material
>> I snipped above, which was a description of how probabilities and
>> frequencies differ. Your argument then seemed to consist merely of saying
>> that Shannon originally meant p_i to be a probability, rather than a
>> frequency, to which I already agreed. None of that addresses my question.
>
> If you used freqencies, it would be equivalent to considering what
> new would be learned by an observer with very little brain - whose only
> knowledge of the future consits of measuring symbol frequencies, and
> assuming what has happened before will happen again.
>
> Such observers are surely not common.
You're thinking of information like a Bayesian, but it is not necessary to
invoke any role of observer perception at all once you switch to the
frequency paradigm. Frequencies are real, not imagined. They can be
observed with strong objectivity and are verifiable. Indeed, from this
point of view information exists in the utter absence of any observers.
This is the stance that must be assumed most by those, like Hawking and
Akaike, who apply information theory outside the context of inter-individual
signaling.
>>> p_i can only be treated as a frequency, *if* the source is something like
>>> a RNG - where the probability of symbols being emitted is constant - and
>>> does not depend on the history of the stream or environmental conditions.
>>
>> That may be the condition under which a probability and a frequency are
>> interchangeable, but it still does not address the issue at hand. Given the
>> differences between probabilities and frequencies, why isn't it better to
>> think of p_i as a frequency instead of a probability as Shannon first had in
>> mind?
>
> Currently Shannon's information represents the "suprise value" of
> a message - an estimate of how unlikely the observer thinks it is
> to be received.
>
> I.e.:
>
> ``Shannon's entropy measure came to be taken as a measure of the
> uncertainty about the realization of a random variable. It thus served
> as a proxy capturing the concept of information contained in a message
> as opposed to the portion of the message that is strictly determined
> (hence predictable) by inherent structures.''
>
> -
>
http://www.all-science-fair-projects.com/science_fair_projects_encyclopedia/In
> formation_entropy
I am not surprised that an encyclopedia would provide the historical view on
the origin of information theory, but I don't find it a convincing source on
the model currently used in the application of information theory by
scientists.
> An estimate of how likely a particular, stupid observer thinks it
> is to be received seems likely to be lacking in utility by comparison.
As in the comparison of frequentist statistics with Bayesian statistics, one
advantage the ignorant observer has is a lack of bias. However, this
comment is irrelevant to my argument, because it does not involve observers,
even stupid ones.
>>> That is certainly not the general case - and it is not the case with many
>>> common sorts of message streams either.
>>>
>>>> If not, then don't you think it is worth considering the more extensive
>>>> version of the theory?
>>>
>>> It isn't "more extensive".
>>
>> It is more absolutely extensive in its potential application and the breadth
>> of its explanatory power because it overcomes the limitations of trying to
>> approximate identical states among observers before they can agree upon the
>> information content in a data set (an observation).
>
> It is equivalent to calculating Shannon information for a rather dumb
> observer who is unable to perform simple logical reasoning :-(
>
> It doesn't completely overcome the supposed "problem" of information being
> subjective - since agents can still differ on the issue of the frequency
> of source symbols - depending on how much of the stream they have seen.
This is not a problem for either the theory or the application, which
demands the ability to share data and verify the results of analyses. There
is no difference in "bit streams" or data represented in higher dimensions
(e.g., as a 2d matrix) among scientists.
> ...and it would mean that the term "information" no longer represented
> the "suprise value" of a message to most observers - and that's pretty
> much the point of the term.
IMHO you have failed to recognize that so many others have already let go of
this definition without so much concern. After all, the word "information"
was in common usage long before the notion of "surprise value" was tagged
onto it. That tag just didn't prevent other views from arising later.
> I don't see the resulting metric as being of much interest.
>
> The term "information" already has a good meaning - and what you are
> describing isn't it.
I guess we will have to agree to disagree on this issue. I will try to
avoid confusion with my usage in the future by emphasizing degree of
structure (the continuum of order-disorder), although I will continue to
apply the structuralist definition of "information" when responding to the
posts of others using that term, and I will try to be explicit about it.
Regards,
Guy
---
ū RIMEGate(tm)/RGXPost V1.14 at BBSWORLD * Info@bbsworld.com
---
* RIMEGate(tm)V10.2á˙* RelayNet(tm) NNTP Gateway * MoonDog BBS
* RgateImp.MoonDog.BBS at 9/13/04 4:43:54 PM
* Origin: MoonDog BBS, Brooklyn,NY, 718 692-2498, 1:278/230 (1:278/230)
|