Ugly duckling theorem

The Ugly duckling theorem is an argument showing that classification is not really possible without some sort of bias. More particularly, it assumes finitely many properties combinable by logical connectives, and finitely many objects; it asserts that any two different objects share the same number of properties. The theorem is named after Hans Christian Andersen's 1843 story "The Ugly Duckling", because it shows that a duckling is just as similar to a swan as two duckling are to each other. It was proposed by Satosi Watanabe in 1969.

Mathematical formula

Suppose there are n things in the universe, and one wants to put them into classes or categories. One has no preconceived ideas or biases about what sorts of categories are "natural" or "normal" and what are not. So one has to consider all the possible classes that could be, all the possible ways of making sets out of the n objects. There are such ways, the size of the power set of n objects. One can use that to measure the similarity between two objects: and one would see how many sets they have in common. However one can not. Any two objects have exactly the same number of classes in common if we can form any possible class, namely . To see this is so, one may imagine each class is a represented by an n-bit string, with a zero for each element not in the class and a one for each element in the class. As one finds, there are such strings.
As all possible choices of zeros and ones are there, any two bit-positions will agree exactly half the time. One may pick two elements and reorder the bits so they are the first two, and imagine the numbers sorted lexicographically. The first numbers will have bit #1 set to zero, and the second will have it set to one. Within each of those blocks, the top will have bit #2 set to zero and the other will have it as one, so they agree on two blocks of or on half of all the cases. No matter which two elements one picks. So if we have no preconceived bias about which categories are better, everything is then equally similar. The number of predicates simultaneously satisfied by two non-identical elements is constant over all such pairs and is the same as the number of those satisfied by one. Thus, some kind of inductive bias is needed to make judgements; i.e. to prefer certain categories over others.

Boolean functions

Let be a set of vectors of booleans each. The ugly duckling is the vector which is least like the others. Given the booleans, this can be computed using Hamming distance.
However, the choice of boolean features to consider could have been somewhat arbitrary. Perhaps there were features derivable from the original features that were important for identifying the ugly duckling. The set of booleans in the vector can be extended with new features computed as boolean functions of the original features. The only canonical way to do this is to extend it with all possible Boolean functions. The resulting completed vectors have features. The Ugly duckling theorem states that there is no ugly duckling because any two completed vectors will either be equal or differ in exactly half of the features.
Proof. Let x and y be two vectors. If they are the same, then their completed vectors must also be the same because any Boolean function of x will agree with the same Boolean function of y. If x and y are different, then there exists a coordinate where the -th coordinate of differs from the -th coordinate of. Now the completed features contain every Boolean function on Boolean variables, with each one exactly once. Viewing these Boolean functions as polynomials in variables over GF, segregate the functions into pairs where contains the -th coordinate as a linear term and is without that linear term. Now, for every such pair, and will agree on exactly one of the two functions.
If they agree on one, they must disagree on the other and vice versa.

Discussion

would be to introduce a constraint on how similarity is measured by limiting the properties involved in classification, say between A and B. However Medin et al. point out that this does not actually resolve the arbitrariness or bias problem since in what respects A is similar to B: “varies with the stimulus context and task, so that there is no unique answer, to the question of how similar is one object to another”. For example, "a barberpole and a zebra would be more similar than a horse and a zebra if the feature striped had sufficient weight. Of course, if these feature weights were fixed, then these similarity relations would be constrained". Yet the property "striped" as a weight 'fix' or constraint is arbitrary itself, meaning: "unless one can specify such criteria, then the claim that categorization is based on attribute matching is almost entirely vacuous".
Stamos has attempted to solve the Ugly Ducking Theorem by showing some judgments of overall similarity are non-arbitrary in the sense they are useful:
Unless some properties are considered more salient, or ‘weighted’ more important than others, everything will appear equally similar, hence Watanabe wrote: “any objects, in so far as they are distinguishable, are equally similar".
In a weaker setting that assumes infinitely many properties, Murphy and Medin give an example of two putative classified things, plums and lawnmowers:

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...