**How to be Typical: A Philosophical Study
How scientists select the most representative object
**

[This is an abridged version of my same titled article in Chinese.]

Type, in the sense of a typical member representative of a class^{note 1}, is important in scientific research. Generally, if a scientist picks the type as the subject to study, his research is likely to be most efficient, if not successful. This paper attempts to reveal how a type is determined using a quantitative approach.

A type is the most representative member of a class. Scientists make the best effort to select the type to study. To find the law of the planet movement, Kepler chose the relationship between Mars from among the nine planets and the sun as the type for two reasons.

(1) A natural reason, which is of interest to the philosophy of science: Mars is the planet easiest to observe, for instance.

(2) A social or historical reason, which is of interest to the sociology of science and pertains to a scientific activity: Tycho had already accumulated massive data about the movement of Mars.

Unfortunately, because of reason (2), scientists sometimes have to give up on the objectively typical or representative member of the class, the type. The following discussion focuses on the type determined for a natural as opposed to social reason only.

**1.** Let's assume that X_{i}, a member of some class X, has properties (a_{1}, a_{2}, ..., b_{1}, b_{2}, ...), where all a's are essential properties^{note 2}, all b's non-essential or accidental or occasional properties. Note the fact that

(1) that the degree of conspicuousness of the properties in X_{i} or how noticeable they are is different for each one of them;

(2) that for a different X_{i}, the number of b's differs.

Thus, when social factors are equal or negligible, X_{T} is the type of the class X if and only if

(i) a's in X_{T} are the most conspicuous;

(ii) there're the least number of b's in X_{T};

and (iii) b's in X_{T} are the least conspicuous.

Conspicuousness takes into account the weight distribution of the relevance of {a_{i}} (i.e. (a_{1}, a_{2}, ...)) or {b_{i}} to the experiment in question: those important properties are assigned greater weights so that we may pay more attention to them because they're more noticeable. But it's also necessary to note that (ii) may influence (i) and (iii), although (ii) is not the sufficient nor necessary condition for (i) and (iii). In X_{T}, if there're sufficient b's, then its a's are not the most conspicuous, and its b's may not be the most inconspicuous. Therefore, to make (i), (ii) and (iii) independent of each other, conspicuousness is required to be an independent characteristic of a certain property. For instance, the statement that plant p_{1} has more conspicuous phototaxis than plant p_{2} is only saying that under the same conditions, the angle of p_{1}'s bending toward light is greater than that of p_{2}'s, rather than say, p_{2} is so luxuriant that it is more inconvenient to observe or measure the angle of bending.

To better understand (i), (ii) and (iii), let's look at the case when {a_{i}} is an empty set, i.e., with no common properties.

**2.** In some cases, {a_{i}} is an empty set. This is the so-called family resemblance.^{note 3} For instance, X_{1}(a,b), X_{2}(b,c), X_{3}(c,a) form a class, which is not defined by a common property of its members
, and yet people give a class name (a universal or collection concept). So philosophers believe that this class must be defined by family resemblance plus convention. Let's study how to define a type.

Suppose there're 3 members in class X, X_{1}(a,b), X_{2}(b,c) and X_{3}(c,a,d), where letters in parentheses denote their properties. Let's find the number of occurrences of these properties in the whole class NP. Thus the NP's of a, b, c, d are 2, 2, 2, 1, respectively. Following the order of the properties in X_{1}, X_{2}, X_{3}, let's write them as X_{1}[2,2], X_{2}[2,2], X_{3}[2,2,1]. By intuition, when other conditions are equal (all properties are the same conspicuous, all social factors are the same etc.), the best candidate for the type of X is X_{1} or X_{2}, because X_{3} has an "eccentric" property less shared by other members, which makes X_{3} more atypical.

Suppose we have X_{1}(a,b,e), X_{2}(b) and X_{3}(a,c,d), then we have X_{1}[2,2,1], X_{2}[2], X_{3}[2,1,1]. By intuition, a scientist most likely picks X_{1} as the type, because it possesses, to the greatest extent, the properties which are shared, to the greatest extent, among class members (sharing two, a and b). But X_{2} and X_{3} only share one each, b and a, respectively. Furthermore, X_{3} is even more unlikely to be selected as the type than X_{2} because X_{3} has two "eccentric" properties c and d.

Summarizing these tests, we make this preliminary yet very important, fundamental principle, not subject to casual change.

(iv) Other conditions being equal, the type X_{T} in class X must satisfy: in the parentheses of X_{T}, there're as many as possible or all properties with large NP_{T}, and as few as possible or no properties with small NP_{T}, and the larger NP_{T}, the better (note: NP_{i} and NP_{T} denote the NP of any or a certain property of the i-th class member and the type, respectively)

However, how big for NP is big? First we notice that in the case of family resemblance, it's always true that NP < NM, where NM is the number of class members. In addition, the number of occurrences of the property with a value of NP is n * NP (i.e. n times NP), n being the number of distinct properties with value NP.

Now let's construct a class C, which has these members: X_{1}[4,3,2], X_{2}[4,3,2,1], X_{3}[4], X_{4}[4,2,2,1], X_{5}[3,1]. By intuition and rule (iv), should we select X_{3} or X_{1} as X_{T}? ^{note 5} Here X_{1} and X_{2} obviously differ from the X_{1}[2,2] and X_{2}[2,2] in the previous example where the numbers in brackets were the same. If it's required that only one type be determined, scientists may have different opinions. So we need to make a rule that roughly describes actual scientific activity. Hence rule (v) or (vi).

(v) Other conditions being equal, the type X_{T} of class X must satisfy: r_{i} is the greatest, where r_{i} = Σ(big NPi)- Σ(small NPi), and (big NPi) is an NP greater than or equal to (mid NP), (small NP) is an NP smaller than (mid NP), and (mid NP) = (NP_{max} + NP_{min}) / 2.

or

(vi) Other conditions being equal, the type X_{T} of class X must satisfy: r_{i} is the greatest, where r_{i} = Σ(big NPi)^{2}- Σ(small NPi)^{2}, and (big NPi) is an NP greater than or equal to (mid NP), (small NP) is an NP smaller than (mid NP), and (mid NP) = (Σ NP) / m, and m is the total number of occurrences of all properties in X, i.e. (mid NP) is the average of all NP values.

In principle, we can conjure up infinite number of rules to calculate r_{i}. Which one is the most likely to be accepted by scientists can only be ascertained by experiments. According to (v), (mid NP) of class C is 2.5. X_{T} is X_{1}. r_{1} = 5, r_{3} = 4. According to (vi), (mid NP) is 18/7 ≈ 2.6, X_{T} is X_{1}, r_{1} = 21 while r_{3} is only 16. Is it telling us scientists tend to select X_{1} rather than X_{3} as the type? Certainly not. Just because we made up rules (v) and (vi) out of intuition doesn't mean we can't make up rule (vii) where r_{3} > r_{1}. For instance, if we change the r_{i} in (v) to r_{i} = Σ(big NPi) - 2Σ(small NPi), then r_{1} =3 and r_{3} = 4. If the scentist happens to have a personality such that he abhores certain interfering factors, the coefficient in front of Σ(small NPi) may be even bigger than 2!

**3.** Now let's combine the two cases where there are and there are not common properties. All the properties in the last section are equivalent to the b in section **1**. But the question is whether it is the best strategy to follow (i), (ii) and (iii) when property a exists.

The answer is negative. Occasional heterogeneity sometimes is very important. A green lemon that has turned yellow is no longer a normal member of the lemon class. A lemon of 150 cm^{3} in volume is. But neither the green color nor volume being 150 cm^{3} is a common property of lemons. So when property a exists, occasionally and roughly determining the type follows rules (i), (ii) and (iii). But a better and more demanding determination method is the combination of (i) and (viii), where

(viii) Occasional properties are treated with the rule reached by quantizing rule (iv) and ignoring common properties.

The result of quantifying (iv) is (v), (vi) and (vii). Thus (ii) is completely modified. But doing this seems to isolate property conspicuousness, i.e. leaving (i) and (iii) out of consideration. But this is not really the case. Quantification of (iv) at least considers conspicuousness of occasional properties; the coefficient 2 of the r_{i} formula in (vii) implies this. Needless to say, conspiousness deserves more research. (viii) is semi-quantitative but it does not contain (i).

**4.** The mathematical treatment outlined above is idealistic. In reality, even with no regard to social factors, selection of the type is heavily influenced by conspicuousness of properties, relevance to the current work and other factors difficult to make quantitative. Furthermore, the above discussion is based on a rarely satisfied hypothesis of closed classes: the number of class members is finite. (This is satisfied in the case of Kepler's planet research. But science primarily targets at open classes.) Even with a closed class, it may be practically impossible to directly calculate NP, r_{i} etc. due to the large number of class members. Another obvious problem is that an object in the real world has an infinite number of properties. Only some of them can be selected in research. Can the selection rules be formalized? In addition, calculation of r_{i} is complicated by the fact that the number of properties cannot be accurately determined and the hierarchy of properties (e.g. being an activity is a higher level property of being a game)^{note 4}. These issues make it an urgent task to improve the application of rule (iv).

We may have an expedient solution for these problems in order to narrow the gap between formal rules and actual scientific activity. The problem with open classes and and with infinity of number of properties is seen as a problem with a new class C* (infinite number of properties also form a C*). Thus,

(1) In C*, select part of the properties more relevant to the current work for further selection;

(2) Group the infinite number of properties of C* into a finite number of parts, each part being a subclass with infinite number of properties. Take each of these subclasses as members of C* and treat with rules (i) and (viii).

Those two methods should be used together. It may be advantageous to use (2) first. Whichever is used first, however, intuition comes into play.

Yong Huang

1991,2004

_________________

Notes:

[1] Take definition 4b of the word "type" in the Merriam-Webster dictionary, "a typical and often superior specimen", although "superior" is not necessarily needed. Another appropriate word is "prototype", in the sense of "someone or something that has the typical qualities of a particular group, kind, etc." in the same dictionary. Note that this article is completely irrelevant to the common debate in philosophy about types, as in Bertrand Russell's philosophy of type. Instead, think of this study as one on the meaning of the word "typical".

[2] "Essential properties" are to be understood as common properties or features of a class X. Common properties indeed can be used to define a class. This is an important characteristic of essential properties. The problem whether essential and accidental or occasional properties really exist is not discussed in this paper, which takes the fairly well received opinion. See the entry "Relations, Internal and External" by R. Rorty in
[3] Take game as an example. Are all games entertaining? No, some are purely competitive. Do games result in wins and losses? No, the game in which a child throws a ball to the wall and catches it when it bounces doesn't involve win or loss. No matter what "common" features you find for a game, I can find a counter-example; it has to be called a game but it does not have this "common" feature. Therefore we see overlapping features of similarity and cannot find one single common feature (see Wittgenstein's *Philosophical Investigations* Sect. 66). Obviously, if these common features or properties need to be reasonable, then (1) they cannot be ones that are ubiquitous, such as existence or cognizability (properties of fundamental philosophical significance), cannot be higher level properties for a specific class such as "being an activity" for game; (2) cannot be a tautological property S "being S"; (3) cannot be an intersection of all class members' properties. The reason is simple; we phenomenalize essential properties to common properties but we still require that they be able to define a class.

But whether Wittgenstein extrapolated family resemblance to all entities is controversial. This paper takes the opinion of most philosophers: some entities follow the family resemblance rule, while some others have common essential properties.

[4] Why do we only consider X_{3} and X_{1}? If we disregard X_{3} for the moment, then the probability of being X_{T} is X_{1}>X_{2} (because 1 is definitely a small NP), and X_{2} > X_{4} (because rule (iv) requires that the larger NP_{T} the better), and obviously X_{1} > X_{5}. But before we decide on how big NP is big and on the criterion of comparing the number of NP's, we cannot compare X_{3} with X_{1}, X_{2}, or X_{4}.

[5] All this is closely related to our inability to tell with absolute certainty what objects or facts in the world are simple and basic (see Wittgenstein's *Philosophical Investigations*, Sect. 46, where he criticized his early logical atomism).