(U03)

www.btinternet.com/~adrian.larner/database/pcl07

PLATOCLAST
ON DATA

Lecture VII
The Same Again

 

 

Let me start by congratulating you for having the persistence to be here again today. Those of you who are here, that is. The ground we’ve tried to cover – data structure, manipulation, and integrity, and a nod towards interpretation – takes about an hour in your average relational data base course. We’ve managed structure in two lectures, manipulation in two, and a mere approach to integrity in two, with sundry animadversions and digressions on interpretation; not to say agonising. Ah well, no pain no gain, they tell me.

At least we’ve seen it’s interpretation that’s a real pig. And every time we’ve tackled it, whether the Classical interpretation of the FOPC, or the Entity interpretation that we looked at last lecture, we’ve found problems in getting to grips with the notion of identity. What’s more, the only safe sort of interpretation we’ve found is self-interpretation, either the Classical, where a record is a proposition made true by the record itself, or the Entity interpretation, where the record represents itself. And we guess, I mean, I’m sure, that self-interpretation is safe because we have a pretty simple criterion of identity of records: they are the only things that don’t give us identity problems.

In this lecture, I’m going to try to tidy up this identity question, and get a really firm base for further advance. I want to discuss whether the notion of absolute identity is supportable, and what sort of things the additional types that go with it might be. But first of all, I’m going to revise and complete what I’ve said about relative identities.

 

 

Economy of Means

 

One thing I want to stress is this: that I try never to make any move that my opponents in argument can reasonably object to. For instance, I object to other people’s use of sets, as you know. But could they object to my use of “sum-objects”: what we get in the TT-DD by using “+” to add records (or, more generally, any entities – sense lor sense 2, real or theoretical)? I think not, because I can give an interpretation of sum-objects in terms of sets. Non-mathematicians: believe me and switch off for a few seconds.

I introduce the notion of “congruence”. Let “is within” be the ancestral of “is a member of”, i.e. x is within y if x is a member, or a member of a member, or a member of a member of a member , or ..., of y. I now say that two sets are congruent if and only if all and only the non-sets that are within one are also within the other. And now sum-objects amount to the equivalence classes over congruence of sets that have a non-set within them. One sum-object, so defined, overlaps another if one of the members of each are such that something is within each of them.[1] Non-mathematicians switch on again.

So the believers in sets can’t object to anything I do. And likewise, the believers in absolute identity (and I suspect that they are the same bunch) can’t object either. For they must allow relative identity predicates: they’re just ordinary symmetric and transitive predicates. Consequently, I can reassure you: you are safe with me. At least I won’t lead you astray, though I may not lead you in places where others tread.

 

 

Relative Identity

 

Second quick digression: I’m not going to talk about “identification” in the sense of picking things out – only in the sense of being the same such-and-such.

To relative identity predicates then, or – as you will have appreciated – criteria of identity: they come to the same thing. You’ll remember a sort of trick we played in getting the criterion of application of records: we used the criterion of identity, “is the same record as”. We said: “x is a record” means “there is something that is the same record as x”, or – equivalently – “x is the same record as x”,x is the same record as itself”.

 

 

Naming

 

And I want to say that, given a full enough criterion of identity, we can always define a criterion of application in this way. But we can’t go the other way: we can’t define, say, “x is the same person as y” on “x is a person”, even if we allow ourselves absolute identity. Many years ago a small baby was baptised “Geoffrey”: yes, it was none other than myself. That is, it was the same person as me. But that does not amount to it being a person and being (absolutely) the same as me. Of course it wasn’t absolutely the same as me: it was smaller, for one thing. Probably prettier. Less considerate, in respect of its bodily functions anyway. Probably not made of any of the stuff I’m made of.

Actually, you may well ask: then what entitles you to its name? And the answer is – you will see it must be – that it received its name under a criterion of identity. When the name “Geoffrey” was given to it, that name was, by that very naming, given to each thing that is (was, is, and will be) the same person as it. We could, if we wished, give a baby a name under the criterion of identity “is made of the same stuff as”: such a name given that many years ago would now apply (not very usefully) to a widely scattered object. But fortunately that criterion was not used at my baptism; “is the same person as” was.

We all appreciate that the monadic predicate “is a father” is – let us say – relationally based. It has to be defined, it is understandable only when defined, on the relation (the dyadic predicate), “is father of”: to be a father is to be the father of something, someone. And, surprising as it may seem, ordinary count nouns – person, record, chair – that is, the predicates “is a person”, “is a record”, “is a chair”, are relationally based. They are based on the dyadic predicates: “is the same person as”, “is the same record as”, “is the same chair as”.

 

 

Counting

 

We have seen how this is a prerequisite to naming such things: a name is given under a criterion of identity. It is also a prerequisite to counting such things. Suppose I give you a pack of cards and say: count that! You can’t. What will you say: 52 cards? or 4 suits? or 2 colours? or 13 denominations? Can you count a data base? So many fields? or field-types? or field values? or records? or stored records? or record types?

I will tell you how to count. It’s surprising what you learn on this course isn’t it? Look at this chart:

aAbcdeEEfghij
JkLmnOpqrsStT
TTuvVwxXXXyzZ
0123456788+,.
I use the term “alphabetic letter” in this sense; there are precisely 26 alphabetic letters. They are “A”, “B”, “C”, and so on to “Z”. Uppercase “A” is the same alphabetic letter as lowercase letter “a”. I hope you latch on to what I mean by “is the same alphabetic letter as”: big “P” is the same alphabetic letter as little “p”, but not the same alphabetic letter as big “Q”.

I’ll tell you the criterion of application as well: the only things that are alphabetic letters are single character inscriptions on paper, or projection foils, or carved in granite, or whatever. So “x is the same alphabetic letter as y” means;

x is a single character inscription and y is a single character inscription and ((x is an A and y is an A) or (x is a B and y is a B) or ... (x is a Z and y is a Z))
And now we can set about counting the alphabetic letters on the chart. To begin with, nothing on the chart is tagged. We say “zero”. Then, repeatedly, we pick any untagged alphabetic letter (remember: a single character inscription is an alphabetic letter if it is the same alphabetic letter as itselt); we tag it with the successor of the last uttered number (and we utter that number, thus “one” for the first, “two” for the next, and so on), and ipso facto we tag each thing on the chart that is the same alphabetic letter as it. When no further untagged alphabetic letter is to be found on the chart, the last uttered number is the required count.

I think you’ll find it’s 26: they are all there. So that is how to use a relative identity to count: and I defy anyone to count in any significantly different way. Of course, we could use a different criterion of identity: we could use “is a cased letter”, distinguishing uppercase A from lowercase a, etc., with a maximum possible count of 52. Or we could count under “is the same single character inscription as”, or “is the same alphanumeric as” (maximum count 36).

 

 

Trying to Count Under Absolute Identity

 

But we couldn’t use absolute identity. Suppose that you tried. “Zero”, you say. That’s safe enough. You point to the lowercase “a” and tag it “one”. You try to point somewhere else. “What about that lowercase ‘a’?” I ask. “Just counted that”, says you. “Untagged”, I reply.

You see: you counted it as “one”, and so tagged it, when it was untagged. Now I say: it’s not absolutely the same, is it? It’s tagged. Anyway, as you pointed at it, I noticed that a minuscule portion of ink fell off it. So you have to count the lowercase “a” again.

I think you can see that you’re on a hiding to nothing. Whereas, using the relative identity criterion, we’re safe. We say: I don’t care if it’s tagged or untagged, pristine or slightly decayed, it’s one and the same alphabetic letter, and it gets counted once and once only under that criterion.[2]

Now you think I’m having you on; you think I’m raising spurious problems: that mere tagging with a number doesn’t count (so to speak). But it does. That’s why some things can’t be counted, even with a good criterion of identity: all the non-negative integers for instance (Johnny von Neumann spotted that). You say “Zero”; you point at zero and you tag it “one”. Now you have to count the tag: it’s a number. And when you count it, you tag it “two”. Well, you see the problem. But it gives a pretty definition of “successor”, doesn’t it? The successor of a number is the tag you give it when you try to count just zero under the criterion of identity “is the same non-negative integer as”.

 

 

Concrete (Non-additional) Types

 

And now it is, I hope, crystal clear how I can talk about non-additional types. Remember that an alphabetic letter is a single character inscription: that is how I defined it. And yet there are more single character inscriptions that are alphabetic letters on the chart than there are alphabetic letters on the chart. But there are no alphabetic letters over and above – in addition to – the inscriptions.

So I will say – now you’re all experts in schematic predicates – even if each F is a G, it doesn’t follow that there are at least as many Gs as Fs. Not if you count Gs under the criterion of identity “is the same G as”, and Fs under “is the same F as”. But do remember that you can’t define “is the same G as” on “is a G”: you have to do it the other way round. To be a G is to be the same G as something.

 

 

Systemic Identity Neither Absolute Nor Persistent

 

I don’t want to pour any more scorn on absolute identity: if its defenders want to defend it, let them try. I have spoken about systemic identity, the sort of identity we get when we have a theory. And systemic identity has the character that is called “Leibnizian”: if “=” is our systemic identity, and we have a=b, where “a” and “b” are proper names – that is, and at last I can define “proper name”, names given under the systemic identity – then any proposition containing “a” has the same truth value as (is true or false together with) any proposition formed from it by substituting “b” for “a” at any occurrence. But remember, remember: systemic identity is theory-relative. Add just one more primitive predicate to a theory and you can get a new theory, possibly with a different systemic identity. The old systemic identity becomes merely a relative identity, no longer Leibnizian.

But this just shows that systemic identity is not absolute identity. For, if we have a theory, and extend its vocabulary, and perhaps add postulates, all the old theorems – the truths of the former theory – become theorems of the new theory. So, suppose that a is systemically identical to b in the old theory, T. Let’s say: a is the same T-entity as b; and that’s a theorem of T. Then, when we add vocabulary and postulates to T, getting U, a may not be systemically identical to b in U, but the proposition, “a is the same T-entity as b”, is a theorem of U, because it was a theorem of T.

But obviously “being the same T-entity” can’t mean “being absolutely the same as”, for a is not, we suppose, the same U-entity as b. So a can’t be absolutely the same thing as b.

I hope you begin to see the problem even in retreating from absolute identity to systemic identity in our data bases. It requires only one new predicate to be added, perhaps even only one new proposition, to create a data base in which things that were formerly systemically identical become distinguishable. So when Mr Date says that “entities in the real world are distinguishable ... they have a unique identification of some kind”, and that the primary key of a record performs this unique identification function in a data base, you know that he’s on to a losing ticket.[3] Are we really not safe to add a new relation, perhaps even a single new record, without checking whether our previous systemic identity remains systemic in the updated data base?

 

 

Relative Identity Persistent

 

By contrast, remember the REGISTRATION relation, where we made no pretence that the primary key, PERSONNO, uniquely distinguished an entity; merely that it uniquely distinguished a person. There we could add a column and change the primary key; yet PERSONNO continued to distinguish persons. And surely this is the sort of thing we should expect, for we know pretty well what we mean – our users know what they mean – by “being the same person”. No-one on this earth, however, knows what it means to “be the same”, “be the same thing”, or “be the same entity”, and they would be hard put to say, given a reasonably complex data base, what it meant even to “be said by the data base to be the same”, i.e. to be systemically identical, or indistinguishable in the theory.

So it’s rather pleasant to discover that we don’t have to bother with that. Though probably some people will still be afraid that if they don’t stick with absolute identity, the world will fall apart. Something terrible will happen, just like if you don’t get down to the bottom of the stairs before the flushing stops. But we know what happens if you stick with absolute identity and get it wrong: you get the Join Trap. And that is something terrible: falsehood implied by truth.

 

 

Abstract (Additional)Types (Sets)

 

I can see all the mathematicians getting twitchy: they’ve remembered what we might call “Mr Date’s Way Out”: stick with the Classical interpretation, or the Entity interpretation, but treat things like PARTNO values as names of additional types. Then the parts delivered by different suppliers can be different parts (different part instances) but of the same (additional) type. And what is this type? what sort of entity is a part-type? “A set”, I hear you say: it’s the set of all parts of that type. And I want to say: even if you don’t say it’s a set, it is very like a set isn’t it? It will do until a real set comes along.

I put aside the implications for users like myself: Nominalists who deny the existence of sets and such abstract objects. One implication is, for those that want to be user-friendly: you can’t build a data base that reflects our view of the world, because we nominalists have a set-free view of the world.[4]

There is, of course, an alternative interpretation. We could take what is represented by a PARTNO to be a description of a part; a description that happens to hold true of many parts from different suppliers. Then the different, but commonly PARTNOed parts, supplied by different suppliers would simply be: parts falling under the same description. Where, we might ask, could such a description be found? And, of course, the answer is: the description is just the PART record, keyed on PARTNO. So, once again, a retreat to self-interpretation puts everything right. But we knew we had that retreat: what we were looking for was a way forward.

Well, why not sets? Remember we are here asking not “Why not use set theory as our foundational theory?” but “Why not interpret values, such as PARTNO values, as naming sets (of parts)?” And the simple answer is that the notion of a set is incomprehensible: users cannot grasp it, because people cannot grasp it. It is gibberish.

We have seen Russell’s “paradox”, which mathematicians have made various ad hoc adjustments to avoid.[5] But I want now to consider the problem of understanding an apparently simple notion that we will need if we are to treat objects as sets: cardinality.

 

 

Abstraction and Cardinality

 

We have two things we need to maintain, for at least the sorts of sets that we are going to use:

The axiom schema of abstraction, in effect, that to be an F is to be a member of the set of Fs.
 
The cardinality of the set of Fs is the number of Fs there are (the count of Fs).
If we can’t maintain both of these, for ordinary finite sets, where Russell’s paradox doesn’t arise, then we might as well abandon the whole game. Very well: think back to the number of alphabetic letters on the chart. Let us talk about single character inscriptions, i.e. our variables, “x”, “y”, etc. will range over such inscriptions. I have given a definition of what I mean by “is the same alphabetic letter as”. From this definition I formally define:
x is an alphabetic letter =df x is the same alphabetic letter as x.
And this amounts to: x is a single character inscription and is an A or is a B or ...or is a Z. I think you will now find that the cardinality of the set of alphabetic letters on the chart somewhat exceeds 26; because there are, in this set, all the single character inscriptions that are alphabetic letters. So the cardinality rule fails: the cardinality of the set of alphabetic letters on the chart turns out not to be the same number as the count of alphabetic letters on the chart.

And this is where we should stop. Set theory fails. But we will allow the mathematicians to coerce us. “An alphabetic letter”, they say, “is a type. A single character inscription strictly is not an alphabetic letter, but is of that type, or a member of that set.”

Very well. Though I see nothing wrong with my definition, I will yield. The alphabetic letters we now have are: the set of As, the set of Bs, ...the set of Zs. We have the count right; it is 26; and that is the cardinality of the set of alphabetic letters (which is a set of sets). But notice that we cannot now say that a particular inscription is an alphabetic letter, for no inscription is a member of the set of alphabetic letters (so we would break the first rule, that to be an F is to be a member of the set of Fs ).[6]

And we have to rephrase our question: because there are no alphabetic letters on the chart, only inscriptions that are members of alphabetic letters. We have to ask: how many alphabetic letters contain a member inscribed on the chart?

So much I concede. But now I introduce a symmetric and transitive predicate. But I define it formally:

x is the same alpha-inscription as y =df FOR SOME s, s is an alphabetic letter, and x is a member of s, and y is a member of s.
And I formally define:
x is an alpha-inscription =df x is the same alpha-inscription as x.
And obviously “is the same alpha-inscription as” is our old “is the same alphabetic letter as”, and “is an alpha-inscription” is our old “is an alphabetic letter”. But notice that both these predicates are defined, and defined using only the apparatus demanded by the set theoretician.

As we have seen, in a very obvious sense, there are exactly 26 alpha-inscriptions on the chart. If you count them and get more than 26, you must have counted the same alpha-inscription more than once. I.e. you must have miscounted. Yet the cardinality of the set of alpha-inscriptions on the chart is greater than 26. So, in this case, the second rule breaks down again: the cardinality of the set of Fs is not the same number as the count of Fs.

But now the set theoreticians can’t insist on re-interpretation, for the predicates used have been defined on (are a mere abbreviation for) their own notation.[7]

 

 

Impossibility of Counting Under Absolute Identity

 

And what does it come to in the end? How does the set theoretician count the members of a set? Using what criterion of identity (to avoid counting the same one twice)? Absolute identity. Alas, no-one can count under absolute identity: is a set of alphabetic letters, say the set of Zs, to be the set of things that are Zs? Or, if it is the set of single character inscriptions that are Zs, then each single character inscription will have to be a set of things that are the same single character inscription. And to this madness there is no end.[8]

It is an axiom of set theory that s is the same set as t when s has all and only the same members as t. And this is easy enough to check if the members of s and of t are themselves sets: we simply reapply the common membership test to find if this member of s is the same set as that member of t. But at some point we have to stop: when we reach members that are not sets. Then how do we tell whether this is the same as that? Use absolute identity?

 

 

Sets as Interpretations

 

I should, I suppose, remark that one solution open to the pure set theoretician (if such there be) is to admit no non-sets whatsoever. You see, you don’t need non-sets because you can have the empty set, the set containing just the empty set, the set containing those two sets, and so on. But somehow I don’t think our users will want to interpret their records in terms of sets of sets of sets of sets ...

What does it mean in practical terms? When we have just a PART table in our data base, with key PARTNO, then it is fine for the user to treat parts as things (not sets); and likewise, when we have a REGISTRATION TABLE with key PERSONNO, the user treats persons as things (I mean, as persons – not as sets). But then we add new tables, saying that one and the same part comes from different suppliers, or one and the same person has different registrations. And the user has to change their interpretations. A part (what is represented by a PART record) becomes a type, or set of part instances. We don’t have to go quite that far: a part could become merely a set of supplier-parts, where a supplier-part is a part-type received from a single supplier. Likewise, a person becomes a type, or set of registrations. Well, for purposes of theory, you might – with a fair following wind – persuade me to construe a person as a set. But if you think you can make safe interpretation of a data base depend on users taking persons to be sets, you’ve got another think coming. And – if that isn’t bad enough – every time we modify the data base, adding new record types, or even a new record (or removing record types or records), we have to reinterpret everything. What was a set of non-sets gets reinterpreted as a set of sets of non-sets, what was a non-set becomes a set. And you cheat on the rule that a theory has as theorems all the theorems of its subtheory; that theory U, created by adding vocabulary and postulates to theory T, has all T-theorems as its U-theorems. Formally this remains true, but you re-interpret all the T-theorems: you say they mean something else, they are no longer about parts but about sets of parts, or whatever.

So much for the data base being the enterprise’s view of the world: every minor modification requires a total rethinking of what that world comprises.

 

 

Appeal to Simplicity

 

And yet, look at what the set theoreticians do: counting by, first, applying our relative identities to define equivalence classes; second, forming the set of those equivalence classes; third, finding the cardinality of that set. This is not an alternative to counting using a relative identity predicate: it is counting using such a predicate (there’s no other way to count), but with a massive theoretical overload of sets and sets of sets. So they do everything I do, but I avoid their horrors.

And the same thing applies in naming: I simply apply a name to a person, or to a part, under a certain criterion of identity: “is the same person as” or “is the same part-type as”. But they apply such a name (as a proper name) to a set of all – what shall I say? – presentations of a person, or to a set of all instances of a part-type. And they end up telling me that I’m not really Geoffrey Platoclast, I’m really just a member (or several members, because I change even as I speak) of the set called “Geoffrey Platoclast”. Or perhaps I’m a member of a member of that set, or a subset of a member of it: but they can’t tell us which until we tell them absolutely everything we want to say.

 

SITE HOME PAGE

But we never know everything that we might eventually say, so why don’t we just admit that I am indeed Geoffrey Platoclast: I am the very same person as he. The set theoreticians, if they succeed, will end up saying that anyway; they say much more, but what more is to be said?

THE DATABASE PAGE

THE DATABASE PAPERS

 

Preface & Contents

 

DOWNLOAD

Download Lecture VII (rtf, Word for Windows compatible)

Platoclast on Data: Lecture VIII

 

Copyright © 1993, 2001 Adrian Larner. The author asserts all moral rights.