|
(U03) |
www.btinternet.com/~adrian.larner/database/pcl05 |
|
PLATOCLAST Lecture V |
||
|
|
||
|
|
As I said at the very beginning of this course, the Relational Model is concerned with data structure, manipulation, and integrity. And we have spent quite a lot of time on structure, because we had first to acquaint ourselves with two important tools: the FOPC and the use of criteria of application and identity. But this investment was worth while: it gave us pretty crisp informal definitions of the important concepts: fields, records, tables, etc.; and it allowed us to develop a formalisation of these concepts in the TT-DD. We looked aside at interpretation, and again we exploited the FOPC to give us what we called the Classical interpretation of relations. And then, within a single lecture, we covered the formalisation and the Classical interpretation of the principal data manipulations: restriction, projection, and join. All these, you will remember , applied to records, and only by extension to relations: a restriction or projection of a relation is merely the relation that comprises that restriction or projection of its records; and the join of two relations is the relation that comprises the joins of records taken pairwise from them. We haven't touched the issue of integrity yet, and we are still not ready to touch it because it turns on the concept of Null data. In order to understand Null data we must invest yet more time in understanding these tools of ours: the FOPC and, specifically, criteria of identity. This understanding will also lead, with any luck, to a solution of that question about the essentiality of relations. |
|
|
|
||
|
|
Identity |
|
|
|
I’m going to start by talking about identity. Indeed, I’m going to talk about identity for all of this lecture (though you won’t appreciate that that’s what I’m doing), and probably I’ll go on talking about identity right through the next lecture. It’s an important topic. You may have noticed, when you saw our general definition of join, restriction, intersection, extension, and so on the ANDing operation that it turns on identity. Here’s the definition again:
But consider this relation: .
|
|
|
|
||
|
|
Natural Composition |
|
|
|
And now do their natural composition, that is, their natural join followed by projecting away the join column(s) in this case, ADMIRED. We get:
The interpretation of this result the Classical interpretation is:
|
|
|
|
||
|
|
Names as Values |
|
|
|
Well, we’ve spoken of three quite charming girls, Susan, Helen, and Marjorie. I wanted to mention Anne, as well, but she (like Marjorie) admires no-one; and alas, no-one admires her . So you’ll notice that we can’t have a relation telling us about all four girls and about their various admirations for each other. That’s because we would need two predicates, such as “... admires ...” and “... is a girl”, and we can’t express two predicates in one relation. Why not? It’s because relations have to be rectangular, though normal two-dimensional tables don’t have to be rectangular. Well, let’s beat that problem, and also change the admirations a little:
Let me immediately quell any fears you may have that the Classical interpretation has been destroyed: I reinterpret the ADMIRED predicate as “ADMIRER is characterised by the admiration direction ADMIRED”. The values under ADMIRED are “admiration directions” (showing how the girls admire), and they may be girls’ names: “Susan”, for instance, means “the admiration of Susan by someone else”. The other values are “herself”, meaning “the admiration of herself by someone”; and “no-one”, meaning “the admiration of no-one by someone”. The composed relation now has the interpretation:
|
|
|
|
||
|
|
Names and Implication |
|
|
|
The FOPC has this rule:
So what went wrong? Before our cunning reinterpretation, the values “no-one” and “herself’ were not names. We did not interpret them as names. Of course, the FOPC did: we lied to it, and it is although valid and dependable utterly gullible (it’s a machine). To put it another way, we’ve interpreted Join and Restriction as conjunctions (ANDings) and Projection as a quantification. But we were wrong to interpret them in this way unless our values are indeed names. If they’re not names then, for instance, projection is not a quantification; indeed, it’s not a valid implication at all, and that’s why it takes us from truth to falsity. So you can begin to see, I hope, why I was pretty confident that values had to be interpreted as names. Only that interpretation allows the data manipulations to work, i.e. to be underwritten by the FOPC. Abandon the insistence on names, and the manipulations will derive falsehoods from truths: your average user won’t thank you for that. |
|
|
|
||
|
|
Intentional Predicates |
|
|
|
You may think that’s bad: we now have to be very careful to have only names as values; but things are worse. Not everything that looks like a predicate is a predicate, as far as the FOPC is concerned. Take for example:
If we “project over the second column”, i.e. quantify the first place, we get:
But now try it the other way round:
Anyway, it certainly doesn’t follow from Hasdrubal worshipping some god that there is some god that he worships. A place such as the second place in “... worships ...” is known as “intentional”: it stops the quantification working. And the reason is, I hope, obvious. “Moloch” is not in our parlance a name (though it was in Hasdrubal’s: it was the name of his god). There are many intentional predicates, that is, predicates with at least one intentional place. The etymology of “intentional” gives a good impression of the sort of predicates they are: intendo arcum in, I draw the bow at. (I can draw the bow at a deer when, in fact, there is no deer there: it was just a trick of the light.) Verbs of seeking, wanting, aiming, and planning usually have intentional places.[3] I walk into a bookshop:
|
|
|
|
||
|
|
Shakespearean Predicates |
|
|
|
So you have all noted very carefully that a relation is a predicate, an entirely non-intentional predicate, and a row of such a relation is formed by inserting names names of actual things in each of its places. And you think you’re safe then? Is Douglas here? Right, consider the predicate “I regard ... as a slacker”. And assume that if we insert the name “Douglas” in its place that we get a true proposition: I regard Douglas as a slacker. And let us say that this place is not intentional, so we can properly conclude that there is someone that I regard as a slacker. And “Douglas” is certainly a name. So we should be all right. But I want to tell you about someone whom I do not regard as a slacker. The lad that delivers my morning milk, Jim, is up every morning at four o’clock. He’s never late, and always cheerful. He lives with his crippled mother, who is a widow, and as well as his milk round he does all the housework, and works at a correspondence course in the evening. (I hope you’re all moved by this story, especially you, Douglas!) As I said, I certainly do not regard Jim as a slacker. But suppose, unknown to me, and on the face of it, unlikely that Jim, whom I have seen only muffled up against the cold on winter mornings, is none other than, is the very same person as ... Douglas! And suppose you record these facts in your relational data base, or equivalently tell them to the FOPC: that I do not regard Jim as a slacker, and that Jim is identical to Douglas. The FOPC imposes yet another restriction on predicates: they must be, as Professor Geach says (and he says quite a lot of what I’ve said and will say to you), they must be “Shakespearean”[4]. A rose by any other name would smell as sweet. The FOPC says: a predicate true of something is true of it under any name. If Douglas is Jim, then Douglas smells sweet if and only if Jim smells sweet that’s Shakespearean. But the FOPC takes all predicates as Shakespearean, so it would conclude: I do not regard Douglas as a slacker. But I jolly well do regard Douglas as a slacker, so once again we’ve gone from truth (well supposed truth, unless Douglas really is Jim) to falsehood. What can we say? Although “Douglas” is a name, and the place in “I regard ... as a slacker” is non-intentional, that predicate is not acceptable to the FOPC: it’s not Shakespearean. Things are getting worse and worse, aren’t they?
What amazes me is that all these people toil away designing data bases, but they don’t pause to ask “Is this an intentional predicate?”, or “Is this place Shakespearean?” And have you seen the values they use? Of cour:se, I’ve been fooling you by carefully chosen examples: Abraham and Isaac, Susan and Helen. But what do we see in real data bases: dates, counts, colours, flags that distinguish overseas from domestic suppliers. Are these names? |
|
|
|
||
|
|
Self-Interpretation |
|
|
|
On the face of it, the chances of anyone getting a data base design right, even to the extent of not generating totally misleading responses to queries, seem very slim indeed. They might make you wonder whether the FOPC, and therefore the Relational Model, is appropriate at all. Let us suppose you have designed a data base, and you have ignored all these problems: some of your values aren’t names; some of your record types represent intentional predicates; some, even of your non-intentional places (field-types), are not Shakespearean. And you come to me and say, “Professor Platoclast” or “Geoffrey”, you say “Can you help me out without doing a complete redesign?” And, of course, you offer an appropriate fee. And I can help you out without changing your design at all. I look at each of your record-types, and I design a paper form that has just the same field-types. And I fill in an appropriate one of these forms for each record in your data base. And then I re-interpret your records. For example, suppose you have a VACANCY record, with field-types (columns) VACNO, JOB, YEARSEXP, etc. It means, you say:
Where does this get you? Well. your data base records no longer constitute statements about the wider world, merely about all my paper forms. So to understand what they mean in terms of the wider world, you’ll have to understand my forms. But that’s not too difficult a task for most people: we can read filled-in forms. I should mention, of course, that one day it will dawn on you that the paper forms are quite superfluous: they’re just copies of your records. So you could throw them away, and when you have a record that means, “We have a vacancy form ...”, you can take it that the form it refers to is itself (I didn’t say paper form. did I?) But by the time that occurs to you, I’ll have spent my well-earned fee on the good things of life.
This sort of interpretation, let’s call it self-interpretation, is one way to use the Classical interpretation in safety. But it works because it’s very unambitious: it purports to say nothing about the wider world. only about our records. All further interpretation, all attempts to work out whether Susan and Helen admire the same person, or whether there is something that Hasdrubal worships, or whether Abraham is paternal grandfather of Jacob, are left entirely to the user. And you may feel, as many people have felt, that we should be a bit more ambitious than that. But, of course, it’s better to be unambitious and right than ambitious and wrong, so self-interpretation is perfectly respectable; but boring. |
|
|
|
||
|
|
The Join Trap |
|
|
|
Let’s look now at a notorious problem of data interpretation, noted very early by Dr Codd, and called by him the “Connection Trap, because he thought it characterised the sort of non-relational data base that had connections (pointers and the like) between records.[5] It’s now called the “Join Trap. because it occurs in relational data bases as well. We have these relations:
We perform the natural composition of SUPPLIES and USES (joining on PART):
Before we know what the problem is, we can be pretty confident that self- interpretation would solve it: it would. But the user might then misinterpret the forms to exactly the same effect (tough on the user self-interpretation means washing our hands of that responsibility). |
|
|
|
||
|
|
Proper Names |
|
|
|
The problem is that the PART value P, although it is a name, is not a proper name. It doesn’t name one and only one thing distinguishable by the FOPC together with the predicates used in the data base. Indeed, we can see that here we have something called P and supplied by S, there we have something (clearly something else) called P and supplied by T. At least, it’s clearly something else if nothing is supplied by both S and T. We could say, of course, that something is supplied by both S and T, namely P. And in that case the above interpretation would be true:
So, if we are not content with self-interpretation, we have yet another problem to consider: using names in our data bases that are not proper names. But how avoidable is this? You see, we might have just PART records and PROJECT records if we obtained all our parts either from a single supplier or by manufacturing them in-house, and we marked that in the PART records. I assume here that any kind of part is either bought in or made in-house, but not both. We wouldn’t have SUPPLIES records. Then PART might well be a proper name: we might be happy to say that the same thing (the same PART) was used in different projects. But we might then change our practices, and have to add the SUPPLIES records. And what was a proper name, P for instance, would become another sort of name: a common name. Of course, there’s nothing wrong with common names, “student” is a good common name, a name borne by each of you here today. It’s just that the FOPC can’t handle common names as names. So it seems that every time we add a record-type to our data base we have to check to see whether any of our proper names have become common names, and then what? Redesign and reinterpret the data base? Might we end up having to give a name to, and keep a record for, each part instance? each individual nut, screw, washer, and bolt? Should we now retreat to self-interpretation? Or can we find a better way? Not before this lecture finished five minutes ago we can’t. |
|
|
|
||
|
|
The FOPC and Identity |
|
|
||
|
|
(In the editor’s opinion, the following brief exposition by Platoclast fits chronologically at this point. Its exact provenance is unknown, being recorded by Ms Genudomini on a loose sheet of paper used by her as a bookmark.) I really don’t know, my angel, whether there is a better way. I hope so, or I’m going to run out of things to say before half-term. I think there is, and it all turns on this notion of identity. When you think about it, the problem of the non-names, like “no-one” and “herself’ is that they don’t denote one and the same admired person on each use (“no-one” doesn’t denote any person, and “herself’ denotes different persons if it “denotes” at all). The problem with “Moloch” is that it doesn’t denote anything (thank goodness), so it certainly doesn’t denote the same thing on each use: what isn’t there isn’t there to be the same. Its place in “Hasdrubal worshipped ...” is intentional in this sense: it can be filled with such a pseudo-name and yet create a true proposition. Non-intentional places, the only sort recognised by the FOPC, demand real names: that’s what justifies the implication of an existential quantification. What’s wrong with the non-Shakespearean place? The problem is that whatever’s meant by the identity, “Douglas is the same as Jim”, it isn’t Leibnizian[6] identity: it doesn’t allow substitution of the one name for the other in all contexts salva veritate (that means: without going from truth to falsehood). We know that however identical Douglas and Jim may be, there’s at least one predicate true of one but false of the other, namely “I regard ... as a slacker”. So in one sense at least they are not identical, the one being regarded by me as a slacker, and the other not. And the problem with part P in the Join Trap is that “P” denotes a part supplied by S, and denotes a part not supplied by S, so we get the same name for two things which our predicates distinguish. Self-interpretation gets rid of all these problems. It works by ensuring that we talk only about inscriptions, on paper or on a magnetic medium, and we have a very simple method of telling whether this is the same inscription as that, or is typographically identical to that. Self-interpretation works by making identity unproblematical. |
|
|
|
||
|
So, my sweet, all these problems with interpretation turn on the notion of identity, and I’m going to hack at that on Thursday, after the usual liquid lunch. |
|
|
|
|
|
|
|
||
|
Copyright © 1993, 2001 Adrian Larner. The author asserts all moral rights. |
||