(U03)

www.btinternet.com/~adrian.larner/database/pcl05

PLATOCLAST
ON DATA

Lecture V
Predicates and Names

 

 

As I said at the very beginning of this course, the Relational Model is concerned with data structure, manipulation, and integrity. And we have spent quite a lot of time on structure, because we had first to acquaint ourselves with two important tools: the FOPC and the use of criteria of application and identity. But this investment was worth while: it gave us pretty crisp informal definitions of the important concepts: fields, records, tables, etc.; and it allowed us to develop a formalisation of these concepts in the TT-DD.

We looked aside at interpretation, and again we exploited the FOPC to give us what we called the Classical interpretation of relations. And then, within a single lecture, we covered the formalisation and the Classical interpretation of the principal data manipulations: restriction, projection, and join. All these, you will remember , applied to records, and only by extension to relations: a restriction or projection of a relation is merely the relation that comprises that restriction or projection of its records; and the join of two relations is the relation that comprises the joins of records taken pairwise from them.

We haven't touched the issue of integrity yet, and we are still not ready to touch it because it turns on the concept of Null data. In order to understand Null data we must invest yet more time in understanding these tools of ours: the FOPC and, specifically, criteria of identity. This understanding will also lead, with any luck, to a solution of that question about the essentiality of relations.

 

 

Identity

 

I’m going to start by talking about identity. Indeed, I’m going to talk about identity for all of this lecture (though you won’t appreciate that that’s what I’m doing), and probably I’ll go on talking about identity right through the next lecture. It’s an important topic.

You may have noticed, when you saw our general definition of join, restriction, intersection, extension, and so on – the ANDing operation – that it turns on identity. Here’s the definition again:

Rabc = Pab & Qbc
Obviously, either to specify or to implement this ANDing, indeed, to understand it at all, we need to know what it means for the b in Pab to be “the same” as the b in Qbc. And you might think that this was very simple, very obvious.

But consider this relation: .

ADMIRATION:       ADMIRER    ADMIRED
 
                  Susan      Helen
                  Helen      Marjorie
                  Susan      Marjorie
It means: Susan admires Helen, and Helen admires Marjorie, and Susan admires Marjorie. And now copy it, permute the columns, and rename ADMIRER to ADMIREDBY (it means just the same – ADMIREDBY admires ADMIRED):
BEINGADMIRED:     ADMIRED    ADMIREDBY
 
                  Helen      Susan
                  Marjorie   Helen
                  Marjorie   Susan

 

 

Natural Composition

 

And now do their natural composition, that is, their natural join followed by projecting away the join column(s) – in this case, ADMIRED. We get:

ADMIRER    ADMIREDBY
 
Susan      Susan
Helen      Helen
Helen      Susan
Susan      Helen

The definition of natural composition is (combining those for natural join and projection):
Rac = FOR SOME y, Pay & Qyc
(y is used to quantify the ADMIRED columns, a is a value in ADMIRER, and c in ADMIREDBY.)

The interpretation of this result – the Classical interpretation – is:

ADMIRER admires someone, who is admired by ADMIREDBY.

 

 

Names as Values

 

Well, we’ve spoken of three quite charming girls, Susan, Helen, and Marjorie. I wanted to mention Anne, as well, but she (like Marjorie) admires no-one; and alas, no-one admires her . So you’ll notice that we can’t have a relation telling us about all four girls and about their various admirations for each other. That’s because we would need two predicates, such as “... admires ...” and “... is a girl”, and we can’t express two predicates in one relation. Why not? It’s because relations have to be rectangular, though normal two-dimensional tables don’t have to be rectangular.

Well, let’s beat that problem, and also change the admirations a little:

ADMIRATION:       ADMIRER    ADMIRED
 
                  Anne       no-one
                  Marjorie   no-one
                  Helen      herself
                  Helen      Marjorie
                  Susan      herself
It means: Anne admires no-one, and Marjorie admires no-one, and Helen admires herself, and Helen admires Marjorie, and Susan admires herself. But when we do the composition we get:
ADMIRER    ADMIREDBY
 
Anne       Anne
Anne       Marjorie
Marjorie   Anne
Marjorie   Marjorie
Helen      Helen
Helen      Susan
Susan      Helen
Susan      Susan
And, taking the second row and the sixth row shown, we interpret as:
Anne admires someone, who is admired by Marjorie.
Helen admires someone, who is admired by Susan.
But who are these persons admired by Anne and Marjorie, and by Helen and Susan? They are: no-one and herself.[1]

Let me immediately quell any fears you may have that the Classical interpretation has been destroyed: I reinterpret the ADMIRED predicate as “ADMIRER is characterised by the admiration direction ADMIRED”. The values under ADMIRED are “admiration directions” (showing how the girls admire), and they may be girls’ names: “Susan”, for instance, means “the admiration of Susan by someone else”. The other values are “herself”, meaning “the admiration of herself by someone”; and “no-one”, meaning “the admiration of no-one by someone”.

The composed relation now has the interpretation:

There is some admiration direction that characterises both ADMIRER and ADMIREDBY.
And that is absolutely correct. Susan and Helen are both characterised by the self-admiration direction, and Anne and Marjorie by the non-admiration direction. So what went wrong under the original interpretation?

 

 

Names and Implication

 

The FOPC has this rule:

From Fa to conclude FOR SOME x, Fx.
I.e. that Fa (a proposition formed by applying a predicate, F, to a name, a) implies FOR SOME x, Fx (the proposition formed by applying the predicate to a variable, and existentially quantifying the variable). And this is what justifies Projection, i.e. ensures that if a record represents – if it is – a true proposition then any projection of it is a true proposition. These data manipulations – restriction, projection, and join- – are implications of the FOPC: that’s what makes them reliable. The FOPC being valid, its implications never go from truth to falsehood.

So what went wrong? Before our cunning reinterpretation, the values “no-one” and “herself’ were not names. We did not interpret them as names. Of course, the FOPC did: we lied to it, and it is – although valid and dependable – utterly gullible (it’s a machine). To put it another way, we’ve interpreted Join and Restriction as conjunctions (ANDings) and Projection as a quantification. But we were wrong to interpret them in this way unless our values are indeed names. If they’re not names then, for instance, projection is not a quantification; indeed, it’s not a valid implication at all, and that’s why it takes us from truth to falsity.

So you can begin to see, I hope, why I was pretty confident that values had to be interpreted as names. Only that interpretation allows the data manipulations to work, i.e. to be underwritten by the FOPC. Abandon the insistence on names, and the manipulations will derive falsehoods from truths: your average user won’t thank you for that.

 

 

Intentional Predicates

 

You may think that’s bad: we now have to be very careful to have only names as values; but things are worse. Not everything that looks like a predicate is a predicate, as far as the FOPC is concerned. Take for example:

...worships ...
Consider an instance:
Hasdrubal worships Moloch.
I assume here that you have all done the pre-requisite course in Phoenician Theology, so you know that Hasdrubal (who was Hannibal’s brother, he of the elephants) and his fellow Carthaginians offered their newborn children as burnt offerings to their god, Moloch.

If we “project over the second column”, i.e. quantify the first place, we get:

FOR SOME x, x worships Moloch.
I.e. someone worships Moloch; there is (timeless “is”) someone that worships Moloch. And this is OK.

But now try it the other way round:

FOR SOME y, Hasdrubal worships y.
I.e. Hasdrubal worships something (some god); there is something that Hasdrubal worships. But there isn’t: there is no such thing. These gods, the horrid gods of the Carthaginians, are – as Isaiah says – “nothings”[2]. Isaiah, of course, knew these dreadful people as “Philistines”.

Anyway, it certainly doesn’t follow from Hasdrubal worshipping some god that there is some god that he worships. A place such as the second place in “... worships ...” is known as “intentional”: it stops the quantification working. And the reason is, I hope, obvious. “Moloch” is not – in our parlance – a name (though it was in Hasdrubal’s: it was the name of his god).

There are many intentional predicates, that is, predicates with at least one intentional place. The etymology of “intentional” gives a good impression of the sort of predicates they are: intendo arcum in, I draw the bow at. (I can draw the bow at a deer when, in fact, there is no deer there: it was just a trick of the light.) Verbs of seeking, wanting, aiming, and planning usually have intentional places.[3]

I walk into a bookshop:

Good morning, Sir, can I help you?
 
I’m looking for a detective novel.
 
Which detective novel is it that you’re looking for?
 
Oh, there isn’t a detective novel that I’m looking for. I’m just looking for a detective novel.
 
I’m sorry, Sir?
 
I’m not looking for a particular detective novel, not – so to speak – a specific book, just a detective novel.
 
I regret, Sir, that we sell only specific books. All our detective novels are particular ones. We don’t stock any indefinite ones whatsoever.
Somehow we tread, carefree but safely, through this intentional maze. But the FOPC doesn’t: it simply doesn’t support intentional predicates. Of course, we can use the non-intentional places of such predicates in the FOPC: “... worships Moloch” and “... is looking for a detective novel” are good, non-intentional monadic predicates.

 

 

“Shakespearean” Predicates

 

So you have all noted very carefully that a relation is a predicate, an entirely non-intentional predicate, and a row of such a relation is formed by inserting names – names of actual things – in each of its places. And you think you’re safe then?

Is Douglas here? Right, consider the predicate “I regard ... as a slacker”. And assume that if we insert the name “Douglas” in its place that we get a true proposition: I regard Douglas as a slacker. And let us say that this place is not intentional, so we can properly conclude that there is someone that I regard as a slacker. And “Douglas” is certainly a name. So we should be all right.

But I want to tell you about someone whom I do not regard as a slacker. The lad that delivers my morning milk, Jim, is up every morning at four o’clock. He’s never late, and always cheerful. He lives with his crippled mother, who is a widow, and as well as his milk round he does all the housework, and works at a correspondence course in the evening. (I hope you’re all moved by this story, especially you, Douglas!) As I said, I certainly do not regard Jim as a slacker.

But suppose, unknown to me, and – on the face of it, unlikely – that Jim, whom I have seen only muffled up against the cold on winter mornings, is none other than, is the very same person as ... Douglas! And suppose you record these facts in your relational data base, or – equivalently – tell them to the FOPC: that I do not regard Jim as a slacker, and that Jim is identical to Douglas.

The FOPC imposes yet another restriction on predicates: they must be, as Professor Geach says (and he says quite a lot of what I’ve said and will say to you), they must be “Shakespearean”[4]. A rose by any other name would smell as sweet. The FOPC says: a predicate true of something is true of it under any name. If Douglas is Jim, then Douglas smells sweet if and only if Jim smells sweet – that’s Shakespearean. But the FOPC takes all predicates as Shakespearean, so it would conclude: I do not regard Douglas as a slacker. But I jolly well do regard Douglas as a slacker, so once again we’ve gone from truth (well supposed truth, unless Douglas really is Jim) to falsehood.

What can we say? Although “Douglas” is a name, and the place in “I regard ... as a slacker” is non-intentional, that predicate is not acceptable to the FOPC: it’s not Shakespearean. Things are getting worse and worse, aren’t they?

What amazes me is that all these people toil away designing data bases, but they don’t pause to ask “Is this an intentional predicate?”, or “Is this place Shakespearean?” And have you seen the values they use? Of cour:se, I’ve been fooling you by carefully chosen examples: Abraham and Isaac, Susan and Helen. But what do we see in real data bases: dates, counts, colours, flags that distinguish overseas from domestic suppliers. Are these names?

 

 

Self-Interpretation

 

On the face of it, the chances of anyone getting a data base design right, even to the extent of not generating totally misleading responses to queries, seem very slim indeed. They might make you wonder whether the FOPC, and therefore the Relational Model, is appropriate at all.

Let us suppose you have designed a data base, and you have ignored all these problems: some of your values aren’t names; some of your record types represent intentional predicates; some, even of your non-intentional places (field-types), are not Shakespearean. And you come to me and say, “Professor Platoclast” – or “Geoffrey”, you say – “Can you help me out without doing a complete redesign?” And, of course, you offer an appropriate fee.

And I can help you out without changing your design at all. I look at each of your record-types, and I design a paper form that has just the same field-types. And I fill in an appropriate one of these forms for each record in your data base. And then I re-interpret your records.

For example, suppose you have a VACANCY record, with field-types (columns) VACNO, JOB, YEARSEXP, etc. It means, you say:

We have a vacancy, numbered VACNO, for a JOB, with YEARSEXP years of experience, etc.
And a particular example is:
We have a vacancy, numbered 105, for a widget gauger, with 7 years of experience, etc.
And you’ll remember that I have a paper form, a VACANCY form, which says:
VACNO: 105, JOB: widget gauger, YEARSEXP: 7 etc.
So now I give you a new interpretation of your record-type:
We have a vacancy form, with a vacancy number called VACNO, a job description called JOB, a years of experience entry called YEARSEXP, etc.
And of your particular record:
We have a vacancy form. with a vacancy number called “105”, a job description called “widget gauger”, a years of experience entry called “7”, etc.
And. behold: as long as you keep one of my forms for each of your records, your data base now meets all the requirements of the FOPC. 7 may not be a name, but “7” certainly is: it’s the name of 7. “We have a vacancy for ...” may have an intentional place, but the place in “We have a form with a job description called ...” is not intentional. I see Douglas is looking worried: just try telling your data base management system that, instead of Douglas being identical to Jim, which it accepts without demur, the name “Douglas” is identical to the name “Jim”: even the daftest software can tell that’s false.

Where does this get you? Well. your data base records no longer constitute statements about the wider world, merely about all my paper forms. So to understand what they mean in terms of the wider world, you’ll have to understand my forms. But that’s not too difficult a task for most people: we can read filled-in forms. I should mention, of course, that one day it will dawn on you that the paper forms are quite superfluous: they’re just copies of your records. So you could throw them away, and when you have a record that means, “We have a vacancy form ...”, you can take it that the form it refers to is itself (I didn’t say paper form. did I?) But by the time that occurs to you, I’ll have spent my well-earned fee on the good things of life.

This sort of interpretation, let’s call it self-interpretation, is one way to use the Classical interpretation in safety. But it works because it’s very unambitious: it purports to say nothing about the wider world. only about our records. All further interpretation, all attempts to work out whether Susan and Helen admire the same person, or whether there is something that Hasdrubal worships, or whether Abraham is paternal grandfather of Jacob, are left entirely to the user. And you may feel, as many people have felt, that we should be a bit more ambitious than that. But, of course, it’s better to be unambitious and right than ambitious and wrong, so self-interpretation is perfectly respectable; but boring.

 

 

The Join Trap

 

Let’s look now at a notorious problem of data interpretation, noted very early by Dr Codd, and called by him the “Connection Trap”, because he thought it characterised the sort of non-relational data base that had connections (pointers and the like) between records.[5] It’s now called the “Join Trap”. because it occurs in relational data bases as well.

We have these relations:

SUPPLIES:      SUPPLIER     PART
 
               S             P
               T             P
               ...           ...
USES:          PROJECT      PART
 
               J             P
               J             P
               ...           ...
Notice that a part can be supplied by many suppliers and used in many projects; and a supplier can supply, or a project use, many different parts. We assume the following interpretations:
Supplier SUPPLIER supplies part PART.
Project PROJECT uses part PART.
Thus, as shown, supplier S supplies part P, and supplier T supplies part P , and Project J uses part P. (And perhaps those suppliers, and others, supply other parts, and perhaps other suppliers supply part P, and perhaps other projects use part P and other parts, and perhaps project J uses other parts.)

We perform the natural composition of SUPPLIES and USES (joining on PART):

SUPPLIER   PROJECT
 
S          J
T          J
It means (and don’t quibble, I’m mechanically applying the FOPC):
Supplier SUPPLIER supplies something that is used in Project PROJECT.
And specifically:
Supplier S supplies something that is used in Project J.
And it’s easy to see that this could be false while the interpretations of the SUPPLIES and USES relations were true. (Suppose that everything used by project J comes from supplier T – why not?) So there must be something wrong with ... what? No! Not the FOPC, but the interpretation of SUPPLIES, or of USES, or both.

Before we know what the problem is, we can be pretty confident that self- interpretation would solve it: it would. But the user might then misinterpret the forms to exactly the same effect (tough on the user – self-interpretation means washing our hands of that responsibility).

 

 

Proper Names

 

The problem is that the PART value P, although it is a name, is not a proper name. It doesn’t name one and only one thing distinguishable by the FOPC together with the predicates used in the data base. Indeed, we can see that here we have something called P and supplied by S, there we have something (clearly something else) called P and supplied by T. At least, it’s clearly something else if nothing is supplied by both S and T.

We could say, of course, that something is supplied by both S and T, namely P. And in that case the above interpretation would be true:

Supplier S supplies something that is used in Project J
namely P. But you might think it a strange thing, this P. And when project J failed because of defective parts, supplier S might think it a strange thing to be sued for damages.

So, if we are not content with self-interpretation, we have yet another problem to consider: using names in our data bases that are not proper names. But how avoidable is this? You see, we might have just PART records and PROJECT records – if we obtained all our parts either from a single supplier or by manufacturing them in-house, and we marked that in the PART records. I assume here that any kind of part is either bought in or made in-house, but not both. We wouldn’t have SUPPLIES records. Then PART might well be a proper name: we might be happy to say that the same thing (the same PART) was used in different projects.

But we might then change our practices, and have to add the SUPPLIES records. And what was a proper name, P for instance, would become another sort of name: a common name. Of course, there’s nothing wrong with common names, “student” is a good common name, a name borne by each of you here today. It’s just that the FOPC can’t handle common names as names.

So it seems that every time we add a record-type to our data base we have to check to see whether any of our proper names have become common names, and then what? Redesign and reinterpret the data base? Might we end up having to give a name to, and keep a record for, each part instance? each individual nut, screw, washer, and bolt?

Should we now retreat to self-interpretation? Or can we find a better way? Not before this lecture finished five minutes ago we can’t.

 

 

The FOPC and Identity

 

 

(In the editor’s opinion, the following brief exposition by Platoclast fits chronologically at this point. Its exact provenance is unknown, being recorded by Ms Genudomini on a loose sheet of paper used by her as a bookmark.)

I really don’t know, my angel, whether there is a better way. I hope so, or I’m going to run out of things to say before half-term.

I think there is, and it all turns on this notion of identity. When you think about it, the problem of the non-names, like “no-one” and “herself’ is that they don’t denote one and the same admired person on each use (“no-one” doesn’t denote any person, and “herself’ denotes different persons – if it “denotes” at all). The problem with “Moloch” is that it doesn’t denote anything (thank goodness), so it certainly doesn’t denote the same thing on each use: what isn’t there isn’t there to be the same. Its place – in “Hasdrubal worshipped ...” – is intentional in this sense: it can be filled with such a pseudo-name and yet create a true proposition. Non-intentional places, the only sort recognised by the FOPC, demand real names: that’s what justifies the implication of an existential quantification.

What’s wrong with the non-Shakespearean place? The problem is that whatever’s meant by the identity, “Douglas is the same as Jim”, it isn’t Leibnizian[6] identity: it doesn’t allow substitution of the one name for the other in all contexts salva veritate (that means: without going from truth to falsehood). We know that however identical Douglas and Jim may be, there’s at least one predicate true of one but false of the other, namely “I regard ... as a slacker”. So in one sense at least they are not identical, the one being regarded by me as a slacker, and the other not.

And the problem with part P in the Join Trap is that “P” denotes a part supplied by S, and denotes a part not supplied by S, so we get the same name for two things which our predicates distinguish.

Self-interpretation gets rid of all these problems. It works by ensuring that we talk only about inscriptions, on paper or on a magnetic medium, and we have a very simple method of telling whether this is the same inscription as that, or is typographically identical to that. Self-interpretation works by making identity unproblematical.

 

SITE HOME PAGE

So, my sweet, all these problems with interpretation turn on the notion of identity, and I’m going to hack at that on Thursday, after the usual liquid lunch.

THE DATABASE PAGE

THE DATABASE PAPERS

 

Preface & Contents

 

DOWNLOAD

Download Lecture V (rtf, Word for Windows compatible)

Platoclast on Data: Lecture VI

 

Copyright © 1993, 2001 Adrian Larner. The author asserts all moral rights.