SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Tuesday, May 02, 2006

Has E/R had a negative impact on db

SQL Apprentice Question
I don't like entities. In fact I despise entities, as the enemy of good
information philosophy. You see I just don't accept their existence.
There is no magical wrapper surrounding some construct that turns it
into a nicely formed 'thing' or 'object'. Sure, we can invent them, but
there is never any inherent truth to their boundaries. Now in the
constraints of a piece of my OO code, which is only going to be used by
other parts of my code, and whose domain I have total control over this
is not a problem. But everywhere else in information modelling it is,
as it encourages one to view those boundaries as non-decomposable.


To me the situation get worse when one then adds relationships between
entities. What on earth is the difference between an entity (an
association of attributes) and a relationship (an association of
entities), except for the fact that a relationship is constrained to
being binary. Nothing. In fact all they are, are special cases of n-ary
relationships, or 'associative entities'. Everything we want to model
is ultimately an associative entity, or better put everything we want
to model is an n-ary _relationship_.


Now Codd grokked this. I am sure he did. That's why the RM has no
'links' and pivots on the information principle. Chen did not -
remember his E/R model was originally intended to be a direct
competitor to RM, not just a tool for conceptual modeling.


Okay, so for those in the know this isn't an issue and E/R is a useful
tool. But for those not in the know (which appears to be a lot of the
industry) it promotes the fallacy of the Entity/Relationship
distinction, of impenetrable wrappers, and encourages the mindset that
has lead to OODBMS, XML databases, etc.


So I put to you that Chen's E/R has had a greater deleterious influence
to fashion in db theory than any other paper in the last 30 years.

Celko Answers
One problem is that a relationship in a ER diagram is a line between
boxes. That puts us in a binary-only world. ORM, on the other hand,
allows n-ary relationships. And the diagram has no other information
about the nature of the relationship, like ORM does.

The map becomes the territory. And when you look at query generating
tools based on ER diagrams, they tend "follow the lines", so to answer
a questlon like "Who was in the mail room the day we invoiced XYZ
company?" becomes a chain of joins instead of a match on two date
columns.

No comments: