Monday, March 28, 2011

Database Design to Enforce Cardinality

Database designers need to put as much data integrity rules as possible in their design artifacts with rigorous verification that nothing is missed. If relationships are not reflected in RDM (Relational Data Model), then they need to be stated somewhere in the documents. When writing code for accessing data, no matter it is about an atomic action such as CRUD, or about a composite action that comprises a set of the atomic actions, developers need a clear vision about the data relationships. In development environment, developers get this vision from within database tools. The tool won't show undefined RDM relationships. Database architect should not expect developers to guess them.

The question is, can all possible data relationships be modeled with RDM? To answer this question, we need first answer another question, what are the forms of all possible data relationships? This is a fairly broad question and it could be hard to answer generally. But from OOA (Object-Oriented Analysis) models, we can see three types of relationships have been identified: (1) is-a, (2) has, and (3) association. "is-a" says that one must have a parent and it inherits the data from the parent. On the other hand, "has", including aggregation and composition, addresses ownership. Aggregation addresses that one owns other entities, proprietarily, meaning that when one is deleted, the “others” should be deleted in cascade. In contrary, composition relationship is less strong ownership, even one owns, but the ownees could be owned by others in a sharing manner. So when the owner is deleted, the ownees should not be deleted cascadely. Finally, association is a special type of "has" - it has a third entity involved to express relationships between another two entities.

"has" relationship has another perspective to look at - cardination. Cardination attribute is to specify the numbers one may have. As we have seen in OOM, cardinations of 1, 0..1, 1..m, 0..m are the ones people put attention to.
Now back to our question, can we implement all the relationships and cardinations in database design? And how?

In RDM, the only weapon for defining relationship is the FK - foreign key. With FK we really say "reference". When one refers to another, it implies that "the 'another' 'has' the 'one'". But it can also imply that "the 'one' belongs to the 'another'". However, the important difference between "has" and "belongs" is that in "has" case the "another" must exists firstly; while in "belongs" case the "one" has to be existed firstly. E.g., "A has B and C" means there is an A, and B and C has foreign key relationship to A. If we say "B and C belongs to A", it means B and C have to existed, while A is optionally existed. If A is not existed, then B and C has the FK value null.

In order to implement the "has" and "belongs", as well as the cardination that comes as part of the "has" relationship, we can use "unique" and "nullable" domain integrity tools together with FK. Following table shows that types of relationships "has", but "belongs" can be realized if we flip over the relationships.

R1: 1 - 0..m FK, not null
R2: 1 - 0..1 FK, not null, unique
R3: 0..1 - 0..m FK, null
R4: 0..1 - 0..1 FK, null, unique

From above table, both "is" and "has" can be implemented with R1 and R2. R1 indicates that one can have many of others; R2 says that one can only have maximum one other. On top of this, we can implement proprietary ownership by applying cascade actions to the foreign key.

For R2 implementation, FK and PK can be on one column. That implements "is" relationship.

R3 and R4 implement “belongs” relationship; the referrers may or may not belong to a referee.

No comments: