Content area
Full text
ABSTRACT
Entity-Relationship (ER) diagrams are frequently used for data modeling and database design. The Unified Modeling Language (UML) is dominant in the programming area but has not been widely adopted in the database area. I describe the history of UML as inspired by ER diagrams and argue that the use of a suitable variant of UML incorporates the benefits of ER diagrams as well as the advantages of a modeling language used by the programming community.
Keywords: UML, ER diagram, Data modeling
1. UML AND DATA MODELING
UML is not well accepted by many people in the database field, despite a heritage that originally derives from ER diagrams. Both database and programming language people make assumptions about the role of modeling UML and ER diagrams, assumptions that I feel are limiting. Used correctly, UML is an excellent tool for data modeling. In this paper, I describe some of the history of UML and argue for a broader understanding of modeling.
2. ASSUMPTIONS
The prevailing viewpoint among both database and programming language people is that UML is a tool for building applications using object-oriented programming languages, a tool that is not very suitable for database design. Under this viewpoint, UML is identified with the principles of Smalltalk and, by transference since the decline of that language, with C++ and Java. I would summarize those principles as follows:
* identity of objects is the fundamental concept
* data is stored within encapsulated objects
* data in objects is accessed only through methods
* navigation through a network of data is accomplished by a series of method calls
* objects are connected by references stored in objects
* inheritance is of major importance in organizing types and reusing implementation
* models are for designing programs
* the system modeling focus is "data in motion"
These information hiding principles prevent programs from becoming dependent on the detailed data structures and encourage reuse of fragment of programs, at the cost of some rigidity in the organization and accessing of data and the inability to optimize navigation chains.
Relational databases follow a different set of principles:
* serializable ("pure") data values are the fundamental concept
* data is stored in open (unencapsulated) tables
* data in tables is freely available...





