Short Answers
- 1.
What is data modeling?
Data modeling is both a technique for modeling data and its relationships and a
graphical representation of a database. It is a method for determining what data and
relationships should be stored in the database. The data model is not concerned with
how the data should be stored.
- 2.
What is a useful technique for identifying entities in a written description of a
data modeling problem?
One useful approach to identifying entities is to underline all nouns in the problem
description. Most nouns are possible entities. Select an entity that seems central
to the problem and describe it. Then move to the next one.
- 3.
When do you label relationships?
Typically, relationship arcs should not be labeled. Only when there are
two relationships between two entities or if the relationship is not easily inferred
should you label the arc.
- 4.
When is a data model well formed and when is it high fidelity?
A data model is well formed when all attributes and entities have meaningful and
unique names. Entities, identifiers, and relationships should be meaningful to the
client.
A high-fidelity data model faithfully describes the real world. All relationships are
recorded and are of the correct degree. There are no compromises and all exceptions
have been taken into account.
A well-formed data model is syntactically correct and a high-fidelity data model is
complete, understandable, and accurate.
- 5.
How do you handle exceptions when data modeling?
Exceptions should be handled by the data model. Actually, it is recommended to go
"hunting for exceptions" to avoid a low-fidelity data model. One
should ask the client questions such as:
- Is it always like that?
- Would there be any situations where this could be an m:m
relationship?
- Have there ever been any exceptions?
- Are things likely to change in the future?
- 6.
Describe the different types of entities.
There are five types of entities: independent, dependent, intersection, aggregate,
and subordinate. Independent entities are often the starting points of the database.
They are the entities most prominent in the client's mind and are typically easy to
name.
Dependent entities are entities which rely on a different entity for their existence
and identification. An indication of a dependent entity is a bar on the dependent
entities end of the arc.
Intersection entities are a byproduct of m:m relationships. Intersection entities are
sometimes hard to name, in which case the names of the two related independent
entities are concatenated.
An aggregate entity is created when several different entities have similar
attributes that are distinguished only by a prefix or suffix to keep the names
unique. Aggregate entities typically become independent entities.
A subordinate entity is used to store data about an entity that can vary among
instances. A subordinate entity is useful when an entity consists of mutually
exclusive classes that have different descriptions. For example, the entity
"animal" would have to store different information depending on
the type of animal described.
- 7.
Why might a data model grow?
A data model might grow as you expand the boundaries of the application. Some 1:m
relationships will become m:m relationships and you will add entities, attributes,
and relationships to faithfully describe the real world.
- 8.
Why might a data model contract?
A data model might shrink when you generalize structures. Some entities might
actually all be part of a more general entity and thus could be combined into one
entity.
- 9.
How do you indicate ordering of instances in a data model?
To handle ordering of the instances in a database, create a new attribute which
contains the order of the entity. This way one does not have to store the rows in
order and it is also possible to create two or more orders for each entity. The SQL
clause ORDER BY can be used to print out the rows in order.
- 10.
What is the difference between a synonym and a homonym?
Synonyms: Different words are used to describe the same concept.
Homonyms: The same word is used for different concepts.
Data Modeling
Create a data model from the following narratives, which are sometimes intentionally
incomplete. You will have to make some assumptions. Make certain you state these
alongside your data model. Define the identifier(s) and attributes of each entity.
- 1.
The president of a book wholesaler has told you that she wants information about
publishers, authors, and books.
- 2.
A university has many subject areas (e.g., MIS, Romance languages). Professors
teach in only one subject area, but the same subject area can have many professors.
Professors can teach many different courses in their subject area. An offering of a
course (e.g., Data Management 457, French 101) is taught by only one professor at a
particular time.
- 3.
Kids'n'Vans retails minivans for a number of manufacturers. Each manufacturer
offers several models of its minivan (e.g., SE, LE, GT). Each model comes with a
standard set of equipment (e.g., the Acme SE comes with wheels, seats, and an engine).
Minivans can have a variety of additional equipment or accessories (radio, air
conditioning, automatic transmission, airbag, etc.), but not all accessories are
available for all minivans (e.g., not all manufacturers offer a driver's side airbag).
Some sets of accessories are sold as packages (e.g., the luxury package might include
stereo, six speakers, cocktail bar, and twin overhead foxtails).
- 4.
Steve operates a cinema chain and has given you the following information:
I have many cinemas. Each cinema can have multiple theaters. Movies are shown
throughout the day starting at 11 a.m. and finishing at 1 a.m. Each movie is given a
two-hour time slot. We never show a movie in more than one theater at a time, but we do
shift movies among theaters because seating capacity varies. I am interested in knowing
how many people, classified by adults and children, attended each showing of a movie. I
vary ticket prices by movie and time slot. For instance, Lassie Get Lost at 11 a.m. is
50 cents for everyone but is 75 cents at 11 p.m.
- 5.
A university gymnastics team can have as many as 10 gymnasts. The team competes
many times during the season. A meet can have one or more opponents and consists of four
events: vault, uneven bars, beam, and floor routine. A gymnast can participate in all or
some of these events though the team is limited to five participants in any event.
- 6.
A famous Greek shipping magnate, Stell, owns many container ships. Containers are
collected at one port and delivered to another port. Customers pay a negotiated fee for
the delivery of each container. Each ship has a sailing schedule that lists the ports
the ship will visit over the next six months. The schedule shows the expected arrival
and departure dates. The daily charge for use of each port is also recorded.
- 7.
A medical center employs several physicians. A physician can see many patients,
and a patient can be seen by many physicians though not always on the one visit. On any
particular visit, a patient may be diagnosed to have one or more illnesses.
- 8.
A telephone company offers a 10 percent discount to any customer who phones
another person who is also a customer of the company. To be eligible for the discount,
the pairing of the two phone numbers must be registered with the telephone company.
Furthermore, for billing purposes, the company records both phone numbers, start time,
end time, and date of call.
- 9.
Global Trading (GT), Inc. is a conglomerate. It buys and sells businesses
frequently and has difficulty keeping track of what strategic business units (SBUs) it
owns, in what nations it operates, and what markets it serves. For example, the CEO was
recently surprised to find that GT owns 25 percent of Dundee's Wild Adventures,
headquartered in Zaire, that has subsidiaries operating tours of Australia, Zaire, and
New York. You have been commissioned to design a database to keep track of GT's
businesses. The CEO has provided you with the following information: SBUs are
headquartered in one country, not necessarily the United States. Each SBU has
subsidiaries or foreign agents, depending on local legal requirements, in a number of
countries. Each subsidiary or foreign agent operates in only one country but can operate
in more than one market. GT uses the standard industrial code (SIC) to identify a market
(e.g., newspaper publishing). The SIC is a unique four-digit code. While foreign agents
operate as separate legal entities, GT needs to know in what countries and markets they
operate. On the other hand, subsidiaries are fully or partly-owned by GT, and it is
important for GT to know who are the other owners of any subsidiary and what percentage
of the subsidiary they own. It is not unusual for a corporation to have shares in
several of GT's subsidiary companies and for several corporations to own a portion of a
subsidiary. Multiple ownership can also occur at the SBU level.
- 10.
A real estate investment company owns many shopping malls. Each mall contains many
shops. To encourage rental of its shops, the company gives a negotiated discount to
retailers who have shops in more than one mall. Each shop generates an income stream
that can vary from month to month because rental is based on a flat rental charge and a
negotiated percentage of sales revenue. Also, each shop has monthly expenses for
scheduled and unscheduled maintenance. The company uses the data to compute its monthly
net income per square meter for each shop and for ad hoc querying.
- 11.
Draw a data model for the following table taken from a magazine that evaluates
consumer goods. The reports follow a standard fashion of listing a brand and model,
price, overall score, and then an evaluation of a series of attributes, which can vary
with the product. For example, the sample table evaluates stereo systems. A table for
evaluating microwave ovens would have a similar layout, but different features would be
reported (e.g., cooking quality).
Brand and model |
Price |
Overall
score |
Sound
quality |
Taping
quality |
FM
tuning |
CD
handling |
Ease
of use |
Phillips SC-AK103 |
140 |
620 |
Very good |
Good |
Very good |
Excellent |
Fair |
Panasonic MC-50 |
215 |
55 |
Good |
Good |
Very good |
Very good |
Good |
Rio G300 |
165 |
38 |
Good |
Good |
Fair |
Very good |
Poor |
- 12.
Draw a data model for the following freight table taken from a mail order catalog.
Merchandise
subtotals |
Regular delivery
710 days |
Rush delivery
45 business days |
Express delivery
12 business days |
Up to $30.00 |
$4.95 |
$9.95 |
$12.45 |
$30.01$65.00 |
$6.95 |
$11.95 |
$15.45 |
$65.01$125.00 |
$8.95 |
$13.95 |
$20.45 |
$125.01+ |
$9.95 |
$15.95 |
$25.45 |