📚 Seth MB

Search

Search IconIcon to open search

Databases: Design

Last updated May 26, 2023 Edit

# Setting up

# Entity Relationship Diagram (ERD)

Example: A school has classes that are taught by a single teacher and attended by one or more students. Each class is taught in a single room. Students can study more than one subject but teachers can only teach one subject.

Identify the entities.

Entities: school, classes, teacher, room, students, subject

The school may want additional entities, such as reports, registers, books, invoices etc. It simply depends upon the use case for the database and what the school wishes to track/do with the database.

Some entities may not be needed. So for instance a school would not be required in this database as all data is stored solely for the school—if multiple schools needed to use this database then the school entity might make sense. So we can exclude school from our entity list.

Next, we take all of the entities and think about how they are related to all other entities.

# Naming conventions

Entities should be named in a uniform way. The convention is to be consistent, and people typically name everything in the singular.

So we would store classes as class and students as student.

This does not technically matter in terms of database performance or functionality, it’s just something that makes it easier for humans to understand.

# Relationships

There are 3 types of relationship possible between entities.

# One-to-one

# One-to-many

Heinz is the supplier of baked beans and ketchup. However ketchup and baked beans only have 1 supplier, whilst Heinz has 2 products in this scenario.

Each product can only have 1 supplier.

# Many-to-many

Relationship Types

Avoid any relationship except one-to-many. It is significantly easier to work with one-to-many and it makes your database more flexible.


Many-to-many relationships are difficult to model.

So we break up the relationship and make a link entity

So instead of having a many-to-many link between subject and teacher, we have a one-to-many relationship between teacher and subject teacher and subject and subject teacher. This reduces wasted space to 0 in our model and also makes queries easier to manage.

With a one-to-one relationship, there is exactly one instance of each entity for each instance of another entity. Although this is OK and does provide the possibility to expand the system in future, it might make sense to combine the two.

So House and Alarm (with a one-to-one relationship) could be replaced with a single entity called Alarm Installation.

Normalization is required to ensure that data is stored in the most proper way. So for any one data point, there is only one way it should be stored.

If the same type of information is stored more than once - ie, product details for an order, then this is a repeating group. It should NOT be stored in the same entity as the order. So you would make a new entity to store the repeating group.

An instance of the repeating group entity will be created for every new line required.


# Data Dictionary

# Primary Key Fields

Keys are how we create relationships in a relational database.

# Normalization

Important

A database is in third normal form if every attribute in every entity is dependant on the key, the whole key, and nothing but the key.

Reference to the key is primary or composite key.

# Third Normal Form

A name is not atomic because it contains at least 2 pieces of information “FirstName” and “LastName”. Storing it just as “Name” would be possible, but would greatly complicate things later on.

It is also possible to break information into more pieces of information than necessary. For example, if your address was “23 Chester Close"” you might try and break it down into “23” and “Chester Close”. However as it is the first line of their address, it is a single piece of information. This does depend upon use case of course, because if you just want to find people living on a certain street, then you might wish to collect street names as seperate entities.

# Reasons for normalization

Databases