Subscribe RSS
Database Primer – Part 1 Oct 25
She didnt get it till she read this

This is Sheila - shes fit as well as mental

Part 1 | Part 2 | Part 3

Building a relational database

Databases are extremely powerful data storage mechanisms, I use them everyday. A lot of people have extremely limited experience with relational databases, using them as they would an Excel document. This sequnce of posts aims to demonstrate how easy they are to use and how best to go about converting an everyday scenario into a useful database that can support your web site or application.

Part 1: Creating a database from a scenario

I believe it is always better to first PLAN what it is you are trying to achieve, so first we’ll construct a reason to have a database in the first place. [If you know how to normalise data then jump to here and skip the scenario]

The Scenario

Okay, we are going to model our Surf Shack Member’s Club, storing details about the members within the club. We’re gonna convert this into a normalised set of data tables and then use those tables in a few queries.

This is the data (a list of surfers, their gender and their skill level), as it might be presented scribbled onto a piece of paper:

BABES

Sheila – Mental

Kylie  - Spectator

Danni  - Advanced

DUDES

Hank   – Spectator

Brad   – Beginner

Joey   – Mental

Tom    - Professional

Jason  - Professional

The data in this format is not very useful, its certainly not much use for programming so we need to convert it into a data table. We can easily identify three core items of information: person name, their gender and their skill level. You might enter it into a single Excel-like table like this:

+----+--------+--------+--------------+
| ID | Name   | Gender | Skill Level  |
+----+--------+--------+--------------+
|  1 | Sheila | Babe   | Mental       |
|  2 | Kylie  | Babe   | Spectator    |
|  3 | Danni  | Babe   | Advanced     |
|  4 | Hank   | Dude   | Spectator    |
|  5 | Brad   | Dude   | Beginner     |
|  6 | Joey   | Dude   | Mental       |
|  7 | Tom    | Dude   | Professional |
|  8 | Jason  | Dude   | Professional |
+----+--------+--------+--------------+

Whilst Excel might be able to do some funky stuff with this data it really isnt very good like this. For instance what if you add a new member with a skill level of beginner, what value do you choose for the Skill Level – “Beginner:, “Amateur”, “Lamer”…?

To ‘normalise’ this data into a set of useful, independantly updateable, tables we need to analyse the objects invovled. These are simply:

  • Surfers – the people themselves
  • Gender – the available genders (Dude/Babe)
  • Skill Level – the available skill levels (Beginner, Amateur…)

Developing the model

The relationship between Surfer and Gender is many to one. This means that each surfer (in this example) can have only 1 gender, but a gender can be applied to zero or more surfers (there are three surfers with a gender of Babe). I like to use diagrams to illustrate these relationships. The great thing about a simple object diagram like this is that you can at a glance see all the relationships going on. You can map these relationships directly into table structures as we shall see later. This relationship can therefore  be represented like this:

SURFER *------1 GENDER 

This diagram shows two objects and their relationship. The fact they are related is indicated by the line, and the way they relate by the symbols at the end of each line (* means zero or more, 1 means exactly 1). You can read the relationship in two ways, depending on which object is in context, e.g:

  • You can read this left to right as: “Each surfer has exactly 1 gender”
  • And right to left: “Each Gender may be related to zero or more surfers”

The relationship between Surfer and Skill Level is the same:

SURFER *------1 SKILL

There is no direct relationship between Gender and Skill Level other than that implied through the surfer. So the full model is as follows:

GENDER 1------* SURFER *------1 SKILL

Now we know our objects (Gender, Surfer and Skill) and we know their relationships to each other. We can now convert this to a set of SQL data tables as follows.

Creating the data tables

The Skill table does not need to know anything about the Surfer table or the Gender table. We can tell this because all of its relationship lines, of which there is only one, END in “*”. It is essentially just a look up table for the Surfer objects. We need to store:

  • a unique identifier (primary key),
  • the label for the skill level
  • thats it!

Here is the table:

[skill]

+----+--------------+
| id | level        |
+----+--------------+
|  1 | Spectator    |
|  2 | Beginner     |
|  3 | Amateur      |
|  4 | Advanced     |
|  5 | Professional |
|  6 | Mental       |
+----+--------------+

The Gender table has exactly the same relationships, here is that table:

[gender]
+----+--------+
| id | gender |
+----+--------+
|  1 | Babe   |
|  2 | Dude   |
+----+--------+

The surfer table is more interesting. From the simple object diagram above we can see it has two relationships. I find reading from the object through the relationship to the related object helps with understanding it.
SURFER *--------1 GENDER
"surfer.....has..1...gender"

and

SURFER *--------1 SKILL
"surfer.....has..1..skill"

Therefore the surfer table needs the following fields:

  • a unique identifier
  • a link to skill (by id)
  • a link to gender (by id)
  • a name
[surfer]
+----+----------+-----------+--------+
| id | skill_id | gender_id | name   |
+----+----------+-----------+--------+
|  1 |        6 |         1 | Sheila |
|  2 |        1 |         1 | Kylie  |
|  3 |        4 |         1 | Danni  |
|  4 |        1 |         2 | Hank   |
|  5 |        2 |         2 | Brad   |
|  6 |        6 |         2 | Joey   |
|  7 |        5 |         2 | Tom    |
|  8 |        5 |         2 | Jason  |
+----+----------+-----------+--------+

We now have a set of ‘normalised’ data tables that we can use to do a vast variety of query look ups. You can see the final surfer table is pretty similar to the original ‘Excel’ data table but the lookups to Skill and Gender are by ‘id’ rather than text string. The most important thing about this is we can change the values of, say, the Skill table and every surfer related to that skill will have their data updated in any queries we do.

Category: SQL, Tutorials  | Tags: , ,
You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed.
Leave a Reply