KayeILSQL-01-Database Concepts: A Relational Approach

Contents

Based On

Components of a Business Information System

Information and Data

Basic Database Terminology

Types of Relationships

The Relational Database Model

Integrity Rules

Two Theoretical Relational Languages

References

Based On

This text on SQL is based on the Oracle Database using the EP-European Parliament as Data Model.

The European Parliament (EP) is the parliamentary institution of the European Union (EU).

The European Parliament is elected by the citizens of the European Union to represent their interests.

Its origins go back to the 1950s and the founding Treaties. Since 1979 its members have been directly elected by the citizens of the EU.

Elections are held every five years, and every EU citizen is entitled to vote, and to stand as a candidate, wherever they live in the EU. Parliament thus expresses the democratic will of the European Union's nearly 500 million citizens and it represents their interests in discussions with the other EU institutions.

The European Parliament model and data used in this SQL text, are based on the elections which were held in June 2004. The European Parliament of 2004 elections had 785 members from all 27 EU countries.

The sources for the data, which are all freely obtainable from the European Union website, and from several leaflets of the EU institutions, are listed in the References section.

These sources are used to create the fictitious Business Narrative of the "Data Modelling and Normalization" Chapter.

We called this Business Narrative; "the Story of the European Union" which explains about the EU and EP, for purposes of Data Modeling.

The Story of the European Union is the last Chapter of Kaye is Learning SQL.

Beautiful Crape Myrtle Tree — Beautiful Lagerstroemia / Crape Myrtle Tree

Components of a Business Information System

A Business Information System Consists of the Following Components:

Hardware

Computer "Hardware" such as servers and data storage devices running in a network environment.
A Server can be a Supercomputer, a Desktop Computer, a Laptop, or even a Hand-Held Computer.
Data Storage Devices enable Businesses to backup data, restore and recover data, under normal conditions or in the case of an unforeseen disaster.

Components of a Business Information System - Hardware

Operating System Software / OS

Operating System software, that runs on the hardware, on which the Database Management System Software operates.
Operating System software, which is usually abbreviated as OS or O/S, handles details of the operations of the Hardware.

Components of a Business Information System - OS

Database Management System Software / DBMS

A Database Management System Software package or DBMS is responsible for the Management and Administration of the Database.
It is responsible for properly managing various database structures in an effective and efficient manner, including
Database security, user management, backup/recovery and performance monitoring.

Components of a Business Information System - DBMS

Database

A user-developed and implemented “Database” that contains Tables and Business Data, and many other database objects.
A Database also contains a Data Dictionary that holds information about all the database objects.
Data Dictionary of the Database is used by the Database Management System Software while managing and administering the Database.

Components of a Business Information System - Database

Applications

Programs that enable data management,
Allowing a business to create, modify, share and distribute information about their products, services, and processes,
Programs that enable document management,
Programs that enable a business to structure relationships across different sets of information,
Programs that enable a business to automate their business processes,
Programs that enable a business to generate Reports...
All these programs are called Applications.
Applications enable Business Processing and Business Intelligence.

Components of a Business Information System - Applications

... and Users

Components of a Business Information System - Users

Users are personnel who use the system.

They can be IT Staff such as;

Database Administrators
Database Designers
Business Analysts
Application Developers

Or, they can be End Users or Business Users such as;

Directors
Managers
Project Leaders
Accountants
Call Center Staff
Warehouse Staff
Shipping Staff

In short, anybody who is responsible of doing at least one piece of work in their workplace ...

We simply call all of them "User".

Information and Data

A report which is one of the application programs in a Business Information System, helps users take their business decisions.

Sometimes, these decisions have to be made immediately – they can be daily decisions, hourly decisions or even minutely decisions.

Sometimes decisions have to be made whose results will affect a business in the long run.

These reports are also known as the "Business Intelligence" of a company.

Whether they will affect this very minute, or a ten years’ span, the users need the "information" in these reports..

Information that is necessary to take the right decisions – the decisions that are vital for a business to operate and to be successful.

Information itself uses "data" stored in the Database and managed by the Database Management System Software .

Data are the raw materials.
They are the atoms.

And when the data are processed – that is when they are manipulated, collected and organized, the result is information.

A Simple Basic BI Report created by matplotlib — A Simple Basic BI Report (matplotlib / jupyter notebook)

Another BI Report created by matplotlib — Another Basic BI Report (matplotlib / jupyter notebook)

Another Representation of the BI Report above created by matplotlib — Another Representation of the BI Report above (matplotlib / jupyter notebook)

Heatmap of European Parliament Seats as of August 2008 (seaborn / jupyter notebook)

Basic Database Terminology

A New Definition for Database

Database

A database is an electronic store of data.
A database is a repository that stores information about "things".

a repository is a location for storage, often for safety or preservation

A database contains relationships among those different "things".

Below are some of the basic terms used, to describe the building blocks of a database.

Entity

An Entity can be;

a country
a person
an institution
a product

An Entity is a thing of significance about which information needs to be known.

Example

If you were a statistician in EUROSTAT-the statistical office of the European Commission, you would like to have all the information about the European Union member countries.

A COUNTRY is an Entity Type or an Entity Class.

However, each COUNTRY in the European Union is an Entity Instance .

For example, Belgium and Poland are Entity Instances of the Entity Type COUNTRY.

There is only one Entity Type/Class of COUNTRY, whereas there are twenty-seven Instances of this entity.

Another Example

If you were working in the DG(Directorate-General) Personnel of the European Parliament , you would like to have all the information about the Members of the European Parliament(MEPs).

A MEMBER OF THE PARLIAMENT is an Entity Type or an Entity Class, and each Member of the Parliament is an Entity Instance.

There is only one Entity Type/Class of MEMBER OF THE PARLIAMENT, whereas there are 785 Instances of this entity.

Attribute

An attribute is a characteristic that describes an entity.

Example

Some of the likely "attributes" for a country are;

country name
capital
population
area
gdp
and so on...

Another Example

Some of the likely "attributes" for an MEP are;

last name
first name
political group
country of citizenship
salary information
start date
and so on...

Relationship

Reationships are interactions between Entities.

They are described by "verbs".

Example

A Member of the European Parliament(MEP) belongs to a Political Group.

Therefore, we can say that the Relationship between the Entity MEP and the Entity POLITICAL GROUP is "belongs to".

A New Definition for Database

Now that we know briefly about the concepts of "Entity" and "Relationship", we can say that;

a Database is made up of a Collection of Entities and the Relationships between these Entities.

Example

European Parliament's database may be made up of a Collection of the Entities "COUNTRY", "POLITICAL GROUP", "MEMBER OF THE PARLIAMENT", "DELEGATION", "ROLE" and the Relationships between these Entities.

Types of Relationships

When we design a database, we need to create Entities each of which is a collection of entity instances. For example Entity POLITICAL GROUP and the Entity MEP (Members of the European Parliament).

When we design a database, we also have to establish the "relationships" between these "entities".

Depending on the type of interaction, the relationships are classified into three categories:

One-to-One Relationship

One-to-Many Relationship

Many-to-Many Relationship

How to Determine the Relationship Type

One-to-One Relationship

A one-to-one relationship is written as 1:1 in short form.

Entities X and Y have a 1:1 relationship if the following conditions hold:

An entity instance in Entity X has only one matching instance in Entity Y
An entity instance in Entity Y has only one matching instance in Entity X as well.

Let's say in our database, there is an Entity called COUNTRY, and an Entity called CAPITAL.

Entity COUNTRY contains all the countries in the world.

Each country in the world is an instance of this Entity.

Entity CAPITAL contains all the capitals in the world.

Each capital in the world is an instance of this Entity.

The RELATIONSHIP between these two entities is:

A COUNTRY has one and only one CAPITAL, and a CAPITAL belongs to one and only one COUNTRY.

We say that, the entities COUNTRY and CAPITAL have a 1:1 relationship.

One-to-Many Relationship

A one-to-many relationship is written as 1:M in short form.

Entities X and Y have a 1:M relationship if the following conditions hold:

An entity instance in Entity X has many matching instances in Entity Y,
An entity instance in Entity Y has only one matching instance in Entity X.

Let's say in our database, there is an Entity called POLITICAL GROUP, and an Entity called MEMBER OF THE PARLIAMENT.

The Entity POLITICAL GROUP contains all the political groups in the European Parliament.

Each political group in the European Union is an instance of this Entity.

The Entity MEMBER OF THE PARLIAMENT contains all the Members of the European Parliament(MEPs).

Each MEP is an instance of this Entity.

The RELATIONSHIP between these two entities is:

A POLITICAL GROUP in the European Parliament has many MEMBERs OF THE PARLIAMENT, but a MEMBER OF THE PARLIAMENT belongs to one and only one POLITICAL GROUP.

We say that, the entities POLITICAL GROUP and MEMBER OF THE PARLIAMENT have a 1:M relationship.

Many-to-Many Relationship

A many-to-many relationship is written as M:M or M:N in short form.

Entities X and Y have a M:M relationship if the following conditions hold:

An entity instance in Entity X has many matching instances in Entity Y,
An entity instance in Entity Y has many matching instances in Entity X.

Let's say in our database, there is also an entity called ROLE.

We already know that there is an Entity called MEMBER OF THE PARLIAMENT in our database.

The entity ROLE contains all the roles held by the MEPs, that is all the roles held by the Members of the European Parliament.

Each Role held by an MEP is an instance of this entity.

The Entity MEMBER OF THE PARLIAMENT contains all the Members of the European Parliament (MEPs).

Each MEP is an instance of this Entity.

Each MEP in the European Parliament has at least one Role.

The RELATIONSHIP between these two entities is:

A MEMBER OF THE PARLIAMENT may have many ROLEs, and a ROLE may be held by more than one (by many) MEMBERs OF THE PARLIAMENT.

We say that the entities MEMBER OF THE PARLIAMENT and ROLE have a M:M relationship.

How to Determine the Relationship Type

To determine the Type of Relationship between the Entity X and the Entity Y, we can ask the following two questions:

Two questions to determine the Relationship Type

Depending on the answers to these two questions, we can determine the type of relationship, between the two entities.

How this is achieved, is illustrated below:

Answers to Two questions to determine the Relationship Type

The Relational Database Model

If a Database Management System is based on a Relational Database Model, then it is also known as a Relational Database Management System - an RDBMS.

In this course, we base our learnings on the Relational Database Model.

Sample Database Tables

Tuple

Degree

Domain

Key - Unique Identifier

Primary Key

Composite Primary Key

Secondary Key

Surrogate Key

Foreign Key

E.F.Codd (Edgar F. "Ted" Codd) developed the "relational database model" in 1970.

This new concept is published in the Communications of the Association for Computing Machinary (CACM), in his article named "A Relational Model of Data for Large Shared Data Banks".

The model is based on Mathematical Set Theory.

It uses "relation" as the building block of the database.

The relation is represented by a two-dimensional, flat structure also known as a "table".

The user views the data in a logical, two-dimensional structure.

A table is a matrix of rows and columns in which;

each row represents an entity instance
each column represents an attribute

We refer to a Relation as an Entity at the Data Modeling and Database Design stage.

When the physical database is created, Entities become Tables.

We will use the following concepts interchangeably for the rest of this chapter:

Relation-Entity-Table

We refer to a Relation as an Entity at the Data Modeling and Database Design stage.

When the physical database is created, Entities become Tables.

Entity Instance-Row

A table is a matrix of rows and columns in which;

each row represents an entity instance
each column represents an attribute

Attribute-Column

A table is a matrix of rows and columns in which;

each row represents an entity instance
each column represents an attribute

Sample Database Tables

We will use the following Relations-Entities-Tables of a Sample European Research Institution to learn new terminology:

PROJECTS
PROGRAM_PROJECTS
PROGRAMS_LAST_YEAR
PROGRAMS_THIS_YEAR
STAFF
DEPARTMENTS

Sample Database Tables
Relation-Entity-Table	Definition
PROJECTS	Holds details of research projects. A research project may span a period of several years.
PROGRAM_PROJECTS	Holds details of programs. A program can contain several numbers of different projects.
PROGRAMS_LAST_YEAR	Contains details of programs that consist of research projects, their countries and locations, which started last year.
PROGRAMS_THIS_YEAR	Contains details of programs that consist of research projects, their countries and locations, which started this year.
STAFF	Holds details of staff responsible of each project and their department number.
DEPARTMENTS	Holds details of departments.

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

PROJECTS has;

5 Attributes/Columns
4 Entity Instances/Rows

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

PROGRAM_PROJECTS has;

3 Attributes/Columns
10 Entity Instances/Rows

PROGRAMS_LAST_YEAR
program no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

PROGRAMS_LAST_YEAR has;

3 Attributes/Columns
3 Entity Instances/Rows

PROGRAMS_THIS_YEAR
program no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

PROGRAMS_THIS_YEAR has;

3 Attributes/Columns
4 Entity Instances/Rows

STAFF
staff id	name	dg id	program_no
802	YILDIZ	400	10
804	HEYNEN	400	20
806	KANUITH	400	30
808	ZABOROWSKI	400	40
1020	OTTOVA	400	60

STAFF has;

4 Attributes/Columns
5 Entity Instances/Rows

DEPARTMENTS
dg id	dg name
200	DG Communications
400	DG Innovation and Research
600	DG Legal Service

DEPARTMENTS has;

2 Attributes/Columns
3 Entity Instances/Rows

Tuple

In relational terminology, a Row in a Table or an Instance of an Entity is referred to as a Tuple.

It is assumed that there is no pre-defined order to rows of a table.
Also, no two rows of a table have the exact same set of values.

Degree

The number of columns in a table is called the Degree of the relation.

For instance, if a table has four columns, then the table is of degree 4.

Just like the order of rows is not important in a Relational Database Model, the order of the columns is not important either.

Domain

The set of all possible values that a column may have is called the Domain of that column.

An Example for Domain

The domain of the "dg_name" column of the "DEPARTMENTS" table is the set of all possible Directorate-Generals in this Research Institution.

This domain can be;

DG Communications
DG Innovation and Research
DG Legal Service
DG Finance
DG Human Resources
DG Technical Services
... and so on.

Another Example for Domain

Domain of the "country" column of the PROGRAMS_THIS_YEAR table, is the set of all countries that take part in the European Research Project, namely, all the EU Member and Candidate countries;

Austria, Belgium, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, FYR of Macedonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, Turkey and the United Kingdom.

In our tables, the columns;

dg_id
staff_id
program_no
project_no
number_of_projects

all have numeric values, but their use is different. Therefore they do not have the same domain.

Two domains are the same only if they have the same values and the same use.

Key - Unique Identifier

A Key is a minimal set of columns used to uniquely identify each row in a table.

If a single column can be used to identify each row, there is no need to use two columns as a key.

An Example for a Single Column Key

In PROGRAMS_LAST_YEAR and PROGRAMS_THIS_YEAR, "program_no" column uniquely identifies each row.

PROGRAMS_LAST_YEAR
program no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

PROGRAMS_THIS_YEAR
program no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

In these tables, the column "program_no" is a Single Column Key.

Another Example for a Single Column Key

In PROJECTS, "project_no" column uniquely identifies each row.

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

In PROJECTS, the column "project_no" is a Single Column Key.

Example for a Key with More Than One Column

In PROGRAM_PROJECTS, none of the columns can uniquely identify a row.

In this table, a combination of columns must be used to uniquely identify a row.

This combination is "program_no" and "project_no" cloumns.

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

In the PROGRAM_PROJECTS table, "program_no" and "project_no" columns together can be used as a key, to uniquely identify each row.

UNIQUE IDENTIFIER

A KEY Column, or a Combination of Columns that act as a KEY , is sometimes called a UNIQUE IDENTIFIER.

Primary Key

When a Single Column is used as a Unique Identifier , then this column is known as the Primary Key of the table.

For example, "project_no" column is the Primary Key of the PROJECTS table.

Composite Primary Key

When a Combination of Columns is used as a Unique Identifier, then these columns are known as;

The Composite Primary Key
The Composite Key.

Secondary Key

Sometimes a more human approach is used to identify or retrieve a row from a table.

This is because, it is not possible to remember primary key values such as;

staff identification number
department identification number
project number
... and so on.

In such cases,

staff name
department name
project type
project name
... and so on,

can be used for data retrieval.

Such a key is known as a Secondary Key.

Surrogate Key

If none of the columns in a table is a candidate for the primary key, then database designers can use an extra column as the primary key.

In some cases, this can be preferred to having a composite primary key.

Such a key, which is a Unique Identifier for the row it belongs to, is known as a Surrogate Key.

A new column such as "country_identification_number" which will act as the Surrogate Key, can be added in a table to identify a country.
A new column such as "program_project_identification_no" which will act as the Surrogate Key, can be added in a table to identify a project within a program.
A new column such as "sponsor_number" which will act as the Surrogate Key, can be added in a table to identify a sponsor.

Surrogate means "substitute".
Values of surrogate keys serve only as surrogates or substitutes for the entities they stand for.
They carry no additional information or meaning whatsoever.

One way of generating column values for the Surrogate Key Columns in Oracle is, to use the Sequence object.

Foreign Key

In a relational database, tables are related to each other through a common column.

A column in a table that references a column in another table is known as a Foreign Key.

For example, the "project_no" column is a Foreign Key column in the PROGRAM_PROJECTS table, that References the "project_no" column in the PROJECTS table.

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

Integrity Rules

Although we will learn about the concepts of integrity later in this SQL Course in detail, here is an early introduction.

In a database managed by a Relational Database Management System, it is very important that the data in the underlying tables are kept consistent.

If consistency is compromised, then data become unusable.

Integrity Rule #1 Entity Integrity

Integrity Rule #2 Referential Integrity

This fact led to the following two Integrity Rules:

Entity Integrity
Referential Integrity

Integrity Rule #1 Entity Integrity

Column values of a Primary Key MAY NOT be Null.
The Primary Key provides the means of uniquely identifying a row, or an entity.
Column values of a Primary Key MUST be Unique.

Integrity Rule #1 Entity Integrity

NULL

A Null Value means a value that is;
- not known
- not entered
- not defined
- not applicable
A zero or a space is NOT considered to be a Null Value.

Integrity Rule #1 Entity Integrity

If the Primary Key value in a row is Null, then we do not have enough information about the row to uniquely identify it.

Integrity Rule #1 Entity Integrity

A Relational Database Management System software strictly follows the "Entity Integrity" rule,
and, DOES NOT allow the users and/or application programs to enter a row WITHOUT a Unique and Non-Null value in the Primary Key column.

Integrity Rule #1 Entity Integrity

Please note that the Designers, DBAs and Developers of a Database Design Team MUST define the Primary Key for the RDBMS Software to follow this rule.

Integrity Rule #2 Referential Integrity

A Foreign Key value MAY be a Null Value,
or it must exist as a value of the Primary Key in the Referenced Table.

For example, the "project_no" column is a Foreign Key column in the PROGRAM_PROJECTS table, that References the "project_no" column which is the Primary Key of the PROJECTS table.

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

Integrity Rule #2 Referential Integrity

Oracle RDBMS does not allow you to input a value into the Foreign Key column, if it does not exist as a Primary Key value in another table.
It allows you to leave the Foreign Key column value as a Null.

If a user and/or an application program enters a value in the Foreign Key column of a Table, Oracle cross-references the Referenced Table's Primary Key column to confirm the existence of such a value.

Integrity Rule #2 Referential Integrity

Please note that the Designers, DBAs and Developers of a Database Design Team MUST define the Foreign Key and the Primary Key it references, for the RDBMS Software to follow this rule.

Two Theoretical Relational Languages

E.F. Codd suggested Two Theoretical Relational Languages to use with the Relational Model:

Relational Algebra - a Procedural Language
Relational Calculus - a Nonprocedural Language

In the database systems available today, nonprocedural Structured Query Language SQL is used as a data-manipulation language.

The two theoretical languages which will be explained below, have provided the basis for SQL.

Relational Algebra

Relational Calculus

Basis for SQL

Relational Algebra

Relational algebra is a procedural language.

It is the manipulative part of Relational Model as suggested by E.F. Codd.

"Relational Algebra is basically a Set of Operators that take Relations as their operands and return a Relation as their result."

The user accomplishes the desired results, by using a Set of Operations in a sequence.
It uses Operations on Relations to produce new Resulting Relations.
These Resulting Relations are then used for subsequent Operations.

The nine Operations used by Relational Algebra are as follows:

Applications of Relational Algebra

Weaknesses of Relational Algebra as a Programming Language

Strengths of Relational Algebra as a Programming Language

UNION

The UNION of two tables results in retrieval of all rows that are in one or both tables.

The duplicate rows are eliminated from the resulting table -- the resulting table does not contain two rows with identical data values.

Union Compatiblility

Two tables that satisfy the requirements above are said to be Union Compatible.

In Mathematical Set Theory, a UNION can be performed on any two sets.
In Relational Algebra, a UNION can only be performed on UNION COMPATIBLE TABLES.

If we want to see all the programs that took place last year and programs taking place this year, we can perform a UNION on the PROGRAMS_LAST_YEAR and PROGRAMS_THIS_YEAR tables.

If we call the resulting table TABLE_A, the operation can be denoted by:

TABLE_A = PROGRAMS_LAST_YEAR UNION PROGRAMS_THIS_YEAR

PROGRAMS_LAST_YEAR
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

PROGRAMS_THIS_YEAR
program_no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

TABLE_A = PROGRAMS_LAST_YEAR UNION PROGRAMS_THIS_YEAR

TABLE_A
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

INTERSECTION

The INTERSECTION of two tables produces a table with rows that are in BOTH tables.
Two tables MUST be UNION COMPATIBLE to perform an INTERSECTION on them.

If we want to find out the programs that took place BOTH LAST YEAR and CONTINUING THIS YEAR, we can perform an INTERSECTION operation on the PROGRAMS_LAST_YEAR and PROGRAMS_THIS_YEAR tables.

If we call the resulting table of this INTERSECTION OPERATION, TABLE_B, then the operation can be denoted by:

TABLE_B = PROGRAMS_LAST_YEAR INTERSECTION PROGRAMS_THIS_YEAR

PROGRAMS_LAST_YEAR
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

PROGRAMS_THIS_YEAR
program_no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

TABLE_B = PROGRAMS_LAST_YEAR INTERSECTION PROGRAMS_THIS_YEAR

TABLE_B
program_no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg

The resulting table TABLE_B shows us those programs which started last year or before and continuing this year.

DIFFERENCE

The DIFFERENCE of two tables produces a table with rows that are present in the FIRST table but NOT in the SECOND table.
Two tables MUST be UNION COMPATIBLE to perform a DIFFERENCE on them.

If we want to find out the programs which took place LAST YEAR, BUT DO NOT CONTINUE THIS YEAR, we can perform a DIFFERENCE operation on the PROGRAMS_LAST_YEAR and PROGRAMS_THIS_YEAR tables.

If we call the resulting table of this DIFFERENCE OPERATION, TABLE_C, then the operation can be denoted by:

TABLE_C = PROGRAMS_LAST_YEAR DIFFERENCE PROGRAMS_THIS_YEAR

PROGRAMS_LAST_YEAR
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

PROGRAMS_THIS_YEAR
program_no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

TABLE_C = PROGRAMS_LAST_YEAR DIFFERENCE PROGRAMS_THIS_YEAR

TABLE_C
program_no	location	country
20	Brussels	Belgium

The resulting table TABLE_C shows us those programs which took place last year and do not continue this year.

Just as in Mathematics, in Relational Algebra also, A - B may not be equal to B - A.

If we want to find out those programs which DID NOT EXIST LAST YEAR, BUT STARTED THIS YEAR, we can perform a DIFFERENCE operation on the PROGRAMS_THIS_YEAR and PROGRAMS_LAST_YEAR tables.

If we call the resulting table of this DIFFERENCE OPERATION, TABLE_D, then the operation can be denoted by:

TABLE_D = PROGRAMS_THIS_YEAR DIFFERENCE PROGRAMS_LAST_YEAR

PROGRAMS_THIS_YEAR
program_no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

PROGRAMS_LAST_YEAR
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

TABLE_D = PROGRAMS_THIS_YEAR DIFFERENCE PROGRAMS_LAST_YEAR

TABLE_D
program_no	location	country
40	Warsaw	Poland
60	Prague	Czech Republic

The resulting table TABLE_D shows us those programs which started this year.

PROJECTION

The PROJECTION Operation allows us to create a Resulting Table "Based on Desirable Columns" from all existing columns in a table.
The "Undesired Columns" are ignored.

The PROJECTION Operation returns VERTICAL SLICES from a table.

For example, we might like to list only the DESCRIPTION and the MAIN_SPONSOR columns of the PROJECTS table.

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

PROJECTION Operation is indicated by;

including the Table name
a list of Desired Columns in Brackets

If we call the Resulting Table of this PROJECTION Operation, TABLE_E, then the operation can be denoted by:

TABLE_E = PROJECTS(description, main_sponsor)

TABLE_E
description	main sponsor
Nano Technology	Futuristic
Information Technology	Guru
Bio Technology	Longevity
Neuro Science	Cyberlabs

The Desired Columns are returned.
The Resulting Table has the same number of rows as the original table but fewer columns.

SELECTION

The SELECTION Operation allows us to create a Resulting Table with "Rows Based on a Condition or Conditions" from all the rows in a Table.
To Create a Condition the following can be used;

the Conditional Operators (=, !=, <>, >, >=, <, <=)
the Logical Operators (AND, OR, NOT)
Columns
Values

The SELECTION Operation returns HORIZONTAL SLICES from a table.

For example, we might like to list only "those projects which cost more than 3 million Euros" in the PROJECTS table.

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

SELECTION Operation is indicated by;

the SEL Operator
Table Name
Conditions

If we call the Resulting Table of this SELECTION Operation, TABLE_F, then the operation can be denoted by:

TABLE_F = SEL(PROJECTS: cost > 3)

TABLE_F
project no	project type	description	main sponsor	cost (million Euros)
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48

The Rows that satisfy the Given Conditions are returned.
The Resulting Table has the same number of columns as the original table but fewer rows.

PRODUCT

The PRODUCT of two tables is a combination of all rows in both tables.
It is also known as a CARTESIAN PRODUCT.
It can cause huge results with big tables.

If the First Table has X number of Rows, and the Second Table has Y number of Rows, then the Resulting Table has X TIMES Y number of Rows.
If the First Table has M number of Columns, and the Second Table has N number of Columns, then the Resulting Table has M PLUS N number of columns.

An Example on the PRODUCT Operation

For simplicity we take two tables with one column each and perform the PRODUCT Operation on them.

TABLE_1
T1_COLUMN1
A
B
C

TABLE_1 has 1 Column and 3 Rows.

TABLE_2
T2_COLUMN1
10
20

TABLE_2 has 1 Column and 2 Rows.

What will the Result of the PRODUCT Operation be?

If we call the Resulting Table of this PRODUCT Operation, TABLE_G, then the operation can be denoted by:

TABLE_G = TABLE_1 PRODUCT TABLE_2

TABLE_G
T1_COLUMN1	T2_COLUMN1
A	10
A	20
B	10
B	20
C	10
C	20

The Resulting Table of the PRODUCT Operation on TABLE_1 and TABLE_2, which is TABLE_G, has 6 Rows and 2 Columns.

Another Example on the PRODUCT Operation

Now, let us take the STAFF and the DEPARTMENTS tables, and perform the PRODUCT operation on these two tables.

STAFF table has 5 rows and 4 columns:

STAFF
staff id	name	dg id	program no
802	YILDIZ	400	10
804	HEYNEN	400	20
806	KANUITH	400	30
808	ZABOROWSKI	400	40
1020	OTTOVA	400	60

DEPARTMENTS table has 3 rows and 2 columns:

DEPARTMENTS
dg id	dg name
200	DG Communications
400	DG Innovation and Research
600	DG Legal Service

Cartesian Product of these two tables, will produce a Result Table with 5x3=15 Rows, and 4+2=6 Columns as seen below:

TABLE_R = STAFF PRODUCT DEPARTMENTS

TABLE_R
staff id	name	dg id	program no	dg id	dg name
802	YILDIZ	400	10	200	DG Communications
802	YILDIZ	400	10	400	DG Innovation and Research
802	YILDIZ	400	10	600	DG Legal Service
804	HEYNEN	400	20	200	DG Communications
804	HEYNEN	400	20	400	DG Innovation and Research
804	HEYNEN	400	20	600	DG Legal Service
806	KANUITH	400	30	200	DG Communications
806	KANUITH	400	30	400	DG Innovation and Research
806	KANUITH	400	30	600	DG Legal Service
808	ZABOROWSKI	400	40	200	DG Communications
808	ZABOROWSKI	400	40	400	DG Innovation and Research
808	ZABOROWSKI	400	40	600	DG Legal Service
1020	OTTOVA	400	60	200	DG Communications
1020	OTTOVA	400	60	400	DG Innovation and Research
1020	OTTOVA	400	60	600	DG Legal Service

ASSIGNMENT

ASSIGNMENT Operation creates a new table from existing tables.
We have been using ASSIGNMENT throughout all the other operations above.
ASSIGNMENT Operation is denoted by an = sign.
ASSIGNMENT Operation gives us an ability to name new tables that are based on other tables.

We have already used the ASSIGNMENT (=) Operator in the previous examples as seen below:

TABLE_A = PROGRAMS_LAST_YEAR UNION PROGRAMS_THIS_YEAR

TABLE_D = PROGRAMS_THIS_YEAR DIFFERENCE PROGRAMS_LAST_YEAR

JOIN

JOIN is one of the most important operations because of its ability to get related data from more than one table.
JOIN is based on COMMON SET OF VALUES of two columns.
To be able to perform a JOIN Operation on two tables, the JOIN columns;
- DO NOT HAVE TO have the same name,
- but they HAVE TO have the SAME DOMAIN.

If we are interested in STAFF information, but we also want to see the Department Name which is NOT IN THE STAFF TABLE, then a JOIN Operation can be carried out using the STAFF and the DEPARTMENTS tables.

The dg_id column is the common column having the same domain in both tables, and it will be used for the JOIN Operation.

TABLE_H = JOIN(STAFF,DEPARTMENTS : dg_id=dg_id)

This expression is read as:

JOIN a row in the STAFF table with a row in the DEPARTMENTS table, where the dg_id value in the STAFF table is equal to the dg_id value in the DEPARTMENTS table.

In Relational Algebra, the logic behind the JOIN Operation can be depicted as follows:

Step 1

A PRODUCT Operation on the STAFF and DEPARTMENTS tables is performed first.

T1 = STAFF PRODUCT DEPARTMENTS

The Resulting Table T1 will have 15 Rows and 6 Columns.

T1
staff id	name	dg id	program no	dg id	dg name
802	YILDIZ	400	10	200	DG Communications
802	YILDIZ	400	10	400	DG Innovation and Research
802	YILDIZ	400	10	600	DG Legal Service
804	HEYNEN	400	20	200	DG Communications
804	HEYNEN	400	20	400	DG Innovation and Research
804	HEYNEN	400	20	600	DG Legal Service
806	KANUITH	400	30	200	DG Communications
806	KANUITH	400	30	400	DG Innovation and Research
806	KANUITH	400	30	600	DG Legal Service
808	ZABOROWSKI	400	40	200	DG Communications
808	ZABOROWSKI	400	40	400	DG Innovation and Research
808	ZABOROWSKI	400	40	600	DG Legal Service
1020	OTTOVA	400	60	200	DG Communications
1020	OTTOVA	400	60	400	DG Innovation and Research
1020	OTTOVA	400	60	600	DG Legal Service

Step 2

A SELECTION Operation on the Resulting Table T1 of Step 1 is performed next, selecting only those rows where dg_id = dg_id in T1.

T2 = SEL(T1: dg_id=dg_id)

Those rows in T1 where dg_id = dg_id are highlighted below.

T1
staff id	name	dg id	program no	dg id	dg name
802	YILDIZ	400	10	200	DG Communications
802	YILDIZ	400	10	400	DG Innovation and Research
802	YILDIZ	400	10	600	DG Legal Service
804	HEYNEN	400	20	200	DG Communications
804	HEYNEN	400	20	400	DG Innovation and Research
804	HEYNEN	400	20	600	DG Legal Service
806	KANUITH	400	30	200	DG Communications
806	KANUITH	400	30	400	DG Innovation and Research
806	KANUITH	400	30	600	DG Legal Service
808	ZABOROWSKI	400	40	200	DG Communications
808	ZABOROWSKI	400	40	400	DG Innovation and Research
808	ZABOROWSKI	400	40	600	DG Legal Service
1020	OTTOVA	400	60	200	DG Communications
1020	OTTOVA	400	60	400	DG Innovation and Research
1020	OTTOVA	400	60	600	DG Legal Service

The Resulting Table T2 of this SELECTION Operation on table T1, T2 = SEL(T1: dg_id=dg_id) can be seen below:

T2
staff id	name	dg id	program no	dg id	dg name
802	YILDIZ	400	10	400	DG Innovation and Research
804	HEYNEN	400	20	400	DG Innovation and Research
806	KANUITH	400	30	400	DG Innovation and Research
808	ZABOROWSKI	400	40	400	DG Innovation and Research
1020	OTTOVA	400	60	400	DG Innovation and Research

Step 3

A PROJECTION Operation is performed next on Table T2, eliminating the DUPLICATE dg_id Columns in T2.

T3 = T2(staff_id, name, program_no, dg_id, dg_name )

Table T3 is the Result Table of the JOIN Operation.

T3
staff id	name	program no	dg id	dg name
802	YILDIZ	10	400	DG Innovation and Research
804	HEYNEN	20	400	DG Innovation and Research
806	KANUITH	30	400	DG Innovation and Research
808	ZABOROWSKI	40	400	DG Innovation and Research
1020	OTTOVA	60	400	DG Innovation and Research

The Resulting Table TABLE_H

The Resulting Table TABLE_H of this JOIN Operation on STAFF and DEPARTMENTS tables, which is explained in detail in Step 1, Step 2, and Step 3 above can now be seen below.

TABLE_H = JOIN(STAFF,DEPARTMENTS : dg_id=dg_id)

TABLE_H
staff id	name	program no	dg id	dg name
802	YILDIZ	10	400	DG Innovation and Research
804	HEYNEN	20	400	DG Innovation and Research
806	KANUITH	30	400	DG Innovation and Research
808	ZABOROWSKI	40	400	DG Innovation and Research
1020	OTTOVA	60	400	DG Innovation and Research

DIVISION

DIVISION Operation in Relational Algebra is similar to the DIVISION Operation in Mathematics.

Problem

We want to find out those Projects that take place in each and every Program.

PROGRAM_PROJECTS table has all the Programs, and the Projects that are in the Programs.
PROJECTS table has all the available Projects.

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

Answer

To be able to find the Answer to the Question "Which Projects take place in each and every Program?" we perform the DIVISION Operation.

Using the DIVISION Operation between the PROGRAM_PROJECTS table and the PROJECTS table, we create a Result Table.

TABLE_I = PROGRAM_PROJECTS DIVISION PROJECTS

The Relationship between this new Result Table TABLE_I and the PROGRAM_PROJECTS and the PROJECTS tables is that; the Result Table TABLE_I holds only those Projects that are in each and every Program.

Project 100 takes place in each and every Program.
Project 200 takes place in Programs 10 and 20.
Project 300 takes place in Programs 20 and 30.
Project 400 takes place only in Program 30.

TABLE_I = PROGRAM_PROJECTS / PROJECTS

Project no 100, whose project type is 'ATOMS' and described as 'Nano Technology', takes place in each and every Program of this Research Institution.

Hence, the Result Table TABLE_I of this DIVISION Operation will only hold one row with project_no=100.

TABLE_I
project no
100

In Relational Algebra, DIVISION identifies rows in one table that have certain relationship to all rows in another table.

By using the Division Operation, a Relationship creates a New Entity from an Existing Entity.

Let E={a, b, c, ...} be the set of inhabitants of a village.
Let R be a relation, where x R y denotes that x and y are related, that is they are relatives.
E/R = {Q: x is an element of E}
Q which is the quotient of the division is a new entity.
Q holds those inhabitants who are related to each other.

Applications of Relational Algebra

Problem 1

Solution for Problem 1

Another Solution for Problem 1

Problem 2

Solution for Problem 2

Problem 1

Referring to the Sample Database Tables, find out the "name of staff" who was responsible of the Projects in Belgium, Last Year.

Solution for Problem 1

Step 1

Use the SELECTION Operation on the PROGRAMS_LAST_YEAR table to find the program where the country is Belgium.

TABLE_A = SEL(PROGRAMS_LAST_YEAR: country='Belgium')

PROGRAMS_LAST_YEAR
program no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg

The program that took place last year in Belgium is program_no=20.

Result Table TABLE_A of this SELECTION Operation is seen below.

TABLE_A
program no	location	country
20	Brussels	Belgium

Step 2

Use the JOIN Operation on the STAFF table and the Result Table TABLE_A from Step 1 to find the staff who is responsible of the chosen program.

TABLE_B = JOIN(STAFF,TABLE_A: program_no= program_no)

STAFF
staff id	name	dg id	program_no
802	YILDIZ	400	10
804	HEYNEN	400	20
806	KANUITH	400	30
808	ZABOROWSKI	400	40
1020	OTTOVA	400	60

We can see above the row in the STAFF table where program_no=20.

This gives us the Resulting Table TABLE_B.

TABLE_B
staff id	name	dg id	program_no
804	HEYNEN	400	20

Step 3

Now that we have the staff responsible for the program in Belgium last year, as seen in TABLE_B of Step 2, we can use the PROJECTION Operation on TABLE_B to find his/her name.

TABLE_C = TABLE_B(name)

TABLE_C
name
HEYNEN

The answer is HEYNEN.

This resumes the Solution of Problem 1.

Another Solution for Problem 1

An alternative solution to this problem is seen below.

Step 1

Use the JOIN Operation on PROGRAMS_LAST_YEAR and STAFF to build up the Result Table TABLE_1.

TABLE_1= JOIN(PROGRAMS_LAST_YEAR,STAFF: program_no=program_no)

TABLE_1
program no	location	country	staff id	name	dg id
10	Istanbul	Turkey	802	YILDIZ	400
20	Brussels	Belgium	804	HEYNEN	400
30	Luxembourg	Luxembourg	806	KANUITH	400

Step 2

Use the SELECTION Operation on TABLE_1 to build up the Result Table TABLE_2.

TABLE_2=SEL(TABLE_1: country='Belgium')

TABLE_2
program no	location	country	staff id	name	dg id
20	Brussels	Belgium	804	HEYNEN	400

Step 3

Use the PROJECTION Operation on TABLE_2 to build up the Result Table TABLE_3.

TABLE_3=TABLE_2(name)

TABLE_3
name
HEYNEN

This resumes the alternative solution for Problem 1.

Problem 2

Referring to the Sample Database Tables, find out the Descriptions and the Main Sponsors for the Projects in Belgium.

Solution for Problem 2

Step 1

Use the UNION Operation to put all the Programs together, since we do not know whether the Programs in Belgium took place Last Year or they are taking place This Year.

Resulting Table T1 = PROGRAMS_LAST_YEAR UNION PROGRAMS_THIS_YEAR

T1
program_no	location	country
10	Istanbul	Turkey
20	Brussels	Belgium
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

Step 2

Use the SELECTION Operation on the Resulting Table T1 from Step 1, to find the Programs in Belgium.

Resulting Table T2 = SEL(T1: country='Belgium')

T2
program_no	location	country
20	Brussels	Belgium

Step 3

Use the JOIN Operation on the Resulting Table T2 from Step 2, and table PROGRAM_PROJECTS to find out all the Projects in this Program.

Resulting Table T3 = JOIN(T2,PROGRAM_PROJECTS: program_no=program_no)

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

T3
program no	location	country	project no	number of projects
20	Brussels	Belgium	100	1
20	Brussels	Belgium	300	1

Step 4

Use the JOIN Operation on the Resulting Table T3 from Step 3, and table PROJECTS to find out all the information about the selected Projects.

Resulting Table T4 = JOIN(T3,PROJECTS: project_no=project_no)

PROJECTS
project no	project type	description	main sponsor	cost (million Euros)
100	ATOMS	Nano Technology	Futuristic	2.37
200	BITS	Information Technology	Guru	6.19
300	GENES	Bio Technology	Longevity	4.48
400	NEURONS	Neuro Science	Cyberlabs	1.49

T4
program no	location	country	project no	number of projects	project type	description	main sponsor	cost
20	Brussels	Belgium	100	1	ATOMS	Nano Technology	Futuristic	2.37
20	Brussels	Belgium	300	1	GENES	Bio Technology	Longevity	4.48

Step 5

Use the PROJECTION Operation on the Resulting Table T4 from Step 4, to retrieve the Descriptions and Names of the Main Sponsors of the Projects in Belgium.

Resulting Table T5 = T4 (description, main_sponsor)

T5
description	main sponsor
Nano Technology	Futuristic
Bio Technology	Longevity

This resumes the Solution for Problem 2.

Weaknesses of Relational Algebra as a Programming Language

Relational Algebra is a Procedural Language in the sense that the User is required to use a Series of Operations to obtain a required Result.
Solutions can be difficult to develop.
Relational Algebra CANNOT Group related information together.
It CANNOT Perform Calculations on Numeric Values.
It CANNOT Sort Rows in any particular order.
Printing information with formatting is out of the question.

Strengths of Relational Algebra as a Programming Language

The strength of Relational Algebra as a Programming Language is that its implementation forms the basis of fourth-generation query languages.
These languages are supported by many other tools, which provide users with full application-development capabilities.

Relational Calculus

Relational Calculus is a non-procedural language.

The programmer specifies the data requirement, and the system generates the operations needed to produce a table with the required data.

Relational Calculus uses the general syntax:

Result = (Column List): Expression

The list of columns is on the left of the colon(:)
The expressions and conditions are on the right of the colon(:)

Problem 1

Solution for Problem 1 in Relational Calculus

Solution for Problem 1 in Relational Algebra

Problem 2

Solution for Problem 2 in Relational Calculus

Solution for Problem 2 in Relational Algebra

Problem 3

Solution for Problem 3 in Relational Calculus

Solution for Problem 3 in Relational Algebra

Problem 1

Referring to the Sample Database Tables, find those programs that contain project number 400.

Solution for Problem 1 in Relational Calculus

In Relational Calculus, the User/Programmer writes down the expressions necessary to solve this problem as follows:


(r.program_no) : 
r in PROGRAM_PROJECTS 
AND r.project_no = 400

In this expression, r is known as a row variable.

The expression is read as;

program_no in row r, where r is a row in the PROGRAM_PROJECTS table and project_no in row r is 400

Each row in PROGRAM_PROJECTS is examined using the condition to the right of the column.

PROGRAM_PROJECTS
program no	project no	number of_projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

The Resulting Table T contains the program_no column where its value is 30.

Resulting Table T
program no
30

This solution is in Relational Calculus.

In Relational Calculus, the system generated the required operations and produced the Resulting Table T with the required data, using the expressions given by the User/Programmer.

Solution for Problem 1 in Relational Algebra

Solution for the same problem in Relational Algebra would look like the following.

Step 1

Use the SELECTION Operation on the PROGRAM_PROJECTS table to select the rows where project_no=400.

Resulting Table T1 = SEL(PROGRAM_PROJECTS : project_no = 400)

PROGRAM_PROJECTS
program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

Resulting Table T1
program no	project no	number of projects
30	400	3

Step 2

Use the PROJECTION Operation on the Resulting Table T1 from Step1, to select the required column program_no.

Resulting Table T2 = T1(program_no)

Resulting Table T2
program no
30

This resumes the Solution for Problem 1, using Relational Algebra.

Problem 2

Referring to the Sample Database Tables, find the"staff_id" and "name" of staff members who work in department DG Innovation and Research.

Solution for Problem 2 in Relational Calculus

In Relational Calculus, the User/Programmer writes down the expressions necessary to solve this problem as follows:

        
(r.staff_id, r.name) : 
r in STAFF 
AND s in DEPARTMENTS
AND r.dg_id= s.dg_id 
AND s.dg_name = 'DG Innovation and Research'

The Resulting Table T is as follows:

Resulting Table T
staff id	name
802	YILDIZ
804	HEYNEN
806	KANUITH
808	ZABOROWSKI
1020	OTTOVA

This solution is in Relational Calculus.

In Relational Calculus, the system generated the required operations and produced the Resulting Table T, using the expressions given by the User/Programmer.

Solution for Problem 2 in Relational Algebra

Solution for the same problem in Relational Algebra would look like the following.

Step 1

Use the SELECTION Operation on DEPARTMENTS Table to find the rows where Department Name (dg_name) is 'DG Innovation and Research'.

Resulting Table T1 = SEL(DEPARTMENTS : dg_name = 'DG Innovation and Research')

DEPARTMENTS
dg id	dg name
200	DG Communications
400	DG Innovation and Research
600	DG Legal Service

Resulting Table T1
dg id	dg name
400	DG Innovation and Research

Step 2

Use the JOIN Operation on the Resulting Table T1 from Step 1, and the STAFF Table, to find those rows where their Department IDs(dg_id) are equal.

Resulting Table T2 = JOIN(T1,STAFF : dg_id = dg_id)

STAFF
staff id	name	dg id	program no
802	YILDIZ	400	10
804	HEYNEN	400	20
806	KANUITH	400	30
808	ZABOROWSKI	400	40
1020	OTTOVA	400	60

Resulting Table T2
staff id	name	dg id	program no	dg name
802	YILDIZ	400	10	DG Innovation and Research
804	HEYNEN	400	20	DG Innovation and Research
806	KANUITH	400	30	DG Innovation and Research
808	ZABOROWSKI	400	40	DG Innovation and Research
1020	OTTOVA	400	60	DG Innovation and Research

Step 3

Use the PROJECTION Operation on the Resulting Table T2 from Step 2 to find the values of columns staff_id and name.

Resulting Table T3 = T2(staff_id, name)

Resulting Table T3
staff id	name
802	YILDIZ
804	HEYNEN
806	KANUITH
808	ZABOROWSKI
1020	OTTOVA

This resumes the Solution for Problem 2, using Relational Algebra.

Problem 3

Referring to the Sample Database Tables, find the staff name, department name, and location for Projects taking place This Year for the Staff responsible for Program Number 10.

Solution for Problem 3 in Relational Calculus

In Relational Calculus, the User/Programmer writes down the expressions necessary to solve this problem as follows:


(e.name, d.dg_name, p.location) :  
e in STAFF			
AND d in DEPARTMENTS
AND p in PROGRAMS_THIS_YEAR
AND e.dg_id = d.dg_id
AND e.program_no = p.program_no
AND p.program_no = 10

The Resulting Table T is seen below:

Resulting Table T
name	dg name	location
YILDIZ	DG Innovation and Research	Istanbul

This solution is in Relational Calculus.

In Relational Calculus, the system generated the required operations and produced the Resulting Table T with the required data, using the expressions given by the User/Programmer.

Solution for Problem 3 in Relational Algebra

Solution for the same problem in Relational Algebra would look like the following.

Step 1

Use the SELECTION Operation on Table PROGRAMS_THIS_YEAR to find those rows, where program_no=10.

Resulting Table T1 = SEL(PROGRAMS_THIS_YEAR : program_no = 10)

PROGRAMS_THIS_YEAR
program no	location	country
10	Istanbul	Turkey
30	Luxembourg	Luxembourg
40	Warsaw	Poland
60	Prague	Czech Republic

Resulting Table T1
program no	location	country
10	Istanbul	Turkey

Step 2

Use the JOIN Operation on the Resulting Table T1 from Step 1, and the STAFF table on the common values of the column program_no.

Resulting Table T2 = JOIN(T1,STAFF : program_no=program_no)

STAFF
staff id	name	dg id	program no
802	YILDIZ	400	10
804	HEYNEN	400	20
806	KANUITH	400	30
808	ZABOROWSKI	400	40
1020	OTTOVA	400	60

Resulting Table T2
staff id	name	dg id	program no	location	country
802	YILDIZ	400	10	Istanbul	Turkey

Step 3

Use the JOIN Operation on the Resulting Table T2 from Step 2, and the DEPARTMENTS Table on the common values of the column dg_id.

Resulting Table T3 = JOIN(T2,DEPARTMENTS : dg_id = dg_id)

DEPARTMENTS
dg id	dg name
200	DG Communications
400	DG Innovation and Research
600	DG Legal Service

Resulting Table T3
staff id	name	dg id	program no	location	country	dg name
802	YILDIZ	400	10	Istanbul	Turkey	DG Innovation and Research

Step 4

Use the PROJECTION Operation on the Resulting Table T3 from Step 3, to select the columns name, dg_name and location only.

Resulting Table T4 = T3(name, dg_name, location)

Resulting Table T4
name	dg name	location
YILDIZ	DG Innovation and Research	Istanbul

This resumes the Solution for Problem 3, using Relational Algebra.

Basis for SQL

Relational Calculus is the basis for the non-procedural Structured Query Language SQL, which is supported by many relational database systems.

References

C.J. Date , An Introduction to Database Systems 7th Edition, Addison-Wesley ISBN 0-201-38590-2
Richard Barker , Case*Method, Entity Relationship Modelling, Addison-Wesley ISBN 0-201-41696-4
Nilesh Shah, Database Systems Using Oracle, A Simplified Guide to SQL and PL/SQL, Prentice Hall, ISBN 0-13-101857-4
James Canton, The Extreme Future, Plume, ISBN 978-0-452-28866-9

ISBN 92-79-03653-X , How the European Union works: Booklet by EU Publications Office
ISBN 978-92-824-2203-8, Information handbook of the Council of the European Union, DG Press, General Secretariat of the Council
European Union
https://europa.eu/european-union/index_en
European Council
https://www.consilium.europa.eu/en/
European Parliament
https://www.europarl.europa.eu/portal/en
The Official Directory of the European Union (Who is Who)
https://op.europa.eu/en/web/who-is-who

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of_projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of_projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of_projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1

program no	project no	number of projects
10	100	2
10	200	4
20	100	1
20	300	1
30	100	2
30	200	2
30	300	4
30	400	3
40	100	2
60	100	1