Contents
The contents of this work are based on;
This text on SQL is based on the EP-European Parliament as Data Model.
The European Parliament (EP) is the parliamentary institution of the European Union (EU).
The European Parliament is elected by the citizens of the European Union to represent their interests.
Its origins go back to the 1950s and the founding Treaties. Since 1979 its members have been directly elected by the citizens of the EU.
Elections are held every five years, and every EU citizen is entitled to vote, and to stand as a candidate, wherever they live in the EU. Parliament thus expresses the democratic will of the European Union's nearly 500 million citizens and it represents their interests in discussions with the other EU institutions.
Please Note:
The sources for the data, which are all freely obtainable from the European Union website, and from several leaflets of the EU institutions, are listed in the References section.
These sources are used to create the fictitious Business Narrative of the "Data Modelling and Normalization" Chapter.
We called this Business Narrative; "the Story of the European Union" which explains about the EU and EP, for purposes of Data Modeling.
The Story of the European Union is the last Chapter of Kaye is Learning SQL.
This text on SQL is based on the Oracle Database as RDBMS — Relational Database Management System.
Although the SQL Statements used in this text, apply specifically to the Oracle Database implementation of SQL, they are generally compatible with many other RDBMS SQL implementations.
The knowledge in the following sections are based on my personal work experience, as well as on several Oracle Documentation, which are freely downloadable from Oracle websites.
You can find a list of these documents in the References.
The name of the documents may change in time.
You can refer to the Oracle Documentation if you need more detailed information on any of the topics in this text.
Any errors present, are mine.
A Query is an operation that retrieves data from one or more tables or views.
We call a Query NESTED within another SQL Statement a Subquery.
We want to write a report that displays those countries which joined the EU later than Sweden.
We initially have to find out Sweden's join date.
We write the following query.
This is our first query.
select join_date from countries where country_id='SE';
Here is the result.
Sweden's join date is '01-JAN-95'.
We then write a second query, which retrieves countries with JOIN_DATE > '01-JAN-95'.
select country_id, country_name, join_date, gdp_pps, gdp from countries where status='M' and join_date > '01-JAN-95' order by join_date, country_id;
Here is the result.
Here are the countries which joined the EU later than Sweden.
We wrote the report required from us, in two steps, using two consecutive queries.
First Query In the First Query, we retrieved Sweden's join date: '01-JAN-95'
Second Query In the Second Query, we used the Output of the First Query, in the Comparison Operator > (join_date > '01-JAN-95') to complete the report.
However, we can solve this problem by using Only One Query.
We can solve the problem by writing Only One Query.
This is where the Subquery comes in.
select country_id, country_name, join_date, gdp_pps, gdp from countries where status='M' and join_date > (select join_date from countries where country_id='SE') order by join_date, country_id;
We have exactly the same result as before.
Let's examine this query.
We will now examine the structure of a Subquery, based on the Solution of the Report above.
select country_id,
country_name,
join_date,
gdp_pps,
gdp
from countries
where status='M'
and join_date > (select join_date
from countries
where country_id='SE')
order by join_date, country_id;
The script in black is the TOP-LEVEL QUERY.
The TOP-LEVEL QUERY can also be called; the MAIN QUERY / OUTER QUERY / PARENT QUERY / PARENT STATEMENT.
The script (in red and within brackets) is the SUBQUERY.
The SUBQUERY is nested within the PARENT STATEMENT.
A Query Nested within another SQL Statement is called a SUBQUERY.
A SUBQUERY can also be called an INNER QUERY.
The PARENT STATEMENT of a SUBQUERY is NOT ALWAYS a SELECT Statement.
The PARENT STATEMENT (TOP-LEVEL SQL Statement) in which the SUBQUERY is NESTED, can ALSO be an INSERT, UPDATE or a DELETE Statement.
A SUBQUERY can contain ANOTHER SUBQUERY.
You can nest up to 255 levels of Subqueries in a Nested Subquery.
For the limit on the number of Nested Subqueries within Subqueries, you can check the latest Oracle Documentation.
In any Subquery, OTHER THAN a Correlated Subquery, the INNER Query (the SUBQUERY);
The order of execution is like two sequential queries:
A Subquery can be executed in a:
There are many alternative terms for the Top-Level Query and the Subquery.
A Top-Level Query can also be called a Main Query / Outer Query / Parent Query / Parent Statement.
A Subquery can also be called a Child Query / Inner Query.
In this Chapter we will mostly use the following terms:
Below is the Basic Syntax of a Subquery.
Please keep in mind that there are other syntaxes.
This syntax helps us to understand the subquery operators.
SELECT select_list
FROM table_name
WHERE expression subquery operator
(SELECT select_list
FROM table_name)
The subquery operator can be;
These are the Single-Row Operators:
Single-Row Operators are used with Single-Row Subqueries.
A Subquery which returns Only One Row from the Inner Select Statement is a Single-Row Subquery.
These are the Multiple-Row Operators:
Multiple-Row Operators are used with Multiple-Row Subqueries.
A Subquery which returns More Than One Row from the Inner Select Statement is a Multiple-Row Subquery.
Retrieve those countries that joined the EU On The Same Date as the country with id 'SE'.
select country_id, country_name, capital, join_date from countries where status='M' and join_date = (select join_date from countries where country_id='SE') order by country_id;
Here is the output of this SELECT Statement.
Retrieve those countries that joined the EU Before The Joining Date of the country with id 'SE'.
select country_id, country_name, capital, join_date from countries where status='M' and join_date < (select join_date from countries where country_id='SE') order by join_date, country_id;
Here is the output of this SELECT Statement.
Retrieve the country which has the Minimum Purchasing Power Standard among the EU Member Countries.
select country_id, country_name, capital, gdp_pps, gdp, population, join_date from countries where status='M' and gdp_pps = (select min(gdp_pps) from countries where status='M');
Here is the output of this SELECT Statement.
An Aggregate Function (Group Function) is used in the subquery to select the minimum purchasing power standard MIN(GDP_PPS) among the member countries.
Retrieve the details of those countries which joined the EU before Sweden(SE), and which have a purchasing power standard greater than the United Kingdom(UK).
select country_id, country_name, capital, join_date, gdp_pps from countries where status='M' and join_date < (select join_date from countries where country_id='SE') and gdp_pps > (select gdp_pps from countries where country_id='UK') order by gdp_pps desc;
This SELECT Statement uses two subqueries:
In our database, in the COUNTRIES table, there are three distinct values for the column STATUS.
These three distinct values of the STATUS column are:
We can say that we have three categories of STATUS.
We want retrieve the minimum gdp_pps for different categories of STATUS, IF these minimum values of gdp_pps are GREATER THAN the minimum gdp_pps of countries with a status of 'C'.
select status, min(gdp_pps) from countries group by status having min(gdp_pps) > (select min(gdp_pps) from countries where status='C');
Here is the output of the query.
The INNER Query returns the "minimum gdp_pps MIN(GDP_PPS) of Candidate countries".
HAVING Clause eliminates those rows in the MAIN Query, that have minimum gdp_pps values less than or equal to the value returned by the INNER Query.
The SELECT statement below wants to retrieve those countries which joined the EU on the same date as the country 'SWEDEN'.
select country_id, country_name, gdp_pps, join_date from countries where join_date = (select join_date from countries where country_name='SWEDEN');
Here is the output of the SELECT statement.
We can see that NO Rows are retrieved.
Below is the Subquery.
select join_date from countries where country_name='SWEDEN'
We can see below that the Subquery Returns NO Rows.
This is because, the country name is stored as 'Sweden' which is INITCAP, and NOT in UPPERCASE.
The value of the JOIN_DATE retuened is NULL.
The MAIN Query compares NULL with the values of the JOIN_DATE column in the whole table, using the EQUAL TO = operator.
where join_date = (select join_date from countries where country_name='SWEDEN');
In fact, there is more than one country which have NULL values in their JOIN_DATE columns. These are the countries with status of 'C' and 'O'.
Comparison of two NULLS using the "= Operator" (and also using the other Comparison Operators) yields a NULL.
Therefore, the condition above will yield a NULL.
A row will ONLY be RETURNED if the condition yields a TRUE, therefore NO ROWS are returned.
Within the set of countries when grouped by STATUS (M=Member, C=Candidate, O=Others), we want to display ONLY the sets with the highest average gdp_pps.
We want the report to display the "STATUS", "minimum gdp_pps", "average gdp_pps", and the "maximum gdp_pps" of such a set.
The query below, with a HAVING Clause retrieves the required information.
select status, MIN(GDP_PPS), AVG(gdp_pps), MAX(GDP_PPS) from countries group by status having AVG(gdp_pps) = (select max(avg(gdp_pps)) from countries group by status) order by status
We can see the output of this query below.
Below is another query giving the same result.
select status, MIN(GDP_PPS), AVG(gdp_pps), MAX(GDP_PPS) from countries group by status having AVG(gdp_pps) = (select max(avgdp) from (select status, avg(gdp_pps) avgdp from countries group by status) ) order by status
MULTIPLE-ROW Subqueries return MORE THAN ONE Row.
MULTIPLE-ROW Operators:
We want to retrieve those countries which have the HIGHEST GDP(Gross Domestic Product) within each STATUS ('M', 'C', 'O') Group.
We write the following query.
SELECT COUNTRY_ID, COUNTRY_NAME, CASE STATUS WHEN 'M' THEN 'Member' WHEN 'C' THEN 'Candidate' WHEN 'O' THEN 'Other' END STATUS_CLASS, GDP FROM COUNTRIES WHERE GDP = (SELECT MAX(GDP) FROM COUNTRIES GROUP BY STATUS) ORDER BY GDP DESC;
Here is the output.
We receive the following error:
ORA-01427: single-row subquery returns more than one row
01427. 00000 - "single-row subquery returns more than one row"
We receive this error, because the Subquery returns multiple rows, whereas the "=" is a Single-Row Operator and EXPECTS only ONE ROW to be returned from the Subquery.
Here is the Subquery on its own.
We can see that it returned three rows, one per Status Group.
Our mistake is that, we are using a Single-Row operator (=) for the Subquery.
. . WHERE GDP = (SELECT MAX(GDP) FROM COUNTRIES GROUP BY STATUS)
If we alter the SELECT statement as below and use a Multiple-Row Operator instead, we do get the result set we want.
SELECT COUNTRY_ID, COUNTRY_NAME, CASE STATUS WHEN 'M' THEN 'Member' WHEN 'C' THEN 'Candidate' WHEN 'O' THEN 'Other' END STATUS_CLASS, GDP FROM COUNTRIES WHERE GDP IN (SELECT MAX(GDP) FROM COUNTRIES GROUP BY STATUS) ORDER BY GDP DESC;
Countries with the HIGHEST GDP(Gross Domestic Product) within each STATUS ('M', 'C', 'O') Group are displayed.
The Main Query appears to the server as the following.
SELECT COUNTRY_ID, COUNTRY_NAME, CASE STATUS WHEN 'M' THEN 'Member' WHEN 'C' THEN 'Candidate' WHEN 'O' THEN 'Other' END STATUS_CLASS, GDP FROM COUNTRIES WHERE GDP IN (10509000000000, 2322000000000, 319000000000) ORDER BY GDP DESC;
ANY Operator and SOME Operator are SYNONYMS.
ANY compares a Value to EACH Value returned by a Subquery.
The Condition must hold for AT LEAST ONE Value in the Result Set of the Subquery.
= ANY is equivalent to IN.
< ANY is equivalent to LESS THAN the MAXIMUM.
> ANY is equivalent to GREATER THAN the MINIMUM.
Report Requirement
We are asked to create a report of those countries which are Members, and which have GDPs LESS THAN ANY of the Candidate countries. The result set will be in descending GDP values.
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < ANY (SELECT GDP FROM COUNTRIES WHERE STATUS='C') AND STATUS='M' ORDER BY GDP DESC;
Here is the Result Set of the Query.
Below is the INNER Query / SUBQUERY.
SELECT GDP FROM COUNTRIES WHERE STATUS='C'
This is what the INNER Query / SUBQUERY returns.
< ANY is equivalent to LESS THAN the MAXIMUM.
The MAXIMUM value in the Result Set of the SUBQUERY is 319000000000.
Therefore, the MAIN QUERY appears to the Server as the following.
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < 319000000000 AND STATUS='M' ORDER BY GDP DESC;
These are the rows of Member Countries, whose GDPs are LESS THAN THE MAXIMUM GDP in the RESULT SET OF THE INNER QUERY.
We achieved this result using the < ANY Operator:
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < ANY (SELECT GDP FROM COUNTRIES WHERE STATUS='C') AND STATUS='M' ORDER BY GDP DESC;
ALL compares a Value to EVERY Value returned by a Subquery.
The Condition must hold for ALL Values in the Result Set of the Subquery.
!= ALL is equivalent to NOT IN.
< ALL is equivalent to LESS THAN the MINIMUM.
> ALL is equivalent to GREATER THAN the MAXIMUM.
Report Requirement
We are asked to create a report of those countries which are Members, and which have GDPs LESS THAN ALL of the Candidate countries. The result set will be in descending GDP values.
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < ALL (SELECT GDP FROM COUNTRIES WHERE STATUS='C') AND STATUS='M' ORDER BY GDP DESC;
Here is the Result Set of the Query.
Zero rows returned, i.e. NO ROWS are returned.
Below is the INNER Query / SUBQUERY.
ORDER BY is for displaying the data more meaningfully. It is not in the actual subquery.
SELECT GDP FROM COUNTRIES WHERE STATUS='C' ORDER BY GDP
This is what the INNER Query / SUBQUERY returns.
< ALL is equivalent to LESS THAN the MINIMUM.
The MINIMUM value in the Result Set of the SUBQUERY is 5000000000.
Therefore, the MAIN QUERY appears to the Server as the following.
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < 5000000000 AND STATUS='M' ORDER BY GDP DESC;
The query returns NO ROWS.
This is because, there is NO Member Country with a GDP which is LESS THAN the MINIMUM GDP in the Result Set of the INNER QUERY.
We achieved this result using the < ALL Operator:
SELECT COUNTRY_ID, COUNTRY_NAME, GDP FROM COUNTRIES WHERE GDP < ALL (SELECT GDP FROM COUNTRIES WHERE STATUS='C') AND STATUS='M' ORDER BY GDP DESC;
We want to find out those Political Groups that DO NOT HAVE any Members of the European Parliament.
As an answer to the Report Requirement, we create and run the following query.
select pg_id, pg_code, pg_name from political_groups where pg_id NOT IN (select distinct pg_id from meps);
This is the Result Set of the query.
The query returns NO ROWS !?
We now run the query above which returned NO ROWS, as two separate Queries:
Here is the Outer Query WITHOUT the WHERE Clause for the INNER QUERY.
ORDER BY Clause is added for display purposes only.
select pg_id, pg_code, pg_name from political_groups order by pg_id;
The Outer Query returned 9 rows.
These are the 9 Political Groups of the European Parliament.
They MIGHT NOT all have Members of the Parliament.
We now run the INNER QUERY on its own.
We include the ORDER BY Clause for display purposes only.
select distinct pg_id from meps order by pg_id;
Here is the output of the INNER QUERY.
These are the Political Groups of the European Parliament, that HAVE Members of the Parliament.
The last row with a NULL Value indicates the Non-Attached Members of the European Parliament.
As we know from our Data Model, the Non-Attached Members of the Parliament have a NULL PG_ID.
The result set of the Outer Query is: (20, 40, 60, 80, 100, 140, 160, 200901, 200902)
The result set of the Inner Query is: (20, 40, 60, 80, 100, 140, 160, NULL)
Looking at the Result Sets of the Outer Query and the Inner Query, the logical result we expect to see is the following two political groups: (200901, 200902)
PG_ID | PG_CODE | PG_NAME |
---|---|---|
200901 | NEW-EN | Group of the New Energy |
200902 | IE-IT | Group of the Innovation Economy and Information Technology |
Obviously, these are the two Political Groups that DO NOT HAVE any Members.
But, why don't they appear in the Result Set of Our Query?
Why is the Result Set of Our Query seen below, has No Rows?
select pg_id, pg_code, pg_name from political_groups where pg_id NOT IN (select distinct pg_id from meps);
Question: When does a Query Return any Rows?
Answer: A Query Returns Rows, when the Logical Condition/Conditions in the WHERE Clause Evaluate to TRUE.
Question: What can the Logical Condition/Conditions in the WHERE Clause Evaluate to?
Answer: The Logical Condition/Conditions in a WHERE Clause can evaluate to one of the below:
Question: What does the Condition in our query seen below, Evaluate to?
where pg_id NOT IN (select distinct pg_id from meps);
Answer: This is a lengthy answer.
We will see how to proceed, in the next section.
The query in our example is seen below:
select pg_id,
pg_code,
pg_name
from political_groups
where pg_id NOT IN
(select distinct pg_id
from meps);
During query execution, the WHERE Clause Condition becomes:
(20, 40, 60, 80, 100, 140, 160, 200901, 200902)
NOT IN
(20, 40, 60, 80, 100, 140, 160, NULL)
Each value in the First Set (Result Set of the OUTER Query) is compared with each value in the Second Set (Result Set of the INNER Query).
There are Nine 9 Values in the First Set, therefore there will be Nine 9 Comparison Conditions.
Any Comparison Condition which yields TRUE, will return that value from the First Set.
REMINDER
WHERE variable_01 NOT IN (variable_02, variable_03)
IS IDENTICAL TO
WHERE (variable_01 != variable_02) AND (variable_01 != variable_03)
The value 20 in the First Set is compared with each of the values in the Second Set.
20!=20 AND 20!=40 AND 20!=60 AND 20!=80 AND 20!=100 AND 20!=140 AND 20!=160 AND 20!=NULL
which evaluates to;
FALSE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND NULL
which evaluates to;
FALSE
Therefore, the First Condition does NOT RETURN any rows.
Hence, the row with pg_id = 20 is NOT SELECTED.
The value 40 in the First Set is compared with each of the values in the Second Set.
40!=20 AND 40!=40 AND 40!=60 AND 40!=80 AND 40!=100 AND 40!=140 AND 40!=160 AND 40!=NULL
which evaluates to;
TRUE AND FALSE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND NULL
which evaluates to;
FALSE
Therefore, the Second Condition does NOT RETURN any rows.
Hence, the row with pg_id = 40 is NOT SELECTED.
Similarly, all of the Third to Seventh Conditions evaluate to FALSE.
Hence, they DO NOT RETURN any rows either.
60, 80, 100, 140 and 160 are NOT SELECTED
The value 200901 in the First Set is compared with each of the values in the Second Set.
200901!=20 AND 200901!=40 AND 200901!=60 AND 200901!=80 AND 200901!=100 AND 200901!=140 AND 200901!=160 AND 200901!=NULL
which evaluates to;
TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND NULL
which evaluates to;
NULL
Therefore, the Eighth Condition does NOT RETURN any rows.
Hence, the row with pg_id = 200901 is NOT SELECTED.
The value 200902 in the First Set is compared with each of the values in the Second Set.
200902!=20 AND 200902!=40 AND 200902!=60 AND 200902!=80 AND 200902!=100 AND 200902!=140 AND 200902!=160 AND 200902!=NULL
which evaluates to;
TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND TRUE AND NULL
which evaluates to;
NULL
Therefore, the Ninth Condition does NOT RETURN any rows.
Hence, the row with pg_id = 200902 is NOT SELECTED.
NONE of the NINE Comparison Conditions returned TRUE .
Therefore, NO ROWS are RETURNED.
The query should be re-written as follows:
select pg_id, pg_code, pg_name from political_groups where pg_id NOT IN (select distinct pg_id from meps where pg_id is NOT NULL);
We added the WHERE Condition seen below into the INNER QUERY.
This WHERE Condition makes sure that there will be only NON NULL values in the Result Set of the INNER QUERY.
where pg_id is NOT NULL
We can now see the Political Groups that DO NOT HAVE any Members of the Parliament just as yet.
"Success is not final;
Failure is not fatal.
It is the courage to continue that counts."
— Winston Churchill
We want to retrieve details of those countries which have a GDP MORE THAN the Average GDP of their Status Group.
Please remember that there are three Groups of Status in our Database:
In the Query which brings us the required rows, we use an INNER Query within the FROM Clause.
This INNER Query in the FROM Clause is treated like a "temporary table" during the lifetime of the query.
An INNER Query within the FROM Clause of a SELECT Statement is also called an INLINE VIEW.
Here is our Query:
SELECT C_ALL.COUNTRY_NAME, C_ALL.STATUS, C_ALL.GDP, ROUND(C_AVG_GDP.AVG_GDP) "Average GDP of Status Group" FROM COUNTRIES C_ALL, (SELECT STATUS, AVG(GDP) AVG_GDP FROM COUNTRIES GROUP BY STATUS) C_AVG_GDP WHERE C_ALL.STATUS = C_AVG_GDP.STATUS AND C_ALL.GDP > C_AVG_GDP.AVG_GDP ORDER BY C_ALL.STATUS, C_ALL.GDP DESC;
Here is the Result Set of the query:
We want to retrieve those Member Countries which have the TOP-5 Purchasing Power Standard.
To be able to get the required results, we make use of the pseudocolumn ROWNUM.
For each row returned by a query, the ROWNUM pseudocolumn returns "a number indicating the ORDER" in which the row is selected from a table or from a set of joined rows.
The First Row selected has a ROWNUM of 1 , the Second Row selected has a ROWNUM of 2, and so on.
ROWNUM can be used to LIMIT the number of rows returned by a query, as we will do in our example.
The Initial Step is, to Order the Countries in Descending Order of their Purchasing Power Standard.
The column GDP_PPS holds the Purchasing Power Standard.
Here is our Step 1 Query:
SELECT COUNTRY_NAME, GDP_PPS FROM COUNTRIES WHERE STATUS='M' ORDER BY GDP_PPS DESC;
The Query of Step 1 will be used as the INNER Query in the FROM Clause of the MAIN Query in Step 2.
As we know from the previous section, Result Set of the INNER Query in the FROM Clause is treated like a "temporary table".
The Result Set of the INNER Query is seen below.
The rows will be selected EXACTLY in the order seen below from the INNER Query "temporary table".
The Step 2 Query (MAIN Query) seen below, uses the Query of Step 1 as an INNER Query in the FROM Clause (INLINE VIEW).
SELECT ROWNUM "Rank", COUNTRY_NAME "Top Five", GDP_PPS "PPS" FROM (SELECT COUNTRY_NAME, GDP_PPS FROM COUNTRIES WHERE STATUS='M' ORDER BY GDP_PPS DESC) WHERE ROWNUM <=5;
Our Answer to the Problem is seen below.
We retrieved those Member Countries which have the TOP-5 Purchasing Power Standard.
There is an ORDER BY Clause in the INNER Query:
We placed the WHERE Condition with ROWNUM in the MAIN Query:
This condition LIMITED the number of rows to 5.
We forced the WHERE Condition with ROWNUM to be applied AFTER the rows are ORDERED in the INNER Query.
As a result, the query returned the FIRST FIVE Countries with the HIGHEST GDP_PPS, in DESCENDING order of GDP_PPS.
This type of query is referred to as a TOP-N Query.
In this section, we had an example of a TOP-N Query with the Pseudocolumn ROWNUM.
We want to retrieve those Countries which have the TOP-3 Purchasing Power Standard within their Status Group.
In our report, we want to display:
The Report Output we want to obtain is seen below.
To be able to get the required results, we make use of the Analytic Function ROW_NUMBER.
Here is our INNER QUERY:
SELECT STATUS, COUNTRY_NAME, GDP_PPS, ROW_NUMBER() OVER (PARTITION BY STATUS ORDER BY GDP_PPS DESC) STATUS_RANK FROM COUNTRIES WHERE GDP_PPS IS NOT NULL
Below is the Result Set of the INNER Query.
The complete Result Set is seen below.
We have the following expression in the INNER QUERY.
ROW_NUMBER() OVER (PARTITION BY STATUS ORDER BY GDP_PPS DESC) STATUS_RANK
Let's analyze this expression by each Clause.
This Clause DIVIDES the COUNTRIES table into SUBSETS by the values in the column STATUS.
The values in the column STATUS are 'C', 'M', and 'O'.
Hence we now have three SUBSETS of the COUNTRIES table.
This Clause ORDERS the COUNTRIES in DESCENDING ORDER of their GDP_PPS .
Hence, these Two Clauses together;
This Clause ASSIGNS each COUNTRY a RANK WITHIN EACH SUBSET , which are already ordered in descending GDP_PPS.
This whole expression above, created a New Column.
We now give this new column an ALIAS:
This new column DOES NOT EXIST in the Countries table.
It ONLY EXISTS during the lifetime of the query.
Once again, we display our INNER QUERY and its Result Set below.
SELECT STATUS, COUNTRY_NAME, GDP_PPS, ROW_NUMBER() OVER (PARTITION BY STATUS ORDER BY GDP_PPS DESC) STATUS_RANK FROM COUNTRIES WHERE GDP_PPS IS NOT NULL
Our INNER QUERY Result Set is ready to be used in the OUTER QUERY.
Using the query below, we find the Countries with the three highest GDP_PPS WITHIN each Status Group.
We placed the WHERE Condition with the New Column in the OUTER Query:
This condition which uses the New Column STATUS_RANK, LIMITS the number of rows within Each STATUS Group to 3 .
SELECT STATUS_RANK, STATUS, COUNTRY_NAME, GDP_PPS FROM (SELECT STATUS, COUNTRY_NAME, GDP_PPS, ROW_NUMBER() OVER (PARTITION BY STATUS ORDER BY GDP_PPS DESC) STATUS_RANK FROM COUNTRIES WHERE GDP_PPS IS NOT NULL) WHERE STATUS_RANK <=3 ORDER BY STATUS, GDP_PPS DESC, COUNTRY_NAME;
Below is the Result Set of the OUTER Query.
In this section, we had an example of a TOP-N Query with the Analytic Function ROW_NUMBER.
A Correlated Subquery is performed when the Subquery REFERENCES a column in a table in the MAIN Query (OUTER Query or the PARENT Statement).
The PARENT Statement can be a SELECT, INSERT, UPDATE or a DELETE Statement in which the Subquery is nested.
In any Subquery, OTHER THAN a Correlated Subquery, the INNER Query (the SUBQUERY);
A Correlated Subquery however, EXECUTES ONCE for EACH CANDIDATE ROW considered by the MAIN Query(OUTER Query/PARENT Query).
In other words, the Correlated Subquery (Inner Query ) is DRIVEN BY the MAIN Query(OUTER Query/PARENT Query).
MAIN Query(OUTER Query/PARENT Query) Retrieves a Candidate Row FIRST.
Let's have an Example of a Correlated Subquery.
Find those Countries which have Purchasing Power Standard values MORE THAN the Average Purchasing Power Standard of their Status Group.
In this case, the Correlated Subquery specifically computes the Average Purchasing Power Standard of EACH Status Group (STATUS column).
However, the Value of the Status Group in the Correlated Subquery DEPENDS ON the Value of the Status Column in EACH ROW processed by the MAIN Query(OUTER Query/PARENT Query).
The following SELECT Statement returns Countries which have Purchasing Power Standard values MORE THAN their Status Group Average.
select outer_c.country_id, outer_c.country_name, outer_c.status, outer_c.gdp_pps from countries outer_c where outer_c.gdp_pps > (select avg(inner_c.gdp_pps) from countries inner_c where inner_c.status = outer_c.status) order by outer_c.status, outer_c.gdp_pps desc;
Here is the Result Set of the Query.
EXISTS Operator is used to test whether a value retrieved by the MAIN Query exists in the Result Set of the INNER Query (Correlated Subquery).
If the INNER Query (Correlated Subquery) Returns AT LEAST ONE ROW:
If the INNER Query (Correlated Subquery) Returns NO ROWS:
Requirement:
Display details of those Political Groups that have at least one MEP (Member of the Parliament) belonging to them.
Answer:
To answer this requirement, we write the Query below.
select outer_q.pg_id, outer_q.pg_code, outer_q.pg_name from political_groups outer_q where EXISTS (select 1 from meps inner_q where inner_q.pg_id = outer_q.pg_id);
Here is the Result Set of the Query.
EXISTS Operator ensures that the search in the INNER QUERY (Correlated Subquery) does NOT CONTINUE when AT LEAST ONE MATCH is found for the Political Group.
We can replace the EXISTS Operator with the IN Operator, and use the column pg_id instead of a CONSTANT in the INNER SELECT, and obtain the same result set, as seen below.
select outer_q.pg_id, outer_q.pg_code, outer_q.pg_name from political_groups outer_q where outer_q.pg_id IN (select distinct inner_q.pg_id from meps inner_q where inner_q.pg_id is not null);
Here is the output of the query.
NOT EXISTS Operator is used to test whether a value retrieved by the MAIN Query DOES NOT EXIST in the Result Set of the INNER Query (Correlated Subquery).
If the INNER Query (Correlated Subquery) Returns AT LEAST ONE ROW:
If the INNER Query (Correlated Subquery) Returns NO ROWS:
Requirement:
Display details of those Political Groups that have currently NO MEPs (Members of the Parliament) belonging to them.
Answer:
To answer this requirement, we write the Query below.
select outer_q.pg_id, outer_q.pg_code, outer_q.pg_name from political_groups outer_q where NOT EXISTS (select 1 from meps inner_q where inner_q.pg_id = outer_q.pg_id);
Here is the Result Set of the Query(NOT EXISTS).
We can replace the NOT EXISTS Operator with the NOT IN Operator, and use the column pg_id instead of a CONSTANT in the INNER SELECT, and obtain the same result set, as seen below.
select outer_q.pg_id, outer_q.pg_code, outer_q.pg_name from political_groups outer_q where outer_q.pg_id NOT IN (select distinct inner_q.pg_id from meps inner_q where inner_q.pg_id is not null);
Here is the Result Set of the Query (NOT IN).
This resumes the Chapter on Subqueries.