1. SQL JOIN statement – Worksheet 6
SQL Queries - Basics
Worksheet - 6
JOIN STATEMENT
1 Summarize the rules for single table query processing.
To generate the query results for a select statement follow these steps:
1. Start with the table named in the FROM clause.
2. If there is a WHERE clause, apply its search condition to each row of the table,
retaining those rows for which the search condition is TRUE, and discarding those rows
for which it is FALSE or NULL.
3. For each remaining row, calculate the value of each item in the select list to produce a
single row of query results. For each column reference, use the value of the column in
the current row.
4. If SELECTED DISTINCT is specified, eliminate any duplicate rows of query results
that were produced.
5. If there is an ORDER BY clause, sort the query results as specified.
The rows generated by this procedure comprise the query results.
2 Summarize the rules for two table query processing.
Consider a query based on two tables, such as:
"List all orders, showing the order number and amount, and the name and credit limit of the
customer who placed it."
Prof. Mukesh N. Tekwani Page 1 of 10
2. SQL JOIN statement – Worksheet 6
The ORDERS table contains the order number and amount of each order, but doesn't have
customer names or credit limits.
The CUSTOMERS table contains the customer names and balances, but it does not have
any information about orders.
But, there is a link between these two tables. In each row of the ORDERS table, the CUST
column contains the customer number of the customer who placed the order, which
matches the value in the CUST_NUM column in one of the rows in the CUSTOMERS
table. Hence, the SELECT statement that handles the request must somehow use this link
between the tables to generate its query results.
Here is the procedure to query both these tables:
1. Start by writing down the four column names for the query results. Then move to the
ORDERS table, and start with the first order.
2. Look across the row to find the order number (112961) and the order amount
($31,500.00) and copy both values to the first row of query results.
3. Look across the row to find the number of the customer who placed the order (2117),
and move to the CUSTOMERS table to find customer number 2117 by searching the
CUST_NUM column.
4. Move across the row of the CUSTOMERS table to find the customer's name ("J.P.
Sinclair") and credit limit ($35,000.00), and copy them to the query results table.
You've generated a row of query results! Move back to the ORDERS table, and go to the
next row. Repeat the process, starting with Step 2, until you run out of orders.
Each row of query results draws its data from a specific pair of rows, one from the
ORDERS table and one from the CUSTOMERS table.
The pair of rows are found by matching the contents of corresponding columns from the
tables.
3 What is a JOIN?
Usually a query will have to refer to two or more tables to find all the information it
requires. This happens because in a relational database, data is intentionally split up into
multiple tables in order to achieve modularization or normalization of data.
In order to deal with this fragmentation of data, we need a JOIN statement in SQL. A JOIN
statement combines data from two or more tables into a single result set. The tables are not
actually merged; they just appeared to be merged in the rows returned by the query.
Multiple joins can be used to consolidate data from many tables.
There are two types of JOINS: inner join and outer join. The major difference between
these two is that the outer join includes rows in the result set even when the conditions
specified in the JOIN statement are not met. But the Inner join will not return rows which
do not meet the JOIN condition. When the join condition in an outer join is not met,
Page 2 of 10 mukeshtekwani@hotmail.com
3. SQL JOIN statement – Worksheet 6
columns in the first table are returned normally, but columns from the second table are
returned with no value – as NULLs.
4 INNER JOIN
The INNER JOIN keyword returns rows when there is at least one match in both tables.
The INNER JOIN keyword return rows when there is at least one match in both tables. If
there are rows in "Persons" that do not have matches in "Orders", those rows will NOT be
listed.
SELECT column_name(s)
FROM table_name1
INNER JOIN table_name2
ON table_name1.column_name=table_name2.column_name
Example:
SELECT Persons.LastName, Persons.FirstName, Products.OrderNo
FROM Persons
INNER JOIN Products
ON Persons.P_Id = Products.P_Id
ORDER BY Persons.LastName
Example of INNER JOIN
Consider a database XYZLTD. The tables in this database are as follows:
Table: Customers
CustomerNumber int NOT NULL ,
LastName char(30) NOT NULL,
FirstName char(30) NOT NULL,
StreetAddress char(30) NOT NULL,
City char(20) NOT NULL,
State char(3) NOT NULL,
PinCode char(6) NOT NULL
Table: Orders
OrderNumber int NOT NULL ,
OrderDate datetime NOT NULL ,
CustomerNumber int NOT NULL ,
ItemNumber int NOT NULL ,
Amount numeric(9, 2) NOT NULL
Prof. Mukesh N. Tekwani Page 3 of 10
4. SQL JOIN statement – Worksheet 6
Table: Items
ItemNumber int NOT NULL ,
Description char(30) NOT NULL ,
Price numeric(9, 2) NOT NULL
Suppose we wish to join the tables Orders and Customers. There are two different syntaxes
to join these two tables. The first method is called the legacy (old) method and is as
follows:
Old Method:
SELECT customers.CustomerNumber, orders.Amount
FROM customers, orders
WHERE customers.CustomerNumber = orders.CustomerNumber
This is an inner join. If an order doesnot exist for a given customer, that customer is
omitted completely from the list.
The ANSI / SQL-92 syntax is as follows and this is preferable:
SELECT customers.CustomerNumber, orders.Amount
FROM customers JOIN orders
ON (customers.CustomerNumber = orders.CustomerNumber)
Consider the following example, using the old syntax, where we join 3 tables:
SELECT customers.CustomerNumber, orders.Amount,
items.Description
FROM customers, orders, items
WHERE customers.CustomerNumber = orders.CustomerNumber
AND orders.ItemNumber = items.ItemNumber
We write the ANSI/SQL-92 version of the same as follows:
SELECT customers.CustomerNumber, orders.Amount,
items.Description
FROM customers JOIN orders
ON (customers.CustomerNumber = orders.CustomerNumber)
JOIN items ON (orders.ItemNumber = items.ItemNumber)
7 A simple example of JOIN statement
Consider two tables as shown below:
Customers:
CustomerID FirstName LastName Email DOB Phone
1 John Smith John.Smith@yahoo.com 2/4/1968 626 222
2 Steven Goldfish goldfish@fishhere.net 4/4/1974 323 455
3 Paula Brown pb@herowndomain.org 5/24/1978 416 323
4 James Smith jim@supergig.co.uk 20/10/1980 416 327
Sales:
CustomerID Date SaleAmount
Page 4 of 10 mukeshtekwani@hotmail.com
5. SQL JOIN statement – Worksheet 6
2 5/6/2004 100.22
1 5/7/2004 99.95
3 5/7/2004 122.95
3 5/13/2004 100.00
4 5/22/2004 555.55
The SQL JOIN clause is used whenever we have to select data from 2 or more tables.
To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a
relationship between certain columns in these tables.
As we can see those 2 tables have common field called CustomerID and based on that we
can extract information from both tables by matching their CustomerID columns.
Consider the following SQL statement:
SELECT Customers.FirstName, Customers.LastName,
SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers, Sales
WHERE Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
The SQL expression above will select all distinct customers (their first and last names)
and the total respective amount of dollars they have spent. The SQL JOIN condition has
been specified after the SQL WHERE clause and says that the 2 tables have to be matched
by their respective CustomerID columns.
Here is the result of this SQL statement:
FirstName LastName SalesPerCustomers
John Smith 99.95
Steven Goldfish 100.22
Paula Brown 222.95
James Smith 555.55
The SQL statement above can be re-written using the SQL JOIN clause like this:
SELECT Customers.FirstName, Customers.LastName,
SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
Prof. Mukesh N. Tekwani Page 5 of 10
6. SQL JOIN statement – Worksheet 6
There are 2 types of SQL JOINS – INNER JOINS and OUTER JOINS. If we don't put
INNER or OUTER keywords in front of the SQL JOIN keyword, then INNER JOIN is
used.
The INNER JOIN will select all rows from both tables as long as there is a match between
the columns we are matching on. In case we have a customer in the Customers table,
which still hasn't made any orders (there are no entries for this customer in the Sales
table), this customer will not be listed in the result of our SQL query above.
If the Sales table has the following rows:
CustomerID Date SaleAmount
2 5/6/2004 $100.22
1 5/6/2004 $99.95
And we use the same SQL JOIN statement from above, we get the result as follows:
FirstName LastName SalesPerCustomers
John Smith $99.95
Steven Goldfish $100.22
Even though Paula and James are listed as customers in the Customers table they won't be
displayed because they haven't purchased anything yet.
But what if we want to display all the customers and their sales, no matter if they have
ordered something or not? We can do that with the help of SQL OUTER JOIN clause.
SQL OUTER JOIN:
The second type of SQL JOIN is called SQL OUTER JOIN and it has 2 sub-types called
LEFT OUTER JOIN and RIGHT OUTER JOIN.
The LEFT OUTER JOIN or simply LEFT JOIN selects all the rows from the first table
listed after the FROM clause, no matter if they have matches in the second table.
If we slightly modify our last SQL statement to:
SELECT Customers.FirstName, Customers.LastName
SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers LEFT JOIN Sales
Page 6 of 10 mukeshtekwani@hotmail.com
7. SQL JOIN statement – Worksheet 6
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
and the Sales table still has the following rows:
CustomerID Date SaleAmount
2 5/6/2004 100.22
1 5/6/2004 99.95
The result will be the following:
FirstName LastName SalesPerCustomers
John Smith 99.95
Steven Goldfish 100.22
Paula Brown NULL
James Smith NULL
Thus, we have selected everything from the Customers (first table). For all rows from
Customers, which don’t have a match in the Sales (second table), the SalesPerCustomer
column has amount NULL.
The RIGHT OUTER JOIN or just RIGHT JOIN behaves exactly as SQL LEFT JOIN, except
that it returns all rows from the second table (the right table in our SQL JOIN statement).
5 Explain non-equi join.
Prof. Mukesh N. Tekwani Page 7 of 10
8. SQL JOIN statement – Worksheet 6
The term join applies to any query that combines data from two tables by comparing the
values in a pair of columns from the tables. Joins based on equality between matching
columns (equi-joins) are by far the most common joins, but SQL also allows you to join
tables based on other comparison operators. Here's an example where a greater than (>)
comparison test is used as the basis for a join:
Exmple 1:
List all combinations of salespeople and offices where the salesperson's quota is more than
the office's target.
SELECT NAME, QUOTA, CITY, TARGET
FROM SALESREPS, OFFICES
WHERE QUOTA > TARGET
6 What is meant by self-join? Explain with an example.
Some multi-table queries involve a relationship that a table has with itself. For example,
suppose you want to list the names of all salespeople and their managers. Each salesperson
appears as a row in the SALESREPS table, and the MANAGER column contains the
employee number of the salesperson's manager. It would appear that the MANAGER
column should be a foreign key for the table that holds data about managers. In fact it is—
it's a foreign key for the SALESREPS table itself!
If we tried to express this query like any other two-table query involving a foreign
key/primary key match, it would look like this:
SELECT NAME, NAME
FROM SALESREPS, SALESREPS
WHERE MANAGER = EMPL_NUM
This SELECT statement is illegal because of the duplicate reference to the SALESREPS
table in the FROM clause. You might also try eliminating the second reference to the
SALESREPS table:
SELECT NAME, NAME
FROM SALESREPS
WHERE MANAGER = EMPL_NUM
This SELECT statement is illegal because of the duplicate reference to the SALESREPS
table in the FROM clause. You might also try eliminating the second reference to the
SALESREPS table:
This query is legal, but it won't do what you want it to do! It's a single-table query, so SQL
goes through the SALESREPS table one row at a time, applying the search condition:
Page 8 of 10 mukeshtekwani@hotmail.com
9. SQL JOIN statement – Worksheet 6
MANAGER = EMPL_NUM
The rows that satisfy this condition are those where the two columns have the same
value—that is, rows where a salesperson is their own manager. There are no such rows, so
the query would produce no results—not exactly the data that the English-language
statement of the query requested.
To understand how SQL solves this problem, imagine there were two identical copies of
the SALESREPS table, one named EMPS, containing employees, and one named MGRS,
containing managers, as shown in Figure below. The MANAGER column of the EMPS
table would then be a foreign key for the MGRS table, and the following query would
work:
Example:
List the names of salespeople and their managers.
SELECT EMPS.NAME, MGRS.NAME
FROM EMPS, MGRS
WHERE EMPS.MANAGER = MGRS.EMPL_NUM
Because the columns in the two tables have identical names, all of the column references
are qualified.
6 What is table alias?
As described in the previous section, table aliases are required in queries involving self-
joins. However, you can use an alias in any query. For example, if a query refers to another
user's table, or if the name of a table is very long, the table name can become tedious to
type as a column qualifier. This query, which references the BIRTHDAYS table owned by
the user named SAM:
Example:
List names, quotas, and birthdays of salespeople.
SELECT SALESREPS.NAME, QUOTA, SAM.BIRTHDAYS.BIRTH_DATE
FROM SALESREPS, BIRTHDAYS
WHERE SALESREPS.NAME = SAM.BIRTHDAYS.NAME
Prof. Mukesh N. Tekwani Page 9 of 10
10. SQL JOIN statement – Worksheet 6
This becomes easier to read and type when the aliases S and B are used for the two tables:
List names, quotas, and birthdays of salespeople.
SELECT S.NAME, S.QUOTA, B.BIRTH_DATE
FROM SALESREPS S, SAM.BIRTHDAYS B
WHERE S.NAME = B.NAME
The FROM clause specifies the tag that is used to identify the table in qualified column
references within the SELECT statement. If a table alias is specified, it becomes the table
tag; otherwise, the table's name, exactly as it appears in the FROM clause, becomes the tag.
Page 10 of 10 mukeshtekwani@hotmail.com