SQL Server Nation
For all of your SQL Server needs.
When an outer join becomes an inner join

Enclosed you will find how an improperly constructed OUTER JOIN query can turn into an INNER JOIN.

An OUTER JOIN is described as a join where you want all data from one table along with any matching data from another table.  In SQL Server the default behavior of using JOIN creates an INNER JOIN.  But sometimes even if you use LEFT JOIN or LEFT OUTER JOIN, the query that you build may end up being an INNER JOIN.

The first thing we will do is setup some test tables.  Before you critique the tables, realize that this is simple test data and is not what we would build for an employee employer table.  The data sample below will add three employees.  Two of those employees will be associated with employers in the #Employers table.  Both of those employers will have a simple address.

Temporary tables:

set nocount on
go
create table #Employees
(EmployeeId int primary key clustered
,EmployerId int null
,FirstName varchar(20) not null
,LastName varchar(20) not null
)
create table #Employers
(EmployerId int primary key clustered
,EmployerName varchar(30) not null
)
create table #EmployerAddress
(EmployerId int primary key clustered
,EmployerCity varchar(30) not null
,EmployerState varchar(3) not null
)
go

 

Test data:

insert into #Employees(EmployeeId,EmployerId,FirstName,LastName)values(1,1,'Brandon','Galderisi')
insert into #Employees(EmployeeId,EmployerId,FirstName,LastName)values(2,2,'Tim','Chapman')
insert into #Employees(EmployeeId,EmployerId,FirstName,LastName)values(3,null,'Mike','Jones')

insert into #Employers(EmployerId,EmployerName)values(1,'Company A')
insert into #Employers(EmployerId,EmployerName)values(2,'Company B')

insert into #EmployerAddress(EmployerId,EmployerCity,EmployerState)values(1,'Anytown','USA')
insert into #EmployerAddress(EmployerId,EmployerCity,EmployerState)values(2,'Washington','DC')
go

 

Now let's look at the first simple query.  This query simply selects all #Employees with the #Employer records that are available.

select * from #Employees ee
left outer join #Employers er
on ee.EmployerId = er.EmployerId

 

 

Now let's add in the #Employer_Address table.

select * from #Employees ee
left outer join #Employers er
on ee.EmployerId = er.EmployerId
left outer join #EmployerAddress era
on er.EmployerId = era.EmployerId

 

 

So far so good.  The LEFT OUTER JOINs are working correctly still displaying all of the employees regardless of them having a listed employer.  Now let's change it.  Suppose you want to show all employees, but the employer only if they are in Washington.  So how would you do it?  Your first instinct may be to add criteria EmployerCity = 'Washington'.  Let's see what that does.

select * from #Employees ee
left outer join #Employers er
on ee.EmployerId = er.EmployerId
left outer join #EmployerAddress era
on er.EmployerId = era.EmployerId
where era.EmployerCity = 'Washington'

What you will see that not only did it remove the Employer information, but it filtered down to only employees who work for that employer.

 

 

One way to solve this is to create a DERIVED table and JOIN to it.  Since we only want employers who are in Washington, the DERIVED table will have an inner join on #Employers and #EmployerAddress.

select * from #Employees ee
left outer join
(select er.EmployerId,er.EmployerName,era.EmployerCity,era.EmployerState
from #Employers er
INNER join #EmployerAddress era
on er.EmployerId = era.EmployerId
where era.EmployerCity = 'Washington'
) er
on ee.EmployerId = er.EmployerId

 

What you will see is that now we have all 3 employees.  But we only list the employer information for those in Washington.

 

 

I hope that you found this post useful and that it brings light to the fact that improperly constructed JOIN statements can yield undesired results.


Posted 20 May 2009 9:00 PM by BrandonGalderisi
Filed under: ,
Copyright SQL Server Nation 2010
Powered by Community Server (Non-Commercial Edition), by Telligent Systems