The Difference Between SQL Join Conditions Into "ON" And "WHERE" Clauses
Some people may think that both approaches will return the same results every time. This is not completely true. Yes in some cases both approaches will return the same results but in other cases they won't. Let's check the case below and we will get the whole thing at the end.
Assume that we have two tables:
- Departments table which includes all departments which exist at a certain company
- Employees table which includes all employees in the same company
The two tables can be as follows
- Browse to http://sqlfiddle.com
- In the site tool bar on the upper left side there is a drop down list to choose the SQL engine you wish to use, so choose "MS SQL Server 2008" ...... I was so happy to find "MS SQL Server 2012" in the list :)
So, first let's build our schema. To do so, paste the code below into the text area on the left and hit the "Build Schema" button.
create table Departments ( [deptID] [int] NOT NULL , Name nvarchar(max) , CONSTRAINT pk_departments_pid PRIMARY KEY(deptID) ); create table Employees ( [empID] [int] IDENTITY(1,1) NOT NULL , Name nvarchar(max) , DepartmentId int , CONSTRAINT pk_employees_pid PRIMARY KEY(empID) , CONSTRAINT FK_employees_Department FOREIGN KEY (DepartmentId) REFERENCES Departments (deptID) );
So now we have our two tables as in the picture
Now, let's test our two approaches and see what happens. Assume that for some reason we only care about all departments except department with deptID = 3.
Adding Condition To The "WHERE" Clause
Copy and paste the code below into the text area on the right, on the "Run SQL" button there is a small arrow pointing down, hit this arrow and choose "Tabular Output" and finally hit "Run SQL".
insert into Departments (deptID, Name) values(1, 'Dept1'),(2, 'Dept2'),(3, 'Dept3'),(4, 'Dept4'),(5, 'Dept5'); insert into Employees (Name, DepartmentId) values('Ahmed', 1),('Tarek', 2),('Hasan', 2),('Saleh', 2),('Selim', 3); select e.Name as empName , d.Name as deptName from Employees as e left outer join Departments as d on e.DepartmentId = d.deptID where d.deptID <> 3;
After running the code you will get the results as in the picture below
As you can see there should be the record (Selim, Dept3) but it is filtered out because we added a condition to the "WHERE" clause which states that we only need records matching to all departments except the departments which have the "deptID" column equal to "3".
Adding Condition To The "ON" Clause
Copy and paste the code below into the text area on the right, on the "Run SQL" button there is a small arrow pointing down, hit this arrow and choose "Tabular Output" and finally hit "Run SQL".
insert into Departments (deptID, Name) values(1, 'Dept1'),(2, 'Dept2'),(3, 'Dept3'),(4, 'Dept4'),(5, 'Dept5'); insert into Employees (Name, DepartmentId) values('Ahmed', 1),('Tarek', 2),('Hasan', 2),('Saleh', 2),('Selim', 3); select e.Name as empName , d.Name as deptName from Employees as e left outer join Departments as d on (e.DepartmentId = d.deptID and d.deptID <> 3);
After running the code you will get the results as in the picture below
As you can see there should be the record (Selim, Dept3) but instead we have the record (Selim, NULL) and this is because we added the condition to the "ON" clause to filter out the departments with "deptID" equal to "3". So, just before the joining the table "Departments" is filtered according to the condition and then the joining is performed. Since the joining is outer left joining (knowing that the "Employees" table is the one on the left), then whatever values in the "Departments" table are all records in the "Employees" table will return. This time the employee "Selim" has no corresponding department in the "Departments" table because it is already filtered out by the condition just before the joining occurred.
Conclusion
When adding the condition to the "WHERE" clause, the joining is executed first and the filtering based on the condition is the last thing to happen. While when adding the condition to the "ON" clause the filtering is applied to the tables first then the joining happens.
So, in which cases do both approaches return the same results? This happens in case of inner joins as in inner joins if a record in the "Employees" table doesn't have a corresponding record in the "Departments" table then this record will be filtered out. So, even if the condition is added to the "ON" clause, the department record is filtered out before joining, joining is applied, then the joining itself will filter out the whole employee record because it has no corresponding department.
That's it. Hope you find this post useful.