Union vs. Union All

By using the UNION or UNION ALL operators we can combine multiple result sets into one result set.
UNION OPERATOR (Alias: DISTINCT UNION ORDERED LIST): is used to combine multiple result sets into one result set and will remove any duplicates rows that exist.  Basically it is performing a DISTINCT operation across all columns in the result set.
UNION ALL OPERATOR:  is used to combine multiple result sets into one result set, but it does not remove any duplicate rows.  Because this does not remove duplicate rows this process is faster, but if you don’t want duplicate records you will need to use the UNION operator instead.
Performance TIP:  Compared UNION ALL operator, UNION operator has the extra overhead of removing duplicate rows and sorting results. So, If we know that all the records returned by our query is unique from union then use UNION ALL operator instead of UNION Operator.
 Following are the constraints for using UNION/UNION ALL Operator:
  • All the query’s which need to combine need to have the same number of columns
  • Column should be of the same data type/compatible data types
  • ORDER BY clauses can only be issued for the overall result set and not within each result set
  • Column names of the final result set will be from the first query
Let us understand all these Constraints/Rules from Examples:
By using the below script we are creating a Sample DataBase Named:UNIONDEMO. Then in this database we are creating two tables Customers and Employees and in these tables populating the sample data.
CREATE DATABASE UNIONDEMO
GO
USE UNIONDEMO
GO
CREATE TABLE dbo.Customers(CustomerID int,Name Varchar(50),City Varchar(50))
GO
INSERT INTO dbo.Customers VALUES(1,'Monty Biradar','Bangalore'),
        (2,'Shashank Biradar','Mysore') GO
Create Table dbo.Employees(EmployeeId int,EmployeeName varchar(50),
                              City Varchar(50))
GO
INSERT INTO dbo.Employees VALUES(1,'Raju Patil','Mumbai'),
        (2,'Praveen Patil','Mumbai'),
        (3,'Krishna Kumar','Mumbai'),
        (4,'Ravi BIradar','Mumbai')
GO
SELECT * FROM dbo.Customers WITH(NOLOCK)
SELECT * FROM dbo.Employees WITH(NOLOCK)
GO
Example 1) Usage of UNION AND UNION ALL Operator

Observe that the Customer Table has 2 records, so the UNION statement is returning only 2 rows, by removing duplicate 2 rows. Where as the UNION ALL operator is returning all the 4 rows including the duplicates.
Example 2) MisMatch in the No of Columns in the select queries combined by the UNION Operator:
Query: SELECT EmployeeName,City
FROM DBO.Employees WITH(NOLOCK)
UNION
SELECT *
FROM DBO.Customers WITH(NOLOCK) 
OutPut:
Msg 205, Level 16, State 1, Line 1 All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
Reason for the above error is:First statement of the UNION has two columns in the select list where as Second statement has 3 columns (i.e. * means all the columns in the Customers table)
Example 3) No of columns matching but mismatch in the data type of the columns
Query:
SELECT EmployeeID,City
FROM DBO.Employees WITH(NOLOCK)
UNION
SELECT Name,City
FROM DBO.Customers WITH(NOLOCK)
 
Output:
Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'Monty Biradar' to data type int.
Reason for this error is: first column in the first statement of the UNION is of type INT and the first column in the Second statement is VACHAR, due to this incompatible data type we are seeing an error here.
Now we can re-write this query to work as below.  Also by seeing the column name in the result we can conclude that the column names in the result are always taken from the first statement of the UNION clause.
Query:
SELECT CAST(EmployeeID AS VARCHAR(50)) EmpIDORName,City
FROM DBO.Employees WITH(NOLOCK)
UNION
SELECT Name,City
FROM Customers WITH(NOLOCK)
OutPut:
EmpIDORName            City
---------------------- --------------
1                      Mumbai
2                      Mumbai
3                      Mumbai
4                      Mumbai
Monty Biradar          Bangalore
Shashank Biradar       Mysore
(6 row(s) affected)
 Example 4: Union Statement not only eliminates duplicate rows, but output is the sorted list
UNION Operator returns sorted list:
Query:
SELECT Name
FROM Customers WITH(NOLOCK)
UNION
SELECT EmployeeName
FROM DBO.Employees WITH(NOLOCK)
Output:
Name
--------------------
Krishna Kumar
Monty Biradar
Praveen Patil
Raju Patil
Ravi BIradar
Shashank Biradar
(6 row(s) affected)
UNION All operator doesn't sort the result:
Query:
SELECT Name
FROM Customers WITH(NOLOCK)
UNION ALL
SELECT EmployeeName
FROM DBO.Employees WITH(NOLOCK)
Output:
Name
---------------------
Monty Biradar
Shashank Biradar
Raju Patil
Praveen Patil
Krishna Kumar
Ravi BIradar
(6 row(s) affected)
 In the Next articel I will discuss the difference between UNION and UNION ALL operator, by showing execution plans, IO’s, time etc.

No comments:

Post a Comment