The OUTPUT clause

by Allan Svelmøe Hansen March 20, 2010 12:25

The output clause is a really nifty thing in SQL Server 2008, which provides you access to the same "inserted" and "deleted" tables you get access to via triggers, but from within the same query.
To illustrate, suppose we have a table:

CREATE TABLE [dbo].[OutputEx]
   
[Key] [int] IDENTITY(1,1) NOT NULL, 
   
[Value] [nchar](10) NULL, 
   
[SecondValue] [datetime2](7) NULL, 
 
CONSTRAINT [PK_OutputEx] PRIMARY KEY CLUSTERED  

   
[Key] ASC 
)WITH (PAD_INDEX  = OFFSTATISTICS_NORECOMPUTE  = OFFIGNORE_DUP_KEY = OFF
ALLOW_ROW_LOCKS  
= ON,
ALLOW_PAGE_LOCKS  = ONON [PRIMARY] 
ON [PRIMARY] 

GO 

ALTER TABLE [dbo].[OutputEx] ADD  CONSTRAINT [DF_OutputEx_SecondValue]  DEFAULT (GETDATE()) FOR [SecondValue] 

This is a table which has an identity column in [Key] and a datetime value field in SecondValue2 which defaults to the current date (GETDATE())

Now, if I insert a [Value] into this table SQL Server will automatically insert the identity value and the current date SecondValue, but if I wanted to get these automated values out, I would have had to select them out manually afterwards.
This might not seem like much of a problem given these two values (Identity and date) however it might be much more complex constraints and default values.

So what I can do is use the OUTPUT clause like this:

INSERT INTO OutputEx Value)
OUTPUT inserted.[key]inserted.secondvalue         
VALUES  'a')  
 
This will provide me with a result of the identity value inserted into [key] and the date value inserted into SecondValue.
I can use this result in an outer query by nesting the insert in a sub-query, or I can insert it into another table directly like this:

DECLARE @T TABLE(ID INT[Date] DATETIME2)

INSERT INTO OutputEx Value)
OUTPUT inserted.[key]inserted.secondvalue  INTO @T
VALUES  'a')
 

 
My table variable @T will now hold the values inserted
This also works with the "deleted" table instead of the inserted and thus both in insert/update and deleted queries.
It is a useful technique.

Bookmark and Share DotnetKicks dotnetshoutout

Selecting latest order

by Allan Svelmøe Hansen October 31, 2009 12:15

A common problem to solve is to list, for example the current price for a product or latest order for a customer or similar.
It is a problem which quickly can seem complex, but once understanding the situation, then – as everything else – it is relative simple, and I’ll show how using both window functions (in this instance, RANK) from SQL Server 2005+ and using a sub-query in case working on an earlier version, or other databases.

I’ll take offset in the Northwind database, which I’ve installed on a SQL Server 2008.
It has an Order table with a foreign key to a Customer table; however for this the Order table is the only one of interest:
The Order table contains the following information:
SELECT [OrderID]
      
,
[CustomerID]
      
,
[EmployeeID]
      
,
[OrderDate]
      
,
[RequiredDate]
      
,
[ShippedDate]
      
,
[ShipVia]
      
,
[Freight]
      
,
[ShipName]
      
,
[ShipAddress]
      
,
[ShipCity]
      
,
[ShipRegion]
      
,
[ShipPostalCode]
      
,
[ShipCountry]
  
FROM [Northwind].[dbo].[Orders]
 


In this situation, we’ll get the latest order per customer, but the problem is the same as if you’d need the current price – the only thing which differs will be the tables and where clauses and so on.
Solving the issue using the RANK window function, and a common table expression, it would look like this:

;WITH CTE AS (
SELECT 
RANK() OVER (Partition BY T1.CustomerID ORDER BY OrderDate DESCAS OrderRank
,
T1.
*
FROM 
Orders T1
)
SELECT FROM CTE WHERE OrderRank 
1
ORDER BY CustomerID
   

What happens here is that we use RANK to give us a number partitioned (grouped) by CustomerID and sorted by OrderDate descending. This will give each row selected the number/rank that order is historically, from newest to earliest. That means each row which have the rank 1 will be the latest/newest order.
You can then expand on the joins inside/outside the common table expression to get information about customers or order details or what not.

It is also solvable without using the window function RANK, and would look for example something like this:

SELECT 
FROM 
Orders T1
INNER JOIN 
(
   
SELECT CustomerIDMAX(OrderDateAS 
OrderDate
   
FROM 
Orders T2
   
GROUP BY 
T2.CustomerID
T3 ON T3.OrderDate T1.OrderDate AND T1.CustomerID 
T3.CustomerID
ORDER BY T1.CustomerID
    

Here we utilize a sub-query where we select the CustomerID and the highest order date from Orders, which we then can join into Orders again with a self-join on customer id and order date.

Note that these are just examples. Many similar solutions exists, but they all follow the same methodology.
I’ve also not taken optimizaiton or anything like that into account. It was just to illustrate a solution to a common problem.

 

Bookmark and Share DotnetKicks dotnetshoutout

Using INNER JOIN/SELF JOIN to allow for smaller indexes.

by Allan Svelmøe Hansen May 15, 2009 16:26

In databases, I often need foreign keys in my tables, because I’ll want to use them to select content out from my tables. However this can often result in either bad index utilization in the selection or making additional indexes based on the foreign key(s) and the content I need to select out.
This in turn can result in ending up with “many” indexes and sometimes many big indexes.

However a method can be to make smaller indexes and use an INNER JOIN to join into the table an extra time.

I’m going to show the pattern with a relative simple example to illustrate, because it is about the pattern more so then the specifics layout, content and size of the table.
It is just a pattern/technique to keep in mind and have in the SQL toolbox.
Suppose you have two tables of a similar pattern to this:

Tables for self join example
(click for larger size)

I’ve let the SQL Server create my clustered indexes based on the primary key, which means I have a clustered index over my primary key(s).
When having to look up rows in JoinOne based on the foreign key, it’ll often look like this:

SELECT 
FROM 
JoinOne T1
WHERE FKOne <VALUE>
  

Because I’ve added no other indexes to the tables, I’ll get an Index Scan or Table Scan when running the above query, which - as we - know is not a good thing most of the time.
This usually leads to the creating of a second index which indexes over my foreign key.
However if the table is large, and if I’ll need to extract many/most of the columns (like in my example with SELECT *), it can mean I’ll have to make either a complete index or one with many included columns, just ordered by my foreign key first.

If I make a small index with my foreign key and my primary key second, I can use this to join into my table again, and then after that use my primary key.
This is an index to illustrate:

Index for self join example 
(click for larger size)

Then I can make the following query:

SELECT T2.
FROM 
JoinOne T1
INNER JOIN JoinOne T2 
ON
 
T1.PKOne 
T2.PKOne
 
AND T1.PKTwo 
T2.PKTwo
 
AND T1.PKThree 
T2.PKThree
WHERE T1.FKOne <VALUE>
  

When looking at its execution plan, it’ll show two index seeks instead of my previous index scan. This way I have a small index as possible, but maintain seeks in my execution.
Of course there are some considerations one need to take with this pattern. Firstly – if the tables are “small”, then the scan in itself might be alright, or the overhead of keeping a complete index for the foreign key is neglible.

However the advantage of the pattern is that I can have smaller indexes on (very) large tables, which means less overhead when inserting/updating – but still have only index seeks in my selections execution plans.
It is a useful technique in my opinion – when used at the right times, which of course is on a case by case evaluation as always.  

Bookmark and Share DotnetKicks dotnetshoutout

Filter parameters in a Stored Procedure

by Allan Svelmøe Hansen March 02, 2009 20:19

Something I see often is difficulty in using “filters” in a stored procedure used to select – for example – products based on various different restrictions. Often I see people using dynamically written SQL in the code-layer and then sent to the SQL Server and EXEC.

However a different method of doing this is to utilize the logic of the OR functionality in the Stored Procedure.

Suppose you have a table containing products. These products have – for example – a weight column, height, length and other such physical attributes. Now suppose you want to select all products from this table, but also give the option to filtering the select further – say below a specific weight and/or below a specific length and so on.
This is doable the following way:

  DECLARE @Weight INT = NULL
  
DECLARE @Length INT = 
NULL
  
DECLARE @Width INT = 
NULL
  
  
SELECT 

  
FROM 
Product
  
WHERE (@Weight IS NULL OR (ProductWeight <= @Weight
))
  AND (
@Length IS NULL OR (ProductLength <= @Length
))
  AND (
@Width IS NULL OR (ProductWidth <= @Width))
  

Note this is 2008 syntax, if using 2000 or 2005 remember you can’t assign value to a declaration in the same line.

Also if using it in a Stored Procedure, the declarations are usually input parameters – but just default them to (for example) NULL.

When running this, with the null assignment, you’ll get all rows out, because all rows will fulfill the first part of the OR logic.

However if changing one, or more, of the variables to an actual value, the SQL will filter based on those values, because – as the variables no longer hold the value “null”, the second part of the OR statement comes into play.
So suppose you have the following values instead:

  DECLARE @Weight INT = 10
  
DECLARE @Length INT = 
15
  
DECLARE @Width INT = 20
  

The select will only select out those product rows which have a Weight less than or equal to 10, a Length equal to or less than 15 and likewise with Width and 20.
Omit one value and you’ll not filter for that one either, so 

  DECLARE @Weight INT = NULL
  
DECLARE @Length INT = 
10
  
DECLARE @Width INT = NULL
  

Will select all products which have a length less than or equal to 10 – regardless of Weight and Width.

Of course, me using <= is basically irrelevant, as any logical operation can be performed within the second part of the OR statement.

An inherit disadvantage to this type of procedure over the more specialized ones are the difficulty in utilizing indexes. Due to the OR nature of the procedure, you’ll be unable to utilize the indexes optimal, so it is naturally a concern which you must consider when using such approach.

Bookmark and Share DotnetKicks dotnetshoutout

Difference in assignment using SELECT and SET

by Allan Svelmøe Hansen September 27, 2008 09:08

Basically – there are two ways (at least?) of selecting information into a variable from a table/select query.

You can either SELECT @Variable = COLUMN FROM TABLE or you can SET @Variable = (SELECT COLUMN FROM TABLE).

There is one big difference between the two one needs to be aware of. The first method will not set a new value to @Variable if there isn’t returned a row from the select statement – but the second one will.

This is easy to see by running a query like this where the SELECT will return no rows:

DECLARE 
@Value 
INT
DECLARE 
@i 
INT
SET 
@i 
0

WHILE 
@i 100 
BEGIN
   SELECT 
@Value MyID FROM Table_1 WHERE MyValue IS 
NULL
   
IF @Value IS 
NULL 
           
SET @Value 
@i
   
PRINT(@Value
)
   
SET @i @i 
1
END 

SET 
@i 
0
SET @Value 
0

Now the first impression will be that the values 0 to 99 will be printed out, but that is wrong. You’ll get 100 prints of 0. That is because the SELECT doesn’t return any rows so the assignment doesn’t happen, because it occurs within the SELECT statement.

Whereas if you change the query to this:

WHILE @i 100 BEGIN
   SET 
@Value (SELECT MyID FROM Table_1 WHERE MyValue IS 
NULL)
   
IF @Value IS 
NULL 
           
SET @Value 
@i
   
PRINT(@Value
)
   
SET @i @i 
1
END 

You’ll get the expected assignment of 0 to 99 when run, because the SELECT doesn't return a row, the SET will assign NULL. A big difference in the execution of the assignment.

Bookmark and Share DotnetKicks dotnetshoutout

Tags: ,

SQL

Powered by BlogEngine.NET 1.6.1.0
Theme by Mads Kristensen | Modified by Mooglegiant

About:
Allan Svelmøe Hansen

My real name is Allan Svelmøe Hansen.
I live in Denmark, where I work as a developer for hedal:kruse:brohus using SQL Server and the .NET framework since 2004.
My primary fields of expertise is back-end data integration, database design and optimization.


       View Allan Svelmøe Hansen's profile on LinkedIn     

Disclaimer

The opinions expressed herein are my own personal opinions and thoughts and does not represent my employers view in any way, nor are my results guaranteed for all situations.
Content is presented "as is", with no warranty.