Selecting latest order

by Allan Svelmøe Hansen October 31, 2009 12:15

A common problem to solve is to list, for example the current price for a product or latest order for a customer or similar.
It is a problem which quickly can seem complex, but once understanding the situation, then – as everything else – it is relative simple, and I’ll show how using both window functions (in this instance, RANK) from SQL Server 2005+ and using a sub-query in case working on an earlier version, or other databases.

I’ll take offset in the Northwind database, which I’ve installed on a SQL Server 2008.
It has an Order table with a foreign key to a Customer table; however for this the Order table is the only one of interest:
The Order table contains the following information:
SELECT [OrderID]
      
,
[CustomerID]
      
,
[EmployeeID]
      
,
[OrderDate]
      
,
[RequiredDate]
      
,
[ShippedDate]
      
,
[ShipVia]
      
,
[Freight]
      
,
[ShipName]
      
,
[ShipAddress]
      
,
[ShipCity]
      
,
[ShipRegion]
      
,
[ShipPostalCode]
      
,
[ShipCountry]
  
FROM [Northwind].[dbo].[Orders]
 


In this situation, we’ll get the latest order per customer, but the problem is the same as if you’d need the current price – the only thing which differs will be the tables and where clauses and so on.
Solving the issue using the RANK window function, and a common table expression, it would look like this:

;WITH CTE AS (
SELECT 
RANK() OVER (Partition BY T1.CustomerID ORDER BY OrderDate DESCAS OrderRank
,
T1.
*
FROM 
Orders T1
)
SELECT FROM CTE WHERE OrderRank 
1
ORDER BY CustomerID
   

What happens here is that we use RANK to give us a number partitioned (grouped) by CustomerID and sorted by OrderDate descending. This will give each row selected the number/rank that order is historically, from newest to earliest. That means each row which have the rank 1 will be the latest/newest order.
You can then expand on the joins inside/outside the common table expression to get information about customers or order details or what not.

It is also solvable without using the window function RANK, and would look for example something like this:

SELECT 
FROM 
Orders T1
INNER JOIN 
(
   
SELECT CustomerIDMAX(OrderDateAS 
OrderDate
   
FROM 
Orders T2
   
GROUP BY 
T2.CustomerID
T3 ON T3.OrderDate T1.OrderDate AND T1.CustomerID 
T3.CustomerID
ORDER BY T1.CustomerID
    

Here we utilize a sub-query where we select the CustomerID and the highest order date from Orders, which we then can join into Orders again with a self-join on customer id and order date.

Note that these are just examples. Many similar solutions exists, but they all follow the same methodology.
I’ve also not taken optimizaiton or anything like that into account. It was just to illustrate a solution to a common problem.

 

Bookmark and Share DotnetKicks dotnetshoutout

The IsNumeric Trap

by Allan Svelmøe Hansen September 04, 2009 08:13

As mentioned – casually – in the msdn documentation the IsNumeric function will return 1 for some values which aren’t actually numbers.  The currency sign $ is mentioned as well are plus/minus.
Period and comma will also be returning 1.

This essentially means that you can’t be sure that it is actually a number/numeric value which passes the IsNumeric check.
And least of all, you can’t be assured of the semantic value returned is what you expect.
This can – if unaware and not paying attention – be a problem when dealing with number values for countries which do not follow the same period/decimal convention as the US.

Also – “funnily” enough due to the caveats in the IsNumeric, you can’t actually be assured that it can be converted to a Numeric data type. (It can normally always be converted into a Money, but then the name is misleading :) )
To illustrate notice for example the following snippet:

DECLARE @VAR VARCHAR(20
SET @VAR '€,.,,'   
  

This will pass a “IsNumeric” check with the result 1, but it can’t be cast into a numeric/decimal. It can be cast into money but will return 0.00
And that is despite it is not in any form an actual numeric value.
There are some semantic checks built into the IsNumeric such as you can’t have values infront of the currency sign, so

SET @VAR '123$' 

would fail a “IsNumeric” check. Also some checks on the numbers of “plus/minus” signs and so on......
So when using IsNumeric, it is time to be careful and not just accept a success for actual success.

Bookmark and Share DotnetKicks dotnetshoutout

Tags:

SQL

Local variables and batch scope

by Allan Svelmøe Hansen July 14, 2009 10:36

I was recently debugging a problem in a Stored Procedure where the wrong values where inserted into a table, in the middle of a long running cursor operation.
The culprit turned out to be the scope of a variable in SQL versus the scope most (object orientated) developers are used to in the language they usually code in.

The problem can be illustrated with this syntax:
DECLARE @Counter INT = 1

WHILE @Counter 
BEGIN
   DECLARE 
@Var VARCHAR(MAX

   
IF @Counter 

       
SET @Var 
'SomeValue'
   
SELECT 
@Var
   
SET @Counter +
1
END

The way scope works for variables in T-SQL is "batch scope", whereas most OO developers are used to the scope being limited within the innermost context.

The way it functions in SQL is that in the first iteration of the while loop the variable "@Var" gets declared but not assigned a value.
Thus, when selecting it out, the result will be NULL.

However in the second iteration, the variable is already declared once, so SQL Server will not "recreate" it, because the scope is "batch scope".
So here it’ll just be assigned the value "SomeValue".

The third iteration is like the second. The variable is already declared, and because it in the previous iteration had the value set to "SomeValue", that value will persist in this iteration, and all subsequent iterations until another value is set.

This is a source of errors if not careful, because the object orientated developer will – if not aware of this – most likely read the above as only iteration 2 will contain the value "SomeValue", whereas all other iterations will contain NULL.
But because of the batch scope, it is not so.

As the result from the query shows:

Bookmark and Share DotnetKicks dotnetshoutout

Tags: ,

SQL

Sorting integers in a string using XQuery

by Allan Svelmøe Hansen June 29, 2009 18:13

I stumbled across the task of sorting numbers present in a comma separated string, based on their numerical value in SQL.
Curious as always I thought it would be possible to solve such a problem using XML and XQuery in SQL and indeed it was.

Given a numerical string, then in SQL Server 2005 and up, you can do something like this:
DECLARE @X XML ''

DECLARE @STR VARCHAR(255CAST(@X.query(
'for $i in (11, 9, 10)
                  order by $i
                  return fn:concat(xs:string($i), ",")'
AS VARCHAR(255
))
                  
SELECT LEFT(@STRLEN(@STR) -1)
  
(Remember, in SQL 2005, you need to put assignment on another line, this is 2008 syntax)

The output from this query will be the value: '9, 10, 11'

Now, it might not be special or terrible useful as such – but the point of the post is also much more that untraditional methods for solving problems exists, and one can think outside the box and solve many problems by doing so.

A problem with this is that if you want the input created dynamically - you either need to build the entire segment within a string and use EXEC, or perhaps you can use the sql:column and sql:variable to help you along.

Bookmark and Share DotnetKicks dotnetshoutout

Tags: , ,

SQL | XML

Using INNER JOIN/SELF JOIN to allow for smaller indexes.

by Allan Svelmøe Hansen May 15, 2009 16:26

In databases, I often need foreign keys in my tables, because I’ll want to use them to select content out from my tables. However this can often result in either bad index utilization in the selection or making additional indexes based on the foreign key(s) and the content I need to select out.
This in turn can result in ending up with “many” indexes and sometimes many big indexes.

However a method can be to make smaller indexes and use an INNER JOIN to join into the table an extra time.

I’m going to show the pattern with a relative simple example to illustrate, because it is about the pattern more so then the specifics layout, content and size of the table.
It is just a pattern/technique to keep in mind and have in the SQL toolbox.
Suppose you have two tables of a similar pattern to this:

Tables for self join example
(click for larger size)

I’ve let the SQL Server create my clustered indexes based on the primary key, which means I have a clustered index over my primary key(s).
When having to look up rows in JoinOne based on the foreign key, it’ll often look like this:

SELECT 
FROM 
JoinOne T1
WHERE FKOne <VALUE>
  

Because I’ve added no other indexes to the tables, I’ll get an Index Scan or Table Scan when running the above query, which - as we - know is not a good thing most of the time.
This usually leads to the creating of a second index which indexes over my foreign key.
However if the table is large, and if I’ll need to extract many/most of the columns (like in my example with SELECT *), it can mean I’ll have to make either a complete index or one with many included columns, just ordered by my foreign key first.

If I make a small index with my foreign key and my primary key second, I can use this to join into my table again, and then after that use my primary key.
This is an index to illustrate:

Index for self join example 
(click for larger size)

Then I can make the following query:

SELECT T2.
FROM 
JoinOne T1
INNER JOIN JoinOne T2 
ON
 
T1.PKOne 
T2.PKOne
 
AND T1.PKTwo 
T2.PKTwo
 
AND T1.PKThree 
T2.PKThree
WHERE T1.FKOne <VALUE>
  

When looking at its execution plan, it’ll show two index seeks instead of my previous index scan. This way I have a small index as possible, but maintain seeks in my execution.
Of course there are some considerations one need to take with this pattern. Firstly – if the tables are “small”, then the scan in itself might be alright, or the overhead of keeping a complete index for the foreign key is neglible.

However the advantage of the pattern is that I can have smaller indexes on (very) large tables, which means less overhead when inserting/updating – but still have only index seeks in my selections execution plans.
It is a useful technique in my opinion – when used at the right times, which of course is on a case by case evaluation as always.  

Bookmark and Share DotnetKicks dotnetshoutout

Powered by BlogEngine.NET 1.6.1.0
Theme by Mads Kristensen | Modified by Mooglegiant

About:
Allan Svelmøe Hansen

My real name is Allan Svelmøe Hansen.
I live in Denmark, where I work as a developer for hedal:kruse:brohus using SQL Server and the .NET framework since 2004.
My primary fields of expertise is back-end data integration, database design and optimization.


       View Allan Svelmøe Hansen's profile on LinkedIn     

Disclaimer

The opinions expressed herein are my own personal opinions and thoughts and does not represent my employers view in any way, nor are my results guaranteed for all situations.
Content is presented "as is", with no warranty.