I was faced with a problem where I had to do a price update based on some product replacement logic.

This meant that I needed to calculate an existing price and compare it to the replacement product’s price and whichever was highest, needed to be written into the final structure.
Having to do this in an update query which could number the thousands of records, it presented the problem of how to get the calculating logic into the update query without having to resort to cursors and without having to call the calculating function too many times for each row.

A solution presented itself because I knew that I could make assignments in UPDATE statements and reuse that assignment as well, so I could make the calculation into a function which would assign a variable, and then I could case this variable and decide. This meant I only had to call the calculation function once, and could use the result for multiple things.

The technique is easy to illustrate using this simplified made up example:


--I'm using a unique identifier to mimic the result of my function.
DECLARE @testTable TABLE (idx INT, updatedValue UNIQUEIDENTIFIER)
--used in the update to assign a value to.
DECLARE @ui UNIQUEIDENTIFIER
--insert some test data
INSERT INTO @testTable
( idx, updatedValue )
VALUES ( 1, -- idx - int
NULL  -- updatedValue - uniqueidentifier
),
( 2, -- idx - int
NEWID()  -- updatedValue - uniqueidentifier
),
( 3, -- idx - int
NULL  -- updatedValue - uniqueidentifier
)
--verify inserted data
SELECT * FROM @testTable AS TT
--now, do the update, by first assigning a new value to @ui
--from a function, and then use that value to compare to the content of @testTable
UPDATE @testTable
SET @ui = NEWID(),
updatedValue = CASE
WHEN updatedValue IS NULL THEN @ui
ELSE updatedValue
END
--verify updated data
SELECT * FROM @testTable AS TT
--notice that row idx 1 and 3 have receieved new updatedValue,
--whereas idx 2 has retained its value.

It’s a technique you wouldn’t really expect from SQL when knowing how it handles sets based execution, but well – it seems to work.

I have made a script which shrinking all log files on a given database server.

It was needed because we had a (development) server that was running low on space with regular intervals and the main culprit was growing/large log files from the ever changing amount of databases on the server.
Now, as always – remember there are consequences for doing this; it will mess with your backup plan if relying on the transaction logs, so use it after full backups and …. Well, use your common sense :)
The script checks for offline databases and ignore those.

DECLARE @dbfile VARCHAR(255)
DECLARE @isOnline BIT
DECLARE @isfull BIT

DECLARE db_cursor CURSOR FOR
SELECT
name, state FROM sys.databases AS D WHERE owner_sid <> 0x01

OPEN db_cursor    
FETCH NEXT FROM db_cursor INTO @dbfile, @isOnline

WHILE @@FETCH_STATUS = 0 BEGIN    
    PRINT
('Working on: ' + @dbfile + '; Online: ' + LTRIM(STR(@IsOnline)))
    
IF @isOnline = 0BEGIN
        SELECT
@isfull =
                
CASE                    
                    
WHEN recovery_model = 1 THEN 1
                    
ELSE  0
                
END
        FROM
sys.databases AS D WHERE name = @dbfile
        
        
IF @isfull = 1 BEGIN
            PRINT
('Setting DB: ' + @dbfile + ' to simple')
            
EXEC('ALTER DATABASE [' + @dbfile +'] SET RECOVERY SIMPLE')
        
END ELSE BEGIN
            PRINT
('DB: ' + @dbfile + ' was already simple, skip setting to simple')
        
END          
        
        EXEC
(
        
'USE [' + @dbfile + ']            
        DECLARE file_cursor CURSOR FOR
        SELECT name FROM sys.database_files AS DF WHERE type_desc = ''log''
        DECLARE @file VARCHAR(255)        
        OPEN file_cursor
        FETCH NEXT FROM file_cursor INTO @file
        WHILE @@FETCH_STATUS = 0 BEGIN
            PRINT(''Shrinking file: '' + @file)
            DBCC SHRINKFILE(@file)
            FETCH NEXT FROM file_cursor INTO @file
        END
        
        CLOSE file_cursor
        DEALLOCATE file_cursor
        '
)
        
        
IF @isfull = 1 BEGIN    
            PRINT
('Setting DB: ' + @dbfile + ' to full')
            
EXEC('ALTER DATABASE [' + @dbfile +'] SET RECOVERY FULL')
        
END ELSE BEGIN
            PRINT
('DB: ' + @dbfile + ' was simple, skip setting to full')
        
END        
    END ELSE BEGIN
        PRINT
('Database is offline, skip')
    
END
    PRINT
(CHAR(13) + CHAR(10))
    
FETCH NEXT FROM db_cursor INTO @dbfile, @isOnline
END
CLOSE
db_cursor
DEALLOCATE db_cursor  

		

Handling namespaces in SQL XML Bulk Load can be a headache as there’s not really much documentation about it. However while laborious, it is basically following the normal XML Schema structure.
I’ll not go deep into the SQL XML Bulk Load syntax in this blog post, if you’re looking for that, I’ll advice you to read my SQL XML Bulk Load – basic XSD syntax or SQL XML Bulk Load – more XSD syntax blog posts before this one.

I’ve made this simplified example to illustrate. Suppose you have an XML file something akin to this:

<?xml version="1.0" encoding="utf-8"?>
<Catalogue xmlns:sdt="urn:oasis:names:specification:ubl:schema:xsd:SpecializedDatatypes-2"
           xmlns="urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2"
           xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
           xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2 UBL-Catalogue-2.0.xsd">
  <cbc:UBLVersionID>2.0</cbc:UBLVersionID>
   <cac:ValidityPeriod>
    <cbc:StartDate>2011-03-02</cbc:StartDate>
  </cac:ValidityPeriod>
   <cac:CatalogueLine>
    <cbc:ID>11002</cbc:ID>    
  </cac:CatalogueLine>
   <cac:CatalogueLine>
    <cbc:ID>11003</cbc:ID>
  </cac:CatalogueLine>
</Catalogue>

I have two custom namespaces, and a mixture of the elements within each namespace. This is simplified, but once the logic is known, it is just expanding it with more namespaces, and more mapping.

First we need to make the base XML schema/XSD file. A way to handle namespaces in the XSD file is to use import statements and split each namespace element in to its own separate file. And that technique works for SQL XML Bulk Load as well.
So we’ll make the base XSD like this

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"            
           xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
           xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:sql="urn:schemas-microsoft-com:mapping-schema">

  <xs:annotation>
    <xs:appinfo>
      <sql:relationship name="SomeName"
                        parent="Import"
                        parent-key="PK"
                        child="Import2"
                        child-key="FK" />
    </xs:appinfo>
  </xs:annotation>

  <xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" schemaLocation="cbc.xsd" />
  <xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" schemaLocation="cac.xsd" />

  <xs:element name="Catalogue" sql:relation="Import">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="cbc:UBLVersionID" />
        <xs:element ref="cac:ValidityPeriod" />
        <xs:element maxOccurs="unbounded" ref="cac:CatalogueLine" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Notice that each of the 3 elements simply has a reference to another element. So the XML tag cbc:UBLVersionID will in the XSD simply be element which ref=”cbc:UBLVersionID”, just as cac:ValidityPeriod element is also just a reference.
The key here however is that I’ve imported two namespaces with a schema location

 <xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" schemaLocation="cbc.xsd" />
<xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" schemaLocation="cac.xsd" />

These lines will tell the base schema file where to look for the specific definitions for the namespaces.

Now, I can make a specific schema file for the cbc namespace and the cac namespace.
The cac one looks like

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:tns="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
                attributeFormDefault="unqualified" elementFormDefault="qualified"
                targetNamespace="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
           xmlns:sql="urn:schemas-microsoft-com:mapping-schema"
           xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" schemaLocation="cbc.xsd" />

  <xs:element name="ValidityPeriod" sql:is-constant="1" >
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="cbc:StartDate" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="CatalogueLine" sql:relationship="SomeName" sql:relation="Import2">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="cbc:ID"  />
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Here the same principle of using import statement to map a namespace

<xs:import namespace="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" schemaLocation="cbc.xsd" />

is used similar to the base file.

The cac file is used to define the contents of all cac namespace elements (the naming of the file is irrelevant, I simply named them this way to help myself keeping track). And where needed in this cac file, a reference is made to an element in the cbc file (in this case cbc:StartDate and cbc:ID).
The cbc file will then qualify all cbc elements

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:tns="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
           attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:sql="urn:schemas-microsoft-com:mapping-schema"

           >
  <xs:element name="UBLVersionID" type="xs:decimal" sql:field="VersionID" />
  <xs:element name="StartDate" type="xs:date" sql:field="DateX" />
  <xs:element name="ID" type="xs:unsignedShort" sql:field="LineID" />
</xs:schema>

As can be seen, the sql:field, sql:relationship and sql:relation are all used across all 3 files.
In the base file, I’ve defined a relationship, and I’m using the relationship in the cac file around a cbc element.

Running this through a SQL XML Bulk Load, and I get my data imported into two tables like this

This was just a simplified example, but I hope it helps to show how the syntax is build up in a XSD and then spice it up with some SQL mapping.

For more insight into SQL XML Bulk Load and the syntax, you can check out my other posts on the subject.

Mar 022011

When using COUNT with LEFT JOINs, it is worth noting that is that it does not count NULL values if counting over something different than *.
This is useful if you wish to count how many of the rows fulfill some sort of join clause.

Take the following example.
I’ve created two tables consisting of a unique identifier and a text column. I’ve added test data and for every two row I placed into Tabel_1, I placed one row into Table_2. Then I made a left join query which looks like:

SELECT *
FROM Table_1 AS T
LEFT JOIN Table_2 AS T2 ON T.PK = T2.FK

Which provide an output like this:

LEFT JOIN example for using COUNT | EXEC(@sql)

LEFT JOIN example for using COUNT

Now to illustrate the difference in count, run a query like this:

SELECT COUNT(*) AS AllRows, COUNT(PK) AS RowsFromLeft, COUNT(FK) AS RowsFromRight
FROM Table_1 AS T
LEFT JOIN Table_2 AS T2 ON T.PK = T2.FK

Which provide the following output:

COUNT result for LEFT JOIN | EXEC(@sql)

COUNT result for LEFT JOIN

I’ve seen many more or less creative solutions in my time to how to count over the Right table in a Left Join, because few people seem to read the documentation on something as simple as a COUNT.
However often it is well worth it just to take a peek.

I ran into the error: The select list for the INSERT statement contains fewer items than the insert list. The number of SELECT values must match the number of INSERT columns.
Well I’ve run into it often.

Normally this simply means you have specified fewer items in values for an INSERT statement than you have specified you need. For example:

INSERT INTO Table1
( ID, [Column], Column1 )
SELECT 1,
'Value', 

Now, that’s easily rectified, by including the columns you need or removing the one from INSERT you do not need.

However, I also noticed that in my situation I had all the columns needed, but I was lacking a comma – like this:

INSERT INTO Table1
( ID, [Column], Column1 )
SELECT 1,
'Value1'
'Value2'

Notice the lack of comma between Value1 and Value2.

This gives the mentioned error and not a syntax error in the SELECT statement. The reason is that SQL Server concatenates the Value1 and Value2 fields into one column, and therefore the SELECT only produces 2 columns.
SQL Server will be able to concatenate two columns into one without needing to add them together, but only two columns.
So if I were to forget the third comma I would get an “Incorrect syntax near Value2″ error message alerting me to the issue. But forget one comma between 2 columns and you get no such information.

I didn’t see a mention of this in the MSDN documentation.

I’m migrating my blog system to WordPress, so hopefully the blog will be up in full force over the weekend

Using @@rowcount is nice at times to get the number of rows affected by the last SQL statement. However sometimes you get an unexpected result.
One such situation is with the print statement which is often used as debugging statement in a query.
If looking at the SQL statement:
SELECT 1
UNION
SELECT
2
SELECT @@ROWCOUNT

Then the @@rowcount naturally will return the value 2.
However if we insert a print statement just before the rowcount like this:

SELECT 1
UNION
SELECT
2
PRINT
‘test’
SELECT @@ROWCOUNT

Rowcount now suddenly is 0.

That means the print statement changes the rowcount despite it being just a print to the console window.
Something to be aware of if relying on @@rowcount.

Normally when you have rows from one table you want to move over into the other, you’ll have to run both an update query to get your existing rows updated with the new values and insert query to get new rows over to your target. That means you have to write two queries. With the merge, you only need one as it performs both the update and the insert.
So let’s take a look at it.

First, I’ll create 2 tables with dummy data; a tblSource and a tblTarget.
The scripts for creating the tables and data can be found here merge_setup_20100521.sql (3.58 kb), but basically it’s just a target and source table with an ID, ValA, ValB and ValC with a clustered index on ID and some dummy data.
So to do the update/insert it would look like this:
UPDATE MyTarget SET
MyTarget.ValA = MySource.ValA
,
MyTarget.ValB = MySource.ValB
,
MyTarget.ValC =
MySource.ValC
FROM dbo.tblTargetInUp AS
MyTarget
INNER JOIN dbo.tblSourceInUp AS MySource ON MyTarget.ID =
MySource.ID

INSERT INTO dbo.tblTargetInUp
SELECT ID, ValA, ValB,
ValC
FROM dbo.tblSourceInUp AS
MySource
WHERE NOT EXISTS (SELECT 1 FROM dbo.tblTargetInUp AS MyTarget WHERE MyTarget.ID = MySource.ID
)

If we look at the execution plan, I get an estimated cost of 9.1754 for the update and 2.48725 for the insert, meaning a combined cost of 11.66265 for 33.333 rows updated and 33.3334 rows inserted, into a target of 66.666 rows.

Now do remember, the estimated cost is just a number for how the query runs in my environment, it can’t be taken as a direct number and transferred to another system – I’m only interested in the relative comparison with the merge. More data, more indexes, more variations will all affect the actual numbers. Also if looking at the execution plan, it is clear it is two queries we fire, meaning that all the overhead which goes into running one query will be doubled for this. But it was how you’d have to do merges in the past.

Now, with the merge syntax we can do it like this:
MERGE dbo.TblTargetMerge AS MyTarget USING
(
SELECT * FROM dbo.TblSourceMerge AS
TS
) AS
MySource
ON MyTarget.ID =
MySource.ID
WHEN MATCHED
THEN
UPDATE SET
MyTarget.ValA = MySource.ValA
,
MyTarget.ValB = MySource.ValB
,
MyTarget.ValC =
MySource.ValC
WHEN NOT MATCHED
THEN
INSERT
(ID, ValA, ValB, ValC
)
VALUES
(MySource.ID, MySource.ValA, MySource.ValB, MySource.ValC
)
;

Note that the syntax takes both the update and insert in the WHEN MATCHED and WHEN NOT MATCHED.
For the complete overview over the syntax, I’ll refer you to the documentation by microsoft: MERGE (Transact-SQL).
But basically – you MERGE into a table using a source, and then define the ON clause (as you would a join), and then specify the WHEN MATCHED and the WHEN NOT MATCHED clauses.

One thing I’ll expand on myself though, is the OUTPUT clause which can also be coupled on to the merge. I mentioned the OUTPUT clause myself recently.
The important thing is that you can couple the $action to the output clause and get information about whether you merged the data or you inserted the data, meaning whether the row was matched or not matched.
Like this:

MERGE dbo.TblTargetMerge AS MyTarget USING
<…snipped FOR being brief…
>
OUTPUT $ACTION,
rest_of_select_here
;

Nifty.
Anyways – once we’ve build this query, we can look at the execution plan, and here we clearly see it is handled as one query.
And in my case, the estimated cost is 8.407 for the same number of rows as above, meaning we’ve saved about 27.9% just by changing syntax.

Now, this structure I used to compare is very simple – with a simple matching of ID to ID and then just insert/overwrite everything so the actual result may vary (naturally), however the merge syntax does appear to be faster and with the added bonus of keeping the query combined into one syntax rather than divided into two different queries.

I must admit, I do like the merge syntax myself once I learned to read and write them.

May 202010

I’ve seen many uses, and some misuses, of the STR function over time.

One of the more “easy to spot problems” is that the STR function returns char datatype, meaning it’ll pad the result with spaces and people need to trim the result.

Today I saw an even worse issue. The STR function takes a float argument. This means if you feed it a string (yes, I’ve seen it done) it will implicit convert that string to a float if it can. That in itself can cause an error, but even worse, it can cause a difficult to find bug.
Suppose you have STR(‘0001′) and run that, you’ll after trimming end up with the result of ‘1’. Why?
Because ‘0001’ is converted to a float, which – as we all know – is 1, which then will be cast to a string.

May 192010

I’ve decided to change my previous domain “www.sqlstuff.dk” to this new one “www.execsql.com” – I apologize for any inconvenience which might happen as a result, but it is better done now then later :)

All access to sqlstuff should get a 301 redirect to the new www.execsql.com domain and hopefully the transition should be smooth.

So welcome to EXEC(@sql) and bye bye to SQLStuff.dk :)