SQL Good Idea/Bad Idea

30 October 2009

The following is a SQL function that splits the string provided as an argument into a table of values on the comma character:

CREATE FUNCTION Split
    (@Argument NVARCHAR(MAX))
RETURNS
    @List TABLE (Item NVARCHAR(20))
AS
BEGIN
    -- While a comma still exists in the argument
    WHILE (CHARINDEX(',', @Argument) > 0)
    BEGIN
        -- Add substring from start of argument to comma to result list (trim string to remove spaces)
        INSERT INTO @List
            VALUES (RTRIM(LTRIM(SUBSTRING(@Argument, 1, CHARINDEX(',', @Argument) - 1))))

        -- Remove the item from argument
        SET @Argument = SUBSTRING(@Argument, CHARINDEX(',', @Argument) + 1, LEN(@Argument))
    END

    -- Add last item to the list
    INSERT INTO @List VALUES (RTRIM(LTRIM(@Argument)))

    RETURN
END
GO

So if passed the argument ‘1, 2, 3, 4′, the function will return a table containing ‘1′, ‘2′, ‘3′ and ‘4′. The reason I wrote the function was because I had to wrap a set of queries in SQL stored procedures to provide an interface. The queries were pretty complex but for this example, I will only use a single table, let’s say a table containing employee information:

CREATE TABLE Employee
(
    ID INT PRIMARY KEY,
    Name NVARCHAR(200),
    Salary INT
)
GO

We need to wrap a query that returns the salary for a given list of employees, the list being provided as a comma-separated string of IDs. There are multiple ways to do this, I will just give a good idea and a bad idea.

Good Idea

CREATE PROCEDURE GetSalaries
    (@Employee_IDs NVARCHAR(MAX))
AS
    SELECT ID, Salary
        FROM Employee
        WHERE ID IN (SELECT CAST (Item AS INT) FROM Split(@Employee_IDs))
GO

Using the initial function, the input is split into individual IDs and cast as INT. ID and Salary are extracted from the table with the condition that ID is in the list. So for the sample data

INSERT INTO Employee VALUES (1, 'John Doe', 10000)
INSERT INTO Employee VALUES (2, 'Jane Doe', 20000)
INSERT INTO Employee VALUES (3, 'John Doe the 2nd', 30000)
GO

Calling

GetSalaries '1, 2, 4'

should return

1 10000
2 20000

(as ID 4 doesn’t exist in the table and ID 3 is not in the argument).

Bad Idea

CREATE PROCEDURE GetSalaries
    (@Employee_IDs NVARCHAR(MAX))
AS
    EXEC ('SELECT ID, Salary FROM Employee WHERE ID IN (' + @Employee_IDs + ')')
GO

The above procedure, using dynamic SQL, should behave similar to the previous procedure:

GetSalaries '1, 2, 4'

should still return

1 10000
2 20000

It’s even shorter. And it doesn’t require a Split function to parse the input. So why I call this a bad idea? The stored procedure will be provided as an interface, meaning the input will be beyond our control. And input could also be

GetSalaries '1); DROP TABLE Employee --'

Run it and see what happens ;)


SSIS Tips and Tricks

11 February 2009

Working on an SSIS package today I ran into two problems:

Cannot fetch a row from OLE DB provider “BULK” for linked server

Although a set of possible solutions is provided on the MSDN forums here, none of the ones I tried worked in my case. It seems that this error may also come up if you are using an OLE DB source as one of the sources of the transformation and a SQL Server destination.

I managed to fix the problem by using an OLE DB destination instead of a SQL Server destination. No big difference between the two, but it seems SSIS raises this error for SQL Server destination.

Class not registered (when using an Excel source)

If you are using an SSIS transformation with an Excel source, the above error occurs when trying to run in 64 bit mode. This, I found out, is because the Jet engine (the database engine behind Excel, Access etc.) doesn’t have any 64 bit drivers. The Visual Studio designer runs in 32 bit mode, so there will be no problems at design time but when running, if the machine is a 64 bit one, you will get this error.

To correct this, disable the 64 bit runtime for your project. Go to Project -> Properties, select Debugging and set Run64BitRuntime to False.


SQL DML Triggers

17 December 2008

In this post I will talk about DML (Data Manipulation Language) triggers. There is another kind of triggers, DDL (Data Definition Language) triggers, but I won’t cover them here. As always, you can get more detailed information on MSDN.

SQL Server triggers are stored procedures that execute when either an Insert, an Update or a Delete takes place on a table or a view. Triggers can fire before the actual action takes place (FOR triggers), instead of the action (INSTEAD OF triggers) or after the action (AFTER trigger). For exemple, on a table Example_Table, a trigger can be set up as

CREATE TRIGGER tr_Example_Table_IUD ON Example_Table
FOR INSERT, UPDATE, DELETE
AS
    PRINT 'Trigger fired'

Now whenever one of the actions is applied on the table, SQL Server will output ‘Trigger fired’. Whenever a trigger is fiered, two temporary tables are made available: inserted and deleted. The inserted table contains all rows that will/are/were added to the table and the deleted table contains all rows that will/are/were removed from the table. An explicit update table doesn’t exist because an update is considered to be a delete followed by an insert, so the deleted table contains the old rows and the inserted table contains the modified rows.

An great use of triggers is to enable insert/update/delete capabilities on a view. If we have, for example, a view that joins two tables – Example_View, which is setup as SELECT a.ID, a.A_Column, b.B_Column FROM Example_Table_A a JOIN Example_Table_B b ON a.ID = b.ID, we won’t be able to insert data into it directly. But this can be done with a trigger.

CREATE TRIGGER tr_Example_View_I ON Example_View
INSTEAD OF INSERT
AS
BEGIN
    INSERT INTO Example_Table_A (ID, A_Column)

        SELECT ID, A_Column
            FROM inserted
    INSERT INTO Example_Table_B (ID, B_Column)
        SELECT ID, B_Column
            FROM inserted
END

Another example would be always storing the last modified time for every row of a table. If table Example_Table has an ID primary key and a Date_Modified column, the following trigger can be set:

CREATE TRIGGER tr_Example_Table_U ON Example_Table
AFTER UPDATE
AS
    UPDATE tr_Example_Table
        SET Date_Modified = getdate()
        WHERE ID IN (SELECT ID FROM inserted)

Note that no modifications can be applied on the inserted and deleted tables themselves, so the update cannot be made on the inserted table in a FOR trigger.


SQL Stuff

10 December 2008

Am reușit să fac o grămadă de chestii cu SQL Server, care îmi este din ce în ce mai drag. Din păcate timpul nu îmi permite să explic tot ce aș vrea. Dar promit să scriu câte ceva când voi avea un program mai lejer. În domeniul acesta sunt convins că nu vin cu mari descoperiri – majoritatea lucrurilor le învăț la rândul meu de pe MSDN, dar sunt niște chestii atât de drăguțe.

Așa că, pe scurt, despre ce aș putea scrie: bulk import și export (cu fișiere ce specifică formatul și bcp), SQL Server Integration Services și ETL (în Visual Studio și nu numai), triggers (FOR/INSTEAD OF/AFTER INSERT/UPDATE/DELETE), poate și ceva despre securitate și privilegii (schemas, roles, users).

Îmi pare bine că, în sfârșit, am aprofundat domeniul.


SQL Server and the Command Line

1 December 2008

To execute a SQL script from the command line

From the command line, use SQLCMD. SQLCMD can be used with all kinds of parameters: -S servername to specify a server to connect to, -U username -P password to supply login in case Windows Authentication is not used, -i inputfile to specify a SQL script file to be executed etc. -? shows the entire list of parameters.

So for executing a script on the default server using Windows Authentication, the following command can be used:

sqlcmd -i myscript.sql

In case SQLCMD fails to connect to SQL Server, make sure that the SQL Server instance accepts Named Pipes and TCP/IP connections (in SQL Server Configuration Manager, under Network Configuration – Protocols, check that Named Pipes and TCP/IP are enabled) and also that Windows Firewall allows SQL Server through.

To execute a command line from a SQL script

From the script, use xp_cmdshell. For example, to list all files in the current folder (dir command), the syntax is

EXEC xp_cmdshell 'dir'

Standard output is copied into a temporary table containing a single nvarchar(255) column, each row corresponding to a stdout line. To store the result for further processing, a table can be used:

CREATE TABLE temp (Line NVARCHAR(255))
INSERT INTO temp EXEC xp_cmdshell 'dir'

Also, an execution result code from applications is returned by the command. To check the code, the value can be stored in an INT variable.

DECLARE @ResultCode INT
EXEC @ResultCode = xp_cmdshell 'dir'

The xp_cmdshell stored procedure is disabled by default. It can be enabled either by using the SQL Server Surface Area Configuration tool, or with the following commands:

EXEC sp_configure 'xp_cmdshell', 1
GO
RECONFIGURE
GO


Cursors and Dynamic SQL

28 November 2008

Although what I’ll talk about next isn’t rocket science and most people with some database development experience already know this, I hope there will still be some who will find it useful.

I am currently working on a SQL Server 2005 application. Whenever I had to develop a solution involving databases until now, I developed all the application logic in C# and used SQL Server only for storing data. Maybe I created a small stored procedure or view, but never got to actually implementing anything serious using only SQL. Up until now.

Suppose you have multiple data sources, each uploading data into a table and data sources can be added/removed from the solution after it is deployed. You need to copy all data from all sources into a single table (all tables have the same structure). Now this being a SQL Server application, it comes naturally that the best place to store the list of data sources is a table. Unfortunately, copying data from all tables cannot be done in a single statement. First, you need the SQL equivalent of stepping through each row. This is done using a cursor, as follows:

DECLARE Source CURSOR FOR SELECT SourceName FROM TableListingSources
DECLARE @SourceName VARCHAR(100)
OPEN Source
FETCH NEXT FROM Source INTO @SourceName

WHILE @@FETCH_STATUS = 0
BEGIN
    /* Copying things will come here */
    FETCH NEXT FROM Source INTO @SourceName
END

CLOSE Source
DEALLOCATE Source

First statement declares a cursor, assigning it a query. Second statement declares a variable which will be used to store the data retrieved by the cursor. The the source is opened. FETCH NEXT grabs the following row of the query result and stores it in the given variables. Multiple fields can be retrieved and stored by separating them with commas. @@FETCH_STATUS is 0 as long as the cursor is able to retrieve a row – in other words, as long as the end of the table isn’t reached. Cleanup at the end – the cursor must be closed and deallocated.

So executing the above, the variable @SourceName will take the value of each table name stored in the TableListingSources table. Second problem is copying data from each table, given its name. Although looking very elegant, the following statement won’t work:

INSERT INTO Destination SELECT * FROM @SourceName

There are places in a SQL statement where variables cannot be specified. And this is where Dynamic
SQL comes into play. Instead of the above statement, you can write:

EXEC (‘INSERT INTO Destination SELECT * FROM ‘ + @SourceName)

EXEC, or EXECUTE, can send to SQL Server any string and SQL Server will try to interpret it as a statement. So whenever you need to generate statements on the fly, depending on some variables that can’t be directly placed into statements, you can use EXEC and build a string instead.

More information on cusors and dynamic SQL can be found on MSDN.