SQL Server 2005 Pattern Matching with Dynamically Supplied Patterns
I recently ran into a situation wherein I needed to identify records in a database that did not match a certain set of patterns. I needed a way to specify various patterns without having to dip into SQL Server every time I wanted to add or remove a new pattern. For example, I wanted to check the values in the table to see if they matched this pattern: "AA999AA" -- two characters, followed by three numbers, followed by two characters.
I had at my disposal, of course, a couple different options:
- Pull the records out of the database and execute a regular expression in .Net code against each row to identify those records that did not match the desired pattern; but this required me to extract all of the records from the database in order to run my pattern check...and there were *lots* of records to be queried, Or
- Write a stored procedure that would iterate through each record in the table looking for violators of a pattern; but the obvious issue with this narrow approach had to do with the dynamic nature and variableness of the patterns...how could you make them easy to modify without having to create another table that would require a DBA to go in and update, yada-yada-yada...
But after discussing the situation with my practice manager, Steve Saxon, he pointed my co-worker, Brian Scott, and myself, in the right direction; in essence, option 3.
- Employ a good mix of .Net and SQL Server to accomplish the system requirements. In essence, take advantage of the ease of update that a .Net configuration file gives (this proved to be the source for all patterns that were valid for the system per state), and allow SQL Server to do what it does best...query data. So our .Net code would read the patterns from the configuration file and pass them off to SQL Server to do the work.
I have to admit, prior to this project I wasn't aware that SQL Server could perform pattern matching when performing a query (similar to a regular expression)...
When executing a Pattern Match query on a dynamically supplied pattern, there are a number of things to consider: performance, security, readability, etc. The below code retrieves a dynamically provided list of patterns for a give state, and then builds a SQL injection-attack- proof query/statement.
CREATE PROCEDURE [dbo].[MyStoredProcedure]
@pattern NVARCHAR(2000),
@filter CHAR(2)
AS
BEGIN
DECLARE @processingPattern VARCHAR(100),
@sql NVARCHAR(1000)
-- Patterns table that will hold itemized patterns after they are parsed/split.
DECLARE @Patterns TABLE
(
Pattern VARCHAR(100),
IsProcessed BIT NOT NULL DEFAULT (0)
)
INSERT INTO @Patterns (Pattern)
SELECT Pattern from dbo.funcParsePattern(@pattern, ',')
-- Build dynamic query.
SET @sql = 'SELECT Field1,
Field2,
Field3,
.
.
.
FROM table1'
-- Dynamic definition of parameters and parameter values
DECLARE @paramDefinition NVARCHAR(2000);
DECLARE @paramValues NVARCHAR(2000);
-- Process each pattern...identify all valid plate numbers.
DECLARE @counter INT
SET @counter = 0
WHILE ((SELECT COUNT(*) FROM @Patterns WHERE IsProcessed = 0) > 0)
BEGIN
SELECT TOP 1 @processingPattern = Pattern
FROM @Patterns
WHERE IsProcessed = 0
-- Add to @sql
IF (@counter == 0)
SET @sql = @sql + ' WHERE Field1 NOT LIKE @Pattern' + CAST(@counter AS VARCHAR(10))
ELSE
SET @sql = @sql + ' AND Field1 NOT LIKE @Pattern' + CAST(@counter AS VARCHAR(10))
-- Add to @paramDefinition
SET @paramDefinition = @paramDefinition + ', @Pattern' + CAST(@counter AS VARCHAR(10)) + ' VARCHAR(100)'
-- Add to @paramValues
SET @paramValues = @paramValues + ', @Pattern' + CAST(@counter AS VARCHAR(10)) + ' = ''' + @processingPattern + ''''
SET @counter = @counter + 1
UPDATE @Patterns
SET IsProcessed = 1
WHERE Pattern = @processingPattern
END
DECLARE @sql2 NVARCHAR(4000)
SET @sql2 = 'sp_executesql N''' + @sql + ''', N''' + @paramDefinition + ''', ' + @paramValues
EXEC sp_executesql @sql2
END
You'll notice at the beginning of the stored procedure there is a reference to a user defined function:
INSERT INTO @Patterns (Pattern)
SELECT Pattern from dbo.funcParsePattern(@pattern, ',')
The user defined function provides an itemized list of patterns from those that were supplied. The patterns are passed to the sproc with a delimiter.
Below is the code of the user defined function:
ALTER FUNCTION [dbo].[funcParsePattern]
(
@List NVARCHAR(2000),
@SplitOn NVARCHAR(5)
)
RETURNS @ReturnValue TABLE
(
Pattern NVARCHAR(100)
)
AS
BEGIN
WHILE (CHARINDEX(@SplitOn, @List) > 0)
BEGIN
INSERT INTO @ReturnValue (Pattern)
SELECT Pattern = LTRIM(RTRIM(SUBSTRING(@List, 1, CHARINDEX(@SplitOn, @List)-1)))
SET @List = SUBSTRING(@List, CHARINDEX(@SplitOn, @List) + LEN(@SplitOn), LEN(@List))
END
INSERT INTO @ReturnValue (Pattern)
SELECT Pattern = LTRIM(RTRIM(@List))
RETURN
END
This solution proved to not only be safe and efficient, but most importantly, it met the system requirements. And since the deployment of this solution, patterns have been both added and removed without a DBA getting his/her hands dirty! I simply updated the application's configuration file.
Comments