September 27, 2008
@ 11:48 AM
One of the really killer features included in SQL Server 2005 was Common Table Expressions. One of the really nice uses for them is recursive queries. Imagine any kind of hierarchical set of date (org. chart, security which allows nested roles, parts/assemblies, etc.). You can use CTE's to walk up or down these trees to build it's result set. Let's look at a simple example of this. I'm going to create a table named "ItemGroups" which is nothing more than a listing of items which have a PK, Title, Description, and foreign key to a parent it may be a child of.
CREATE TABLE [dbo].[ItemGroups](
	[iid] [int] IDENTITY(1,1) NOT NULL,
	[Title] [varchar](60) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	[Description] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
	[fk_ItemGroups] [int] NULL,
 CONSTRAINT [PK_ItemGroups] PRIMARY KEY CLUSTERED 
(
	[iid] ASC
)WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Then, I'm going to add some sample data to this table:


iid     Title                   Description                     fk_ItemGroups
1       Root                    This is the root                NULL
2       Child 1                 This is a child of root         1
3       Child 2                 This is a child of root         1
4       Grandchild 1            This is a child of Child 1      2
5       Grandchild 2            This is a child of Child 2      3
6       Great Grandchild 1      This is a child of Grandchild 1 4

If we draw this out as a "tree", it would look something like this (note the modern looking ASCII art...)

- Root
  - Child 1
    - Grandchild 1
       - Great Grandchild 1
  - Child 2
    - Grandchild 2

OK, great - let's suppose we want to walk up or down this tree from a known starting point. How might we use a CTE to do that?

It might help to understand the basic format of a CTE:

WITH SomeTableName (List of resulting fields)
(
   SELECT -- Starting point or anchor of the query
    UNION ALL
   SELECT -- Recursive portion of the query
)
SELECT -- Final select from SomeTableName

We have the "WITH" portion which describes what our CTE cursor would look like (we can reference this in the recursive portion of the query and in the final SELECT). Then we have the first SELECT which selects the starting record(s) for the recursive portion of our query. It's basically the starting point. The second SELECT pulls in matching records which are children or parents of the record in the anchor portion.

Let's see what that would look like against our table, assuming we want to walk down the hierarchy - in this code, we're going to be starting with the root node.

DECLARE @startNode int
SET @startNode = 1; -- Note the semicolon - it's required for the command
                    -- immediately before the CTE

WITH Items (iid, Title, Description, fk_ItemGroups) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid,
         ig.Title, 
         ig.Description,
         ig.fk_ItemGroups
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid,
         ig.Title, 
         ig.Description,
         ig.fk_ItemGroups
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON ig.fk_ItemGroups = Items.iid
)
SELECT *
  FROM Items

If we run this, here's our results (notice that the query automatically stops recursing when no more matches are found).


iid     Title                   Description                     fk_ItemGroups
1       Root                    This is the root                NULL
2       Child 1                 This is a child of root         1
3       Child 2                 This is a child of root         1
5       Grandchild 2            This is a child of Child 2      3
4       Grandchild 1            This is a child of Child 1      2
6       Great Grandchild 1      This is a child of Grandchild 1 4

If we change the starting node to 2 and rerun this, you'll see we only get Child 1 and it's children:

2       Child 1                 This is a child of root         1
4       Grandchild 1            This is a child of Child 1      2
6       Great Grandchild 1      This is a child of Grandchild 1 4

And if we change it to start at Grandchild 1, we get:

4       Grandchild 1            This is a child of Child 1      2
6       Great Grandchild 1      This is a child of Grandchild 1 4

What if we'd like to walk "up" the hierarchy instead? That's just as easy. In the recursive portion of the query, we need to change our join condition. The first query will return the record we want to start on (aliased as 'Item' in this example). To walk up the chain, our fk_ItemGroups will match our parents iid. So change the ON to: " Items.fk_ItemGroups = ig.iid".

Let's rerun the last query:

4       Grandchild 1    This is a child of Child 1      2
2       Child 1         This is a child of root         1
1       Root            This is the root                NULL

It might be useful to know how many levels deep of recursion were required to retrieve a row. We can modify our query to include this info by adding a new column, "Level". In our root query we set it to start at 0, and we increment it in the recursive portion of the query:

SET @startNode = 4; -- Note the semicolon - it's required for the command
                    -- immediately before the CTE

WITH Items (iid, Title, Description, fk_ItemGroups, [Level]) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid,
         ig.Title, 
         ig.Description,
         ig.fk_ItemGroups,
         0 AS Level
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid,
         ig.Title, 
         ig.Description,
         ig.fk_ItemGroups,
         Items.Level + 1 
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON Items.fk_ItemGroups = ig.iid
)
SELECT *
  FROM Items

iid     Title           Description                     fk_ItemGroups   Level
4       Grandchild 1    This is a child of Child 1      2               0
2       Child 1         This is a child of root         1               1
1       Root            This is the root                NULL            2

I've mostly ignored the final SELECT * FROM Items, but in a "real" query you tend to use this portion of it to pull in all your detail from various supporting tables.

In a few cases I've found that I've actually needed to walk up and down a hierarchy from a given starting point. I've ended up just creating two CTEs - one to walk up and one to walk down the hierarchy. I insert the results of each of them into a temp. variable, then pull the final results.

DECLARE @curItems TABLE (iid int);

-- Walks up the hierarchy
WITH Items (iid]) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON Items.fk_ItemGroups = ig.iid
)
INSERT INTO @curItems (iid) (SELECT iid FROM Items);

-- Walks down the hierarchy
WITH Items (iid]) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON ig.fk_ItemGroups = Items.iid
)
INSERT INTO @curItems (iid) (SELECT iid FROM Items)

-- Code which does final select here

DECLARE @curItems TABLE (iid int);

-- Walks up the hierarchy
WITH Items (iid]) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON Items.fk_ItemGroups = ig.iid
)
INSERT INTO @curItems (iid) (SELECT iid FROM Items);

-- Walks down the hierarchy
WITH Items (iid]) AS
( -- This is the 'Anchor' or starting point of the recursive query
  SELECT ig.iid
    FROM ItemGroups ig
   WHERE ig.iid = @startNode
   UNION ALL -- This is the recursive portion of the query
  SELECT ig.iid
    FROM ItemGroups ig
   INNER JOIN Items -- Note the reference to CTE table name
      ON ig.fk_ItemGroups = Items.iid
)
INSERT INTO @curItems (iid) (SELECT iid FROM Items)

-- Code which does final select here

As you can see, it's pretty simple to use CTE's. The syntax looks a little weird at first but once you've written one or two queries it's pretty straightforward.


 
Saturday, September 27, 2008 1:22:49 PM (Eastern Standard Time, UTC-05:00)
Very nice explanation of recursive CTE. It may be worth mentioning that by default SQL server restricts recursive CTEs to 100 levels. To support more than 100 levels, "OPTION (MAXRECURSION)" can be used.

I agree that recursive CTEs are great. Stored procedures and functions can support only 32 levels of recursion. I have written a post that presents an approach to perform recursive updates for more than 32 levels. http://www.sqlserverandxml.com/2008/09/performing-recursive-update-for-more.html

May be it is a good idea to write a follow-up article using the HIERARCHYID data type of SQL Server 2008, that made traversing hierarchies (up or down) much easier.

Recursive CTEs are difficult to explain. You have done it well!

Saturday, September 27, 2008 1:41:53 PM (Eastern Standard Time, UTC-05:00)
Thanks, I'm glad you liked it. I haven't had any time to install and play with SQL Server 2008 yet so I don't know too much about it's new features.
Name
E-mail
(will show your gravatar icon)
Home page

Comment (Some html is allowed: a@href@title, b, i, strike) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

Enter the code shown (prevents robots):

Live Comment Preview