On this page

hierarchy implementation in SQL server 2000

Ads

Navigation

Search

Categories

Clouds

Sql Server (5) .Net (16) .Net 2.0 (2) C# (3) @ff Topic (5) Architectural solutions (9) ASP (1) BDD (5) Blog related (8) database (2) Development process (8) Facebook (1) job interviews (1) Lessons (5) Life (12) Microsoft (5) IIS 6 (2) SPS (sharepoint server) (3) Drivers (1) Internet Explorer (2) Windows 2003 server (1) NightDuck (2) Performance (5) Security (9) Sql Server 2000 (4) Study (2) TDD (1) Threading (3) Under the hood (1) Web (1) Web services (1) XSS (6)

Archive

Blogroll

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

RSS 2.0 | Atom 1.0 | CDF

Send mail to the author(s) E-mail

Total Posts: 63
This Year: 7
This Month: 0
This Week: 0
Comments: 33

Sign In
Pick a theme:

 Wednesday, August 16, 2006
Wednesday, August 16, 2006 10:47:18 PM (GMT Standard Time, UTC+00:00) ( Sql Server  | Architectural solutions | database | Sql Server 2000 )

remember this neat syntax that exists in oracle database for hierarchy selects ?
actually , its pretty simple :

SELECT last_name, employee_id, manager_id, LEVEL
FROM employees
START WITH employee_id = 100
CONNECT BY PRIOR employee_id = manager_id;


ever tried to do the same in sql server ?
well, these kind of syntax just does not exist,
we need to work very hard to create such a feature in our database.

so, what do we have:

  • a table that contains entities
  • each entity connected to some father entity
  • each connection describes a "father - son" relation between the two entities

lets not forget the things we need to relay on , when implementing :

  • what will happen when we will delete the father of some sub tree ?
  • what should we do when we update/add a record ?
  • how will we select the data ?
  • algorithm efficiency is crucial, if we will need to wait 5 minutes for the data, it's not worth it

the first (but apparently the worst) idea that came to my mind is recursion
lets look at this table :

 

EmployeeID Name BossID

1

shimon

NULL

2

yossi

1

3

Gaby

1

4

koby

3

5

jack

3

 

we have we that the employee shimon is the "big boss" (because there is no other boss above him),
under shimon we have the employees Gaby and yossi,
and under Gaby, we have another 2 employees : Koby and jack

the recursive solution is to write some stored procedure that will receive the employeeID and return as a data-Table the results
i will not add the code for this solution and surely will not recommend it because it was many problems :

  • for each record we received as a descendant ,
    we need to run with the function and get her descendants,
    and so on, until there are no descendants for the node
  • we are limited to 32 levels of hierarchy
  • the runtime will depend on the row count that is in the table (we will need to run on each of the rows one at the time)
  • the run on the node will look like this :
    heirarchy.jpg

by the way, the most common way that I've seen to select hierarchical structure,
is simply by setting a join between the levels in the select query.
for example :

SELECT TopBoss.Name TopBoss, Boss.Name Boss, Employees.Name Employee
FROM Employees
INNER JOIN Employees AS Boss ON Employees.BossID=Boss.EmployeeID
INNER JOIN Employees TopBoss ON Boss.BossID=TopBoss.EmployeeID

this apply to the selection of three levels

For each level, you'd need to join the table to itself...not an attractive option if you have 5 or more levels ,
you don't know how many levels you will have to select, there is no way can control it!
It would be great if it could join itself as many times as needed. This is called a recursive join, and though some database products support it (Oracle has the CONNECT BY syntax) SQL Server is not one of them.

the other way is based on a thread that i read here about hierarchies,

lets create a table :

CREATE TABLE Tree (
Node int NOT NULL IDENTITY(100, 1),
ParentNode int,
EmployeeID int NOT NULL,
Depth tinyint,
Lineage varchar(255) )

the extra fields that has been added are the "lineage" and the "depth"

  • Depth - for saving the current depth of the record in the hierarchy
  • Lineage - for saving all the ancestors of the record as a concatenated string

after filling the needed data for relations, the table looks like this :

Node ParentNode EmployeeID Depth Lineage
100 NULL 1001 NULL NULL
101 100 1002 NULL NULL
102 101 1003 NULL NULL
103 102 1004 NULL NULL
104 102 1005 NULL NULL
105 102 1006 NULL NULL

The next part is to find the root node of the tree, also known as the top-level, etc.
That's the node that has no parent (Null), so we will start there and set the Lineage column as the root:

UPDATE Tree SET Lineage='/', Depth=0 WHERE ParentNode Is Null

Once we did that,
we can then update the rows who are the descendant of the root node:

WHILE EXISTS (SELECT * FROM Tree WHERE Depth Is Null
   UPDATE T SET T.depth = P.Depth + 1, 
   T.Lineage = P.Lineage + Ltrim(Str(T.ParentNode,6,0)) + '/' 
   FROM Tree AS
   INNER JOIN Tree AS P ON (T.ParentNode=P.Node) 
   WHERE P.Depth>=0 
   AND P.Lineage Is Not Null 
   AND T.Depth Is Null

 

this loop will run once for each level of the hierarchy (not for each node as the recursion method.)
so, with data representation of 10,000 records with 8 levels of hierarchy,
this code will run only 8 times to populate the needed data of the "lineage" field and the "depth" field, and this "heavy" procedure will happen only once at the setup.
the table should look like this after the given operation :



Node ParentNode EmployeeID Depth Lineage
100 NULL 1001 0 /
101 100 1002 1 /100/
102 101 1003 2 /100/101/
103 102 1004 3 /100/101/102/
104 102 1005 3 /100/101/102/
105 102 1006 3 /100/101/102/

 

You'll notice that for each node, the entire lineage back to the root is stored. This means that finding someone's boss, or their boss' boss, doesn't require any self-joins or recursion to create an indented list. In fact, it can be accomplished with a single SELECT.

SELECT Space(T.Depth*2) + E.Name AS Name
FROM Employees E
INNER JOIN Tree T ON E.EmployeeID=T.EmployeeID
ORDER BY T.Lineage + Ltrim(Str(T.Node,6,0))

 

maintaining the table is really not a big deal if we will use triggers.
think about the new inserted record as the row that has not been filled in the setup process.
so the insert trigger should be :

UPDATE T SET T.depth = P.Depth + 1,
T.Lineage = P.Lineage + Ltrim(Str(T.ParentNode,6,0)) + '/'
FROM Tree AS T
INNER JOIN Tree AS P ON (T.ParentNode=P.Node)
WHERE P.Depth>=0
AND P.Lineage Is Not Null
AND T.Depth Is Null

 

the update trigger should do pretty much the same : building the 2 extra field all over again.

suggestions and request will be repplied :)