For SQL Hierarchical Data, How To Show Only Tree Branches Including Certain Type Of Entities And Discard Others
Posted by 1/13/2014 02:20:00 AM with No commentson
and discard the branches which don't include this type.
For example, let's have a look on the tree above and assume that we need to show only the branches which include the "J" type of entities. For this to happen, then we need to transfer the above tree into the one below.
How to do this in SQL?
To be able to answer this question, we first need to know how our mind does it. Our mind does a lot of processing and computing in an efficient way which makes us think that it is so easy, but is it that easy?
Assume that we have these types of entities in our tree.
And the real entities instances in our database are as in the image below.
Which means that our entities are in the tree form as in the image below.
To do this in a more systematic approach, we can first locate our leaf entities. Then, we go up one level for each one of the leaf entities, then another one level up and so on till we reach the top of the tree. This could be illustrated as in the table below.
As you can see, we first started with the leaf entities and it took us five iterations to reach the top most of the tree. But why do we need to do that?
We did that because we already know the sequence of the system entities types, for example we know that if the leaf entity of a branch is "T", then this branch will not have a "J" as "J" comes after "T". So, we needed to trace back each branch to decide which branch to keep and which to discard.
Does that mean that every branch doesn't end with "J" should be completely deleted up to the top of the tree? No, as you can see in our tree above, there is a branch which ends with "T4", we are sure that this branch will not include a "J" but if we delete this branch up to the "C" entity, we will lose "P2" which has valid "J" children.
So, to get it right you can think of it as a voting system. Each entity in the tree should vote on whether its parent should be kept or no. Finally, we sum the voting on each parent and know if at least one child needs this parent, if not, we can discard this parent without any problems or loosing any valid branches.
So, each entity of the leaf entities should already know if its parent should be kept or not, so in our example, if the leaf entity is a "J", then its parent should be voted a "1". But if the leaf entity is a "S" or "T" or "P" or "C" then the parent should be voted a "0".
So, this leads us to the result in the image below.
Now, we need to sum the votes of each entity to know which entities to keep and which to discard. After doing the summation, we will get the result as in the image below.
Are these the entities we can delete from the tree to make sure every branch includes a "J" entity? Yes, those are the ones and by deleting them we get the final tree as in the image below.
So, now we are sure that the logic we applied is valid but can we apply this complex logic in SQL?
Applying the same logic in SQL
--Creating table structure DECLARE @AllEntities AS Table ( Id INT , NAME NVARCHAR(100) , ParentId INT , [Type] NVARCHAR(10) ) --Inserting sample data as in the example supplied INSERT INTO @AllEntities ( Id , NAME , ParentId , [Type] ) VALUES (1, 'C', NULL, 'C') , (2, 'P1', 1, 'P') , (3, 'P2', 1, 'P') , (4, 'T1', 2, 'T') , (5, 'T2', 2, 'T') , (6, 'T3', 2, 'T') , (7, 'T4', 3, 'T') , (8, 'T5', 3, 'T') , (9, 'S1', 4, 'S') , (10, 'S2', 4, 'S') , (11, 'S3', 5, 'S') , (12, 'S4', 6, 'S') , (13, 'S5', 6, 'S') , (14, 'S6', 8, 'S') , (15, 'S7', 8, 'S') , (16, 'J1', 10, 'J') , (17, 'J2', 10, 'J') , (18, 'J3', 11, 'J') , (19, 'J4', 11, 'J') , (20, 'J5', 14, 'J') , (21, 'J6', 14, 'J') --Declaring a variable to hold the required ensured entity type --Each branch in the tree should include this entity type, otherwise, --the whole branch will be excluded from the final result DECLARE @EnsuredEntityType AS NVARCHAR(10) SET @EnsuredEntityType = 'J' ;WITH EnsuredEntityTree(Id, ParentId, Voting) AS ( --Strating with the leaf entities on the tree as the seed SELECT parent.ObjectID , parent.ParentObjectID , CASE WHEN ( (@EnsuredEntityType = 'J' AND (parent.Type = 'J')) OR (@EnsuredEntityType = 'S' AND (parent.Type = 'S' OR parent.Type = 'J')) OR (@EnsuredEntityType = 'T' AND (parent.Type = 'T' OR parent.Type = 'S' OR parent.Type = 'J')) OR (@EnsuredEntityType = 'P' AND (parent.Type = 'P' OR parent.Type = 'T' OR parent.Type = 'S' OR parent.Type = 'J')) OR (@EnsuredEntityType = 'C' AND (parent.Type = 'C')) ) THEN 1 ELSE 0 END AS Voting FROM @AllEntities as parent LEFT OUTER JOIN @AllEntities as children ON children.ParentId = parent.Id WHERE children.Id IS NULL UNION ALL SELECT parent.Id , parent.ParentId , et.Voting --the same voting from the original seed object FROM @AllEntities AS parent INNER JOIN EnsuredEntityTree AS et ON et.ParentId = parent.Id ) SELECT DISTINCT Id, ParentId FROM EnsuredEntityTree GROUP BY Id, ParentId HAVING SUM(Voting) > 0
Finally, I hope you find this useful someday.