MySQL Closure Table

Sometimes we are given a task where we need to model a hierarchy of different complexity levels and not really sure how to do that properly in the most efficient, reliable, and flexible way. Let’s review one of the data modeling patterns that give us some answers for that.

Consider we have a hierarchy with a ragged variable depth, like on a picture below.

The requirements are that we need to:

  • Query a subtree, preferably up to a certain level of depth, like direct subordinate as well as descendants up to a certain level
  • Query all ancestors of the node
  • Add/Remove a node to a hierarchy
  • Change the layout of the hierarchy — move a subtree from one place to another (from one parent to another one)
  • Manage efficiently the forest —a large number of trees, where each tree is not necessarily big in terms of number of nodes, and the maximum depth, but the number of trees could grow up to hundreds of thousands or even millions
  • Manage nodes of the trees that enter short-term relationships. So there is a need to track the time period for which the actual relationships took place.

This is where a closure table comes into the picture.

A closure table gives you the ability to work with tree structure hierarchies. It involves storing all paths through a tree, not just those with a direct parent-child relationship. For example: things like family tree(Parent – Kids – etc), company hierarchy(Owner – CEO – Manager – Employee), or file sytem(/users/folauk/Documents/test.mp4). There are other ways such as Path Enumeration and Nested Sets to keep track of this kind of information but using a closure is the simplest and most efficient way of doing it.

The main building block of the pattern is an additional structure(table) or a mapping table that keeps the tree relationships — a set of node pairs for each path from the ancestor to a descendant or parent to child, including a path to itself.

 

Our example here is a social media site where a user can put up a post. Other users can comment on that post and can comment on comments from the same post. The tree here is comments under comments. We have to keep track of which comments go under which comments and their parents.

CREATE TABLE `user` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `date_of_birth` date DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  `first_name` varchar(255) DEFAULT NULL,
  `gender` varchar(255) DEFAULT NULL,
  `last_name` varchar(255) DEFAULT NULL,
  `phone_number` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=latin1;

CREATE TABLE `post` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `content` longtext,
  `created_at` datetime(6) DEFAULT NULL,
  `last_updated_at` datetime(6) DEFAULT NULL,
  `uuid` varchar(255) NOT NULL,
  `author_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UK_b1a5vs6t5vi32stck5kjw0tgf` (`uuid`),
  KEY `FK12njtf8e0jmyb45lqfpt6ad89` (`author_id`),
  CONSTRAINT `FK12njtf8e0jmyb45lqfpt6ad89` FOREIGN KEY (`author_id`) REFERENCES `user` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=41 DEFAULT CHARSET=latin1;

CREATE TABLE `comment` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `content` longtext,
  `created_at` datetime(6) DEFAULT NULL,
  `last_updated_at` datetime(6) DEFAULT NULL,
  `author_id` bigint(20) DEFAULT NULL,
  `post_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `FKh1gtv412u19wcbx22177xbkjp` (`author_id`),
  KEY `FKs1slvnkuemjsq2kj4h3vhx7i1` (`post_id`),
  CONSTRAINT `FKh1gtv412u19wcbx22177xbkjp` FOREIGN KEY (`author_id`) REFERENCES `user` (`id`),
  CONSTRAINT `FKs1slvnkuemjsq2kj4h3vhx7i1` FOREIGN KEY (`post_id`) REFERENCES `post` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=latin1;

CREATE TABLE `comment_tree` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `depth` int(11) DEFAULT NULL,
  `child_id` bigint(20) DEFAULT NULL,
  `parent_id` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `FKhosfyxi6cpykkhyrccfxq2k2x` (`child_id`),
  KEY `FKq2283664y5ywd961iojgjk98b` (`parent_id`),
  CONSTRAINT `FKhosfyxi6cpykkhyrccfxq2k2x` FOREIGN KEY (`child_id`) REFERENCES `comment` (`id`),
  CONSTRAINT `FKq2283664y5ywd961iojgjk98b` FOREIGN KEY (`parent_id`) REFERENCES `comment` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1201 DEFAULT CHARSET=latin1;

Here we have a comment table that represents our comment, author of the comment, and the post it belongs to.

Insert comments(rows) into the comment_tree(closure) table

INSERT INTO comment_tree(parent_id, child_id, depth) 
SELECT p.parent_id, c.child_id, p.depth+c.depth+1 
FROM comment_tree as p, comment_tree as c 
WHERE p.child_id = :parentId and c.parent_id = :childId;

Delete comments(rows) in the comment_tree(closure) table

DELETE link
FROM comment_tree as parent, comment_tree as link, comment_tree as child
WHERE parent.parent_id = link.parent_id AND child.child_id = link.child_id
-- AND parent.child_id = {parentId} and child.parent_id = {childId}
AND parent.child_id = ? AND child.parent_id = ?

Retrieve comments under a comment.

Get all comments under comment with id 4.

SELECT com.*
FROM comment AS com
JOIN comment_tree AS comTree ON com.id = comTree.child_id
WHERE comTree.parent_id = 4;

Retrieve parent comments of a comment

Get all parent comments of comment with id 4.

-- retrieve parent comments of comment with id 4
SELECT com.*, comTree.*
FROM comment AS com
JOIN comment_tree AS comTree ON com.id = comTree.parent_id
WHERE comTree.child_id = 4;

Source code on Github




Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *