Recently I happen to see someone posts that the most beautiful data structure in his mind is ** Dancing Links**. The name bursts my interest to understand one more idea from Dr. Knuth. (again let me paste the margin comment of

**here):**

*Concrete Mathematics*

Actually Gauss is often called the greatest mathematician of all time. So it’s nice to be able to understand at least one of his discoveries.

Follow the curiosity and keep youth. I am writing this blog to re-state the algorithm to confirm my understanding.

## The fundamental idea: doubly link list that easy to undo

Following is simple C code that defining a double linklist. Remove Operation is simple in doubly link list because it can accept the data to remove as function parameter. The idea is that if we do not clean up or GC the removed node, it is super easy to add it back (means UNDO the previous remove). Dr. Knuth points out this property can be very powerful when implementing some searching algorithm involving **backtrack**. This is

```
typedef struct Node Node;
struct Node
{
Node *Llink;
Node *Rlink;
/* void *data; */
int data;
};
typedef struct DoublyLink DoublyLink;
struct DoublyLink
{
Node *head; /* dummy head node */
};
#define LLINK(x) ((x)->Llink)
#define RLINK(x) ((x)->Rlink)
extern void remove_node(Node *x);
extern void undo_remove(Node *x);
/*
* Given the pointer to the Node, we can
* remove the Node with the advantage of
* double link list.
* NOTE: caller need to take care that
* not to remove the dummpy head.
*/
void
remove_node(Node *x)
{
Node *l = LLINK(x);
Node *r = RLINK(x);
RLINK(l) = r;
LLINK(r) = l;
}
/*
* The core idea of Dancing Links: we can
* undo a remove-operation super easily.
*/
void
undo_remove(Node *x)
{
Node *l = LLINK(x);
Node *r = RLINK(x);
RLINK(l) = x;
LLINK(r) = x;
}
```

## Exact Covering Problem

\begin{pmatrix} 0 & 0 & 1 & 0 & 1 & 0 & 0\\ 1 & 0 & 0 & 1 & 0 & 0 & 1\\ 0 & 1 & 1 & 0 & 0 & 1 & 0\\ 1 & 0 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 1 & 1 & 0\\ \end{pmatrix}

Given a binary matrix, find some rows that summing those row vectors will get a vector with all ones (no zeros or other number larger than one). This problem is called ** Exact Covering Problem**. Take the above matrix as an example, you can choose the first row, the fourth row and the fifth row to finish a exact covering. The problem is NP-hard.

Introducing two concepts will help model the problem:

**Items**: those need to be covered (columns in the matrix example)**Options**: those we can choose to cover (rows in the matrix example)

Thus the problem is to choose some options that exact covering all items.

We use the following pseudo-code to show the searching algorithm (Knuth’s ** Algorithm X**):

```
transform input matrix into knowledge database;
while True:
if there are no items left to cover:
report a solution;
terminate the program;
else:
pick an item to cover;
cover this item:
if there are not options can be choosen to cover it:
fail; # no solution need backtrack
else:
for each option involving this item:
for each item in this option:
if item is different:
cover the new item; #then get a smaller problem
recursively solving the small problem
```

To build a data structure: **1)** encode the matrix efficiently; **2) **benefit the algorithm X, Knuth’s skills are shown in below picture (some skills look very tricky to me):

Several points need to be noted are:

- the first row of doubly linklist, with elements named
`a,b,c,...g`

are actually two rows: one is the item set; and the other is head for each vertical doubly linklist. Those for item doubly linklist contain a name field for each cell. Those as headers contain a length field that recording the number of elements in vertical list - For other nodes, one row (in the picture linked by dashed line) is an option, for example, the first row is the option
`"c,e"`

. The skill here is to allocate them in order so that we do not need extra pointer fileds to link them horizontally. But to distinguish each option, some specical nodes called**Spacer**are inserted between each line. **Spacer**also stores information of the position of last operation and next operation, this is tricky.- Other normal nodes as elements of options, they contain a pointer to the vertical link’s head.
- Vertical linklist is used for finding all options involving an item

I implement this code in https://github.com/kainwen/misc/tree/main/dancing_links. (Some parts of the code can be refined by abstraction).

The nice idea to me is:

- allocate nodes in a pool, and the nodes will just be there not recycled, even removed temporarily the content still remember all the information
- dancing links to modify the nodes connection just to recursively solve a smaller problem
- dancing links so easy to recover to the previous state

For detailed time complexity analysis, please refer to Reference [1].

## References

- [1]
**The Art of Computer Programming, Volume 4, Fascicle 5: Mathematical Preliminaries Redux; Introduction to Backtracking; Dancing Links** - [2]
**Stanford Lecture: Don Knuth—”Dancing Links” (2018)** - [3] Hitotumatu, Hirosi. “A technique for implementing backtrack algorithms and its application.” (1979).
- [4] Dahl, Ole-Johan, Edsger Wybe Dijkstra, and Charles Antony Richard Hoare, eds.
*Structured programming*. Academic Press Ltd., 1972.

Pingback: 👻幽灵捕手——Dancing Links | Thoughts on Computing