In my previous post, I whined about how it didn't seem to me that a pure-functional union-find data structure should be so hard to make as computationally efficient as the well-known imperative algorithms. After some wonkulation on the subject, I convinced myself that— yes, bunky, it's true— functional purity really does cost extra, and sometimes it costs enough that it's worth sacrificing purity for efficiency.
There are a lot of techniques for hiding imperative code behind functional interfaces, and I know them fairly well at this point, so I thought I would try my hand at designing an [impure] persistent union-find data structure with optimal complexity characteristics.
I'm not by any count the first person to try to do this. Sylvain Filliâtre and Jean-Christophe Conchon
did it a few years ago, and
they were thorough enough to provide a formal proof in Coq alongside it. They did a great job, but there are some nitpicky things about their code that I wish were done differently.
- FIND returns the representative element of the set, and I'd rather the code allowed for the representative of the equivalence class to be of a type different than the set elements. This is because I'm planning to use the data structure in a constraint solver for type inference.
- Their code requires the elements to be integers suitable for indexing persistent arrays. This is unacceptable for me. I want an algorithm suitable for any totally ordered set of elements.
So, I wrote my own. Instead of using persistent arrays, it uses the persistent set and priority queue data structures from the
Cf module in my
OCaml Network Application Environment library. Here's the [untested] code:
module type T = sig
type k and +'a q and +'a t
val nil: 'a t
val start: k -> 'a -> 'a t
val find: k -> 'a t -> 'a q option
val extract: 'a q -> 'a
val disjoint: 'a t -> 'a t -> 'a t
val union: 'a q -> 'a q -> 'a t
end
module Create(K: Cf_ordered.Total_T) : (T with type k = K.t) = struct
module U = Cf_rbtree.Set(K)
module H = Cf_sbheap.PQueue(Cf_ordered.Int_order)
type k = K.t
type 'a q = Q of U.t * int * 'a * 'a t
and 'a t = T of int * (k -> 'a q option)
let void_ _ = None
let nil = T (0, void_)
let start k v =
let q = Some (Q (U.singleton k, 1, v, nil)) in
let f k' = if K.compare k k' <> 0 then None else q in
T (1, f)
let find k (T (_, f)) = f k
let extract (Q (_, _, v, _)) = v
type 'a compressor = {
mutable z: int;
mutable h: 'a q H.t;
mutable u: U.t;
mutable f: k -> 'a q option;
}
let compress_ =
let rec loop c k h =
if U.member k c.u then
None
else
match H.pop h with
| Some (hd, tl) ->
let _, q = hd in
let Q (u, _, _, _) = q in
if U.member k u then Some q else loop c k tl
| None ->
if c.z = 0 then
None
else
match c.f k with
| None ->
c.u <- U.put k c.u;
None
| Some q as q1 ->
let Q (_, n, _, _) = q in
c.h <- H.put (n, q) c.h;
let z = c.z - n in
assert (z >= 0);
c.z <- z;
if z == 0 then begin
c.f <- void_;
c.u <- U.nil
end;
q1
in
fun c k ->
loop c k c.h
let disjoint a b =
if a == b then
a
else begin
let T (na, fa) = a and T (nb, fb) = b in
let n = na + nb in
let f =
if n > 1 then begin
let f1, f2 = if na > nb then fa, fb else fb, fa in
fun k -> match f1 k with Some _ as q -> q | None -> f2 k
end
else if n > 0 then begin
if na > 0 then fa else fb
end
else
void_
in
let c = { z = n; h = H.nil; u = U.nil; f = f } in
T (na + nb, compress_ c)
end
let union a b =
let Q (ua, na, v, ta) = a and Q (ub, nb, _, tb) = b in
if ta == tb && ua == ub then
ta
else begin
let qn = na + nb in
let T (n, f) as t = disjoint ta tb in
let z = n - qn in
assert (z >= 0);
let rec q = Q (U.union ua ub, qn, v, t)
and c = lazy { z = n; h = H.put (n, q) H.nil; u = U.nil; f = f } in
let c = Lazy.force c in
T (n, compress_ c)
end
end
As I mentioned above, this is totally untested code. If you use it without testing the hell out of it yourself, then it will be your own damned fault if it bites your leg off. I'll be getting around to testing it someday soon, and if it works well, then I'll be adding it to the next release of the Cf library. No promises on when.
How does this code work? It's easy, really.
In the module signature of the functor, type k is the set element type ('k' is for 'key' here), 'a q is the type of equivalence classes where each key in the class is equivalent to a value of type 'a, and 'a t is the type of disjoint sets partitioned by equivalence class.
The empty disjoint set is nil. It contains no equivalence classes.
To create a new disjoint set with a single key k and its representative value v, use start k v. Use find k t to search the disjoint set t for an equivalence class containing key k. Use extract q to extract the representative value from the equivalence class q. The find function has a variable cost discussed below, but start and extract both have constant cost.
It is an invariant that the optional value q returned for positive results of a search with find k t is always the same for every equivalent key k in the disjoint set t. Each q returned by find k t contains a reference to t to facilitate path compression, which is discussed below. The cost of find is a the cost of searching a memorized queue of previously found equivalence classes, prioritized by size from largest to smallest, plus the cost of searching any predecessors for remaining equivalence classes, plus the cost when there are still predecessors of searching a cache of negative results.
To combine two existing disjoint sets a and b into a new disjoint set comprising equivalence classes from each, use disjoint a b . If a and b are identical values, then this is a trivial operation. Otherwise, a fresh path compressor is used to memorize equivalence classes when searching the two predecessors. Applying disjoint to a pair of predecessors that contain overlapping equivalence classes is not an error, but the results might seem arbitrary: the predecessors are only searched by find k t when no previous search of t has returned an equivalence class containing k; when the predecessors are searched, the larger predecessor p1 is searched first with find k p1, then the smaller predecessor p2 is searched with find k p2. The first positive result is memorized by the path compressor. A pair of negative results is also memorized. Once all the predecessor equivalence classes are memorized, the associated search function in the compressor is destroyed, which allows the predecessors to be collected.
Finally, use union a b to create a new disjoint set comprising a fresh path compressor constructed to search the predecessors encapsulated in a and b, and with memory pre-initialized to the equivalence class produced by merging the key sets and choosing the value represented by a as representative of the new class. As discussed above with regard to the likewise disjunct function, if a and b were found in overlapping disjoint sets, then the choice of which equivalence classes are memorized in the path compressor is arbitrary.
Worth noting: it's not clear to me that the negative cache is worth it. If you always search for keys that you know are present in the disjoint set, then the cache won't hurt much. However, negative results from searches in uncompressed sets are costly. A negative result can only be obtained by iteratively descending the predecessor graph to make sure none of them have an equivalence class that needs to be memorized. I'm not sure there's much to be gained by caching such negative results until the path compressor determines that the equivalence classes in the predecessors are exhausted. That will certainly be one of the things I will set about to testing when I next get time for this project.