One-way sync of two hierarchies

Spread the love

Question Description

I’m hoping to write an algorithm to synchronize two hierarchical structures. These structures could be object graphs, data stored in relational database tables, etc (even two different structures, so long as they have comparable keys). The synchronization will be one-way, i.e., one structure will be the prototype, and the other will be modified to match.

Let’s say we have a sync function. It would need to accept the following:

  1. objA — the prototype
  2. objB — the object to be modified
  3. keyA — key generating function for objA
  4. keyB — key generating function for objB
  5. addB — function to create an objB (returns id of new objB)
  6. setB — function to update objB
  7. remB — function to delete an objB
  8. parB — id of objB‘s parent — this is passed to addB for context

So we have this:

let sync (objA:'a) (objB:'b) (keyA:'a -> 'k) (keyB:'b -> 'k)
         (addB:'p * 'a -> 'p) (setB:'a * 'b -> unit) (remB:'b -> unit) 
         (parB:'p) = ...

Now here’s where I’m having trouble. 'a and 'b are hierarchical, so the function needs to know which properties of 'a and 'b it should traverse (once it compares their keys and decides they match thus far and should be further traversed). For these “child” properties, it needs all the same arguments passed to sync, but for their respective types.

This is when it became apparent this is a data structure problem. How can I chain together this information such that the root object can be passed to sync and it can traverse the graphs downward? My initial thought was to incorporate all of the arguments into a class, which would have a children property (a ResizeArray of the same type). But with various properties having different types, I couldn’t figure out a way to make it work, short of throwing types out the window and making most or all of the type arguments obj.

So here are my questions:

  1. Is there a well-established method for doing this already (I haven’t been able to find anything)
  2. What data structure might I use to encapsulate the data necessary to make this work?

I’ve tried my best to explain this thoroughly, but if anything remains unclear, please ask, and I’ll try to provide better information.

Practice As Follows

I’m sure this is oversimplifying it but here’s my idea.

If this is a DAG you could do a breadth-first traversal of objA. When you enqueue a node from objA include objB and any other information you need (tuple). Then when you dequeue you fix up objB.

You could use a discriminated union to handle different child types in your enqueueing.

Leave a Comment