François Pinard's site |
||
Gitification of Tomboy notesI was previously using Dropbox to spread Tomboy synchronisation directories between the machines I access, yet I sometimes forget to Tomboy synchronize before leaving home or work, feeling a bit miserable afterwards — as I now depend on Tomboy for many aspects of my duties. However, as I never forget Git synchronisation, the idea came to me that I should use Git to synchronize Tomboy notes, just like for most of my other things. Tomboy is a wonderful and very useful tool. Yet, its internal file and directory formats are under-documented, and some guess work is needed here and there, when I want to handle my Tomboy notes through various scripts. There is a D-Bus interface that I could use, and this is how I originally started my tboy. But I found out I do not master that interface so well; so tboy was sometimes using D-Bus, and other times avoiding it. As tboy was progressively growing to accomodate my various needs, I finally decided it was easier to uniformize it towards direct reading of directories and files, with the practical result I almost never use D-Bus by now. Another tiny advantage is that tboy may work even when Tomboy is not running. The guess work may have strange consequences. I got the persistent impression that the Tomboy synchronisation directories were designed in such a way to preserve the evolution history of successive synchronisation calls — each being called a revision in Tomboy terminology — yet I had some difficulty in deciphering every detail of this. Sandy Armstrong, the actual maintainer of Tomboy, is especially friendly and talkable, so I dared asking him for some help. He explained to me that maintaining the synchronisation history has never been an intent in Tomboy, and that if I ever find two versions of the same note, than I'm uncovering a Tomboy bug. Surprised by this statement, I made a more thorough examination of my whole Tomboy synchronisation directory, and found a lot of duplication. So there is a bug in Tomboy sync in which the cleanup does not work properly — I'm not fully sure, but the cleanup apparently works only in a few cases, for notes being handled in adjacent revisions. All the clutter which results holds a lot of recoverable history. So that particular bug was quite productive in my case ☺. Each revision has an associated number, counting chronologically from zero. The synchronisation directory has one subdirectory per revision, named after the revision number. To be precise, a revision directory is two-level down from the top synchronisation directory, as directories are grouped one hundred at a time (the grouping directory merely uses the hundred digit of the revision number). Each revision directory holds a manifest.xml file explaining which notes were still existing at that revision level, and for each note, at which revision it was last modified. The revision directory also holds the full note contents for notes which were introduced or modified at that revision level. The gitification works in two passes. The first pass establishes, for each note, all revision numbers for which the revision holds a full note contents for that note. It also attributes a timestamp to the revision (the manifest.xml modification time seems a good estimator). The second pass checks, for each revision and referenced note, if we still have its contents at the last modified revision. Most of the times, because of the cleanup bug, we do. If we do not, than we pick a copy at the closest higher revision where the full text of the note exists. If there is no such higher revision, then most likely, the note has been deleted for good, but still exists in the ~/.tomboy/Backup/ directory, so we pick that backed up note instead. Now that we have a set of existing notes at each revision level, it becomes a trivial matter to restore a previous state one revision at a time, and to generate Git commands for staging and commiting that state. After the execution of the transformation script, I manually copied back the few other administrative directories from the original ~/.tomboy/, that is: addins/, addin-db-001/, sync_temp/ and Backup/. With Tomboy stopped, the final step has been to replace ~/.tomboy/ by the new one. When I gitified my Tomboy notes as described above, 375 notes existed after the 187'th revision, and I was ideally expecting 187 Git commits. Whenever the Tomboy sync cleanup code worked correctly, some historical information was lost, to the point that some commits were discarded as being empty. I ended up with 176 commits, which is rather satisfying result. The space savings are interesting as well. The Tomboy synchronisation directory was taking 22M; the resulting Git pack uses 1,5M, while the checkout itself uses 1,8 Meg. While Tomboy built-in synchronisation works live, proper care is needed to stop Tomboy before Git synchronisation, and start it afterwards. If one plays straight and never takes chances about this, there is no reason to have a synchronisation problem ever. As I use many scripts and tools for moving files between machines, as I wander between home and work, it seems easy to slip a few more commands in the scripts ot make sure Tomboy does not run when it should not be running. As the cleanup bug will likely be corrected in Tomboy some day, the whole trickery above, however fun it may be, might not work for long ☺. However, I'll continue using Git to store history, do synchronisation, and even knowingly let conflicts get in, knowing the powerful machinery I now have to resolve them. |
||