Insight: Implementation Thoughts
So, I’ve been reading the book about BFS (the file system of BeOS), which shares some ideas with mine, although it still adheres to the idea of a hierarchical filesystem. It goes into some depth, and although I’ve only reached partway through chapter 6, I have found it to be a very useful resource, particularly with a brief overview of the heuristics of search optimisations.
Anyway, the key thing recently is that I’ve started to think about how to implement the data structures for DBFS that deal with storing the attributes, and I feel that I have come across a rather neat solution. Whether it is efficient or fast is another matter entirely, but that is definitely not the point of this project; I just want to prove it is possible. If performance isn’t too bad, then so much the better.
Incidentally, I was thinking that I might implement this as a userspace program first, and then write it into the filesystem if possible. Anyway, I digress. Back to the data structures:
Part of this came from the idea that grouping and ordering are distributive, that is:
a/[b c d]
is equivalent to:
[a/b a/c a/d]
which seems to be sensible. Of course, the reverse is also true, allowing the inverse operation. Given this, data nodes should store:
- inodes – a pointer to a linked list of inodes with this attribute
- subitems – a pointer to the root of a B+ tree containing subvalues of this attribute
Note that if inodes is null, then you cannot use the attribute for a query by itself; it exists only to give semantic structure to other attributes. If subitems is null, then this attribute has no further values. If both inodes and subitems are null, then this attribute should be deleted as it serves no useful purpose (i.e. it is a value with no further semantic meaning which represents no files).
This then provides a multi-level tree, with the top-level root referenced by a field in the superblock.
Note that this structure provides an efficient way to find all the values of an ordering. This can also be isomorphic to key/value pairs, when the ordering tree has only two levels. It can also represent multilevel structured keys with values:
a/b == a="b"
a/b/c == a/b ="c" or a.b="c"
This appears to be a very neat solution indeed. I can’t wait to implement it!
Edit (2007-12-02): Added diagram
Leave a Reply