Insight: Finished reading about BFS
Well I’ve recently finished reading the BeOS File System (BFS) book, and it was a very interesting read. There was a lot of useful information there, and it really helped me think about the system I’m writing. I’ll put some of the ideas that came to mind in this post.
Firstly, one of the key things it helped me decide upon was the definite use of B+ trees for things, as they have excellent performance, especially for fast block-based reads/writes (as opposed to fast random reads/writes, like in RAM). It was slightly surprising to discover that it may be quicker to read a big block of contiguous data from the disk between two blocks (that is not needed) than it would be to seek to the exact position of the next block required. It seems that, in the world of file system performance, locality of reference is king.
Secondly, it helped me refine my thoughts and ideas about how I am going to work with queries. I think that I will allow applications to create query syntax trees themselves (in a standard way) as well as providing routines to parse strings in a known format, and provide query syntax trees that represent those. This will allow both the syntax described in Reiser’s whitepaper (used elsewhere in this blog) as well as the possibly more natural syntax of Boolean expressions, i.e.:
-
type/music artist/Queen album/["A Day at the Races" "A Night at the Opera"] "air guitar"
-
type="music" artist="Queen" album=["A Day at the Races" "A Night at the Opera"] "air guitar"
-
(type="music" AND artist="Queen"AND (album = "A Day at the Races" OR album = "A Night at the Opera") AND TAG("air guitar"))
Interesting how the Boolean query looks longer, yet these are all equivalent. Note that in comparing Reiser to Boolean queries,
equals
equals
(with the last form probably the only one available to Boolean queries).
This will allow applications to create the query syntax trees in their own way (perhaps even graphically) without having to internally create the strings in the required format. This should make interoperability easier, at the potential expense of creating a proliferation of query languages. There will have to be tight controls as to exactly how this should be used.
Other useful things that I’ve taken away from reading the BFS book include the fact that reading a list of attributes is kind of like reading a directory. Creating an attribute is like creating a file in a directory, and so on. This could lead to some interesting and elegant formulations that should be independent of the underlying representation.
Update [2007-12-19]: Finished reading a load of articles with some more interesting information, including an interview with the BeOS guys from The Register.
Leave a Reply