So, I’ve been reading the book about BFS (the file system of BeOS), which shares some ideas with mine, although it still adheres to the idea of a hierarchical filesystem. It goes into some depth, and although I’ve only reached partway through chapter 6, I have found it to be a very useful resource, particularly with a brief overview of the heuristics of search optimisations.
Anyway, the key thing recently is that I’ve started to think about how to implement the data structures for DBFS that deal with storing the attributes, and I feel that I have come across a rather neat solution. Whether it is efficient or fast is another matter entirely, but that is definitely not the point of this project; I just want to prove it is possible. If performance isn’t too bad, then so much the better.
Apologies to anyone this confuses, but this post is just a collection of random thoughts about DBFS (now called Insight); what it might do, how it might be implemented… all sorts of things. Some of these will be taken from The Book of the Project, and some have developed as I do more research.
After some messing around, I thought I’d share a fix I’ve found for a problem I’ve been having with gvim on my home Linux machine.
Basically, the issue was that gvim would be missing the menu bar. My guioptions had “m” (display menubar) enabled, and $VIMRUNTIME/menu.vim was being sourced. No matter what I tried, the menu bar stayed hidden – it had vanished, and I couldn’t tell why. After some searching with various combinations of “gvim menu vanished” and similar, I finally ran across the solution…
So I’ve begun my research with Hans Reiser’s The Naming System Venture whitepaper, which is now no longer available online. Unfortunately the Wayback Machine doesn’t provide the diagrams, so unless I can find anyone with a mirror of it (so far unsuccessful) then I shall just have to do without them. Incidentally, I am also writing it up again myself, so I will make a PDF available here.
The more I read and re-read this paper, the clearer some of my own ideas become, and it seems to me that my initial intuition that I would have to write my own database system seems more valid. The only thing is that I had been thinking in terms of a relational database, as that is all that I have really had experience. I have plans for multiple indexes (or indices, whichever plural you prefer) linking into the data, with particular optimisations for finding distinct items.
I have also found an interesting project by a student called Onne Gorter at the University of Twente, in the Netherlands, written in August 2004. He aimed to create a database filesystem in O’Caml, integrated fairly tightly with KDE. He took quite a different approach to me, which is refreshing and reassuring, but he also took my planned name. Now I need a better one… but he has a useful bibliography, which may indicate some further items to read.
Finally, I have also briefly investigated Google’s BigTable system, and that also has some interesting ideas. I shall have to consider this further. Maybe later I will write a post explaining some of the things in my DBFS journal, a.k.a. The Book Of The Project.
Well, the initial individual project allocations have been published, and it looks like I’ve definitely had my project confirmed. As far as I know, it was going to happen anyway, but it’s nice to have official confirmation now, so I can start working on it.
What is my project? Well, it was my own proposal, and the title is Towards a Database Filesystem, which may give you a clue. The aim of this project is to build a database filesystem that can store various elements of metadata about a file, allowing you to categorise and organise files differently depending on how you’re feeling or how you want to find your data. Continue reading →
So, I’m now back at University – it feels good to be back! I’ve moved back into the same house as last year, with the same guys as before, which is great. They’re a wonderful bunch, and we live more like a family than a collection of students.
Today was the Freshers Fair, and it was great to see a lot of people again after six months away. I saw a few people from the department, as well as some other very good friends and my dance partner. It’s amazing to realise just how much I’ve missed everybody, and how glad I am to be there with them again. I’m really going to miss this place when I leave this year.
There has been some recent controversy over Facebook‘s Terms and Conditions: specifically, it seems that these Terms imply that they own any Content that you post to the site. When I first heard about this on 1st October 2007, I thought that it couldn’t be right. A company couldn’t just bury that in their Terms and Conditions� could they?
I decided to go through it, and having assauged my own doubts, I felt it was worth explaining in detail to anyone who is interested. Hi to anyone visiting from a group set up about this. Continue reading →
This is something I’ve been meaning to write for a while, and finally got round to doing. Basically, it’s a simple Thunderbird extension that makes it easier for you to check other folders for new messages. I’ve named it FolderCheck* (rather unimaginatively), and it’s available from the Mozilla Add-ons site sandbox.
The problem is that, at the moment, Thunderbird will only check either the inbox or every folder for messages (for all accounts). This is pretty useless for me, as I only get new email in my inbox for some accounts, and get my email server-sorted into different folders for other accounts. This means that if I want to tell Thunderbird to check a specific folder for messages, I have to right-click the folder, choose “Properties”, check “Check this folder for new messages”, then click OK. And do it all over again for the next folder. And the next. This quickly gets irritating!
My extension adds a new “Check for new items” item to the context menu for each folder (except Inbox and some other special folders) that allows you to quickly see whether a folder is checked for new messages, and to quickly and easily toggle that setting. Now I just have to right-click each folder and hit “h” to toggle the setting.
Coming soon (when I get round to it): a dedicated window to make multiple folder selection easy. My current thought is a list of folders with a filter box and “(De)Select All Visible Folders” buttons.
* Note: currently requires a free Mozilla Developer account, as it’s not yet a public extension.
So, I got to thinking about xkcd‘s famous (or infamous) map of the IPv4 address space. And, naturally, I wondered if I could write a program to draw it. After looking around on their forums, I found an algorithm or two which seem to do this, and all it needs is the input list of integers. So the algorithm (in fact, a plethora of them) exists and is straightforward… probably even fast. All that needs to happen now is to hook it up to GD or something similar and watch it draw.
Then I started worrying.
IPv4 is relatively small, but it’s actually very big, isn’t it? How much space would this image take?
Well there are 232 possible addresses in IPv4. The Hilbert curve ends up with a square, so that’s 2^16 pixels per side. But wait; it’s a space-filling curve. That means we need to double the dimensions to have a 1px gap between each part. And add an extra pixel for a nice 1px border all round. That’s 2^17+1 pixels per side. Now, let’s decide to use four colours in our diagram. There’s the background colour (e.g. white), the line colour (grey), the allocated address colour (red) and the unallocated address colour (green). So, at 4 bits per pixel, we want to store (2^17+1)^2 pixels, or ((2^17+1)^2)/2 bytes. That works out as being 8,590,065,665 bytes or so. Ignoring headers. That’s almost exactly 8GiB.
So, say, after generating this huge 8GiB bitmap, we decide we want to print it. At, let’s say, 300dpi (which is a fairly high resolution for printers, or at least it used to be). It works out that the image would therefore be just under 437 inches each way, or just under 36.5 feet, or about 12 yards (or approx 11.1m if you’re feeling metric). That’s a pretty big picture. Even at twice the resolution, it’s 6 yards (5.55m) on a side. Chances are, at that fine a dot size, you wouldn’t be able to make out the individual addresses anyway!
OK, another personal post: my girlfriend told me to do this! Basically, you have to put your music player on random, with your entire collection, and (being honest), choose each random track as it comes and put them next to these headings in order:
Opening Credits: I Just Can’t Wait To Be King (The Lion King) Waking Up: This Time Next Year (Sunset Blvd) First Day At School: Orinoco Flow (Celtic Chillout) Falling In Love: When You Are Old And Grey (Tom Lehrer) Fight Song: From Russia With Love (Ballroom Instrumental) Breaking Up: Could We Start Again Please (JCS) Prom: A Whiter Shade of Pale (Procol Harum) Life: Be Lucky (Show of Hands) Making babies: Land of the 1000 Dances (Ballroom Jive) Mental Breakdown: I Hold Your Hand In Mine (Tom Lehrer) Driving: The Music of the Night (Phantom) Flashback: Memory (Evita) Getting back together: No Questions Asked (Fleetwood Mac) Wedding: Read ‘Em and Weep (Meat Loaf) Birth of Child: Por Una Cabeza (Carlos Gardel) Final Battle: I Can Hear Music (Beach Boys) Death Scene: Hushabye Mountain (Chitty Chitty Bang Bang) Funeral Song: Santiago (Show of Hands) End Credits: All I Ask of You (Phantom)
… I think my music player knows things that I don’t. What’s more worrying is that I didn’t cheat: that was the actual list. Whether you believe me or not, I know I’m not lying. I’m worried.