A content-based file manager
I’ve been thinking recently about Insight again, and I’ve been considering part of the problem with naming and uniqueness.
Names in a traditional file system are made unique based on a full path to the file, but most people think of a file name as just the final component. This would then cause a problem with the move to Insight, as a file could appear in multiple directories, and its only distinguishing feature would be the final component of its path. This is counter-intuitive and can cause all sorts of problems.
Consider makefiles, for example. They rely on a standard named file (Makefile) appearing at various levels in the hierarchy in order to work. Obviously, you would want different makefiles at different levels and in different projects, but Insight as it stands has no way to handle this.
I then started thinking about what makes a file unique. In the end, I came up with two things: name and content. This covers the makefile case (same name, different content) as well as the backup case (same content, different name). It then occurred to me that, in the general case, all you need to distinguish a file is its content, and then actually finding it can all be left up to metadata.
If files are then thought of as containers for data that happen to have a unique internal identifier (which never needs to be exposed to the user, although it can be accessed as, say, the file’s inode number) then the idea of a content-based file manager comes into play. These examples work best with visual media, particularly images, but there is no reason in principle that this could not be extended.
Imagine searching for a file. You know it’s a photo, but you have a large collection of them. With a digital camera and a large-capacity memory card, who needs to ever delete a photo? We’ll assume that you’ve dilligently tagged the photos with metadata as you’ve imported them from the camera, through some easy batch process.
On the tagging point: a lot can be taken from the metadata stored by the camera (date/time, resolution, orientation, black and white/colour, perhaps GPS co-ordinates) and with the right tools, more can be inferred (auto-tagging faces, buildings, perhaps recognising common events like football matches, converting GPS co-ordinates to places, …). As time goes on, people will need to do less and less manual tagging.
Anyway, back to the file manager. You know you are after a picture or a set of pictures. Normal thought processes will probably follow a path similar to: “Yeah, I wanted to show dad those photos from that holiday in Paris that we had two months ago. I think he’d particularly like the ones we got of the Louvre, as well as the ones with me in, of course.” I’ve highlighted various key words that can be translated directly to metadata searches. Notice how these all involve a narrowing down of the query.
To convert these to filters, we then have:
- type: photo
- holiday
- location: Paris
- date: two months ago
- at least one of:
- location: Louvre
- person: Me
This could also be represented by a query:
type:photo AND holiday AND location:Paris AND date:-2m
AND (location:Louvre OR person:me)
Breaking it down in this way feels fairly technical and wordy, however. I’d much prefer a visual view.
Imagine a black field, speckled with points of light representing your photos:
You filter by “holiday”, and (because it learns based on previous searches) it then groups by location. The ones which have been filtered out fade into nothing, and the photos group into labelled blobs and enlarge slightly:
You filter by date, and as you drag the slider, irrelevant items fade away and relevant ones enlarge:
Then you add the final filters and set the photos up for viewing, perhaps as a slideshow… and you’re done!
Pretty neat, I think.
Leave a Reply