Overflowing with ideas
I’ve found myself overflowing with ideas as I try to focus on revision, so I thought I might as well scribble some of them down, if only to get them out of my head! To name the prominent few:
- More on Partis and its future
- A web service integration API proposal
- A PHP system to make it easy to plug together a load of code modules for rapid development
- Not to mention some others which are too fleeting or too vague to pin down yet
Partis
Partis is the name of a project I had originally planned to put together many years ago when I started university (back in 2004!).
The basic idea was to create a P2P system that worked internally on networks using IP multicast for peer discovery, which also allowed browsing and cross-network searches. It would provide bandwidth limitations, both in terms of speed and amount, as well as dynamic limits (e.g. 500KB/s for the first 300MB downloaded, then 50KB/s thereafter).
While I think this is still a good idea for places like the halls networks (where we were limited to 5GB bandwidth per 24 hours) I have decided on a better direction for the project, which is something almost completely different: distributed backup.
I don’t want to talk about Partis too much in this post, as I want to discuss other things too, but I will note down some of my ideas and things that have come out of discussion with others.
Changing Partis to a distributed backup solution came from a fairly selfish beginning, as have many of my ideas. If I want to put a system together because I’ll find it very useful (and perhaps indispensible) then perhaps other people will think the same. When I was getting ready to begin my final year project, I wanted to be paranoid about my work. I wanted to guard against losing it in any conceivable way:
- Hard drive or computer dying or becoming corrupted or destroyed
- Losing a USB stick
- Accidental deletion/overwriting
- Malicious tampering
- Theft
- External server failure (e.g. Department of Computing or my own server), although highly unlikely
So after considering all these options, I thought it would be great to create a distributed P2P-style backup solution. Sadly I haven’t had the time to do this, but the basic idea for myself was to use the fact that many of the computers in the Department have an empty partition that is about 20GB in size. If I could use that space to store my data and redundantly distribute it across multiple machines (to account for failures/reformats/reinstalls, computer being off or in Windows or otherwise inaccessible, etc) then I would feel a bit safer. And of course I could use some spare space on my housemates’ computers, and maybe my home computer, and my server, and some other friends around the country or even the world…
The data would have to be encrypted, of course, as I wouldn’t want just anyone reading it, and there would have to be checksums to guard against modification or corruption. I quickly realised that this project was something that I wouldn’t have the time to undertake in parallel to my Masters project, and that it might have made a viable alternative. Then of course, there’s the possibility of storing versioned copies of files, and the question of how to distribute everything, and so forth.
Then it hit me that other people would find this useful. Surely most students would want to have a gigabyte or more of redundant, safe storage? Then this creates and solves the problem of finding enough space at the same time. As a potential business model, for example, people could have 1GB of free storage by offering 3-5GB of their own disk for usage by the network.
There would be different storage tiers based on host reliability (uptime/online time/connection speed/etc) and so on. It would perhaps be possible to charge a premium for upper-tier storage (i.e. things you’d need to be able to recover ASAP) as opposed to lower-tier storage (data that you want safe but you don’t need immediate access to, e.g. family photos), with varying prices depending on the service you offer to others. The better the service you offer (the more you put in), the better you get (the more you get out). Of course, those who don’t want to contribute their resources to the network (or can’t do so) would be able to pay for some centralised storage.
It’s not just individuals who could use this system, however. Think about companies with large numbers of desktop machines in geographically distinct locations. Bump up the redundancy factor and it could be licensed as an alternative off-site backup solution. Archived data could be distributed across employees’ desktop computers, using a percentage of the disk space that would otherwise just be sitting idle. For a fee.
Of course, there are a large number of things to consider and tweak, such as how to deal with encryption, and the inevitability of people losing their keys. My first thoughts on this are to have a two-layer system, although I’m not sure this is possible. The user has a passphrase-protected key that can decrypt a secondary key which is actually centrally stored and used to encrypt/decrypt their files. Then if users lose their passphrase, their data isn’t lost – another key held in central reserve that requires a combination of personal details to unlock can be used to decrypt the secondary key and create a replacement personal keypair.
That’s probably enough about Partis for now!
Update (2008-04-29 18:25): Dan Lester (who uses the same WP theme – what are the chances?) has drawn my attention to Zoogmo which is similar on the surface, but is Windows-only and based around the idea of sharing with friends rather than the Partis network. Also, they don’t yet appear to have different tiers. They do have a suitable “Web 2.0” name though. Something to look into and keep an eye on though. Are any ideas original any more?
Web service integration
Something that came to my attention the other day when reading Charles Miller’s blog was the fragmentation of data across the web. There are few central authentication schemes (such as OpenID) and little support for them. There is no standard for integrating the services offered by the multitude of different sites. This isn’t helped by the lack of incentive for sites to bring their offerings together, but I believe it could be really useful. I then read various blog posts on this subject and decided that I would like to hammer out the beginnings of a web interaction/integration API. With help and assistance from other people, of course!
More on this another time. All I will say for now is a potential name: OpenIntegrate.
Pluggable PHP system
While in the shower the other day (an excellent place for ideas to appear, it seems), I decided that what I really want to create is a library of PHP code that I can just plug together for various tasks. It would have to know a bit about database layout, so that I didn’t really have to think about it in advance. The more I thought about it, the more developed the idea became. This system would be a set of PHP files and a preprocessor, so that modules can have optional dependencies and hence conditional code generation/inclusion in a fast way, as well as automatically creating/updating the database in order to work.
For example, I was recently thinking about how persistent logins work, and how nice it would be to just write a component once that I can just add to a generic user authentication component. A user authentication component requires (at minimum) a�username and�password field in a�users table of a database. When coupled with a persistent login cookie system, however, either the users table has to be updated to include a�persistent_token field (and maybe a�series_id, see here) or a new tables is required if you want to allow login to be remembered from multiple places.
It would be really nice if the PHP could take care of these database modifications by itself, and automatically integrate with the user login module, so that I don’t have to worry about it. That way, I can just put a load of files together, run something similar to rake/bake on them, and end up with an SQL file that will contain the required database schema and also include/exclude bits of the PHP code based on the available modules, perhaps commenting/uncommenting blocks based on available modules, etc. This would prevent a number of checks for optionally available functionality. Each module would then contain a comment header, perhaps something like:
/*# PROVIDES: userauth *# REQUIRES: *# SUPPORTS: persistlogin *# DEFAULT_DEF: TBL_USER="users" *# DB_REQUIRE: [ TBL_USER, [ username: varchar50, password: char32 ] ] */
and then:
/*# PROVIDES: persistlogin *# REQUIRES: userauth *# SUPPORTS: persistlogin *# DB_REQUIRE: [ TBL_USER, [ persistent_token: char32, series_id: char32 ] ] */
… and then make a sensible minimum required database structure out of them. Of course, much more than this would be required, and the format described above is very nasty, but that was just my first thoughts. You could then have conditional compilation:
/*# IF HAS_MODULE("persistlogin")
// code to check for persistent login cookie
if ($_COOKIE['persist_login']) { }
elseif (/* check DB ... */) {
*# ELSEIF #*/
if (/* check DB ... */) {
/*# ENDIF #*/
// log them in
} else {
// ...
}
or:
/*# IF HAS_MODULE("persistlogin") #*/
// code to check for persistent login cookie
if ($_COOKIE['persist_login']) { }
elseif (/* check DB ... */) {
/*# ELSEIF #*
if (/* check DB ... */) {
*# ENDIF #*/
// log them in
} else {
// ...
}
depending on whether the “persistlogin” module is available at the time the preprocessor is run.
I thought about calling it GHOTI (prounounced “fish”) because I like quirky words, but couldn’t immediately come up with a backronym. G- H- Offering Tight Integration? Maybe I’ll come up with something from Latin instead. Or call it RADPHP or PHP-RAD. Who knows?
Anyway, that’s probably enough of my random ideas for now. More to come after exams, I don’t doubt! First one is tomorrow at 14:30. Sigh.
Leave a Reply