The Man Also Known As "I'm Batman!"

Hello Citizen! Would you like to be my sidekick? Would you like to ride with... Batman?

Thursday, June 23, 2005

We've moved! Check out all these great articles and others at http://akaimbatman.intelligentblogger.com!

The Linux Desktop Distribution of the Future Part 3

Category: Conceptual Design

This article is part of a four part series intended to provide some thought into how a future Linux Desktop might work. It is not intended to be a comprehensive essay, although all the concepts presented here are considered "doable" by the author.

Part 1: Linux and the Desktop Today
Part 2: Applications
Part 3: File Management
Part 4: The Desktop Interface

Part 3: File Management

As computer systems grow in complexity, they tend to drag that complexity into the user's filesystem, thus confusing what files are important and what files are only for application data. This complexity introduces confusion for the user as to where his files should be placed. In more extreme cases, it can even cause a user to lose where his documents are placed! In this episode we'll look at ways in which a Linux system can further reduce complexity in these areas.

Database File Systems

If you haven't read my previous article on Database File Systems, I highly recommend that you do so now.

The key to detangling the complexity of modern file systems is to separate system files from documents. Existing systems have attempted to solve this by encouraging users to place documents in their home directory, but this can create more problems as the user interface hooks into special directories. Under existing Linux interfaces, users may not even be able to access files on their desktop without using the graphical interface!

In the previous episode we reduced the complexity of applications by effectively making them into documents. But that still leaves the user with some confusion as to how to organize their documents and applications vs. their system files. To solve this, we need to split the file system into two areas:

A partition for the core system libraries, root files, and /usr files.
A DBFS partition for the user's applications and documents.

Core System Libraries

One of the most important things that a desktop user needs is to get out of the business of maintaining system files. Such maintenance was always problematic for Windows users and even gained the title of "DLL Hell". Microsoft eventually solved this issue through several OS changes, not the least of which was encouraging application writers to keep their DLLs to their own program directories. Most Windows programs today install into a folder in the "Program Files" directory with no extraneous files in the System32 directory.

Linux, on the other hand, took the approach of package management. No real standard existed for the base system, but rather the core libraries could be updated at will. This lead to the situation where a given distro version may mean a different set of available APIs under each installation. Standards such as LSB have been proposed to help alleviate this situation, but such efforts have mostly focused on providing a minimum of low level APIs. What is necessary in a Desktop focused distro is that the developer be able to count on a specific set of APIs for a given system level.

For example, let's create a mythical desktop Linux OS called DeskLin. For version 5.0, DeskLin publishes APIs for GTK 2.1 and QT 3.2. Thus all programs based on GTK 2.1 or lower and Q 3.2 and lower should work. If a developer wants to create a program that uses the FLTK toolkit instead, he'll need to package the FLTK shared objects in the lib folder of his application bundle. Now let's say that DeskLin Inc. notices a large number of applications using the FLTK APIs. To reduce waste, they may chose to include FLTK as a core API in DeskLin 6.0. This then allows software application developers to take advantage of these APIs as long as they target version 6.0 and up of the DeskLin operating system distribution. The application developer, however, can still target version 5.0 by including the FLTK libraries.

The end result of this process is that control of the system APIs is taken away from the user. While many Linux purists would argue against such a step, it's important to note that I am not advocating taking this step for all distributions. In workstation and server environments it can be critically important that the user maintain complete control over his system. Only Desktop-oriented distributions that are looking to target home users should break off and take these steps.

Of course, security updates will still be an issue. As a result, it makes sense for a Desktop distribution to carry an installer mechanism such as Autopackage. This installer mechanism would allow for patches to be applied to the system quickly and easily. If possible, such patches should be automated. Beyond that, the core libraries should be hidden from the user and made read-only.

Documents

Differently from system libraries, users DO want to manage their documents and are always happy when they are given more power to do so. The best solution for users is to move their documents and only their documents into a database file system. In many other OSes this might be a problem, as there only exists one filesystem tree. In Linux we can get away with much more.

Under Linux we can create a new VFS module that handles a DBFS partition independent from the system files partition. The DBFS can then be seamlessly integrated by mounting it onto a standardized subdirectory such as /Documents. To a terminal and classic Linux programs, the DBFS Labels would look like normal directories. Files outside of the /Documents folder would be non-writable for normal users. Queries to the file system can be performed through a special /proc or /dev interface, allowing the Desktop to quickly search for the exact files the user is looking for.

This arrangement does pose a few problems, however. In a database file system, it is possible for the same file name to occur more than once. Possibly even under the same label. As a result, it is very important for programs to start using the INode number for the file instead of the abstract path. Two solutions immediately come to mind for dealing with "Classical" software programs:

Allow for the final component of the path to be the INode number. This could be achieved with special file names such as "#1234" or "@1234". Such names are not normally used in an everyday Unix system, and are thus unlikely to be invoked by accident. These names can be passed to existing software programs to ensure that the correct file is selected. There is still an issue if the program displays the filename, however.
Take a page from the Windows VFAT scheme and display duplicates under the same label with special filename extensions. For example, two files named "Bill.doc" could become "Bill #1.doc" and "Bill #2.doc". The only downside is that the user may become confused at the numbering scheme. Especially if he attempts to rename the file to include the special addition.

Configuration Files

The one type of file that I haven't yet addressed is program configuration files. The reason for this is that I currently have no "good" place for them to go. The Windows solution of using a central registry is certainly not a bad one (although definitely not a good implementation), but adds one more complex abstraction for the user to deal with. In addition, it is far less feasible to force Linux programs to make the switch to a registry solution than it was for Windows programs. (Central control does have its advantages.)

The best solution I can come up with at the moment is to hook into the Unix "standard" for configuration files in order to create a virtual registry in the file system. The idea is as follows:

When the DBFS detects a file or directory created with the "." prefix (a naming convention that hides a file in Unix), it immediately traces back the creating program to its on-disk disk image. Instead of creating the file, the DBFS makes it a binary meta-data attachment to the program image. The resulting psuedo-file is then indexed by the DBFS so that a regedit-like management program can quickly retrieve a list of all programs with such configuration files.

The downsides to this scheme are as follows:

The application settings are permanently attached to the program. If the software application disk image is deleted, the settings will go with it. Many users would see this as an advantage, but it does depend on the program. Some larger programs are removed and later reinstalled by users due to disk space considerations. Gaming users in particular might not be happy if their Doom III saved game was lost. (Although the obvious solution to that situation is to encourage the vendor to change the saved game to be a regular document that is associated with the application.)
Inter-program communication through configuration files is made more difficult. For example, Opera would have to support the necessary DBFS APIs to directly access the meta-data under which pre-existing bookmarks are stored in FireFox. In a traditional Unix system, Opera could just open the "."-prefixed sub-directory in the user's home folder.

DBFS Structure

While the technical details of how the DBFS stores files and meta-data is for the most part irrelevant, I'm going to quickly go over the structure to eliminate questions concerning what a DBFS can and can't do.

A DBFS as envisioned in this article would forgo the traditional storage of INodes and Directories. Instead, the data about the filesystem would look more like an SQL database. An entry would exist in the database for each file on disk, with keyed linkages to meta-data. The meta-data list would be like a table with three columns: The name of the meta-data, the type of the meta-data, and a link to the value in another table. Tables would exist for String, Integer (64 bit?), and Data Block types.

The Data Block type would be a table that would hold a list of file system blocks used by the specific piece of meta-data. All files would have a piece of meta-data of this type that would identify the file contents. (Likely named something witty like "data".) Strictly speaking though, Data Block meta-data other than the file contents may be attached. For example, the application icon may be stored in a meta-data value called "icon", or a thumbnail of a photograph may be stored under "thumbnail". In fact, the potential for storing pre-calculated binary meta-data is limited only by disk space and your imagination. Even plain old text can easily be extracted from meta-data already in a file, and stored in the file system for easy access.

Each of the "tables" in the DBFS would be appropriately indexed to provide for the fastest lookup times possible. It's likely that such indexes might also be created for the contents of a file. These indexes would be what would make fast file searches/queries possible.

Tune in next week for the final installment, where I tie all of these features together into an easy to use interface!

Part 4: The Desktop Interface

Links:
LSB
Autopackage

WARNING: Comments have been temporarily disabled during maintenence. Comments will reappear soon.

Wednesday, June 15, 2005

We've moved! Check out all these great articles and others at http://akaimbatman.intelligentblogger.com!

The Linux Desktop Distribution of the Future Part 2

The best solution yet to emerge for application installation is the NeXT/OS X AppFolder concept. Put simply, the entire application is packaged into an folder with a special extension. When the file browser sees a folder with this extension, it treats it as if it were a special file instead of a directory. As a result, the user sees a single icon for an application, and is free to move or copy this icon to wherever he chooses - even a remote computer. Installation is as simple as extracting the folder and moving it to your favorite location, un-installation is as simple as deleting the folder, and tons of meta-info (such as associations) can automatically be pulled from the directory package.

These features make AppFolders an excellent concept for all desktop systems, Linux included. Unfortunately, there are some issues that make it less than useful in existing Linux distros:

1. Copying one directory over another is currently treated at the individual file level instead of the directory level. This means that if you replace an App with a newer version, the resulting files would be a combination of the original and new files instead of a clean copy of the new directory structure.

2. Linux already has a standardized directory structure (i.e. $prefix/bin, $prefix/lib, $prefix/man, $prefix/share, etc.) that would be difficult to change for a large number of programs.

3. Linux lacks a good method for notifying programs of changes to files and directories.

The Solution

The proposed solution is this. Programs should be compiled normally, but with the "--prefix" setting set to a new folder. For example, /usr/src/build/$APPNAME would be an excellent choice. A standard build will usually produce the /bin subdirectory at a minimum, and optionally /lib, /man, and /share. From this folder, it is possible to execute the program by modifying the PATH and LD_LIBRARY_PATH. Now if we add a standardized shell script executable (say /execute), we now have a method for executing something inside of a given directory.

The job of the execute script would be to investigate it environment, then set up any environment variables that would be needed by the program, For example, most programs have their own variables for identifying where the information stored in /share is kept on the file system. Once the execute script has configured the environment, it will then launch the primary executable. Note that this method quite handily allows for non-native applications such as WINE and Java apps to use the same system.

We now have a highly portable method for executing applications.

Single File vs. Directory

Of course, we have not yet finished solving all of the issues listed above. While point 2 has been effectively addressed, points one and three still loom large. Surprisingly, point one can be quite easily addressed if we ditch the use of folders and move directly to using a single file instead.

If you're scratching your head at that last statement, don't worry. I haven't lost my mind yet. There is a method under which we can have a single file for an application without sacrificing the AppFolder concept we just discussed. Consider for a moment if we created a disk image of the application folder. We'd now have a single, portable file that could be moved to any Linux system. The target system can mount the image, and run the application by calling the /execute script.

The execute script becomes even more important for cross-platform binaries. The script can attempt to detect the architecture and execute the proper binary. The advantage of this method over using the ELF support, is that cross platform compilers are not necessary. The binaries can be compiled on each system, then combined into a single AppFolder/Disk image.

Things aren't all roses, though. If we're going to move to an AppFolder design, then we need to find a method for loading information about the file. For example, what icon should be shown for the disk image? The obvious solution is to detect when the file first appears on the system. Whether it be through the graphical desktop or through a more advanced apparatus such as dnotify (Linux directory notification support), the file should be detected, mounted, the meta-information read, then unmounted. More in the next article about where that meta-information might be stored.

Additional Features

Assuming that a routine is used to detect the type of disk image prior to mounting, a large number of file system features can be supported. CramFS could be used for a read-only compressed disk image, while ext2fs could be used for a writable image (perfect for time limited software). It's even conceivable that an image format could be developed with support for growing the image/file system, password protecting an encrypted disk, and other features only available in a format such as this. Even the existing abilities to freeze the software (no user damage!) and carry information around with the application are very powerful features not seen on other systems.

Existing Work

Unsurprisingly, some work has already been done in this area. Rox Filer has incorporated AppFolders for some time now, though the application selection is still quite small. Klik is a bit newer and has introduced CMG files - an application packed inside a CramFS filesystem. Last but not least, GoboLinux has taken on the challenge of building a stable Linux system that separates all binaries into per-program folders. Hopefully this article will help bring some visibility to these projects, as well as provide useful ideas for their future direction.

Next up, Part 3: File Management

Links:

Rox Filer
Klik
CramFS
GoboLinux

WARNING: Comments have been temporarily disabled during maintenence. Comments will reappear soon.

We've moved! Check out all these great articles and others at http://akaimbatman.intelligentblogger.com!

The Linux Desktop Distribution of the Future Part 1

Installing Applications is complicated
Directory structures can be confusing to navigate
Interface is confusing and inconsistent
Steep learning curve required to understand system functions

The first point is a complex issue, but mostly stems from the Linux use of package managers. Package management is one of those concepts that seems great on the outset, but fails in practice. The issue is that each package has a complex chain of dependencies unique to itself. In order to be certain that a package is compatible with all installations, all combinations of installed packages must be tested! As it is unlikely that anyone would go through so much trouble, the incompatibilities between packages accumulate, and before long the packaging system is rejecting new installs. And that's assuming that a graphical installer exists!

If a graphical installer does not exist, then life becomes even more difficult for the end user. Instead of launching a GUI and selecting the applications he wants, the user must open a terminal and begin typing cryptic commands for which he has no training for.

Many proponents of packaging systems downplay these issues by stating that packaging errors don't exist on system XYZ (despite proof to the contrary), and that if the user is running Linux he should be "smart enough" to know how to use the command line. Such statements are just silly. Users want the computer to make their lives easier. Any barrier thrown in their way will only drive them to a different platform. Unfortunately, package managers still drive most Linux desktop distributions.

The second point is caused by the spread out arrangement of Linux system files. This arrangement is intended to ease the multi-user aspect of Unix system. Unfortunately, it greatly complicates the user view of the system. Existing users are accustomed to having a root folder with a couple of system folders to worry about. For new users, even one system folder can be a massive issue. (I'm sure we've all heard about or seen the guy who deleted his Windows folder and then expected everything to work properly.) Expecting these users to understand the cryptically named "usr", "bin", and "etc" folders is probably a bit much. (OS X Finder actually hides them.)

Points three and four are caused by some rather interesting decisions in the OSS community, constant arguments about interface design, and beta quality software being bundled in distro releases. (Red Hat is particularly guilty of the latter.) For example, there's a single, non-hidden folder on Windows where Start Menu shortcuts are stored. This folder can be accessed by right clicking on the Start Menu and hitting "Show Folder". While the Windows design is a confused interface, at least it's consistent. In GNOME and KDE, the menu folder has moved several times, switched back and forth between a GUI Tree, File Explorer, and Drag/Drop-on-the-menu interface for modifications, and they had a split personality on whether or not user-specific menu items could be created. Most of these issues have been smoothed out to some degree, but not without leaving users utterly confused.

With these issues in mind, let's put some thought into how we might fix them.

Part 2: Applications

Links:

Making desktop Linux software installation easy
The Year of the Linux Desktop: 2003
2004: The Year of the Linux Desktop?
2004 Won't Be the Year of the Linux Desktop
2005 Will Be the Year of the Linux Desktop
Desktop Linux News, Articles, and Forums

WARNING: Comments have been temporarily disabled during maintenence. Comments will reappear soon.

Thursday, June 09, 2005

We've moved! Check out all these great articles and others at http://akaimbatman.intelligentblogger.com!

Explanation of Database File Systems

Category: Technology Explained

One of the hottest topics on the market today is Database File Systems. Between Gnome Storage, WinFS, and now Apple Spotlight, it seems like everyone is making a big deal out of these features. But what is a database file system and why is it so important? I'll attempt to answer those questions and provide a good explanation of how a modern Database File System can be structured.

Defining the Problem

In every day life we tend to keep track of things by association. For example, I know that the bill I received from the credit card company is both a bill and is from my CC company. So what if I wanted to find that bill again, but didn't remember what I did with it? Well, I'd probably go and check in my bills first, then perhaps in a folder where I kept old paperwork, then on my desk, then perhaps somewhere that I might have left it by accident. Note the associations here:

All Bills
Archived Paperwork
Recent Desktop Documents
Search Common Areas

These associations are a form of "meta-data" or "data about data". Usually we don't consider this information anywhere near as important as the data itself, but without it we couldn't even find the data!

Now let's assume for a moment that our mythical credit card bill was delivered to us electronically. If we further assume that we kept it as a file on disk, then the question that comes to bear is: Where do we put it so that it's easy to find?

Options include:

In a "Bills" folder. It's probably a good idea to keep all the bills together. That way we know where to find one that we might need.
In an "Archived Documents" folder. The bill is already taken care of and paid. Why not just archive it and keep it out of the way?
On the Desktop. After all, it's an important file, right? So this important file should go somewhere noticable.

As you may have noticed, this is very similar to the issues faced in real life. The difference is that a computer should be able to do much more to help organize the information than just placing it somewhere you hope is obvious.

To a certain degree a computer can help. If I lose track of what I did with the bill, then I can attempt to search for it. The computer will then trace through every file in the system attempting to find what I lost. The problem with such a search, however, is that the search can only work as well as the name I've given the file. If I got lazy and called the file "CC Bill", then it won't be returned in a search for "Credit Card".

Meta-Data

What if we could improve upon the situation presented above? For example, what if we could place the bill in the Bills folder, Archived Documents folder, and the Desktop simultaneously? How about if we could search inside the document for the information it contains? What if the computer could intelligently score files it finds in a search and sort them based on the files it thinks are the best match? With the meta-data from a database file system, all of this becomes possible.

Generally speaking, there are three types of meta-data that a system might support:

Inherent meta-data
User applied meta-data
Organizational meta-data

Inherent meta-data is data that can be derived from the very existence of the file. At the simplest level, this includes things like the date it was created and the size of the data. More complex meta-data schemes may actually dive into the binary stream of the file data and extract additional meta information assigned to the file. This could be as simple as the text of a word processing document or as complex as the name of a movie or the artist of a song. It should go without saying that the better the system is at understanding files, the more useful information it can extract from a file.

User applied meta-data is data that the computer user explicitly adds to a file. For example, I might type a note in a meta-data field stating that I need to remember that this bill is due by the 28th of February instead of the usual 30th of the month. This information can often be very useful, but it can be difficult to convince users to take the time to apply it.

Organizational meta-data is something of a cross between the two previous types of meta-data. It's the type of data that a user might add to a file for the purposes of better organization, and thus becomes an inherent attribute that can be queried on. For example, let's say that I wanted to categorize bills under a meta-data tag called "Bills". And let's say that I then had a tag for each credit card company so that I could easily find correspondence with them. Now I have added meta-data to the file that tells the system that the document is a Bill from Credit Card Company A!

Out with the Old, In with the New

The key to a database file system is that the meta-data attached to files provides us with a better method for searching for things. The theory is that if searches get good enough, the traditional methods of accessing files can go away all together. Instead of browsing through folders, the user can just type "Bills" and get a list of all the bills stored on the system. Or the user can narrow it down and ask for "Bills from Credit Card Company A" and receive a more specific list of results. The key is that the search will always come back with what the user needs, but in a much quicker fashion than if the user had attempted to manually find the files.

So does that mean that Folders will go away all together? The answer is both yes and no. No, traditional folders won't be as useful. But at the same time, it is occasionally nice to be able to browse the information contained in your system. The replacement solution is two fold. The first part of the solution is to allow the user to save search queries as a psuedo-folder. This provides a user with easy, and automatic organization of his files. For example, he could create a saved query that searches for all movies. Then whenever he wishes to know which movies exist on his system, he can just open the psuedo-folder and see the results of the search!

The only issue with using saved queries is that without regular folders your files will be lost until you can craft a query to find them. As a result, another solution is necessary.

The second part is a concept known as Labels. Labels are a type of meta-data that falls under the Organizational category. The idea is that you can create as many Labels in your system as you'd like, then apply them to individual files. As files are tagged with Labels, they automatically appear in psuedo-folders that display the name of the Label. Files lacking a label will appear in an area that displays unlinked files. This list not only ensures that the user never loses his files, but also encourages him to properly Label them. And since a user can apply and subtract any number of Labels at will, there's much less of a need to ensure organizational correctness up front.

The concept can even be extended into common metaphors in use today. What if the Desktop stopped being a folder for files, and was just a standard system label? Files could easily be moved to the desktop and removed just by applying and removing the label! Files could be trashed with nothing more than a Label called "Trash". Hundreds of uses could spring up to ease the user interface just because a better method of organization exists! That is why database file systems are considered the future of computer file systems.

Links:

Wikipedia: File Systems
Practical File System Design with the Be File System (1999)
Spotlight Technology Brief
DBFS for KDE
Microsoft WinFS

Questions? Comments? Use the "comments" feature on this blog to leave feedback. The more people who I know are listening, the more of these articles I will do. So sound off and let your voice be heard!

WARNING: Comments have been temporarily disabled during maintenence. Comments will reappear soon.

Thursday, June 02, 2005

We've moved! Check out all these great articles and others at http://akaimbatman.intelligentblogger.com!

Have you been waiting for Star Wreck?

Category: Entertainment

After 5 years(!) of development, the legacy of Star Wreck 6 is finally coming to a head. No, they have not given up on the project. Something far, far better has happened. According to the front page, Star Wreck 6 will hit DVD distribution on August 20, 2005! So get your twinkler beams ready folks, because we're in for a Star Trek parody of far greater proportions than any Star Trek movie ever released. Move over Kirk, you're in Captain Pirk's seat!

For those of you unfamiliar with Star Wreck, here's a quick history: Back in 1992 a young Finnish boy by the name of Samuli Torssonen decided to do a video of a starship battle. Utilizing a copy of Deluxe Paint for frame by frame animation, he managed to put together a four minute clip of various Star Trek ships doing battle. Just to have fun, he also added a comedic aspect to the film. Boy, if this young fellow had any idea what he was getting into...

The success of the first video spurred young Samuli, and by 1994 he'd managed to pull off a new Star Wreck video. But this time Mr. Torssonen did the ships in 3D, and used cell animations for the characters. The result was one of the funniest Star Trek parodies ever to hit video!

The laughes and running time only increased in Star Wrecks III and IV, jumping all the way to 31 and 47 minutes, respectively. Star Wreck was becoming a feature film, and Samuli only wanted to push it farther. Thus Star Wreck V: Lost Contact made a huge leap and became the first live action Star Wreck episode. Only one problem....

Our heros were trapped in the past....

That was 8 years ago. Plans for a sequel had been immediately put into action, but the project was progressing slowly. The decision was made to make the next episode into a feature film, but the team was unsure if they could pull it off. As a result, Samuli created a secret project known as Star Wreck 4 1/2. This project tested new camera techniques and 3D rendering packages to prove that they could make a professional quality movie. The result was short and not as funny as the previous installments, but was an amazing sight to behold. The quality of the 3D was as good (if not better!) than the quality of The Next Generation TV show! Not to mention the compositing done for the bridge scenes! For the first time, the live action bridge was truely open, and appeared to be much more than just an image pasted behind Captain James B. Pirk.

The triumph of the 4 1/2 project was both a blessing and a curse for the sixth Star Wreck installment. Most of the film that was shot to date was scrapped and reshot. New actors and actresses were brought on board, the music and 3D were redone, and the greatest comedy adventure of all time was on its way to becoming a reality.

So load the light balls, and get ready for an action packed adventure! Coming August 20, 2005 to a DVD player near you is:

Star Wreck 6: In the Pirkinning

Fukkoooooooovvvvvvv!!!

Links:

Star Wreck 6
Movie Trailer
Previous Star Wreck Movies
Star Wreck: Asskicker (A spin off)

WARNING: Comments have been temporarily disabled during maintenence. Comments will reappear soon.