Today I’ll be talking about G-ICING, an Installable File System for Windows that interfaces with a Grid-Backend. As this photo shows, G-ICING is intended to be a layer on top of the Grid intended to make the grid tastier ;-)
For example, Biomedical researchers want to integrate patient history, demographics and trial records to better study diseases and treatments.
Explosion of information systems has led to vast amounts of data stored in widely varying formats, locations, and subject to numerous access control and privacy policies. Numerous users, projects and organizations spanning research industry and government have recognized the enormous potential of integrating the data. However, the complexity of integrating data between various sources presents a large obstacle to this goal. Data exists across organizational boundaries and in varying formats. Finding and naming these resources presents another problem. In essence, the energy barrier is too high.
Grids just have little uptake. They often tend to be hard to use, thus requiring users to get past a large learning curve. When we talk about difficulty, we also tend to talk about transparency to the user and application. Grids tend not to have either transparency … users must learn something new, and applications either have to be modified and recompiled or specifically made for the Grid. Most grids have one security model. Organizations much completely switch to using that model or cannot securely use the Grid Grids don’t play well with others. While Ian Foster pushed for standards and many people believe standards are essential to Grids. There are various such “standards” in play and not one set of standards that all should follow. THE MAJORITY OF USERS WHO COULD BENEFIT FROM GRID COMPUTING HAVE NEVER USED OR EVEN HEARD OF IT.
A solution to that uses Grid computing should follow these four criteria. First it should be simple and familiar … it seems that most Grids have ignored this fact or treated it as a second-level goal. Solutions are exclusively either user transparent, or application transparent. Some are neither.
Talk briefly about each point The numbers differ depending on your source, but the majority of people still use Windows. At the same time, however, there have been few solutions in Windows that have attempted to answer the four criteria. It is clear that if we want more uptake to the Grid and to solve this problem for a lot of people, we really need to have a solution in Windows.
G-ICING is a filesystem for Windows that maps a Grid platform into the Windows file namespace. You can access G-ICING as you would any network drive … e.g. you could mount it to G: and access the Grid namespace from there.
Here is the overall design of G-ICING. You can read this diagram from left to right. G-ICING is divided into three main components, the Kernel Management Service, the User Forwarding Service and the Grid Interface Service. I will discuss each of these modules in turn and how G-ICING communicates to the Grid backend in the following slides. For clarity, I will be highlighting in yellow each aspect before I talk about them. I’m going to try to gloss over some of these details so please feel free to stop me for any clarifications you may want.
What are these I/O requests and where do they come from
By far the most difficult aspect of G-ICING to design and the main reason why most people haven’t built filesystems for Windows. A particularly difficult issues with writing a kernel driver was being able to communicate to user processes. Why do I need to do this? Because using standards based in XML, you’re very limited to the libraries you can use in the kernel. Therefore, in terms of amount of time it takes to develop, its much easier to develop this in user mode.
Inverted Call Model Takes advantage of the I/O mechanisms in WinNT User level program makes special I/O Request: “Hello, I’m waiting for an operation” and Kernel mode stores it Kernel forwards actual I/O requests, to the user mode by responding to the above I/O with the forwarded I/O call Problems with limited buffer size between processes
Security – within the host G-icing relies upon Windows security to protection information between users of the same machine. Host-grid secuirty is complex. G-icing acts as a proxy for users. GIS prompts user for credentials and user is given a choice of using one of their certificates. GIS gets a signed delegated credential. Uses WS-Security* family of specifications and profiles.
People are familiar with Windows – because filesystem paradigm is a familiar one (this proves our point)