Tuesday, April 25, 2006

A step closer to the fully dynamic DQP

The guys from EPCC mentioned that they are trying to create a lighter version of OGSA-DAI which would be readily deployable within a container - as a WAR file. This is a good news for me. So far, deploying OGSA-DAI was a rather complicated process - first you had to deploy OGSA-DAI, then add a data service, then create a data service resource and finally add the resource to the service - as a result of which, I wasn't able to deploy OGSA-DAI on the fly on a node where the data resides. Now that this version of OGSA-DAI will be available (soon), I will move one step closer to the concept of "moving code to the data" - because I already have a version of DQP which can deploy my evaluator services on-the-fly, and I am now adding the code to deploy the analysis services on-the-fly as well.

The code is mostly ready, but I can't test it because since yesterday for some reason, some machines in the giga-cluster are down or unreachable:-(

Old, but still gold

It's about ten years old, but still worth reading - an essay by Neal Stephenson (author of the Snow Crash) - "In the beginning there was the command line".

PS: Whatever Neal writes in the essay and despite how well, I still love my Powerbook G4. Apple has come a long way since those memory-problems when improper memory handling caused them to crash...Lookwise, Powerbooks, iBooks and iMacs are sleek, actually resembling sleek luxury cars, performance-wise they seem to be superior to the windows machines I use. But yes, you need to know how to drive before you start driving a Jaguar, don't you?

Wednesday, April 12, 2006

DynaSOAr 2.0 is ready

Finally, I have been able to finish off the DynaSOAr 2.0 prototype.

This new prototype extends the previous one with the concept of a broker and I have also added the DynaSOAr registries (using GRIMOIRES). In the new architecture, DynaSOAr maintains an internal registry (for its own use) which is updated every time a new service is deployed or added to the repository. For example, when a service is added to the repository for the first time, an entry is made in the registry, with the endpoint of the repository as a CategoryBag for the service. Initially, there are no AccessPoints (service endpoints) because it is not deployed anywhere. When the service is deployed on some node, the registry is updated and a new AccessPoint is added to the service entry.

In the new DynaSOAr Service Provider receives a service request (say meant for Service A) from a consumer. It simply adds the abstract name of the service and the endpoint of the internal registry to the message header and passes it onto the bound entity, which can be a Broker or a HostProvider. The Broker and the HostProvider have similar interfaces, but they function differently. A HostProvider either manages a cluster of nodes, in which case, the HostProvider entity is a ROOT. Alternatively, it can be one of the nodes in the cluster as well, in which case it is the LEAF. So, the Service Provider will always have an entity locally bound to it - and this entity can be a broker, connecting to other brokers or HostProviders. This entity can also be a ROOT HostProvider, and a LEAF HostProvider too. Services are always deployed on the LEAF HostProvider. A Broker decides on where to forward the request if it is connected to other brokers or HostProviders. Right now, I have a very simple random scheduler, but there is a group at Newcastle who are looking into proper scheduling algorithms for this. The HostProvider, once it receives the request checks whether it has the service already deployed within its domain - in which case it is passed on to that node. If the service hasn't yet been deployed, the service code (currently only WAR files) is fetched from the repository and deployed on a target node...

(Right now, I am not considering hybrid nodes - where the node may be responsible for managing other nodes, and deploy service on itself as well)

I have tried some new things in this version - for example using Castor instead of Axis generated stubs. I am using the recommended signature for message-oriented services (because these services need to deal with message headers). I created a schema for all the messages (defined the messages) and used Castor to generate the Java bindings - and within the services, the messages are created using these Java bindings and then marshalled into an XML document using Castor - on the other hand, an XML formatted message is unmarshalled into Java objects by Castor. I am not totally sure whether this approach has been tried by someone or not or whether this should be the proper approach for message-orientation though. But it seems to fit into the concept of message-oriented services, where you first define your schema, and there is only one method in the service interface which decides what to do based on the message received...

Further, I have been able to use this framework with OGSA-DQP - where, the evaluator services can now be dynamically deployed on the nodes which contain the data resources. So, this is one step in the direction of achieving what we call "moving the code to the data".