Jurassic Park - where DynaSOArs play: 2006

Wednesday, May 03, 2006

Contract first

I had an excellent e-mail exchange with Jim and Savas about proper message orientation, and now I feel I have a much clearer concept. The way I have implemented DynaSOAr 2.0 is message-oriented, and loosely-coupled, but there is one drawback - the consumer will not be able to generate any metadata about the service and the messages it consumed with the current WS tools. In a proper message-oriented service, you need to define your messages first, and interact with the service by sending those messages, and not the business objects which is a common tendency in the current WS programming style. Locally these messages are "java objects", you create them and set their properties - but that is not the same as interacting with the service by sending your internal business objects.

What do I mean by these? Let's take a couple of examples:

(1) Sending business objects:

public class PurchaseOrder { }

public class OrderedItem { }

public OrderedItem processPO (PurchaseOrder myOrder) { }

This is the common style where you send your business objects (like PurchaseOrder) while interacting with your service. Effectively, you are exposing the internal details of your service, and also, you need to have several APIs (like deletePO, updatePO etc) to perform several different processing.

(2) Sending messages:

public class POMessage { }

public class ResponseMessage { }

public ResponseMessage processMessage (POMessage myMessage) { }

Here, you are explicitly sending messages, and not exposing your business objects. Ideally, you can have different types of messages, and the service should be able to deal with them differently based on the type of each message - by which it means you do not have to expose a CORBA-style OO interface.

So, this is the style I will be adopting for a revision of DynaSOAr 2.0 - "contract first", as it is termed in the WS-world these days.

And there is a "contract first" issue elsewhere too - my contract runs out on June 15th. Unless it is extended soon, I will not be able to finish what I started. Yes, I can always go back home (to India) and get another job there - but, in that case I won't be able to complete my PhD here - which will mean that I will have wasted a great deal of time...

I am a little tensed about this - hoping that something will come up soon.

Tuesday, April 25, 2006

A step closer to the fully dynamic DQP

The guys from EPCC mentioned that they are trying to create a lighter version of OGSA-DAI which would be readily deployable within a container - as a WAR file. This is a good news for me. So far, deploying OGSA-DAI was a rather complicated process - first you had to deploy OGSA-DAI, then add a data service, then create a data service resource and finally add the resource to the service - as a result of which, I wasn't able to deploy OGSA-DAI on the fly on a node where the data resides. Now that this version of OGSA-DAI will be available (soon), I will move one step closer to the concept of "moving code to the data" - because I already have a version of DQP which can deploy my evaluator services on-the-fly, and I am now adding the code to deploy the analysis services on-the-fly as well.

The code is mostly ready, but I can't test it because since yesterday for some reason, some machines in the giga-cluster are down or unreachable:-(

Old, but still gold

It's about ten years old, but still worth reading - an essay by Neal Stephenson (author of the Snow Crash) - "In the beginning there was the command line".

PS: Whatever Neal writes in the essay and despite how well, I still love my Powerbook G4. Apple has come a long way since those memory-problems when improper memory handling caused them to crash...Lookwise, Powerbooks, iBooks and iMacs are sleek, actually resembling sleek luxury cars, performance-wise they seem to be superior to the windows machines I use. But yes, you need to know how to drive before you start driving a Jaguar, don't you?

Wednesday, April 12, 2006

DynaSOAr 2.0 is ready

Finally, I have been able to finish off the DynaSOAr 2.0 prototype.

This new prototype extends the previous one with the concept of a broker and I have also added the DynaSOAr registries (using GRIMOIRES). In the new architecture, DynaSOAr maintains an internal registry (for its own use) which is updated every time a new service is deployed or added to the repository. For example, when a service is added to the repository for the first time, an entry is made in the registry, with the endpoint of the repository as a CategoryBag for the service. Initially, there are no AccessPoints (service endpoints) because it is not deployed anywhere. When the service is deployed on some node, the registry is updated and a new AccessPoint is added to the service entry.

In the new DynaSOAr Service Provider receives a service request (say meant for Service A) from a consumer. It simply adds the abstract name of the service and the endpoint of the internal registry to the message header and passes it onto the bound entity, which can be a Broker or a HostProvider. The Broker and the HostProvider have similar interfaces, but they function differently. A HostProvider either manages a cluster of nodes, in which case, the HostProvider entity is a ROOT. Alternatively, it can be one of the nodes in the cluster as well, in which case it is the LEAF. So, the Service Provider will always have an entity locally bound to it - and this entity can be a broker, connecting to other brokers or HostProviders. This entity can also be a ROOT HostProvider, and a LEAF HostProvider too. Services are always deployed on the LEAF HostProvider. A Broker decides on where to forward the request if it is connected to other brokers or HostProviders. Right now, I have a very simple random scheduler, but there is a group at Newcastle who are looking into proper scheduling algorithms for this. The HostProvider, once it receives the request checks whether it has the service already deployed within its domain - in which case it is passed on to that node. If the service hasn't yet been deployed, the service code (currently only WAR files) is fetched from the repository and deployed on a target node...

(Right now, I am not considering hybrid nodes - where the node may be responsible for managing other nodes, and deploy service on itself as well)

I have tried some new things in this version - for example using Castor instead of Axis generated stubs. I am using the recommended signature for message-oriented services (because these services need to deal with message headers). I created a schema for all the messages (defined the messages) and used Castor to generate the Java bindings - and within the services, the messages are created using these Java bindings and then marshalled into an XML document using Castor - on the other hand, an XML formatted message is unmarshalled into Java objects by Castor. I am not totally sure whether this approach has been tried by someone or not or whether this should be the proper approach for message-orientation though. But it seems to fit into the concept of message-oriented services, where you first define your schema, and there is only one method in the service interface which decides what to do based on the message received...

Further, I have been able to use this framework with OGSA-DQP - where, the evaluator services can now be dynamically deployed on the nodes which contain the data resources. So, this is one step in the direction of achieving what we call "moving the code to the data".

Friday, January 13, 2006

GRIMOIRES doing funny things

As I mentioned in my past two posts, that I have been trying to use GRIMOIRES as the registry in my dynamic deployment work. The idea is that when a service is deployed on a node, it will be registered with the GRIMOIRES registry, which is the basic concept of UDDI. What we are trying to add is a dynamic deployment feature, where the Service Provider will advertise a set of services, which may or may not be deployed somewhere. The flow is as follows:

A consumer contacts the Service Provider (SP) and finds the services that are supported by the SP. The consumer then decides which service is to be invoked, and sends a request message (SOAP) to the SP. The SP looks up the registry and finds out on which nodes this service is already deployed. If such a node is found, the request is forwarded to that node and on completion, the response is sent back to the consumer. But, if there are no nodes on which the service is already deployed, the SP sends a message to a suitable host to deploy this service dynamically. In this message, the SP provides the service name/ID and the location where the deployment code can be found. In the DynaSOAr work, we have already developed this infrastructure (except the registry), where the deployable code can be fetched from a code-store (which is again a service). I am now trying to develop a registry, and that is where GRIMOIRES comes in. I could have developed a simple MySQL-based utility to store all the information required in the MySQL backend, but then it defies the purpose of UDDI.

So, I need to add "businessEntities" and "businessServices" to the GRIMOIRES registry. The services would have more than one bindingTemplates - because a service can be deployed on more than one nodes. Each service should also have a reference to the Code-Store URL, from where the service code can be fetched during hot-deployment. UDDI specs allow more than one bindingTemplates and there is a concept of tModels, which can be used for reference purposes - which eactly suits the requirement for the reference to the Code-Store. This is what I tried to do. Adding a businessEntity and a businessService din not prove to be difficult at all. But the problem cropped up when I added the bindingTemplate, more specifically more than one of those. It seems GRIMOIRES creates duplicate entries while doing this via the UDDIBrowser. So, once I add a bindingTemplate to a service, the registry is updated, and when the businessEntity (under which the businessService is created) is expanded, multiple copies of the same entry is displayed (using UDDIBrowser). But, funnily, if a query is sent to the registry (via the same UDDIBrowser) inquiry interface, it returns the correct number of services...I suspect it sends a "select distinct"-like query to the database.

Other than this, I think I am comfortable with the tModel concept for CodeStore. So, each of the services registered with the registry will have multiple bindingTemplates, and one tModel reference. I have created a tModel as follows:

<tmodel tmodelkey="some tModel key - uuid">
<name>CodeStoreLocation</name>
<description lang="en">some description</description>
<overviewdoc>
 <overviewurl>http://codestore.url</overviewurl>
</overviewdoc>
</tmodel>

And I am using this as a reference within the businessServices - as categoryBag entries.

<businessService
businessKey="544d3b47-c908-449a-9f2d-8f5c4f69fa9e"
serviceKey="cdb0c903-839b-4a74-ae1d-c3981e55e27a">
<name>QueryEvaluationService</name>
<bindingTemplates>
 <bindingTemplate
   bindingKey="c8cdc9a5-f070-4e16-9e09-f78b3842f9c3"
   serviceKey="cdb0c903-839b-4a74-ae1d-c3981e55e27a">
   <accessPoint URLType="http">serviceURL</accessPoint>
   <tModelInstanceDetails>
     <tModelInstanceInfo tModelKey="some key"/>
   </tModelInstanceDetails>
 </bindingTemplate>
 <bindingTemplate
   bindingKey="ca8d1c0e-9312-4b92-9d80-edb7c247a813"
   serviceKey="cdb0c903-839b-4a74-ae1d-c3981e55e27a">
   <accessPoint URLType="http">service URL</accessPoint>
   <tModelInstanceDetails>
     <tModelInstanceInfo tModelKey="some key"/>
   </tModelInstanceDetails>
 </bindingTemplate>
</bindingTemplates>
<categoryBag>
 <keyedReference
   keyName="CodeStoreLocation"
   keyValue="http://codestore.is.somewhere"
   tModelKey="dqp:uk.org.ogsadai:codestore:location"/>
</categoryBag>
</businessService>

I guess this should work...

Thursday, January 12, 2006

UDDI Registry And tModels

For the past few days I have been scratching my head on how to describe the services that I will put in the DynaSOAr registry for the dynamic distributed query processing work. I have been exploring GRIMOIRES as a possible option. It provides a GShell interface to interact with the registry and alternatively the UDDIBrowser can be used to browse the contents and publish/query entries in the registry. Unfortunately, UDDIBrowser contains minimal (rather no) documentation which led to several problems in publishing new entries, especially tModels. A google search led me to this article on tModels - I found it quite useful. I have a clearer idea on describing the services to be exposed by the registry - specially, identifying the code store repository from where the service code can be fetched in case of dynamic deployment, and also information about the service or the virtual machine image...

I still have problems with GRIMOIRES, which I think is possibly a bug in the UDDI server - because similar steps against jUDDI were successful.

Jurassic Park - where DynaSOArs play