Scientific Workflow Management by Database Management
In several working environments, production involves repeated executions of certain procedures. A workflow describes the individual tasks performed in these procedures and their interrelationships. Current Workflow Management Systems (WFMSs) use a Database Management System (DBMS) to store task descriptions, and implement all workflow functionality in modules that run on top of the DBMS. Motivated by scientific workflows, we propose a much more DBMS-centric architecture, in which conventional database technology provides much of the desired scientific WFMS functionality. A key element of our approach is viewing the workflow as a web of data objects interconnected with active links that carry process descriptions. The workflow is fully defined as a database schema, and its execution is the gradual buildup of an instance of this schema through the active object links. For our work, we use the modeling and querying tools of Horse, the object-oriented DBMS that we have developed in the context of the Zoo Desktop Experiment Management Environment.