Workflow Scheduler

Table of Contents
System Overview
Scheduling a Workflow - Internal
Using the Quartz Scheduler
Building and Installing Workflow Scheduler
Access Control
Workflow Scheduler Client
System Overview

The Workflow Scheduler (WFS) was originally implemented as part of the TPC (Threshold for Potential Concern) project. It is one piece in an architecture that currently includes Kepler, Metacat and the Workflow Run Engine.

The WFS is a stand alone java servlet web application. It handles requests to schedule, unschedule, reschedule, delete and serve up workflows. It manages the schedules by wrapping the Quartz Scheduler utility.

The WFS also maintains a database of scheduled workflow information so the schedules can be restored in the event of server restart.

A typical flow to create a workflow can be seen here:

In this case, the steps to create, schedule and run a workflow are:

  1. The user creates the workflow on their instance of Kepler.
  2. The user chooses to save the workflow to an instance of Metacat.
  3. The workflow is saved on a pre-configured instance of Metacat.
  4. The user chooses to schedule the workflow from within Metacat.
  5. Metacat sends a schedule request to the Workflow Scheduler.
  6. When the schedule criteria is triggered, the WFS requests that the Workflow Run Engine (WRE) run the workflow.
  7. The WRE gets the workflow information from Metacat.
  8. The WRE instantiates a local instance of Kepler and requests that it run the workflow.
  9. Kepler runs the workflow.
  10. Kepler sends the workflow run results to Metacat.
  11. Metacat saves the workflow run results.

Note that this diagram does not show several steps in the process. Most notibly, the authentication and authorization steps between the WFS and Metacat. These are covered in the next section.

Scheduling a Workflow - Internal

Here we will look in a little more detail at what happens in the WFS when a workflow is scheduled.

  1. The user logs into metacat. This is important, since the user must have a logged in session in order to schedule the workflow.
  2. The user can then go to the workflow page to search and view workflows. At the time of this writing, the workflow page was only implemented for the SANParks skin.
  3. The user can choose a workflow to schedule from the workflow page. A schedule form will be displayed, and the user can enter start time, end time and Interval in hours, days, weeks or months.
  4. Once the user chooses to schedule, Metacat will place a request to the WFS to schedule the workflow.
  5. The WFS will make two calls to Metacat axis api services to make sure that 1) the session is logged in and 2) the user associated with the session is authorized to schedule to workflow.
  6. The WFS will register the workflow schedule with the Quartz scheduler utility and save the schedule information in the database.

Here is a sequence diagram including the class names for the process just described:

Using the Quartz Scheduler

This section will briefly cover the WFS's use of the Quartz Scheduling functionality. For a more detailed overview of Quartz, see the Quartz documentation.

The WFS contains a service (SchedulerService.java) which wraps the Quartz scheduling functionality. It maintains a static instance of the Quartz scheduler (org.quartz.Scheduler). When the SchedulerService is instantiated, it creates an instance of the Quartz scheduler factory (org.quartz.impl.StdSchedulerFactory) and uses this to get the Quartz scheduler.

When a request is made to schedule a new job, the Scheduler Service creates a Quartz job detail object (org.quartz.JobDetail) which holds the information needed to create the new schedule. This information includes the job name, the job group name, and the job class name.

The job is then scheduled using the Quartz trigger class (org.quartz.Trigger) and Quartz trigger utilities class (org.quartz.TriggerUtils). These are used to create triggers that fire on a secondly, minutely, hourly, daily, weekly or monthly basis. Note that the SchedulerService contains individual methods to handle each of these intervals, but for the most part, they do the same things and exist to facilitate ease of use.

The job is then registered with the Quartz scheduler class.

Once the criteria for a trigger is reached, the Quartz scheduler uses reflection to determine what handler class needs to be run (see the job class name in the Quartz job detail object). In the WFS, there is currently only one handler class created (WorkflowJob.java). This class implements the Quartz interruptable job interface (org.quartz.InterruptableJob). The run() method is called on this class (again via reflection).

The WorkflowJob class handles the communication with the WRE axis client.

Building and Installing Workflow Scheduler

These are the steps to install the workflow scheduler. These instructions assume that you already have a working copy of apache and tomcat.


Build the workflow scheduler from the source:

Or download the binary release.

Update Apache configuration:

Restart:

Now the scheduler will be running with the default configuration. The following instrctuctions will give more details about how to modify the default configuration.


Configure the user who can execute the workflow run engine:

The workflow scheduler often schedules jobs to run the workfow run engine which is secured by the Axis Rampart component. The workflow run engine only allows a set of users who can run it. The default user name is uid=kepler,o=unaffiliated,dc=ecoinformatics,dc=org. You may change it by:

The default database is HSQL for the scheduler. You don't need to modify anything if you would like to use it. However, you may change it to the PostgreSQL:

You may configure a Metacat instance to access the workflow scheduler as a client:

Access Control

The default the setting of the Workflow Scheduler allows any KNB user to use the server to schedule a workflow on which the user has a READ permission. However, the administrator can modify the workflowscheduler.properies located on your-webapps/workflowscheduler/WEB-INF to enable the access control -- only a set of users can use the Workflow Scheduler to schedule a workflow. When the user schedule a workflow, he/she should still have a READ permsiion on it.


accessControl.isAccessControlOn=true
accessControl.allowedUsers=uid=kepler,o=unaffiliated,dc=ecoinformatics,dc=org:cn=knb-dev,o=NCEAS,dc=ecoinformatics,dc=org


The above setting will allow the user kepler and the users in knb-dev group to use the scheduler. Any other users will be rejected.

Workflow Scheduler Client

The Workflow Scheduler Client provides a Java library to access the Worklfow Scheduler Server in Java applications.


Build the client from the source:

Or download the binary client:

Java APIs: