How would you implement a Workflow system?

Go To StackoverFlow.com

1

I need to implement a Workflow system.

For example, to export some data, I need to:

  1. Use an XSLT processor to transform an XML file
  2. Use the resulting transformation to convert into an arbitrary data structure
  3. Use the resulting (file or data) and generate an archive
  4. Move the archive into a given folder.

I started to create two types of class, Workflow, which is responsible of adding new Step object and run it.

Each Steps implement a StepInterface.

My main concerns is all my steps are dependent to the previous one (except the first), and I'm wondering what would be the best way to handle such problems.

I though of looping over each steps and providing each steps the result of the previous (if any), but I'm not really happy with it.

Another idea would have been to allow a "previous" Step to be set into a Step, like :

$s = new Step();
$s->setPreviousStep(Step $step);

But I lose the utility of a Workflow class.

Any ideas, advices?

By the way, I'm also concerned about success or failure of the whole workflow, it means that if any steps fail I need to rollback or clean the previous data.

2012-04-05 16:41
by Trent


3

I've implemented a similar workflow engine a last year (closed source though - so no code that I can share). Here's a few ideas based on that experience:

  1. StepInterface - can do what you're doing right now - abstract a single step.
  2. Additionally, provide a rollback capability but I think a step should know when it fails and clean up before proceeding further. An abstract step can handle this for you (template method)
  3. You might want to consider branching based on the StepResult - so you could do a StepMatcher that takes a stepResult object and a conditional - its sub-steps are executed only if the conditional returns true.
  4. You could also do a StepException to handle exceptional flows if a step errors out. Ideally, this is something that you can define either at a workflow level (do this if any step fails) and/or at a step level.
  5. I'd taken the approach that a step returns a well defined structure (StepResult) that's available to the next step. If there's bulky data (say a large file etc), then the URI/locator to the resource is passed in the StepResult.
  6. Your workflow is going to need a context to work with - in the example you quote, this would be the name of the file, the location of the archive and so on - so think of a WorkflowContext
Additional thoughts

You might want to consider the following too - if this is something that you're planning to implement as a large scale service/server:

  1. Steps could be in libraries that were dynamically loaded
  2. Workflow definition in an XML/JSON file - again, dynamically reloaded when edited.
  3. Remote invocation and call back - submit job to remote service with a callback API. when the remote service calls back, the workflow execution is picked up at the subsequent step in the flow.
  4. Parallel execution where possible etc.
  5. stateless design
2012-04-05 19:25
by Raghu


1

Rolling back can be fit into this structure easily, as each Step will implement its own rollback() method, which the workflow can call (in reverse order preferably) if any of the steps fail.

As for the main question, it really depends on how sophisticated do you want to get. On a basic level, you can define a StepResult interface, which is returned by each step and passed on to the next one. The obvious problem with this approach is that each step should "know" which implementation of StepResult to expect. For small systems this may be acceptable, for larger systems you'd probably need some kind of configurable mapping framework that can be told how to convert the result of the previous step into the input of the next one. So Workflow will call Step, Step returns StepResult, Workflow then calls StepResultConverter (which is your configurable mapping thingy), StepResultConverter returns a StepInput, Workflow then calls the next Step with StepInput and so on.

2012-04-05 18:00
by biziclop
Hi, thanks you for your valuable answer, I did something similar (at least the first step, http://pastebin.com/KF2bp0Wy) I though too of a StepResult but not a StepInput, concerning the "Configuration" as I'm working with Symfony2 I was wondering to use a ConfigurationBuilder to validate result - Trent 2012-04-05 18:19
I'm afraid I can't give you any implementation help, unless you're doing it in Java. :) Note though that the "natural" way to do rollback is to call them in a reverse order, so that each step can know that it's in exactly the same state as it was after it ran - biziclop 2012-04-05 18:32


1

I've had great success implementing workflow using a finite state machine. It can be as simple or complicated as you like, with multiple workflows linking to each other. Generally an FSM can be implemented as a simple table where the current state of a given object is tracked in a history table by keeping a journal of the transitions on the object and simply retrieving the last entry. So a transition would be of the form:

nextState = TransLookup(currState, Event, [Condition])

If you are implementing a front end you can use this transition information to construct a list of the events available to a given object in its current state.

2015-09-29 19:30
by cyberman
Ads