Currently, schedulers only support cron-based triggers. That means a workflow runs every so often. The disadvantage of that are
- wasted resources in case no data is available
- added latency in case the data is available sooner
It would be better if a schedule could be triggered by event notifications about:
- state change of another program (for example, workflow finished successfully)
- data availability (for example, a partition was added to a PFS)
Event-based schedules may need to be combined with time schedules, for example: "run this workflow as soon as a new partition is available, but no more than 2 times an hour"
This needs more detailed design. WIP design here: https://wiki.cask.co/pages/viewpage.action?pageId=4363504