Open
Description
Contact Details
What happened?
When running the orchestrator in celery mode, the start stopping of tasks is not idempotent. There are some cases where tasks can be stuck in a "resumed" state and never picked up. Furthermore the running_processes counter in the database often does not reflect the accurate state of affairs.
Steps to reproduce
- submit a large amount of workflows onto a queue
- kill celery
- A large number of tasks will remain in the
created
state.
It is also possible to do this when retrying a large number of workflows, they will then become stuck in resumed
state. Or killing celery whilst executing long running tasks and start/stopping the engine
Sugested solution
As we heve implemented #832, we can now make the pick-up of tasks idempotent. Please design and implement a method so celery, at startup re-fills it's queue correctly.
Version
3.1.2rc4 and lower
What python version are you seeing the problem on?
All