You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'll work around this, so this is just a suggestion, to help the AI-comrads, as these often have long running tasks and hw/memory requirements.
I have checked the issues list
for similar or identical feature requests.
I have checked the pull requests list
for existing proposed implementations of this feature.
I have checked the commit log
to find out if the same feature was already implemented in the
main branch.
I have included all related issues and possible duplicate issues
in this issue (If there are none, check this box anyway).
Related Issues and Possible Duplicates
None
Related Issues
None
Possible Duplicates
None
Brief Summary
To leverage loading (AI/LLM-)models in (GPU-)memory and having memory available, an option similar to worker_max_tasks_per_child that kills a worker after an idle-time to free memory would be helpful. This naturally implements caching and loading if needed.
Design
Architectural Considerations
I couldn't quit grasp the celery comsumer-loop, but I'm going to mimic this behavior by spinning off a thread that checks idle-time and releases memory when too idle.
However, my best guess is that this is the consumer loop. My next guess is that it would have some sort of condition-variable to wait for jobs. I'd suggest adding a timeout there to sometimes check worker_max_idle.
Proposed Behavior
After a worker has been idle for at least worker_max_idle, the worker is either killed or restarted.
Proposed UI/UX
worker_max_idle=120
The text was updated successfully, but these errors were encountered:
I'll work around this, so this is just a suggestion, to help the AI-comrads, as these often have long running tasks and hw/memory requirements.
for similar or identical feature requests.
for existing proposed implementations of this feature.
to find out if the same feature was already implemented in the
main branch.
in this issue (If there are none, check this box anyway).
Related Issues and Possible Duplicates
Related Issues
Possible Duplicates
Brief Summary
To leverage loading (AI/LLM-)models in (GPU-)memory and having memory available, an option similar to
worker_max_tasks_per_child
that kills a worker after an idle-time to free memory would be helpful. This naturally implements caching and loading if needed.Design
Architectural Considerations
I couldn't quit grasp the celery comsumer-loop, but I'm going to mimic this behavior by spinning off a thread that checks idle-time and releases memory when too idle.
However, my best guess is that this is the consumer loop. My next guess is that it would have some sort of condition-variable to wait for jobs. I'd suggest adding a timeout there to sometimes check
worker_max_idle
.Proposed Behavior
After a worker has been idle for at least
worker_max_idle
, the worker is either killed or restarted.Proposed UI/UX
worker_max_idle=120
The text was updated successfully, but these errors were encountered: