# Baseline\nRecent ai_tasks (post 2026-04-22): 1423 total.\n- Failed: 785 (86.5% of attempted)\n- Completed: 122\n- Proposed: 436 waiting\n- Avg failed duration: ~1.85hrs (but recent ~1min)\n\n## Root Causes\n1. 'No available distributed workers': 630/785 (80%)\n2. Gateway errors: 109+46\n3. Idle workers but not dispatched.\nGateway: 13 restarts, unhealthy.\n\n## Quick Fixes Delegated (High ROI)\n1. **Worker Dispatch Fix** (critical): ai-ml-operations/backend - enable idle workers, affinity for DB tasks.\n2. **Gateway Stability** (critical): devops - stop restarts, monitoring.\n3. **Retry + Prioritize** (high): backend - exp backoff, DB/short first.\n\n## Next: Re-query baseline post-fixes, test idle_inference swarm, monitor KPIs.\nLong-term: <20% fail rate.