If your sweep agent starts but does not receive new run configurations, or receives one run and then idles, there are several common causes. The sweep has exhausted its search space (grid search) InDocumentation Index
Fetch the complete documentation index at: https://wb-21fd5541-kb-refresh.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
grid search, the sweep controller assigns every combination of hyperparameter values exactly once. Once all combinations are assigned, no new runs are generated. If you started multiple agents simultaneously, they may have collectively consumed all configurations before any single agent finished its current run.
To confirm: open the sweep page in the W&B UI and check the run count against the total grid size. If they match, the sweep is complete.
The --count flag is limiting the agent
Running wandb agent --count N SWEEP_ID tells the agent to accept at most N runs before exiting. If you set --count 1, the agent exits after a single run. This is intentional for SLURM and other job schedulers, but can be surprising if you expected the agent to loop.
Remove --count (or increase it) to allow the agent to keep pulling runs:
wandb.agent() on the same job
In distributed training setups, if every process on a node calls wandb.agent(), each process registers as a separate agent and consumes a run configuration. This leads to runs that crash immediately (because only one process was meant to drive the sweep) and a quickly exhausted configuration pool. Restrict wandb.agent() to rank 0 only. See How do I run sweeps with distributed training on SLURM? for the recommended pattern.
SDK version bug after upgrade
Some SDK versions between 0.19.6 and 0.19.10 introduced a regression where the sweep agent teardown raised an error that caused the agent loop to exit prematurely rather than requesting the next run. If you recently upgraded and agents stop after one run with a teardown-related traceback, upgrade to the latest SDK version:
Sweeps Experiments