AI search agents don't fail at searching, they fail at asking the right questions when queries get ambiguous
AI search agents struggle with ambiguity because they prioritize repeated searching over asking clarifying questions.
The DiscoBench benchmark reveals that models often fail to identify ambiguous queries, leading to poor research outcomes. Agents that attempt to search repeatedly perform worse than those that guess, with top models achieving only 43 percent accuracy. The findings suggest that agentic workflows need better mechanisms for detecting ambiguity and prompting users for input.