SPEAR: Code-Augmented Agentic Prompt Optimization
SPEAR is an agentic prompt optimizer that uses a Python sandbox to perform structural error analysis and iterative prompt refinement.
SPEAR (Sandboxed Prompt Engineer with Active Roll-back) moves beyond fixed APE pipelines by using an agentic approach. It utilizes four tools—evaluate, python, set_prompt, and finish—to autonomously optimize prompts. The Python tool allows the agent to execute code on evaluation data, enabling advanced error analysis like confusion matrices and error clustering to inform its next iteration.