S1-DeepResearch-32B expands open‑source AI from keyword search to full‑scale research.
The authors release a unified trajectory construction pipeline that blends closed‑ended question answering with open‑ended exploration. They generate graph‑grounded tasks, roll out agentic trajectories, and verify them across multiple dimensions. The resulting dataset stresses evidence integration, planning, file handling and report writing—areas that standard search datasets ignore. Trained on these trajectories, the 32‑b model ranks top among comparable open‑source systems on twenty benchmarks covering reasoning, instruction following, report generation, file understanding and skill usage, and it comes close to the performance of leading proprietary models on several deep‑research tests.
This matters because most publicly available agents still excel only at surface‑level retrieval. By teaching models to stitch together evidence, synthesize knowledge and produce structured outputs, the work moves the field toward agents that can conduct end‑to‑end investigations without constant human steering. The approach also offers a scalable way to create high‑quality training data for tasks that were previously too costly to label, potentially accelerating research in domains that rely on multi‑step analysis.
If the community adopts this trajectory‑centric paradigm, we may see a wave of open‑source agents that can draft literature reviews, audit codebases or even draft policy briefs—functions that today remain the domain of expensive, closed‑source services.