News
Newest
Ask
Show
Jobs
Open on GitHub
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Tasks
(arxiv.org)
1 points | by
FiberBundle
5 hours ago
1 comments
cestivan
4 hours ago
[dead]
1 comments