Claude 3.5 Sonnet can navigate user interfaces, move cursors, click buttons, and type text. Anthropic has unveiled a major update to its Claude AI models, including the new “Computer Use ...
SWE-bench Verified score increased from 33.4% to 49.0%, the best score ever by any model in the industry. TAU-bench score increased from 62.6% to 69.2% in the retail domain and from 36.0% to 46.0% ...