| Claude Haiku 3.5 | Anthropic | low-latency generation, cost-sensitive generation, agent workflow, targeted classification | Anthropic Messages API | 2026-05-19 |
| DeepSeek V4 (Pro-Max / Flash-Max) | DeepSeek | coding, agent workflow, cost-sensitive generation | OpenAI-compatible API style | 2026-05-21 |
| DeepSeek-V3.2 | DeepSeek | coding, agent workflow, cost-sensitive generation, OpenAI-compatible integration | OpenAI-compatible API style | 2026-05-21 |
| Gemini 3.5 Flash | Google | low-latency generation, multimodal workflow, cost-sensitive generation, Google ecosystem | Gemini API | 2026-05-21 |
| gpt-oss-20b | OpenAI | open-weight deployment, local inference, cost-sensitive generation, low-latency generation | Open-weight model with OpenAI harmony format | 2026-05-19 |
| Grok 4.3 | xAI | agent workflow, tool-use, reasoning, OpenAI-compatible integration | xAI Responses API and OpenAI-compatible API | 2026-05-18 |
| Llama 4 Maverick | Meta | multimodal workflow, document workflow, coding, cost-sensitive generation | Open-weight model card and Llama tooling | 2026-05-18 |
| Mistral Large 3 | Mistral AI | coding, agent workflow, cost-sensitive generation, multilingual, OpenAI-compatible integration | Mistral API and open-weight deployment with OpenAI-compatible serving | 2026-05-21 |
| Qwen3.6 | Alibaba Cloud | reasoning, coding, cost-sensitive generation, OpenAI-compatible integration | Open-weight model family with OpenAI-compatible serving through frameworks such as SGLang | 2026-05-21 |