Microsoft added MAI‑Code‑1‑Flash to its model suite on June 2, 2026.
The model card lists a 7 B‑parameter architecture with a 16 K token context window. Microsoft says inference runs about 15 % faster than the prior MAI‑Code‑1, while pass@1 on HumanEval improves by roughly 0.5 %. Pricing on Azure’s AI service drops to $0.0008 per 1 K tokens, down from $0.001 for the earlier version.
For developers, the speed bump means lower cloud bills and quicker iteration on code suggestions, especially in CI pipelines where latency adds up. The modest accuracy lift keeps the model competitive without demanding new hardware.
It’s an incremental upgrade, not a wholesale rewrite of Microsoft’s code‑generation strategy.