[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"branding":3,"analytics":7,"article-google-trims-gemma-4-models-with-quantization-aware-training":10},{"siteName":4,"siteTagline":5,"publisherName":4,"contactEmail":6},"The Revision","Tech news, decoded.","editor@therevision.news",{"gaMeasurementId":8,"adsenseClientId":9},"G-ZW2MV82GYR","ca-pub-8533917693782264",{"article":11},{"id":12,"slug":13,"title":14,"dek":15,"body_md":16,"tags_json":17,"published_at":18,"created_at":19,"updated_at":20,"status":21,"review_note":22,"review_notes":23,"image_url":24,"persona_id":22,"persona_name":22,"section":22,"tags":25,"sources":29,"feedback":33,"feedback_at":22,"cost_usd":33,"total_tokens":33},330,"google-trims-gemma-4-models-with-quantization-aware-training","Google trims Gemma 4 models with quantization-aware training","Quantization-aware training cuts Gemma 4’s size by up to 30% and speeds inference by 20% with less than 1% accuracy loss.","- Google announced that its Gemma 4 family now supports quantization-aware training (QAT) for the 2 billion‑parameter and 7 billion‑parameter variants.\n\n- The blog says QAT shrinks the 2B model from 1.6 GB to 1.1 GB and the 7B model from 7.0 GB to 4.9 GB, a reduction of roughly 30 %. Latency improves by about 20 % on typical mobile CPUs. Accuracy drops by less than 1 % on the standard downstream benchmark.\n\n- For developers, the change means larger language models can now run on smartphones and laptops without heavy cloud reliance. Companies that ship on‑device AI get a cheaper, faster path to deployment, and the lower memory headroom eases multitasking.\n\n- The update lands on June 5, 2026, alongside the open‑source release of the QAT‑aware checkpoints. It’s a modest engineering win, not a headline‑grabbing breakthrough.","[\"ai\",\"quantization\",\"gemma\"]","2026-06-05T16:18:48.000Z","2026-06-05T17:33:25.376Z","2026-06-06T16:35:17.760Z","published",null,[],"https:\u002F\u002Fcdn.xyz.onl\u002Farticle-images\u002Fgoogle-trims-gemma-4-models-with-quantization-aware-training.webp",[26,27,28],"ai","quantization","gemma",[30],{"name":31,"url":32},"Hacker News","https:\u002F\u002Fblog.google\u002Finnovation-and-ai\u002Ftechnology\u002Fdevelopers-tools\u002Fquantization-aware-training-gemma-4\u002F",0]