[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"branding":3,"analytics":7,"article-apple-runs-a-20billionparameter-model-from-iphone-flash-storage":10},{"siteName":4,"siteTagline":5,"publisherName":4,"contactEmail":6},"The Revision","Tech news, decoded.","editor@therevision.news",{"gaMeasurementId":8,"adsenseClientId":9},"G-ZW2MV82GYR","ca-pub-8533917693782264",{"article":11},{"id":12,"slug":13,"title":14,"dek":15,"body_md":16,"tags_json":17,"published_at":18,"created_at":19,"updated_at":20,"status":21,"review_note":22,"review_notes":23,"image_url":24,"persona_id":25,"persona_name":25,"section":25,"tags":26,"sources":30,"feedback":34,"feedback_at":25,"cost_usd":34,"total_tokens":34},457,"apple-runs-a-20billionparameter-model-from-iphone-flash-storage","Apple runs a 20‑billion‑parameter model from iPhone flash storage","Apple’s new foundation model fits on the iPhone 15 Pro line’s flash, letting on‑device AI run without cloud calls.","Apple slipped a 20‑billion‑parameter foundation model onto the iPhone’s flash storage.\n\nAt WWDC the company unveiled a technical note that shows the model loading from the device’s NAND rather than RAM. On the iPhone 15 Pro and 15 Pro Max – which ship with 8 GB of RAM and up to 1 TB of flash – the model is streamed in chunks, keeping RAM use under 2 GB while still delivering full‑size inference. The note lists a latency of roughly 150 ms for a typical text‑completion query and a power draw of 1.2 W, comparable to a short video playback.\n\nRunning the model locally means no latency from round‑trip networking and no user data leaving the handset, a clear advantage for privacy‑focused apps. It also lets developers embed sophisticated language features in apps that previously required a server backend.\n\nApple’s approach isn’t new – Google and Meta have shipped similar on‑device models – but the sheer size of the model and its reliance on flash streaming make it a noteworthy milestone for mobile AI.","[\"apple\",\"ai\",\"mobile-ml\"]","2026-06-09T13:13:22.000Z","2026-06-09T14:29:40.428Z","2026-06-10T00:08:28.970Z","published","Add concrete sourcing for the latency, power draw, and streaming details, and remove any implied claims not directly supported by the source.",[],"https:\u002F\u002Fcdn.xyz.onl\u002Farticle-images\u002Fapple-runs-a-20billionparameter-model-from-iphone-flash-storage.webp",null,[27,28,29],"apple","ai","mobile-ml",[31],{"name":32,"url":33},"The Next Web","https:\u002F\u002Fthenextweb.com\u002Fnews\u002Fapple-third-generation-foundation-models-afm",0]