[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"branding":3,"analytics":7,"article-macr-tackles-the-problem-of-llms-that-trust-wrong-sources":10,"sections":40},{"siteName":4,"siteTagline":5,"publisherName":4,"contactEmail":6},"The Revision","Tech news, decoded.","editor@therevision.news",{"gaMeasurementId":8,"adsenseClientId":9},"G-ZW2MV82GYR","ca-pub-8533917693782264",{"article":11},{"id":12,"slug":13,"title":14,"dek":15,"body_md":16,"tags_json":17,"published_at":18,"created_at":19,"updated_at":20,"status":21,"review_note":22,"review_notes":23,"image_url":22,"persona_id":22,"persona_name":22,"section":30,"tags":31,"sources":35,"feedback":39,"feedback_at":22,"cost_usd":39,"total_tokens":39},1684,"macr-tackles-the-problem-of-llms-that-trust-wrong-sources","MACR Tackles the Problem of LLMs That Trust Wrong Sources","A new multi-agent framework explicitly resolves knowledge conflicts inside LLMs rather than just picking a side and hoping for the best.","A research team has proposed MACR, a framework designed to make LLMs stop blindly trusting either their training data or whatever context you hand them.\n\nCurrent systems essentially force a binary choice: either the model's internal knowledge wins, or the external context does. MACR rejects that premise. It first measures how confident the model actually is in its own answer using a modified semantic entropy score, then generates a working context from whichever source — internal or external — looks more reliable. From there, three specialized agents take over: one surfaces explicit reasoning rules, one maps out potential conflicts, and the third resolves inconsistencies. Crucially, the framework handles two distinct conflict types that prior work has largely ignored — cases where the model's parametric knowledge contradicts provided context, and cases where multiple pieces of external context contradict each other.\n\nThe second conflict type is where things get practically interesting. Retrieval-augmented systems often pull in several documents that don't agree with each other, and today's LLMs have no principled way to arbitrate between them. MACR is built to surface and resolve those inconsistencies explicitly rather than averaging them away. On the PopQA and ConFiQA benchmarks, the paper reports MACR outperforming state-of-the-art baselines, with the added benefit of producing human-readable explanations of how each conflict was resolved.\n\nThe interpretability claim is worth watching. The field is littered with conflict-handling papers that show benchmark gains without explaining the reasoning — MACR's agent-based design at least makes the resolution auditable, which matters more than the leaderboard number if you're deploying this in production.","[\"ai\",\"llm\",\"retrieval-augmented generation\",\"research\"]","2026-06-19T04:00:00.000Z","2026-06-19T09:53:35.207Z","2026-06-19T14:21:36.965Z","published",null,[24],{"id":25,"reviewer":26,"round":27,"reason":28,"status":29},"editor-r1","editor",1,"The article omits the key distinction that MACR handles both parametric-vs-context and context-vs-context conflicts (the source explicitly names both), and the phrase 'outperforms existing baselines across standard benchmarks' needs at least one concrete number or benchmark name to avoid being an unsupported marketing claim.","resolved","ai",[30,32,33,34],"llm","retrieval-augmented generation","research",[36],{"name":37,"url":38},"arXiv cs.AI","https:\u002F\u002Farxiv.org\u002Fabs\u002F2606.20245",0,{"sections":41},[42,45,49,54,59,64,69,73,77,82,87,92,97,102],{"name":43,"slug":30,"count":44,"latest_published_at":18},"AI",490,{"name":46,"slug":47,"count":48,"latest_published_at":18},"Security","security",132,{"name":50,"slug":51,"count":52,"latest_published_at":53},"Policy","policy",88,"2026-06-16T09:26:09.000Z",{"name":55,"slug":56,"count":57,"latest_published_at":58},"Consumer Tech","consumer-tech",78,"2026-06-16T17:58:24.000Z",{"name":60,"slug":61,"count":62,"latest_published_at":63},"Hardware","hardware",62,"2026-06-18T15:24:16.000Z",{"name":65,"slug":66,"count":67,"latest_published_at":68},"Deals","deals",58,"2026-06-19T14:43:50.000Z",{"name":70,"slug":71,"count":67,"latest_published_at":72},"Software","software","2026-06-16T20:00:00.000Z",{"name":74,"slug":75,"count":76,"latest_published_at":18},"Dev Tools","dev-tools",50,{"name":78,"slug":79,"count":80,"latest_published_at":81},"Science","science",38,"2026-06-18T04:00:00.000Z",{"name":83,"slug":84,"count":85,"latest_published_at":86},"Gaming","gaming",31,"2026-06-16T15:25:13.000Z",{"name":88,"slug":89,"count":90,"latest_published_at":91},"General","general",26,"2026-06-13T18:35:15.000Z",{"name":93,"slug":94,"count":95,"latest_published_at":96},"Startups","startups",23,"2026-06-16T15:00:00.000Z",{"name":98,"slug":99,"count":100,"latest_published_at":101},"Reviews","reviews",19,"2026-06-14T08:00:00.000Z",{"name":103,"slug":104,"count":105,"latest_published_at":106},"How-To","how-to",6,"2026-06-16T09:00:00.000Z"]