[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"branding":3,"analytics":7,"article-unified-lens-finds-binary-codes-outpace-larger-quantisers-in-retrieval":10},{"siteName":4,"siteTagline":5,"publisherName":4,"contactEmail":6},"The Revision","Tech news, decoded.","editor@therevision.news",{"gaMeasurementId":8,"adsenseClientId":9},"G-ZW2MV82GYR","ca-pub-8533917693782264",{"article":11},{"id":12,"slug":13,"title":14,"dek":15,"body_md":16,"tags_json":17,"published_at":18,"created_at":19,"updated_at":20,"status":21,"review_note":22,"review_notes":23,"image_url":22,"persona_id":22,"persona_name":22,"section":22,"tags":34,"sources":38,"feedback":42,"feedback_at":22,"cost_usd":42,"total_tokens":42},1325,"unified-lens-finds-binary-codes-outpace-larger-quantisers-in-retrieval","Unified lens finds binary codes outpace larger quantisers in retrieval","A new study groups projection, quantisation and organisation methods, showing one-bit codes can match full‑precision quality while slashing memory.","One‑bit binary codes can now match full‑precision retrieval quality, the authors report.\n\nThe authors surveyed approximate nearest‑neighbour techniques and reframed them under a three‑step lens: projection, quantisation and organisation. They released the open‑source BitBudget benchmark and ran reproducible tests on seven embedding models. A single‑bit code with full‑precision re‑ranking achieved the same recall as uncompressed vectors for six models, using just 1\u002F32 of the memory. When byte budgets were equal, binary codes overtook a traditional inverted‑file product quantiser as the embedding size grew. Adding class labels to an eight‑byte supervised code more than doubled retrieval quality over a 2 KB task‑agnostic float.\n\nThis matters because retrieval‑augmented generation and large‑scale search have been fragmented across academic silos. Showing that aggressive quantisation can preserve, or even improve, quality challenges the assumption that higher‑dimensional floats are necessary for state‑of‑the‑art performance. It also gives practitioners a clear, low‑memory path to scale up services.\n\nIn short, the study suggests the community should consolidate around the projection‑quantisation‑organisation framework and consider binary codes as the default low‑memory choice for large‑scale retrieval.","[\"retrieval\",\"nearest-neighbour\",\"hashing\"]","2026-06-16T04:00:00.000Z","2026-06-17T04:08:35.237Z","2026-06-17T04:08:38.055Z","published",null,[24,30],{"id":25,"reviewer":26,"round":27,"reason":28,"status":29},"editor-r1","editor",1,"Add a clear concluding paragraph that summarizes the findings and their implications for readers.","resolved",{"id":31,"reviewer":26,"round":32,"reason":33,"status":29},"editor-r2",2,"Add a concise concluding paragraph that summarises the findings and their implications for readers.",[35,36,37],"retrieval","nearest-neighbour","hashing",[39],{"name":40,"url":41},"arXiv cs.AI","https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.04127",0]