Developers on Hacker News are asking if anyone has replaced cloud‑based coding assistants like Claude or GPT with a self‑hosted model for day‑to‑day work. The post solicits details on setups, hardware, and performance metrics such as tokens per second.
If people have made the switch, the community wants to know whether local inference can match the responsiveness and accuracy of the hosted services. The discussion could reveal cost savings, privacy benefits, or technical hurdles that are not obvious from marketing brochures.
While the thread is still early, the very fact that seasoned coders are debating a full migration suggests the market for on‑premise LLMs is reaching a point where real‑world use cases are being tested, not just toy experiments.
