Startup Ideas Bank
A half-baked idea overshadowed by existential doubts.
AI roast score: 55/100 (D)
The idea
Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?
Ask HN: Has anyone replaced Claude GPT with a local model for daily coding? | Hacker News Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Has anyone replaced Claude GPT with a local model for daily coding?
523 points by cloudking 7 hours ago | hide | past | favorite | 260 comments
Has anyone here fully swapped Claude GPT for a local model as their main coding tool, not just for side experiments? If so, please share your setup and performance (e.g tok s)
help
Greenpants 3 hours ago | next [–]
I have! I care about data privacy and LLMs being free. I m using the Pi coding harness but containerized and sandboxed, to make sure it s running completely offline. On my Mac Studio with 128GB RAM (or MacBook with 36GB RAM) I m using Qwen3.6 35b, with only 3b active parameters so that it runs really fast. I ve done a complete redesign for my website s homepage and blog with Django + Wagtail. The latter is interesting, because Wagtail is a bit less well-known, so the agent, without giving it internet access, doesn t always know how to develop for Wagtail. I ve used Qwen3.5 122b for when things get more complex. At 10b active parameters, it s significantly slower though. I ve noticed a few things compared to large models like Claude. For starters, you really need to know what you re asking, and be precise; it doesn t do much thinking for you. Any assumptions left open, and it ll take the easiest route to reach the goal (e.g. CSS in HTML), often not the best in terms of architecture. It gets into loops quite often, and surprisingly often gets the edit tool call wrong, after which it will spend lots of thinking tokens and re-read files instead of retrying (despite the system prompt suggesting so). Comparing agentic Qwen3.6 35b to Claude Opus is like a junior with knowledge across the board, that you really need to guide, versus a senior that thinks with you on architecture. If Opus gives a 15x speedup, local and fully offline Qwen gives a 5x speedup. Which, given that it s completely free, is still mind-boggling to me :)
reply
lambda 3 hours ago | parent | next [–]
This is very similar to my setup. Pi in a container (I do let it have network access, just no access to creds or anything, only the one directory that I m working on at the time and my ~ .pi directory), talking to llama.cpp in another container. I m on a Strix Halo 128 GiB unified memory laptop. I ve never used the frontier models in earnest, I don t believe in using proprietary tools for my programming, so I can t really compare. And I m still a AI skeptic, so I m doing more testing and kicking the tires than I am actually using it. That means I spend a lot of time tryi
The roast
This isn't a startup; it's a glorified Reddit thread. You're asking if replacing Claude with a local model works, but your execution is rooted in theory instead of practical application. Claiming a '5x speedup' without solid data is just wishful thinking. Your insights come from anecdotes, not from real market feedback, which screams hobby project instead of a scalable business. You need to get out of the echo chamber and validate this beyond the confines of Hacker News.
Red flags
- No clear product-market fit articulated.
- Execution relies too heavily on anecdotal evidence.
- Lack of funding and support stifles growth potential.
Verdict
Pivot your focus from speculation to solid product development and validation.
Roast your own startup idea →