Open Source Coding LLMs (30B–70B Range)
Best locally-runnable coding models as of mid-2026, focused on the 30B–70B parameter range — the practical sweet spot between quality and hardware requirements.
Why / When to Use
Use when selecting a local LLM for Claude Code alternatives, agentic workflows (n8n, OpenACP), or offline coding assistance. These models run in Ollama or llama.cpp on consumer/prosumer GPUs.
Core Concept / Commands
Running in Ollama
# Pull and run a model
ollama pull qwen2.5-coder:32b
ollama run qwen2.5-coder:32b
# List running models
ollama list
# Run with specific quantization
ollama pull llama3.3:70b-instruct-q4_K_MKey Options / Variants
| Model | Size | Focus | VRAM (Q4) | License |
|---|---|---|---|---|
| Qwen2.5-Coder 32B | 32B | Code-specialized | ~20GB | Apache 2.0 |
| Llama 3.3 70B | 70B | General + code | ~40GB | Meta |
| DeepSeek R1 70B distill | 70B | Reasoning + debug | ~40GB | MIT |
| Kimi-Dev-72B | 72B | SWE / repo-level bugs | ~45GB | Modified MIT |
| Nemotron Nano 30B | 30B | Fast inference | ~20GB | NVIDIA |
Recommended by Use Case
- Pure coding tasks → Qwen2.5-Coder 32B (Apache 2.0, 24GB GPU, best benchmark)
- General coding + reasoning → Llama 3.3 70B (40GB GPU, Meta license)
- Debugging / math / chain-of-thought → DeepSeek R1 70B distill (MIT, reasoning built in)
- Repo-level bug fixing (SWE-bench) → Kimi-Dev-72B (45GB GPU, best SWE score at this size)
For Local + Agentic Workflows (Claude Code, n8n)
Practical sweet spot: Qwen2.5-Coder 32B
- Fits on single 24GB GPU at Q4
- Apache 2.0 — clean commercial use
- Best coding benchmark in the 30B range
Quality ceiling if hardware allows: Kimi-Dev-72B or Llama 3.3 70B
- Need 40–45GB VRAM or aggressive quantization
Gotchas
- Llama 3.3 70B at Q4_K_M = ~40GB VRAM — won’t fit on 24GB GPUs without further quantization
- DeepSeek R1 distill variants (8B, 14B, 32B) exist if 70B is too large
- “Modified MIT” on Kimi-Dev-72B — check terms for commercial use before deploying
- Meta license on Llama 3.3 70B allows most commercial use but has user threshold restrictions
Source
Conversation: “Best open source LLM for coding” — 2026-06-02 Web sources: Whatllm, Hugging Face, Siliconflow