Sometimes you can't send code to OpenAI.
Maybe it's proprietary code under NDA. Maybe it's a security-conscious client. Maybe you just prefer keeping your work private.
Local AI models have reached the point where they're genuinely useful for development. Here's how to set up a private AI workflow.
Why Local?
Cloud AI services are convenient, but they have trade-offs:
Data leaves your machine. Your prompts, your code, your context—all sent to external servers.
Terms of service matter. Most providers say they don't train on your data. But terms change, breaches happen, trust is required.
Compliance requirements. Some industries and clients prohibit external data processing.
Internet dependency. No connection, no AI. Local runs anywhere.
For sensitive work, local models solve real problems.
The Current State
Local models have improved dramatically:
Good enough for coding. Models like CodeLlama and DeepSeek Coder handle most programming tasks competently.
Reasonable hardware requirements. A decent laptop can run useful models. You don't need a server farm.
Easy setup. Tools like Ollama make running models trivial.
They're not as capable as Claude or GPT-4, but they're often capable enough.
Ollama: The Easy Path
Ollama is the simplest way to run local models.
Installation: One command on Mac or Linux. Download and run on Windows.
# Mac/Linux
curl -fsSL https://ollama.com/install.sh | sh
# Then run a model
ollama run codellama
That's it. You're now running a local AI.
Model options:
codellama— Meta's code-focused modeldeepseek-coder— Strong coding performancemistral— Good general-purpose modelllama3— Latest Llama, versatile
Try different models for your use case. They have different strengths.
LM Studio: The GUI Option
If you prefer a visual interface:
LM Studio provides a ChatGPT-like interface for local models. Download models from within the app. Chat naturally. No command line required.
Good for:
- Exploring what's available
- Quick experimentation
- Non-technical team members
IDE Integration
Local models can plug into your editor:
Continue (VS Code extension): Connect to Ollama or other local providers. Get Copilot-like completions from local models.
Ollama + API: Ollama exposes an OpenAI-compatible API. Many tools that work with OpenAI can point to your local instance instead.
# Ollama API runs on localhost:11434
curl http://localhost:11434/api/generate -d '{
"model": "codellama",
"prompt": "Write a Python function to parse JSON"
}'
Hardware Reality
What you need:
Minimum: 8GB RAM, any recent CPU. Can run smaller models (7B parameters).
Comfortable: 16GB RAM, modern CPU. Runs 13B models smoothly.
Ideal: 32GB+ RAM or a GPU with 8GB+ VRAM. Runs larger models at reasonable speed.
M1/M2/M3 Macs are particularly good—the unified memory architecture handles larger models well.
Performance Trade-offs
Be realistic:
Slower than cloud. Local inference takes time. Expect seconds, not milliseconds.
Less capable. The best local models are behind cloud frontier models. Complex reasoning and very long contexts suffer.
Resource-intensive. Your laptop will work hard. Battery life suffers. Fans spin.
For quick completions and straightforward tasks, local works great. For complex architectural discussions, I still reach for Claude.
A Hybrid Workflow
My approach:
Sensitive code: Local models only. Ollama with CodeLlama.
General development: Cloud AI. Claude, ChatGPT.
Quick completions: Copilot or local via Continue.
Match the tool to the sensitivity level. Not everything needs the same protection.
Getting Started
- Install Ollama: Takes two minutes
- Download a model:
ollama pull codellama - Try it:
ollama run codellama - Integrate with your editor: Install Continue extension
Start with CodeLlama for coding tasks. Experiment from there.
Related Reading
- Free AI Tools Worth Your Time - Local models provide completely free AI without subscription costs or usage limits
- When to Use Claude vs ChatGPT vs Copilot - Understand when local models fit versus cloud AI services
- The AI Tools I Stopped Using - Why I chose local models over some specialized cloud tools
- My AI Tool Stack in 2025 - How local models integrate with my complete AI workflow
- Security for Solo Founders - Privacy and data protection strategies including local AI
Official Resources
- Ollama - The easiest way to run local AI models on Mac, Linux, and Windows
- LM Studio - GUI application for running local models with a chat interface
- Hugging Face Models - Repository of open-source AI models for local use