How to Use a Local AI Model in Visual Studio Code

How to Use a Local AI Model in Visual Studio Code

Most tutorials on setting up AI code editors focus on cloud-based LLMs like OpenAI’s Codex or Anthropic’s Claude. These tools are convenient, but they’re not always ideal.

This guide takes a different approach: it’s about using local LLMs to power your code editor, offering full control, privacy, and customization.

Why Go Local?

Traditionally, tools like GitHub Copilot rely on cloud-based services (OpenAI, Anthropic, etc.). While powerful, these services can raise privacy concerns or limit customizability. Local LLMs offer an offline, more private alternative for beginning AI coding assistance directly to your machine.

Benefits of Using a Local LLM

Privacy: Your code never leaves your computer.
Customization: You control the model’s behavior, system usage, and update schedule.
Performance: Run lighter models locally or leverage high-performance setups.

Trade-offs:

Hardware requirements: Running large models locally requires decent RAM, storage, and ideally a GPU.
Setup complexity: It takes a bit more effort to configure than signing into a cloud service.
Model limitations: Some local models aren’t as polished or advanced as the latest cloud-based LLMs, but they’re getting better and better each day.

What Is an AI Code Editor?

AI code editors are intelligent development environments that leverage machine learning models -especially LLMs – to assist with writing, understanding, and managing code. These editors can auto-complete functions, generate documentation, refactor code, and even write tests.

Traditionally, these tools connect to cloud-based models (e.g., OpenAI Codex, GitHub Copilot), but local LLMs offer a private, offline alternative.

Use Cases:

Code Understanding: Summarize large files, explain functions, or navigate complex logic.
Code Suggestions: Autocomplete entire lines or blocks based on context.
Test Generation: Write unit/integration tests from function signatures or code blocks.
Documentation Help: Create function/method/class documentation.
Refactoring: Rename variables, split functions, improve code structure.
Tooling Glue: Generate shell commands, SQL queries, configuration files, etc.

And with a local setup, you can do all this without sending a single byte to the cloud.

What You’ll Need

System Requirements:

16GB RAM minimum (32GB+ recommended for larger models)
SSD storage (some models are 4–10GB+)
GPU for better performance (optional but beneficial)

Software:

Tool	Role	Example
AI Code Editor	Code Interface	VS Code
LLM Server	Hosts the model	Ollama
LLM Model	Powers the AI	QwenCoder, DeepseekCoder

Getting Started

1. Install VS Code, if you haven’t already.

Be sure the Agent mode is Enabled in Visual Studio Code’s User settings

Note: Agent mode is included starting from VS Code version 1.99.

2. Add the GitHub Copilot extension from the VS Code Marketplace.

Open the settings for Copilot and uncheck “Allow GitHub to use my data for product improvements” to maximize privacy.

3. Install an LLM backend like on your development machine.

4. Download your model of choice (e.g., ollama pull qwen2.5-coder).

- After downloading, you can select your model in GitHub Copilot:
  Go to Copilot Models → Manage Models → Ollama, check the box next to your selected model, and click OK.
- Now, the model will be available for selection in Copilot Models.

5. Start coding with AI assistance—offline and fully under your control.

Final Thoughts

Using a local LLM for your code editor is a great step toward privacy, customization, and independence from cloud-based models. Whether you’re an indie developer or part of a privacy-conscious team, this setup gives you full control of your coding AI.

Do you use a low-power laptop and struggle to work with large language models? Don’t miss our next article, where we will explore how to set up a remote machine to host your large language model.

Author: Viktor Vörös