The beginner's guide to local AI

What it is, how to set it up, and who it's for

Hey friend,

I was working on a project - nothing crazy, just needed help drafting some emails and brainstorming ideas. Standard ChatGPT stuff.

Halfway through, I got that annoying message. "You've reached your limit. Please wait or upgrade."

I was already paying for the Plus subscription.

So I did what any reasonable person would do. I complained to a friend.

His response: "Why are you still paying for that? I run everything locally now."

I laughed. Running AI on your own computer sounded like something for Linux nerds with server racks in their basements.

Then he showed me his setup.

A few minutes later, I was running a model on my laptop that felt almost identical to what I'd been paying for.

No subscription. No usage limits. No sending my data to anyone's servers.

I've been down this rabbit hole for a few weeks now, and I think more people need to know about it.

Wait, What Do You Mean "Run AI Locally"?

Let me back up.

When you use ChatGPT or Claude, here's what actually happens:

You type a question. That question travels across the internet to a massive data centre. Thousands of specialised computers process your request. The answer travels back to you.

Your prompts, your documents, your weird 2 am questions - all of it passes through someone else's servers.

Local AI flips this completely.

The AI model lives on your computer. When you ask a question, your machine does all the processing. Nothing leaves your device. Ever.

It's like the difference between streaming a movie and having the DVD. One requires a connection and permission. The other is just... yours.

The wild part is that the "DVDs" have gotten really good.

How Is This Even Possible Now?

A year ago, running serious AI locally required mass-market hardware.

The models were too big. The software was too complicated. The results were mediocre at best.

Three things changed:

First, the models got way better.

Meta released Llama. Then Llama 2. Then Llama 3. Each version closed the gap with commercial offerings.

Then, Chinese labs like DeepSeek and Alibaba started releasing models that genuinely compete with the big names. And they're giving them away for free.

DeepSeek-R1 matches frontier models on reasoning tasks. People are running it on gaming PCs.

Second, someone figured out compression.

There's this technique called quantisation that shrinks models dramatically. A model that would normally need a supercomputer can be compressed to run on a laptop.

You lose a little quality - maybe 5-8% - but for most tasks, you genuinely can't tell the difference.

Third, the tools got stupid simple.

This is the big one.

There's a program called Ollama. Installation takes one command. Running a model takes one command.

That's it. No Python environments. No dependency hell. No debugging cryptic errors.

If you can copy and paste, you can run local AI.

So What's It Actually Like To Use?

Honestly? Pretty normal.

I open my terminal, type a command, and start chatting. The responses come back in a couple of seconds. I can ask follow-up questions, paste in documents, and do all the usual stuff.

For everyday tasks - writing help, brainstorming, explaining concepts, basic coding questions - I genuinely cannot tell the difference between local and cloud.

The conversations feel the same. The quality feels the same. The speed is comparable.

What's different is everything around it.

No login. No subscription management. No "you've hit your limit." No wondering if my conversation history is being used to train the next model.

I asked it to help with a sensitive client situation last week. Didn't have to sanitise the details or speak in vague hypotheticals. The data literally cannot leave my machine.

That felt like a small thing until I realised how much I'd been self-censoring before.

The Part Where I Have To Be Honest

This isn't magic. There are real tradeoffs.

Complex reasoning is still better in the cloud.

For most everyday tasks, local models are great. But when I need serious analytical thinking - breaking down a complicated problem, working through intricate logic - the cloud models still have an edge.

I tested this with some technical questions where I knew the right answer. The cloud models got it right more often on the hard stuff.

Multimodal is limited.

Uploading images, processing documents, voice interactions - the cloud is smoother here. Local options exist, but they're not as polished.

You need decent hardware.

This is the real barrier for some people.

The sweet spot is a graphics card with at least 8GB of video memory. That runs the smaller models comfortably.

For the really impressive stuff - the models that feel GPT-4 level - you want 24GB or more.

If you have a newer Mac, you're actually in good shape. Apple's chips handle local AI surprisingly well.

If you have an older laptop with no dedicated graphics... it'll work, but it'll be slow. Like, unusably slow for interactive chat.

There's a learning curve.

Not a huge one. But it's not "open browser, start typing" simple either.

You'll spend an afternoon figuring things out. Reading a guide. Trying different models. Getting comfortable with the workflow.

For some people, that's fun. For others, it's a barrier.

Who Should Actually Try This?

I've been thinking about this, and it comes down to a few profiles:

You handle sensitive information.

Lawyers, doctors, consultants, anyone working with confidential data. The privacy guarantee alone might be worth the setup time.

You stop asking "is it safe to put this in ChatGPT?" because the question becomes irrelevant.

You're tired of subscriptions.

The math is simple. A one-time hardware investment versus monthly fees forever.

If you're paying for multiple AI tools, the break-even point comes faster than you'd think.

You're curious about how this stuff works.

Running AI locally teaches you things. You start understanding what models are, how they differ, and why some tasks are harder than others.

It's like the difference between ordering food delivery and learning to cook. Both get you fed, but one teaches you something.

You want control.

No terms of service changes. No features getting paywalled. No "we're discontinuing this model."

Your setup works until you decide to change it.

If You Want To Try It

The simplest path:

Step 1: Go to ollama.ai and download the installer. Works on Mac, Windows, Linux.

Step 2: Open your terminal and type: ollama run llama3.2

Step 3: Wait a few minutes for the model to download.

Step 4: Start typing questions.

That's genuinely it for a basic setup.

If you want something more visual, download LM Studio instead. It has a nice interface that looks like ChatGPT. Point and click, no terminal required.

Or, just watch this video:

Also, here is a list of the Top 10 LLM Tools to Run Models 

The first model you try probably won't blow your mind. The smaller ones are capable but not exceptional.

If you have hardware that can handle it, try ollama run llama3.3:70b - that's when things get interesting.

What I'm Actually Running Now

Since I know people will ask:

My main model is Llama 3.3 for general stuff. It handles writing, brainstorming, explanations, and everyday questions.

For coding, I use DeepSeek Coder. It's specifically trained for programming, and it shows.

I still use Claude for complex analytical work where I need the best possible reasoning. Maybe a few times a week.

The split feels natural now. Quick stuff stays local. Hard stuff goes to the cloud.

Most days, I don't touch the cloud at all.

The Thing That Surprised Me Most

I expected the privacy and the cost savings.

What I didn't expect was how it changed my relationship with AI tools.

When you're paying for a subscription, there's this subtle pressure to use it. To justify the cost. To get your money's worth.

When it's just... there, on your computer, available whenever - that pressure disappears.

I use AI more naturally now. For smaller things. Quick questions I wouldn't have bothered with before.

And I'm more honest with it. More willing to share real context. More willing to ask dumb questions.

Turns out, the best AI tool is the one you actually feel comfortable using.

The local AI thing isn't for everyone. But it's also not just for tech nerds anymore.

The tools are ready. The models are genuinely good. The setup takes an afternoon, not a weekend.

If you've been curious, now's a pretty good time to try it.

And if you do, let me know how it goes. I'm always curious what people end up using this stuff for.

That's it for today. If this was useful, share it with someone who's been complaining about their ChatGPT subscription.

See you next time.

Reply

or to participate.