Running AI Locally with Ollama

Why Local AI?

Cloud AI is convenient, but for personal data analysis, privacy matters. When you're extracting beliefs, tracking contradictions, and generating narratives from your most personal writing, you want that processing to happen on your machine.

Ollama makes this practical. It runs open-source LLMs locally with GPU acceleration — no internet required.

Setup (2 Minutes)

Download Ollama from ollama.com
Install and run it (it starts a local server at localhost:11434)
Pull two models:

# Embedding model (required for search, 274 MB)

LLM model (pick by your VRAM) ollama pull qwen2.5:14b-instruct-q4_K_M ```

Which Model For Your Hardware?

VRAM	Model	Quality	Speed
4 GB	llama3.2:3b	Good for basic extraction	~40 tok/s
8 GB	llama3.1:8b	Solid all-around	~35 tok/s
12 GB	qwen2.5:14b-instruct-q4_K_M	Best quality/speed balance	~25 tok/s
16 GB+	qwen2.5:32b-instruct-q4_K_M	Near-cloud quality	~15 tok/s

Recommendation: If you have 12 GB VRAM (RTX 4070/5070 Ti), the 14B Qwen model is the sweet spot. It's significantly better than 8B models at structured JSON extraction and nuanced belief analysis.

Embedding Model: Always Use nomic-embed-text

For search to work, you need an embedding model. nomic-embed-text is: - Only 274 MB - 768 dimensions (efficient) - Fast enough for real-time search - Runs on any hardware

Hybrid Setup: Cloud LLM + Local Embeddings

The best of both worlds: use a free cloud LLM (Gemini Flash, Groq) for analysis + local Ollama for embeddings. This way: - Analysis uses a powerful cloud model (free) - Your search index stays fully private (local) - No VRAM needed for the large LLM

In MemryLab: Settings > Embedding Provider > Select "Ollama (local, private)"

Cost Comparison

Setup	Monthly Cost	Privacy	Speed
Full local (Ollama)	$0 + electricity	Full	15-40 tok/s
Gemini Flash (cloud)	~$0 (free tier)	Analysis only	100+ tok/s
GPT-4o (cloud)	~$5-20	None	60+ tok/s

For MemryLab's typical usage (~100 LLM calls per analysis), even cloud costs are negligible. But local gives you zero dependency on external services.

Download MemryLab | View All Providers