Data never leaves the house
Model runs on your hardware. Prompts and responses stay on the machine.
Run a local LLM with Ollama, point an MCP-capable frontend at our open-source MCP server, and you can ask your loved one's check-in data conversationally without a byte leaving your machine. Privacy-maximalist Coordinators only.
Coordinator-only. Included in the £4.99 plan. Same day to set up.
Model runs on your hardware. Prompts and responses stay on the machine.
Open-source models, free to use. Pay only the electricity bill.
Same Getwello MCP server as Claude Desktop. We don't care which AI calls it.
'Has Mum checked in?' is well within a 7B model's capability when given the tool to look it up.
$ ollama run llama3.1
>>> How is Mum doing this week?
[Calling get_circle_stats...]
[Calling list_gap_days...]
Adherence 86% over 7 days (6 of 7 on time).
Mood mostly fine, one low day on Thursday.
Two upcoming visits, gap on Wednesday.
The Thursday dip lines up with the GP
appointment, worth checking in about that.
All data stayed local.Through an MCP-capable Ollama frontend, the model uses our MCP server like any other. The terminal-style transcript shows tool calls being made, data flowing in, the model interpreting it. Most importantly: every line of that conversation stays on your machine.
Download Ollama from ollama.com. Pull a model: ollama pull llama3.1. Confirm it runs: ollama run llama3.1.
Open WebUI is the most polished option. Run via Docker: docker run -d -p 3000:8080 ghcr.io/open-webui/open-webui:main.
Most frontends accept the same npx invocation as Claude Desktop. Add the snippet, paste your API key, save.
Type 'How is Mum doing this week?' The model calls our tools, summarises the result, hands it back. All without leaving your hardware.
Ollama itself is a model runner. MCP support comes through frontends that speak MCP: Open WebUI, LM Studio, Continue and a handful of others. The Getwello MCP server stays the same, you point an MCP-capable frontend at both.
For tool-calling, a model trained for function calling is essential. Llama 3.1 8B and above, Qwen 2.5 7B, or Mistral Nemo all work well. Smaller models tend to hallucinate tool calls.
8GB+ of unified memory on an M1/M2/M3 Mac runs the 7-8B parameter models comfortably. On Windows or Linux, a recent GPU helps but is not strictly required for the smaller models.
No. The local model handles the conversation, then calls our MCP tools (which call our REST API) only for the specific data the model decides to fetch. We see only the tool calls themselves, not your prompts.
Privacy. The model never sees your data outside your machine, our API responses never enter any cloud-AI's context. Worth the quality trade-off for some families.
Some families wire two or three at the same time. The webhook fans out, the API keys stay the same.
See every integration or jump to the Developer API overview.
Mint your first API key in two minutes. Coordinator-only. Included in the £4.99 family plan.