100% local AI — zero cloud, full privacy

Daneel AI

Chat with anything,
build knowledge graphs and run agents,
entirely in your browser.

A browser extension that turns any page, local documents or an entire site into agentic conversations and actionable knowledge. Connect to MCP servers, build your own agents and run AI on your GPU. No cloud, API keys, no accounts, no data leaves your trusted environment.

Private Release Candidate
Invitation-only access while we finalize the public launch.

Request RC access

Getting Started

Up and running in minutes

No configuration headaches, no sign-ups, no API keys. Just install and ask.

Request RC access

Open an issue on GitHub. We'll send you a packaged build — drop it in Chrome's extensions page with Developer Mode on and you're running.

Wait for the model

First launch downloads ~600 MB to your local cache. After that, it's instant forever.

Ask anything

Click the floating widget on any page. Ask questions, get AI-powered answers with citations.

Index entire sites

Switch to Site mode to crawl, index, and search entire websites as a knowledge base.

Index local folders

Switch to Vault mode to index, and search entire local documents as a knowledge base.

Connect MCP servers

Attach tools to LLMs and empower them with live data.

Build your own Agents

Attach tools to LLMs and empower them with live data.

Features

Everything you need, nothing you don't

Two powerful modes, one seamless experience.

💬

Page Q&A

Ask questions about any webpage. Smart content extraction strips noise (nav, ads, scripts) and feeds clean markdown to the AI. Streaming responses appear in real time.

🕸

Full Site RAG

Auto-discovers sitemaps, crawls up to 150 pages, chunks and embeds everything on your GPU. Ask questions and get answers with source citations.

🧠

16 Local AI Models

From 280 MB ultra-light to 2.3 GB high-quality. The extension benchmarks your GPU and recommends the best model for your hardware.

🔌

4 LLM Backends

WebGPU (default), Ollama, built-in Gemini Nano, or Claude API if you have no GPU. Switch with one click — the entire experience stays the same.

🔍

Semantic Search

Search across all indexed sites by meaning, not just keywords. Find what you're looking for even if you don't remember the exact words.

📦

Data Portability

Export your entire knowledge base as a ZIP — embeddings, conversations, settings. Import on another machine to pick up where you left off.

Why This Extension

The competition requires your data

Every other "chat with a page" tool sends your content to external servers.

Capability	This Extension	Everyone Else
AI inference	Your GPU via WebGPU	OpenAI / Anthropic servers
Your data	Never leaves your browser	Uploaded to third-party clouds
After install	Works offline, forever	Breaks without API key
Cost	Free core, $9 one-time	$10–20/month subscriptions
Setup	Click install. Done.	API keys, accounts, config
Models	16 local + 3 remote backends	Whatever the vendor gives you

AI Models

35 models, ranked for your hardware

The extension profiles your GPU on first run and recommends the best fit. Models that won't fit are automatically flagged.

Granite 4.0 Micro 3B

2.3 GB · Best quality

Phi-3.5 Mini 3.8B

2.0 GB · Reasoning

Granite 4.0 1B

2.15 GB · Balanced

SmolLM2 1.7B

1.65 GB · Fast

Pleias RAG 1B

1.45 GB · RAG specialist

LFM2.5 1.2B Thinking

800 MB · Best ratio

DeepSeek Coder 1.3B

1 GB · Code-focused

Qwen2.5 1.5B

2 GB · General

Llama 3.2 1B

1.9 GB · General

Gemma 3 1B

1.95 GB · General

Qwen3 0.6B

1.05 GB · Lightweight

Qwen2.5 0.5B

420 MB · Lightweight

Granite 4.0 350M

280 MB · Ultra-light

SmolLM 360M

300 MB · Ultra-light

And more...

From 350M to 120B depending on your hardware

Under the Hood

Built for performance and privacy

Clean architecture, modern stack, zero runtime overhead.

WebGPU Inference

ONNX Runtime Web + @huggingface/transformers. Models run as native GPU compute shaders — no WASM overhead for inference.

Svelte 5 UI

Compiles to vanilla JS at build time. Zero runtime, zero virtual DOM. The popup and widget add virtually nothing to page weight.

Web Workers

LLM inference, embedding generation, and site crawling each run in dedicated workers. The UI stays at 60fps no matter what.

FAQ

Common questions

How can I get Daneel today?

Daneel is currently in private Release Candidate testing — access is by invitation while we finalize the public launch. To request access, open an issue on our public GitHub repository telling us a bit about your setup and use case — we'll reach out with a packaged build.

How much data does the model download?

The default model is ~600 MB. It downloads once from Hugging Face's CDN and caches permanently. Smaller models start at ~280 MB.

Does it work offline?

Yes. After the initial model download, all AI inference runs locally. Querying already-indexed sites works fully offline. Site crawling requires network access.

Is my data sent anywhere?

No. Page content, questions, and AI responses stay in your browser. The only network calls: model download (once), optional license refresh (weekly, paid users), and optional anonymous telemetry (opt-in).

Is the paid license a subscription?

No. One-time payment of $9. No recurring charges, no expiration. The license key works forever.

What browsers are supported?

Chrome 113+ with WebGPU. Edge (Chromium-based) works with sideloading. Without WebGPU, it falls back to CPU inference, or you can connect to Ollama / Claude API.

Why is inference slow on my machine?

Try a smaller model in Settings. The extension benchmarks your GPU and recommends a model, but integrated GPUs may struggle with larger models. You can also connect to a local Ollama server for better performance.

Daneel AI

Chat with anything,
build knowledge graphs and run agents,
entirely in your browser.

Can your machine run this?

Up and running in minutes

Request RC access

Wait for the model

Ask anything

Index entire sites

Index local folders

Connect MCP servers

Build your own Agents

Search in sites by meaning, not just keywords.

Everything you need, nothing you don't

Page Q&A

Full Site RAG

16 Local AI Models

4 LLM Backends

Semantic Search

Data Portability

The competition requires your data

35 models, ranked for your hardware

Built for performance and privacy

WebGPU Inference

Svelte 5 UI

Web Workers

Common questions

What's new in Daneel

Understanding in-browser LLM inference

Daneel AI

Chat with anything,build knowledge graphs and run agents,entirely in your browser.

Can your machine run this?

Up and running in minutes

Request RC access

Wait for the model

Ask anything

Index entire sites

Index local folders

Connect MCP servers

Build your own Agents

Search in sites by meaning, not just keywords.

Everything you need, nothing you don't

Page Q&A

Full Site RAG

16 Local AI Models

4 LLM Backends

Semantic Search

Data Portability

The competition requires your data

35 models, ranked for your hardware

Built for performance and privacy

WebGPU Inference

Svelte 5 UI

Web Workers

Common questions

What's new in Daneel

Understanding in-browser LLM inference

Chat with anything,
build knowledge graphs and run agents,
entirely in your browser.