[![Main Repo](https://img.shields.io/badge/Main%20Repo-gits.tysstech.com-blue?logo=gitea)](https://git.tysstech.com/tyler/ollama-ai-answers-searxng) [![Mirror Repo](https://img.shields.io/badge/Mirror%20Repo-github.com-blue?logo=github)](https://github.com/TySP-Dev/ollama-ai-answers-searxng)
# Ollama AI Answers Plugin for SearXNG **Based on [ai-answers-searxng](https://github.com/cra88y/ai-answers-searxng) by [cra88y](https://github.com/cra88y)** A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context. Features: - Token-by-token UI streaming - Clickable inline citations - Interactive mode: continue summary, ask follow-ups, copy, or regenerate - Simple response mode with no extras - Internally called low-latency RAG for follow-ups (bypasses HTTP loopback) - Native network integration via `searx.network` (respects proxy/SSL settings) - Stateless conversation persistence/shareability via URL hash - Model selector in the AI overview widget - Does not slow down result loading - One file install ## Installation Place `ollama_answers.py` into the `searx/plugins` directory of your SearXNG instance (or mount it in a container) and enable it in `settings.yml`: ```yaml plugins: searx.plugins.ollama_answers.SXNGPlugin: active: true ``` ## Configuration Configure via environment variables. ### Required | Variable | Description | Default | |---|---|---| | `LLM_URL` | Ollama chat completions endpoint | `http://ollama:11434/v1/chat/completions` | | `LLM_MODEL` | Model name as listed in Ollama | `qwen3.5:9b` | ### Optional | Variable | Description | Default | |---|---|---| | `LLM_SYSTEM_PROMPT` | Overrides the default system prompt | `You are a direct, citation-accurate search synthesis engine.` | | `LLM_MAX_TOKENS` | Max tokens in the AI response | `200` | | `LLM_TEMPERATURE` | Sampling temperature | `0.2` | | `LLM_CONTEXT_DEEP_COUNT` | Results used with full snippets | `5` | | `LLM_CONTEXT_SHALLOW_COUNT` | Results with headlines only (breadth) | `15` | | `LLM_TABS` | Comma-delimited tab whitelist | `general,science,it,news` | | `LLM_INTERACTIVE` | Interactive UI mode (copy, regenerate, follow-up) | `true` | | `LLM_QUESTION_MARK_REQUIRED` | Only trigger on queries containing `?` | `false` | ## How It Works 1. User performs a search 2. Results return server-side 3. `post_search` plugin hook fires 4. Token-optimized context is extracted from results 5. UI/logic shell injected into the standard answers object 6. Client-side script calls a signed endpoint (`/ai-stream`) 7. Ollama streams a response token-by-token in the UI ## Docker Compose Example ```yaml services: searxng: environment: - LLM_URL=http://ollama:11434/v1/chat/completions - LLM_MODEL=qwen3.5:9b volumes: - ./ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py ollama: image: ollama/ollama volumes: - ollama_data:/root/.ollama volumes: ollama_data: ``` ## Remote Ollama If your Ollama instance is remote or behind a reverse proxy, set `LLM_URL` to the full endpoint and provide an API key if required. The plugin supports Bearer token auth and follows HTTP redirects. ```yaml environment: - LLM_URL=https://ollama.example.com/v1/chat/completions - LLM_API_KEY=your-bearer-token ``` ## Development — Dev Server A standalone Flask dev server is included in `tests/dev.py`. It mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance. ### Setup ```bash pip install flask flask-babel certifi ``` ### Run ```bash python tests/dev.py ``` Then open [http://127.0.0.1:5000/](http://127.0.0.1:5000/) in your browser. > **Note:** Use `127.0.0.1:5000`, not `localhost:5000` — macOS AirPlay Receiver can occupy the IPv6 loopback on port 5000. ### Usage - Type a query in the search bar and hit **Search** to trigger an AI overview. - Expand **Ollama Configuration** at the top to change the endpoint URL or Bearer token for the current session. Click **Apply** to save and re-run the current query. - The model selector in the AI overview widget (loaded from `/ai-models`) shows all models available on the configured Ollama server and persists your choice in the session URL. ### Environment Variables (dev) The dev reads the same variables as the plugin: ```bash LLM_URL=http://localhost:11434/v1/chat/completions \ LLM_MODEL=qwen3.5:9b \ python tests/dev.py ``` Or export them before running. Any values set in the config panel at runtime take priority for that session.