Files
ollama-ai-answers-searxng/README.md
T

5.0 KiB

ollama-ai-answers-searxng

Local AI search overviews for SearXNG, powered by Ollama.

Python License SearXNG

Mirror Repo

One-line Install

bash <(curl -fsSL https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/install.sh)

Features

  • AI Overview box at the top of every search result page
  • Powered entirely by your local Ollama instance — no external API calls
  • Page content fetching — enriches context beyond SearXNG snippets
  • Model selector dropdown — switch models per-search without restarting
  • Inline citations with clickable source links
  • Citation footer listing all referenced sources
  • Follow-up questions with conversation history
  • Copy and Regenerate buttons
  • Typewriter animation (granian-compatible buffered response)
  • Ollama-only — no OpenAI, Gemini, or other provider bloat

Requirements

  • SearXNG installed via Docker Compose
  • Ollama running and accessible from the SearXNG container
  • Python 3.8+ (for build.py and install.sh)
  • Docker + Docker Compose

Install

bash <(curl -fsSL https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/install.sh)

The script will clone the repo, build the plugin, detect your SearXNG Docker Compose installation, copy the plugin, update docker-compose.yml and settings.yml, and optionally restart SearXNG.

Manual

git clone https://github.com/TySP-Dev/ollama-ai-answers-searxng
cd ollama-ai-answers-searxng
python3 build.py
bash install.sh

Or manually copy the built plugin and update your config:

# docker-compose.yml — searxng service
environment:
  - LLM_URL=http://ollama:11434/v1/chat/completions
  - LLM_MODEL=qwen3.5:9b
volumes:
  - ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z
# settings.yml
plugins:
  searx.plugins.ollama_answers.SXNGPlugin:
    active: true

Configuration

All configuration is done via environment variables on the SearXNG container.

Variable Default Description
LLM_URL http://ollama:11434/v1/chat/completions Ollama endpoint
LLM_MODEL qwen3.5:9b Default model
LLM_MAX_TOKENS 200 Max response tokens
LLM_TEMPERATURE 0.2 Response temperature
LLM_TABS general,science,it,news Search tabs to show AI overview on
LLM_QUESTION_MARK_REQUIRED false Only trigger on queries ending with ?
LLM_INTERACTIVE true Show copy/regenerate/follow-up UI
LLM_SYSTEM_PROMPT (built-in) Override the system prompt
LLM_CONTEXT_DEEP_COUNT 5 Results fetched for full page content
LLM_CONTEXT_SHALLOW_COUNT 15 Results used as headline-only context

Project Structure

ollama-ai-answers-searxng/
├── ollama_answers.py      # Source plugin — reads UI from assets/
├── build.py               # Assembles dist/ollama_answers.py (self-contained)
├── install.sh             # Full automated Docker Compose installer
├── assets/
│   ├── ui.css             # Interactive widget styles
│   ├── ui.html            # Interactive widget HTML (copy/regen/follow-up bar)
│   └── ui.js              # Frontend JS (typewriter, citations, streaming)
├── dist/                  # Output of build.py — gitignored
│   └── ollama_answers.py  # Self-contained, ready to deploy
├── dev/
│   └── dev.py             # Local Flask dev server (no SearXNG required)
└── README.md

Development

# Edit source files
vim ollama_answers.py
vim assets/ui.css

# Build dist file for deployment
python3 build.py

# Deploy to server
cp dist/ollama_answers.py ~/searxng/plugins/ollama_answers.py
cd ~/searxng && docker compose up -d --force-recreate core

# Run local dev server
PYTHONPATH=. python3 dev/dev.py

The dev server mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance. Open http://127.0.0.1:5000/ after starting it.

Note: Use 127.0.0.1:5000, not localhost:5000 — macOS AirPlay Receiver can occupy the IPv6 loopback on port 5000.

How It Works

  1. User searches on SearXNG
  2. post_search hook fires after results are fetched
  3. Top result URLs are fetched in parallel for full page content
  4. Context is assembled from page content + snippets + infoboxes
  5. A signed token is generated and injected into the page
  6. The browser POSTs to /ai-stream with the token and context
  7. The server calls Ollama with the enriched context
  8. The response is returned as JSON and animated with a typewriter effect
  9. Citations are rendered inline and collected in a footer

License

MIT License