T

TySS-Dev 15cfe47181 Merge pull request 'Removing return arrow' (#1 ) from master into testing

Reviewed-on: tyler/ollama-ai-answers-searxng#1

2026-05-17 02:46:49 -04:00

tests

Updated file name, and updated call to main program

2026-05-15 15:50:07 -04:00

.gitignore

feats: native searxng networking, code composition, ux polish, follow up querying via internals, config var clarity, readme

2026-01-20 21:35:43 -06:00

ollama_answers.py

Removing return arrow

2026-05-17 02:43:34 -04:00

README.md

Updated file names for ollama_anwsers.py and dev.py

2026-05-15 15:51:44 -04:00

requirements.txt

Updated the demo.py to work with the changes in ai_answers.py

2026-05-15 15:25:37 -04:00

README.md

Ollama AI Answers Plugin for SearXNG

Based on ai-answers-searxng by cra88y

A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context.

Features:

Token-by-token UI streaming
Clickable inline citations
Interactive mode: continue summary, ask follow-ups, copy, or regenerate
Simple response mode with no extras
Internally called low-latency RAG for follow-ups (bypasses HTTP loopback)
Native network integration via searx.network (respects proxy/SSL settings)
Stateless conversation persistence/shareability via URL hash
Model selector in the AI overview widget
Does not slow down result loading
One file install

Installation

Place ollama_answers.py into the searx/plugins directory of your SearXNG instance (or mount it in a container) and enable it in settings.yml:

plugins:
  searx.plugins.ollama_answers.SXNGPlugin:
    active: true

Configuration

Configure via environment variables.

Required

Variable	Description	Default
`LLM_URL`	Ollama chat completions endpoint	`http://ollama:11434/v1/chat/completions`
`LLM_MODEL`	Model name as listed in Ollama	`qwen3.5:9b`

Optional

Variable	Description	Default
`LLM_SYSTEM_PROMPT`	Overrides the default system prompt	`You are a direct, citation-accurate search synthesis engine.`
`LLM_MAX_TOKENS`	Max tokens in the AI response	`200`
`LLM_TEMPERATURE`	Sampling temperature	`0.2`
`LLM_CONTEXT_DEEP_COUNT`	Results used with full snippets	`5`
`LLM_CONTEXT_SHALLOW_COUNT`	Results with headlines only (breadth)	`15`
`LLM_TABS`	Comma-delimited tab whitelist	`general,science,it,news`
`LLM_INTERACTIVE`	Interactive UI mode (copy, regenerate, follow-up)	`true`
`LLM_QUESTION_MARK_REQUIRED`	Only trigger on queries containing `?`	`false`

How It Works

User performs a search
Results return server-side
post_search plugin hook fires
Token-optimized context is extracted from results
UI/logic shell injected into the standard answers object
Client-side script calls a signed endpoint (/ai-stream)
Ollama streams a response token-by-token in the UI

Docker Compose Example

services:
  searxng:
    environment:
      - LLM_URL=http://ollama:11434/v1/chat/completions
      - LLM_MODEL=qwen3.5:9b
    volumes:
      - ./ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py

  ollama:
    image: ollama/ollama
    volumes:
      - ollama_data:/root/.ollama

volumes:
  ollama_data:

Remote Ollama

If your Ollama instance is remote or behind a reverse proxy, set LLM_URL to the full endpoint and provide an API key if required. The plugin supports Bearer token auth and follows HTTP redirects.

environment:
  - LLM_URL=https://ollama.example.com/v1/chat/completions
  - LLM_API_KEY=your-bearer-token

Development — Dev Server

A standalone Flask dev server is included in tests/dev.py. It mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance.

Setup

pip install flask flask-babel certifi

Run

python tests/dev.py

Then open http://127.0.0.1:5000/ in your browser.

Note: Use 127.0.0.1:5000, not localhost:5000 — macOS AirPlay Receiver can occupy the IPv6 loopback on port 5000.

Usage

Type a query in the search bar and hit Search to trigger an AI overview.
Expand Ollama Configuration at the top to change the endpoint URL or Bearer token for the current session. Click Apply to save and re-run the current query.
The model selector in the AI overview widget (loaded from /ai-models) shows all models available on the configured Ollama server and persists your choice in the session URL.

Environment Variables (dev)

The dev reads the same variables as the plugin:

LLM_URL=http://localhost:11434/v1/chat/completions \
LLM_MODEL=qwen3.5:9b \
python tests/dev.py

Or export them before running. Any values set in the config panel at runtime take priority for that session.