TySS-Dev/ollama-ai-answers-searxng

Fork 0

T

Tyler b3093f5918

CI Test Guard / validate-code (push) Has been cancelled

Details

UI color changes

2026-05-14 23:05:40 -04:00

.github/workflows

variety of changes: custom system prompt, some bugs, dev qol

2026-05-02 21:01:37 -05:00

tests

variety of changes: custom system prompt, some bugs, dev qol

2026-05-02 21:01:37 -05:00

.gitignore

feats: native searxng networking, code composition, ux polish, follow up querying via internals, config var clarity, readme

2026-01-20 21:35:43 -06:00

ai_answers.py

UI color changes

2026-05-14 23:05:40 -04:00

README.md

variety of changes: custom system prompt, some bugs, dev qol

2026-05-02 21:01:37 -05:00

requirements.txt

feat: implement RAG context awareness and non-blocking streaming

2026-01-10 18:25:34 -06:00

README.md

AI Answers Plugin for SearXNG

Single file install
Does not block result loading time

A SearXNG plugin that generates AI answers using search results as RAG context. Supports 8+ LLM providers.

Features:

token-by-token UI streaming
clickable inline citations
interactive mode to continue summary, ask follow ups, copy, or regenerate
simple response mode with no extras
internally called low-latency RAG for follow ups (bypasses http loopback)
native network integration via searx.network (respects proxy/SSL settings)
stateless conversation persistence/sharability via URL
provider detection based on URL

Installation

Place ai_answers.py into the searx/plugins directory of your instance (or mount it in a container) and enable it in settings.yml:

plugins:
  searx.plugins.ai_answers.SXNGPlugin:  
    active: true

Configuration

Configure via the environment variables:

Required

LLM_PROVIDER: openrouter, openai, ollama, localai, lmstudio, gemini, azure, or huggingface
LLM_KEY: Provider API key (optional for local providers: ollama, localai, lmstudio)

Optional

LLM_MODEL: Model identifier. Defaults vary. Recommended: 10-30B dense or 5-15B MoE activated.
LLM_URL: Overrides endpoint URL for any provider preset.
LLM_SYSTEM_PROMPT: Overrides some of the system prompt. Default You are a direct, citation-accurate search synthesis engine..
LLM_MAX_TOKENS: Default 500.
LLM_TEMPERATURE: Default 0.2.
LLM_CONTEXT_DEEP_COUNT: results as context with full snippets. Default 5.
LLM_CONTEXT_SHALLOW_COUNT: Results with headlines only (additional breadth). Default 15.
LLM_TABS: Tab whitelist, comma delimiter. Default general,science,it,news.
LLM_INTERACTIVE: UI mode. Default is true (interactive: copy, regenerate, follow up). Set to false for simple response only mode.
LLM_QUESTION_MARK_REQUIRED: Only trigger AI answers when the query contains ?. Default false.
LLM_OLLAMA_UNLOAD_AFTER: Unload Ollama model after each response. Default false.

How It Works

1 user initial search 2 results return server side 3 post_search plugin hook entry 4 token optimized context extracted 5 inject the ui/logic "shell" into standard results answer object 6 client side script calls custom endpoint with signed token 7 LLM response streams back token by token

Examples

OpenRouter

LLM_PROVIDER=openrouter
LLM_KEY=sk-or-xxx
LLM_MODEL=google/gemma-3-27b-it:free

Ollama (Local)

LLM_PROVIDER=ollama
LLM_KEY=ollama
LLM_MODEL=llama3.2

LocalAI

LLM_PROVIDER=localai
LLM_KEY=your-key
LLM_MODEL=gpt-4
LLM_URL=http://localai.lan:8080/v1/chat/completions

Gemini

LLM_PROVIDER=gemini
LLM_KEY=AIzaSy...
LLM_MODEL=gemma-3-27b-it

Azure

LLM_PROVIDER=azure
LLM_KEY=your-api-key
LLM_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment/chat/completions?api-version=2024-02-01

Hugging Face

LLM_PROVIDER=huggingface
LLM_KEY=hf_xxx
LLM_MODEL=meta-llama/Meta-Llama-3-8B-Instruct

Development

pip install flask flask-babel
python tests/demo.py   # UI demo at localhost:5000