b3093f5918a182784defd5fd6cfe1fc3638da1e7
AI Answers Plugin for SearXNG
Single file install
Does not block result loading time
A SearXNG plugin that generates AI answers using search results as RAG context. Supports 8+ LLM providers.
Features:
- token-by-token UI streaming
- clickable inline citations
- interactive mode to continue summary, ask follow ups, copy, or regenerate
- simple response mode with no extras
- internally called low-latency RAG for follow ups (bypasses http loopback)
- native network integration via
searx.network(respects proxy/SSL settings) - stateless conversation persistence/sharability via URL
- provider detection based on URL
Installation
Place ai_answers.py into the searx/plugins directory of your instance (or mount it in a container) and enable it in settings.yml:
plugins:
searx.plugins.ai_answers.SXNGPlugin:
active: true
Configuration
Configure via the environment variables:
Required
LLM_PROVIDER: openrouter, openai, ollama, localai, lmstudio, gemini, azure, or huggingfaceLLM_KEY: Provider API key (optional for local providers: ollama, localai, lmstudio)
Optional
LLM_MODEL: Model identifier. Defaults vary. Recommended: 10-30B dense or 5-15B MoE activated.LLM_URL: Overrides endpoint URL for any provider preset.LLM_SYSTEM_PROMPT: Overrides some of the system prompt. DefaultYou are a direct, citation-accurate search synthesis engine..LLM_MAX_TOKENS: Default500.LLM_TEMPERATURE: Default0.2.LLM_CONTEXT_DEEP_COUNT: results as context with full snippets. Default5.LLM_CONTEXT_SHALLOW_COUNT: Results with headlines only (additional breadth). Default15.LLM_TABS: Tab whitelist, comma delimiter. Defaultgeneral,science,it,news.LLM_INTERACTIVE: UI mode. Default istrue(interactive: copy, regenerate, follow up). Set tofalsefor simple response only mode.LLM_QUESTION_MARK_REQUIRED: Only trigger AI answers when the query contains?. Defaultfalse.LLM_OLLAMA_UNLOAD_AFTER: Unload Ollama model after each response. Defaultfalse.
How It Works
1 user initial search
2 results return server side
3 post_search plugin hook entry
4 token optimized context extracted
5 inject the ui/logic "shell" into standard results answer object
6 client side script calls custom endpoint with signed token
7 LLM response streams back token by token
Examples
OpenRouter
LLM_PROVIDER=openrouter
LLM_KEY=sk-or-xxx
LLM_MODEL=google/gemma-3-27b-it:free
Ollama (Local)
LLM_PROVIDER=ollama
LLM_KEY=ollama
LLM_MODEL=llama3.2
LocalAI
LLM_PROVIDER=localai
LLM_KEY=your-key
LLM_MODEL=gpt-4
LLM_URL=http://localai.lan:8080/v1/chat/completions
Gemini
LLM_PROVIDER=gemini
LLM_KEY=AIzaSy...
LLM_MODEL=gemma-3-27b-it
Azure
LLM_PROVIDER=azure
LLM_KEY=your-api-key
LLM_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment/chat/completions?api-version=2024-02-01
Hugging Face
LLM_PROVIDER=huggingface
LLM_KEY=hf_xxx
LLM_MODEL=meta-llama/Meta-Llama-3-8B-Instruct
Development
pip install flask flask-babel
python tests/demo.py # UI demo at localhost:5000
Description
A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context.
Languages
Python
100%