Updated README
CI Test Guard / validate-code (push) Has been cancelled

This commit is contained in:
Tyler
2026-05-15 14:36:29 -04:00
parent 0d44023bb8
commit 4f970c537a
+27 -60
View File
@@ -1,8 +1,9 @@
# AI Answers Plugin for SearXNG # Ollama AI Answers Plugin for SearXNG
**Single file install** **Single file install**
**Does not block result loading time** **Does not block result loading time**
**Based on [ai-answers-searxng](https://github.com/cra88y/ai-answers-searxng) by [cra88y](https://github.com/cra88y)
A SearXNG plugin that generates AI answers using search results as RAG context. Supports 8+ LLM providers. A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context.
Features: Features:
- token-by-token UI streaming - token-by-token UI streaming
@@ -12,7 +13,6 @@ Features:
- internally called low-latency RAG for follow ups (bypasses http loopback) - internally called low-latency RAG for follow ups (bypasses http loopback)
- native network integration via `searx.network` (respects proxy/SSL settings) - native network integration via `searx.network` (respects proxy/SSL settings)
- stateless conversation persistence/sharability via URL - stateless conversation persistence/sharability via URL
- provider detection based on URL
## Installation ## Installation
@@ -27,81 +27,48 @@ plugins:
## Configuration ## Configuration
Configure via the environment variables: Configure via environment variables:
### Required ### Required
- `LLM_PROVIDER`: openrouter, openai, ollama, localai, lmstudio, gemini, azure, or huggingface - `LLM_URL`: Ollama chat completions endpoint. Default: `http://ollama:11434/v1/chat/completions`
- `LLM_KEY`: Provider API key (optional for local providers: ollama, localai, lmstudio) - `LLM_MODEL`: Model name as listed in Ollama. Default: `llama3.2`
### Optional ### Optional
- `LLM_MODEL`: Model identifier. Defaults vary. Recommended: 10-30B dense or 5-15B MoE activated. - `LLM_SYSTEM_PROMPT`: Overrides the system prompt. Default: `You are a direct, citation-accurate search synthesis engine.`
- `LLM_URL`: Overrides endpoint URL for any provider preset. - `LLM_MAX_TOKENS`: Default `200`.
- `LLM_SYSTEM_PROMPT`: Overrides some of the system prompt. Default `You are a direct, citation-accurate search synthesis engine.`.
- `LLM_MAX_TOKENS`: Default `500`.
- `LLM_TEMPERATURE`: Default `0.2`. - `LLM_TEMPERATURE`: Default `0.2`.
- `LLM_CONTEXT_DEEP_COUNT`: results as context with full snippets. Default `5`. - `LLM_CONTEXT_DEEP_COUNT`: Results used as context with full snippets. Default `5`.
- `LLM_CONTEXT_SHALLOW_COUNT`: Results with headlines only (additional breadth). Default `15`. - `LLM_CONTEXT_SHALLOW_COUNT`: Results with headlines only (additional breadth). Default `15`.
- `LLM_TABS`: Tab whitelist, comma delimiter. Default `general,science,it,news`. - `LLM_TABS`: Tab whitelist, comma delimited. Default `general,science,it,news`.
- `LLM_INTERACTIVE`: UI mode. Default is `true` (interactive: copy, regenerate, follow up). Set to `false` for simple response only mode. - `LLM_INTERACTIVE`: UI mode. Default `true` (interactive: copy, regenerate, follow up). Set to `false` for simple response only.
- `LLM_QUESTION_MARK_REQUIRED`: Only trigger AI answers when the query contains `?`. Default `false`. - `LLM_QUESTION_MARK_REQUIRED`: Only trigger AI answers when the query contains `?`. Default `false`.
- `LLM_OLLAMA_UNLOAD_AFTER`: Unload Ollama model after each response. Default `false`.
## How It Works ## How It Works
1 user initial search 1. User performs initial search
2 results return server side 2. Results return server side
3 `post_search` plugin hook entry 3. `post_search` plugin hook fires
4 token optimized context extracted 4. Token-optimized context extracted from results
5 inject the ui/logic "shell" into standard results answer object 5. UI/logic shell injected into the standard results answer object
6 client side script calls custom endpoint with signed token 6. Client-side script calls custom endpoint with a signed token
7 LLM response streams back token by token 7. Ollama response renders token by token in the UI
## Examples ## Example
### OpenRouter ### Docker Compose
``` ```yaml
LLM_PROVIDER=openrouter environment:
LLM_KEY=sk-or-xxx - LLM_URL=http://ollama:11434/v1/chat/completions
LLM_MODEL=google/gemma-3-27b-it:free - LLM_MODEL=llama3.2
``` ```
### Ollama (Local) ### Environment variables
``` ```
LLM_PROVIDER=ollama LLM_URL=http://ollama:11434/v1/chat/completions
LLM_KEY=ollama
LLM_MODEL=llama3.2 LLM_MODEL=llama3.2
``` ```
### LocalAI
```
LLM_PROVIDER=localai
LLM_KEY=your-key
LLM_MODEL=gpt-4
LLM_URL=http://localai.lan:8080/v1/chat/completions
```
### Gemini
```
LLM_PROVIDER=gemini
LLM_KEY=AIzaSy...
LLM_MODEL=gemma-3-27b-it
```
### Azure
```
LLM_PROVIDER=azure
LLM_KEY=your-api-key
LLM_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment/chat/completions?api-version=2024-02-01
```
### Hugging Face
```
LLM_PROVIDER=huggingface
LLM_KEY=hf_xxx
LLM_MODEL=meta-llama/Meta-Llama-3-8B-Instruct
```
## Development ## Development
```bash ```bash