Compare commits
8 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 904cf945a2 | |||
| b3dc603b94 | |||
| 4e2f9d97d7 | |||
| 1f7d54590f | |||
| 2ed6a0aae9 | |||
| 9d6d4ec160 | |||
| e4880a7a51 | |||
| 332834a126 |
@@ -21,40 +21,70 @@ Features:
|
|||||||
- Model selector in the AI overview widget
|
- Model selector in the AI overview widget
|
||||||
- Does not slow down result loading
|
- Does not slow down result loading
|
||||||
- One file install
|
- One file install
|
||||||
|
- Real-time streaming via Valkey — responses stream token by token using a background thread + Valkey job queue, working around granian's broken generator support for true streaming feel
|
||||||
|
- TF-IDF result reranking — fetched page content is scored against the query using BM25-style TF-IDF before being sent to Ollama, surfacing the most relevant sources first
|
||||||
|
- Smart chunking — pages are split into 512-token overlapping segments and the highest-scoring chunk per page is selected for context
|
||||||
|
- Intent detection — queries are automatically classified into 8 intent types (factual, howto, technical, comparison, opinion, current, local, general) with tailored system prompts per type
|
||||||
|
- Conversation memory — 30-minute cross-search conversation history stored in Valkey, so follow-up questions work even after navigating to a new search
|
||||||
|
- Markdown rendering — AI responses render bold, italic, lists, headers, and inline code natively in the result box
|
||||||
|
- Intent emoji badge — a small emoji appears next to "AI Overview" indicating the detected query type
|
||||||
|
|
||||||
## Installation
|
## Install
|
||||||
|
|
||||||
Place `ollama_answers.py` into the `searx/plugins` directory of your SearXNG instance (or mount it in a container) and enable it in `settings.yml`:
|
1. Download the plugin:
|
||||||
|
```bash
|
||||||
|
curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/ollama_answers.py
|
||||||
|
```
|
||||||
|
|
||||||
```yaml
|
2. Copy to your SearXNG plugins directory:
|
||||||
plugins:
|
```bash
|
||||||
|
cp ollama_answers.py ~/searxng/plugins/ollama_answers.py
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Add the volume mount to your `docker-compose.yml` under the searxng service:
|
||||||
|
```yaml
|
||||||
|
volumes:
|
||||||
|
- ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Add environment variables to `docker-compose.yml`:
|
||||||
|
```yaml
|
||||||
|
environment:
|
||||||
|
- LLM_URL=http://ollama:11434/v1/chat/completions
|
||||||
|
- LLM_MODEL=qwen3.5:9b
|
||||||
|
- VALKEY_HOST=searxng-valkey
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Add to `settings.yml` plugins section:
|
||||||
|
```yaml
|
||||||
|
plugins:
|
||||||
searx.plugins.ollama_answers.SXNGPlugin:
|
searx.plugins.ollama_answers.SXNGPlugin:
|
||||||
active: true
|
active: true
|
||||||
```
|
```
|
||||||
|
|
||||||
|
6. Restart SearXNG:
|
||||||
|
```bash
|
||||||
|
docker compose up -d --force-recreate core
|
||||||
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
Configure via environment variables.
|
Configure via environment variables.
|
||||||
|
|
||||||
### Required
|
| Variable | Default | Description |
|
||||||
|
|
||||||
| Variable | Description | Default |
|
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `LLM_URL` | Ollama chat completions endpoint | `http://ollama:11434/v1/chat/completions` |
|
| `LLM_URL` | `http://ollama:11434/v1/chat/completions` | Ollama endpoint |
|
||||||
| `LLM_MODEL` | Model name as listed in Ollama | `qwen3.5:9b` |
|
| `LLM_MODEL` | `qwen3.5:9b` | Default model |
|
||||||
|
| `LLM_MAX_TOKENS` | `200` | Max response tokens |
|
||||||
### Optional
|
| `LLM_TEMPERATURE` | `0.2` | Response temperature |
|
||||||
|
| `LLM_TABS` | `general,science,it,news` | Tabs to show AI overview on |
|
||||||
| Variable | Description | Default |
|
| `LLM_QUESTION_MARK_REQUIRED` | `false` | Only trigger on queries with `?` |
|
||||||
|---|---|---|
|
| `LLM_INTERACTIVE` | `true` | Show copy/regen/follow-up UI |
|
||||||
| `LLM_SYSTEM_PROMPT` | Overrides the default system prompt | `You are a direct, citation-accurate search synthesis engine.` |
|
| `LLM_SYSTEM_PROMPT` | *(built-in)* | Override the system prompt |
|
||||||
| `LLM_MAX_TOKENS` | Max tokens in the AI response | `200` |
|
| `LLM_CONTEXT_DEEP_COUNT` | `5` | Full-content results to fetch |
|
||||||
| `LLM_TEMPERATURE` | Sampling temperature | `0.2` |
|
| `LLM_CONTEXT_SHALLOW_COUNT` | `15` | Headline-only results |
|
||||||
| `LLM_CONTEXT_DEEP_COUNT` | Results used with full snippets | `5` |
|
| `VALKEY_HOST` | `searxng-valkey` | Valkey container hostname |
|
||||||
| `LLM_CONTEXT_SHALLOW_COUNT` | Results with headlines only (breadth) | `15` |
|
| `VALKEY_PORT` | `6379` | Valkey port |
|
||||||
| `LLM_TABS` | Comma-delimited tab whitelist | `general,science,it,news` |
|
|
||||||
| `LLM_INTERACTIVE` | Interactive UI mode (copy, regenerate, follow-up) | `true` |
|
|
||||||
| `LLM_QUESTION_MARK_REQUIRED` | Only trigger on queries containing `?` | `false` |
|
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
@@ -66,6 +96,50 @@ Configure via environment variables.
|
|||||||
6. Client-side script calls a signed endpoint (`/ai-stream`)
|
6. Client-side script calls a signed endpoint (`/ai-stream`)
|
||||||
7. Ollama streams a response token-by-token in the UI
|
7. Ollama streams a response token-by-token in the UI
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────┐
|
||||||
|
│ Browser │
|
||||||
|
│ POST /ai-stream → GET /ai-status/{id} (poll 150ms) │
|
||||||
|
└────────────────┬────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌────────────────▼────────────────────────────────────┐
|
||||||
|
│ SearXNG + Plugin │
|
||||||
|
│ │
|
||||||
|
│ post_search() │
|
||||||
|
│ → _enrich_results() ← ThreadPoolExecutor │
|
||||||
|
│ → _fetch_page_text() × 5 parallel │
|
||||||
|
│ → _chunk_text() + _tfidf_score() │
|
||||||
|
│ → rerank by score │
|
||||||
|
│ → _assemble_context() │
|
||||||
|
│ → inject AI Overview HTML + JS │
|
||||||
|
│ │
|
||||||
|
│ /ai-stream │
|
||||||
|
│ → validate token │
|
||||||
|
│ → _detect_intent() → select system prompt │
|
||||||
|
│ → _load_conversation() from Valkey │
|
||||||
|
│ → launch stream_to_valkey() thread │
|
||||||
|
│ → return {job_id} immediately │
|
||||||
|
│ │
|
||||||
|
│ stream_to_valkey() [background thread] │
|
||||||
|
│ → Ollama stream=True │
|
||||||
|
│ → RPUSH tokens to Valkey │
|
||||||
|
│ → RPUSH __DONE__ when complete │
|
||||||
|
│ │
|
||||||
|
│ /ai-status/{job_id} │
|
||||||
|
│ → LRANGE chunks from offset │
|
||||||
|
│ → return {chunks, done} │
|
||||||
|
└────────────────┬────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌────────────────▼────────────────────────────────────┐
|
||||||
|
│ Valkey │
|
||||||
|
│ ai:job:{id}:chunks (list, TTL 120s) │
|
||||||
|
│ ai:job:{id}:status (string, TTL 120s) │
|
||||||
|
│ ai:conv:{session} (JSON, TTL 1800s) │
|
||||||
|
└─────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
## Docker Compose Example
|
## Docker Compose Example
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
@@ -74,6 +148,7 @@ services:
|
|||||||
environment:
|
environment:
|
||||||
- LLM_URL=http://ollama:11434/v1/chat/completions
|
- LLM_URL=http://ollama:11434/v1/chat/completions
|
||||||
- LLM_MODEL=qwen3.5:9b
|
- LLM_MODEL=qwen3.5:9b
|
||||||
|
- VALKEY_HOST=searxng-valkey
|
||||||
volumes:
|
volumes:
|
||||||
- ./ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py
|
- ./ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py
|
||||||
|
|
||||||
@@ -96,6 +171,17 @@ environment:
|
|||||||
- LLM_API_KEY=your-bearer-token
|
- LLM_API_KEY=your-bearer-token
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
ollama-ai-answers-searxng/
|
||||||
|
├── ollama_answers.py # single plugin file — all logic here
|
||||||
|
├── README.md
|
||||||
|
├── requirements.txt # flask, flask-babel (for local dev only)
|
||||||
|
└── tests/
|
||||||
|
└── dev.py # local dev server
|
||||||
|
```
|
||||||
|
|
||||||
## Development — Dev Server
|
## Development — Dev Server
|
||||||
|
|
||||||
A standalone Flask dev server is included in `tests/dev.py`. It mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance.
|
A standalone Flask dev server is included in `tests/dev.py`. It mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance.
|
||||||
@@ -124,7 +210,7 @@ Then open [http://127.0.0.1:5000/](http://127.0.0.1:5000/) in your browser.
|
|||||||
|
|
||||||
### Environment Variables (dev)
|
### Environment Variables (dev)
|
||||||
|
|
||||||
The dev reads the same variables as the plugin:
|
The dev server reads the same variables as the plugin:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
LLM_URL=http://localhost:11434/v1/chat/completions \
|
LLM_URL=http://localhost:11434/v1/chat/completions \
|
||||||
|
|||||||
+752
-158
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user