diff --git a/README.md b/README.md index b01a6e0..043272b 100644 --- a/README.md +++ b/README.md @@ -21,40 +21,70 @@ Features: - Model selector in the AI overview widget - Does not slow down result loading - One file install +- Real-time streaming via Valkey — responses stream token by token using a background thread + Valkey job queue, working around granian's broken generator support for true streaming feel +- TF-IDF result reranking — fetched page content is scored against the query using BM25-style TF-IDF before being sent to Ollama, surfacing the most relevant sources first +- Smart chunking — pages are split into 512-token overlapping segments and the highest-scoring chunk per page is selected for context +- Intent detection — queries are automatically classified into 8 intent types (factual, howto, technical, comparison, opinion, current, local, general) with tailored system prompts per type +- Conversation memory — 30-minute cross-search conversation history stored in Valkey, so follow-up questions work even after navigating to a new search +- Markdown rendering — AI responses render bold, italic, lists, headers, and inline code natively in the result box +- Intent emoji badge — a small emoji appears next to "AI Overview" indicating the detected query type -## Installation +## Install -Place `ollama_answers.py` into the `searx/plugins` directory of your SearXNG instance (or mount it in a container) and enable it in `settings.yml`: +1. Download the plugin: + ```bash + curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/ollama_answers.py + ``` -```yaml -plugins: - searx.plugins.ollama_answers.SXNGPlugin: - active: true -``` +2. Copy to your SearXNG plugins directory: + ```bash + cp ollama_answers.py ~/searxng/plugins/ollama_answers.py + ``` + +3. Add the volume mount to your `docker-compose.yml` under the searxng service: + ```yaml + volumes: + - ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z + ``` + +4. Add environment variables to `docker-compose.yml`: + ```yaml + environment: + - LLM_URL=http://ollama:11434/v1/chat/completions + - LLM_MODEL=qwen3.5:9b + - VALKEY_HOST=searxng-valkey + ``` + +5. Add to `settings.yml` plugins section: + ```yaml + plugins: + searx.plugins.ollama_answers.SXNGPlugin: + active: true + ``` + +6. Restart SearXNG: + ```bash + docker compose up -d --force-recreate core + ``` ## Configuration Configure via environment variables. -### Required - -| Variable | Description | Default | +| Variable | Default | Description | |---|---|---| -| `LLM_URL` | Ollama chat completions endpoint | `http://ollama:11434/v1/chat/completions` | -| `LLM_MODEL` | Model name as listed in Ollama | `qwen3.5:9b` | - -### Optional - -| Variable | Description | Default | -|---|---|---| -| `LLM_SYSTEM_PROMPT` | Overrides the default system prompt | `You are a direct, citation-accurate search synthesis engine.` | -| `LLM_MAX_TOKENS` | Max tokens in the AI response | `200` | -| `LLM_TEMPERATURE` | Sampling temperature | `0.2` | -| `LLM_CONTEXT_DEEP_COUNT` | Results used with full snippets | `5` | -| `LLM_CONTEXT_SHALLOW_COUNT` | Results with headlines only (breadth) | `15` | -| `LLM_TABS` | Comma-delimited tab whitelist | `general,science,it,news` | -| `LLM_INTERACTIVE` | Interactive UI mode (copy, regenerate, follow-up) | `true` | -| `LLM_QUESTION_MARK_REQUIRED` | Only trigger on queries containing `?` | `false` | +| `LLM_URL` | `http://ollama:11434/v1/chat/completions` | Ollama endpoint | +| `LLM_MODEL` | `qwen3.5:9b` | Default model | +| `LLM_MAX_TOKENS` | `200` | Max response tokens | +| `LLM_TEMPERATURE` | `0.2` | Response temperature | +| `LLM_TABS` | `general,science,it,news` | Tabs to show AI overview on | +| `LLM_QUESTION_MARK_REQUIRED` | `false` | Only trigger on queries with `?` | +| `LLM_INTERACTIVE` | `true` | Show copy/regen/follow-up UI | +| `LLM_SYSTEM_PROMPT` | *(built-in)* | Override the system prompt | +| `LLM_CONTEXT_DEEP_COUNT` | `5` | Full-content results to fetch | +| `LLM_CONTEXT_SHALLOW_COUNT` | `15` | Headline-only results | +| `VALKEY_HOST` | `searxng-valkey` | Valkey container hostname | +| `VALKEY_PORT` | `6379` | Valkey port | ## How It Works @@ -66,6 +96,50 @@ Configure via environment variables. 6. Client-side script calls a signed endpoint (`/ai-stream`) 7. Ollama streams a response token-by-token in the UI +## Architecture + +``` +┌─────────────────────────────────────────────────────┐ +│ Browser │ +│ POST /ai-stream → GET /ai-status/{id} (poll 150ms) │ +└────────────────┬────────────────────────────────────┘ + │ +┌────────────────▼────────────────────────────────────┐ +│ SearXNG + Plugin │ +│ │ +│ post_search() │ +│ → _enrich_results() ← ThreadPoolExecutor │ +│ → _fetch_page_text() × 5 parallel │ +│ → _chunk_text() + _tfidf_score() │ +│ → rerank by score │ +│ → _assemble_context() │ +│ → inject AI Overview HTML + JS │ +│ │ +│ /ai-stream │ +│ → validate token │ +│ → _detect_intent() → select system prompt │ +│ → _load_conversation() from Valkey │ +│ → launch stream_to_valkey() thread │ +│ → return {job_id} immediately │ +│ │ +│ stream_to_valkey() [background thread] │ +│ → Ollama stream=True │ +│ → RPUSH tokens to Valkey │ +│ → RPUSH __DONE__ when complete │ +│ │ +│ /ai-status/{job_id} │ +│ → LRANGE chunks from offset │ +│ → return {chunks, done} │ +└────────────────┬────────────────────────────────────┘ + │ +┌────────────────▼────────────────────────────────────┐ +│ Valkey │ +│ ai:job:{id}:chunks (list, TTL 120s) │ +│ ai:job:{id}:status (string, TTL 120s) │ +│ ai:conv:{session} (JSON, TTL 1800s) │ +└─────────────────────────────────────────────────────┘ +``` + ## Docker Compose Example ```yaml @@ -74,6 +148,7 @@ services: environment: - LLM_URL=http://ollama:11434/v1/chat/completions - LLM_MODEL=qwen3.5:9b + - VALKEY_HOST=searxng-valkey volumes: - ./ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py @@ -96,6 +171,17 @@ environment: - LLM_API_KEY=your-bearer-token ``` +## Project Structure + +``` +ollama-ai-answers-searxng/ +├── ollama_answers.py # single plugin file — all logic here +├── README.md +├── requirements.txt # flask, flask-babel (for local dev only) +└── tests/ + └── dev.py # local dev server +``` + ## Development — Dev Server A standalone Flask dev server is included in `tests/dev.py`. It mocks the SearXNG plugin environment so you can test the full UI without a running SearXNG instance. @@ -124,7 +210,7 @@ Then open [http://127.0.0.1:5000/](http://127.0.0.1:5000/) in your browser. ### Environment Variables (dev) -The dev reads the same variables as the plugin: +The dev server reads the same variables as the plugin: ```bash LLM_URL=http://localhost:11434/v1/chat/completions \