2026-05-17 20:55:00 -04:00
1 changed files with 52 additions and 34 deletions
@@ -10,44 +10,50 @@
 A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context.
-Features:
+## Features:
- Token-by-token UI streaming
+
- Clickable inline citations
+- Inline numbered citations
- Interactive mode: continue summary, ask follow-ups, copy, or regenerate
+- Interactive mode - Continue summary, ask follow-ups, copy, or regenerate
- Simple response mode with no extras
+- Overview of ranked results with prompts based on detected query intent:
- Internally called low-latency RAG for follow-ups (bypasses HTTP loopback)
+  - `How To` `Technical` `Factual` `Comparison` `Opinion` `Current` `Local` `Geneal`
- Native network integration via `searx.network` (respects proxy/SSL settings)
+- Internally called RAG for follow-ups
- Stateless conversation persistence/shareability via URL hash
+- Native network integration via `searx.network`
- Model selector in the AI overview widget
+- Stateless conversation presistence/shareability via URL hash
- Does not slow down result loading
+- Ollama model selector
- One file install
+- Feeds fetched results to Ollama without slowing down SearXNG  results
- Real-time streaming via Valkey — responses stream token by token using a background thread + Valkey job queue, working around granian's broken generator support for true streaming feel
+- Real-time streaming via Valkey (No waiting for a completed response)
- TF-IDF result reranking — fetched page content is scored against the query using BM25-style TF-IDF before being sent to Ollama, surfacing the most relevant sources first
+- TF-IDF result ranking before being sent to Ollama
- Smart chunking — pages are split into 512-token overlapping segments and the highest-scoring chunk per page is selected for context
+- Smart chunking - Pages are split into 512-token segments and highest-scoring chunk per page used for context
- Intent detection — queries are automatically classified into 8 intent types (factual, howto, technical, comparison, opinion, current, local, general) with tailored system prompts per type
+- Conversation memory - 30-minute cross-search conversation history via Valkey for follow-up questions
- Conversation memory — 30-minute cross-search conversation history stored in Valkey, so follow-up questions work even after navigating to a new search
+- Markdown support
- Markdown rendering — AI responses render bold, italic, lists, headers, and inline code natively in the result box
+- Intent emoji badge showing what intent prompt was used 
 - Intent emoji badge — a small emoji appears next to "AI Overview" indicating the detected query type
 ## Install
 1. Download the plugin:
   ### Main repo (Gitea)
   ```bash
-   curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/ollama_answers.py
+   curl -o ollama_answers.py https://git.tysstech.com/TySS-Dev/ollama-ai-answers-searxng/raw/branch/main/ollama_answers.py
   ```
-2. Copy to your SearXNG plugins directory:
+   ### Mirror repo (Github):
   ```bash
-   cp ollama_answers.py ~/searxng/plugins/ollama_answers.py
+   curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/main/ollama_answers.py
   ```
-3. Add the volume mount to your `docker-compose.yml` under the searxng service:
+3. Copy to your SearXNG plugins directory:
   ```bash
   cp ollama_answers.py path_to/searxng/plugins/ollama_answers.py
   ```
 4. Add the volume mount to your `docker-compose.yml` under the searxng service:
   ```yaml
   volumes:
     - ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z
   ```
-4. Add environment variables to `docker-compose.yml`:
+5. Add environment variables to `docker-compose.yml`:
   ```yaml
   environment:
     - LLM_URL=http://ollama:11434/v1/chat/completions
@@ -55,14 +61,14 @@ Features:
     - VALKEY_HOST=searxng-valkey
   ```
-5. Add to `settings.yml` plugins section:
+6. Add to `settings.yml` plugins section:
   ```yaml
   plugins:
     searx.plugins.ollama_answers.SXNGPlugin:
       active: true
   ```
-6. Restart SearXNG:
+7. Restart SearXNG:
   ```bash
   docker compose up -d --force-recreate core
   ```
@@ -96,6 +102,18 @@ Configure via environment variables.
 6. Client-side script calls a signed endpoint (`/ai-stream`)
 7. Ollama streams a response token-by-token in the UI
 ## Known Issues
 - [ ] When asking a follow up question the previous output disappears
 - [ ] Parts of the UI are not theme aware resulting in a unpolished look when not using a dark theme
 - [ ] When SearXNG provides a info blob for a search it appears on top of the overview i.e. `Wikipedia` or `Linux` 
 For any issues not stated here please create an issue ticket on [Gitea](https://git.tysstech.com/TySS-Dev/ollama-ai-answers-searxng/issues) or [GitHub](https://github.com/TySP-Dev/ollama-ai-answers-searxng/issues) and add the `bug` tag.
 ## Roadmap
 - [ ] Working on feature plans
 ## Architecture
 ```
@@ -105,35 +123,35 @@ Configure via environment variables.
 └────────────────┬────────────────────────────────────┘
                 │
 ┌────────────────▼────────────────────────────────────┐
-│              SearXNG + Plugin                        │
+│              SearXNG + Plugin                       │
-│                                                      │
+│                                                     │
-│  post_search()                                       │
+│  post_search()                                      │
 │    → _enrich_results()  ← ThreadPoolExecutor        │
 │      → _fetch_page_text() × 5 parallel              │
 │      → _chunk_text() + _tfidf_score()               │
 │      → rerank by score                              │
 │    → _assemble_context()                            │
 │    → inject AI Overview HTML + JS                   │
-│                                                      │
+│                                                     │
-│  /ai-stream                                          │
+│  /ai-stream                                         │
-│    → validate token                                  │
+│    → validate token                                 │
 │    → _detect_intent() → select system prompt        │
 │    → _load_conversation() from Valkey               │
 │    → launch stream_to_valkey() thread               │
 │    → return {job_id} immediately                    │
-│                                                      │
+│                                                     │
 │  stream_to_valkey() [background thread]             │
 │    → Ollama stream=True                             │
 │    → RPUSH tokens to Valkey                         │
 │    → RPUSH __DONE__ when complete                   │
-│                                                      │
+│                                                     │
 │  /ai-status/{job_id}                                │
 │    → LRANGE chunks from offset                      │
 │    → return {chunks, done}                          │
 └────────────────┬────────────────────────────────────┘
                 │
 ┌────────────────▼────────────────────────────────────┐
-│                  Valkey                              │
+│                  Valkey                             │
 │  ai:job:{id}:chunks  (list, TTL 120s)               │
 │  ai:job:{id}:status  (string, TTL 120s)             │
 │  ai:conv:{session}   (JSON, TTL 1800s)              │