Compare commits

..

8 Commits

Author SHA1 Message Date
TySS-Dev be3caee615 Merge pull request 'Updated README' (#2) from main into testing
Reviewed-on: #2
2026-05-17 20:55:00 -04:00
TySS-Dev eeac7fcd88 Added more known issues 2026-05-17 20:13:19 -04:00
TySS-Dev 1c3824b7a4 Fixed typo 2026-05-17 20:01:43 -04:00
TySS-Dev a7c031d27b Fixed check boxes 2026-05-17 20:00:56 -04:00
TySS-Dev 5e2b2a246f Added known issues and roadmap 2026-05-17 19:59:55 -04:00
TySS-Dev ffad0de8ae Fixed flow diagram 2026-05-17 19:51:19 -04:00
TySS-Dev 3dffeb384b Fixed a typo in README 2026-05-17 19:46:04 -04:00
TySS-Dev 85d1481bd9 Updated README 2026-05-17 19:45:37 -04:00
+52 -34
View File
@@ -10,44 +10,50 @@
A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context. A SearXNG plugin that generates local AI overviews powered by Ollama, using search results as RAG context.
Features: ## Features:
- Token-by-token UI streaming
- Clickable inline citations - Inline numbered citations
- Interactive mode: continue summary, ask follow-ups, copy, or regenerate - Interactive mode - Continue summary, ask follow-ups, copy, or regenerate
- Simple response mode with no extras - Overview of ranked results with prompts based on detected query intent:
- Internally called low-latency RAG for follow-ups (bypasses HTTP loopback) - `How To` `Technical` `Factual` `Comparison` `Opinion` `Current` `Local` `Geneal`
- Native network integration via `searx.network` (respects proxy/SSL settings) - Internally called RAG for follow-ups
- Stateless conversation persistence/shareability via URL hash - Native network integration via `searx.network`
- Model selector in the AI overview widget - Stateless conversation presistence/shareability via URL hash
- Does not slow down result loading - Ollama model selector
- One file install - Feeds fetched results to Ollama without slowing down SearXNG results
- Real-time streaming via Valkey — responses stream token by token using a background thread + Valkey job queue, working around granian's broken generator support for true streaming feel - Real-time streaming via Valkey (No waiting for a completed response)
- TF-IDF result reranking — fetched page content is scored against the query using BM25-style TF-IDF before being sent to Ollama, surfacing the most relevant sources first - TF-IDF result ranking before being sent to Ollama
- Smart chunking — pages are split into 512-token overlapping segments and the highest-scoring chunk per page is selected for context - Smart chunking - Pages are split into 512-token segments and highest-scoring chunk per page used for context
- Intent detection — queries are automatically classified into 8 intent types (factual, howto, technical, comparison, opinion, current, local, general) with tailored system prompts per type - Conversation memory - 30-minute cross-search conversation history via Valkey for follow-up questions
- Conversation memory — 30-minute cross-search conversation history stored in Valkey, so follow-up questions work even after navigating to a new search - Markdown support
- Markdown rendering — AI responses render bold, italic, lists, headers, and inline code natively in the result box - Intent emoji badge showing what intent prompt was used
- Intent emoji badge — a small emoji appears next to "AI Overview" indicating the detected query type
## Install ## Install
1. Download the plugin: 1. Download the plugin:
### Main repo (Gitea)
```bash ```bash
curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/master/ollama_answers.py curl -o ollama_answers.py https://git.tysstech.com/TySS-Dev/ollama-ai-answers-searxng/raw/branch/main/ollama_answers.py
``` ```
2. Copy to your SearXNG plugins directory: ### Mirror repo (Github):
```bash ```bash
cp ollama_answers.py ~/searxng/plugins/ollama_answers.py curl -o ollama_answers.py https://raw.githubusercontent.com/TySP-Dev/ollama-ai-answers-searxng/main/ollama_answers.py
``` ```
3. Add the volume mount to your `docker-compose.yml` under the searxng service: 3. Copy to your SearXNG plugins directory:
```bash
cp ollama_answers.py path_to/searxng/plugins/ollama_answers.py
```
4. Add the volume mount to your `docker-compose.yml` under the searxng service:
```yaml ```yaml
volumes: volumes:
- ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z - ./plugins/ollama_answers.py:/usr/local/searxng/searx/plugins/ollama_answers.py:Z
``` ```
4. Add environment variables to `docker-compose.yml`: 5. Add environment variables to `docker-compose.yml`:
```yaml ```yaml
environment: environment:
- LLM_URL=http://ollama:11434/v1/chat/completions - LLM_URL=http://ollama:11434/v1/chat/completions
@@ -55,14 +61,14 @@ Features:
- VALKEY_HOST=searxng-valkey - VALKEY_HOST=searxng-valkey
``` ```
5. Add to `settings.yml` plugins section: 6. Add to `settings.yml` plugins section:
```yaml ```yaml
plugins: plugins:
searx.plugins.ollama_answers.SXNGPlugin: searx.plugins.ollama_answers.SXNGPlugin:
active: true active: true
``` ```
6. Restart SearXNG: 7. Restart SearXNG:
```bash ```bash
docker compose up -d --force-recreate core docker compose up -d --force-recreate core
``` ```
@@ -96,6 +102,18 @@ Configure via environment variables.
6. Client-side script calls a signed endpoint (`/ai-stream`) 6. Client-side script calls a signed endpoint (`/ai-stream`)
7. Ollama streams a response token-by-token in the UI 7. Ollama streams a response token-by-token in the UI
## Known Issues
- [ ] When asking a follow up question the previous output disappears
- [ ] Parts of the UI are not theme aware resulting in a unpolished look when not using a dark theme
- [ ] When SearXNG provides a info blob for a search it appears on top of the overview i.e. `Wikipedia` or `Linux`
For any issues not stated here please create an issue ticket on [Gitea](https://git.tysstech.com/TySS-Dev/ollama-ai-answers-searxng/issues) or [GitHub](https://github.com/TySP-Dev/ollama-ai-answers-searxng/issues) and add the `bug` tag.
## Roadmap
- [ ] Working on feature plans
## Architecture ## Architecture
``` ```
@@ -105,35 +123,35 @@ Configure via environment variables.
└────────────────┬────────────────────────────────────┘ └────────────────┬────────────────────────────────────┘
┌────────────────▼────────────────────────────────────┐ ┌────────────────▼────────────────────────────────────┐
│ SearXNG + Plugin │ SearXNG + Plugin │
│ │
│ post_search() │ post_search() │
│ → _enrich_results() ← ThreadPoolExecutor │ │ → _enrich_results() ← ThreadPoolExecutor │
│ → _fetch_page_text() × 5 parallel │ │ → _fetch_page_text() × 5 parallel │
│ → _chunk_text() + _tfidf_score() │ │ → _chunk_text() + _tfidf_score() │
│ → rerank by score │ │ → rerank by score │
│ → _assemble_context() │ │ → _assemble_context() │
│ → inject AI Overview HTML + JS │ │ → inject AI Overview HTML + JS │
│ │
│ /ai-stream │ /ai-stream │
│ → validate token │ → validate token │
│ → _detect_intent() → select system prompt │ │ → _detect_intent() → select system prompt │
│ → _load_conversation() from Valkey │ │ → _load_conversation() from Valkey │
│ → launch stream_to_valkey() thread │ │ → launch stream_to_valkey() thread │
│ → return {job_id} immediately │ │ → return {job_id} immediately │
│ │
│ stream_to_valkey() [background thread] │ │ stream_to_valkey() [background thread] │
│ → Ollama stream=True │ │ → Ollama stream=True │
│ → RPUSH tokens to Valkey │ │ → RPUSH tokens to Valkey │
│ → RPUSH __DONE__ when complete │ │ → RPUSH __DONE__ when complete │
│ │
│ /ai-status/{job_id} │ │ /ai-status/{job_id} │
│ → LRANGE chunks from offset │ │ → LRANGE chunks from offset │
│ → return {chunks, done} │ │ → return {chunks, done} │
└────────────────┬────────────────────────────────────┘ └────────────────┬────────────────────────────────────┘
┌────────────────▼────────────────────────────────────┐ ┌────────────────▼────────────────────────────────────┐
│ Valkey │ Valkey │
│ ai:job:{id}:chunks (list, TTL 120s) │ │ ai:job:{id}:chunks (list, TTL 120s) │
│ ai:job:{id}:status (string, TTL 120s) │ │ ai:job:{id}:status (string, TTL 120s) │
│ ai:conv:{session} (JSON, TTL 1800s) │ │ ai:conv:{session} (JSON, TTL 1800s) │