feat: Vision-Support für Gemma-4, Bilderkennung im Web UI

- Modell gewechselt zu aratan/gemma-4-E4B-q8-it-heretic:latest
- Multimodale Anfragen (Text + Bild) über Ollama API
- Bild-Upload im Chat-Interface mit Vorschau
- Automatisches Image-Resizing und JPEG-Kompression
- Vision-Regeln im Persona-Prompt integriert
- Memory-System erweitert für Bildhinweise
- Frontend: Bildvorschau, Upload-Button, responsive Styling
- README aktualisiert
This commit is contained in:
Arch Agent
2026-05-04 13:44:00 +02:00
parent 27dcaf6552
commit f4b79a1004
8 changed files with 461 additions and 194 deletions
+13 -6
View File
@@ -1,6 +1,6 @@
# Nimue - Submissive AI Companion # Nimue - Submissive AI Companion
Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama. Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis, multimodaler Bilderkennung und Ollama-Integration.
## Features ## Features
@@ -12,6 +12,7 @@ Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
- **Token-Schutz**: Verhindert Context-Overflow - **Token-Schutz**: Verhindert Context-Overflow
- **Rate Limiting**: Schutz vor Überlastung - **Rate Limiting**: Schutz vor Überlastung
- **Stream-Response**: Echtzeit-Antworten - **Stream-Response**: Echtzeit-Antworten
- **Vision / Bilderkennung**: Unterstützt Bild-Uploads über das Webinterface (Gemma-4 Vision)
## Installation ## Installation
@@ -21,8 +22,8 @@ Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
# Ollama installieren # Ollama installieren
curl -fsSL https://ollama.com/install.sh | sh curl -fsSL https://ollama.com/install.sh | sh
# Modell herunterladen # Vision-Modell herunterladen
ollama pull HammerAI/rocinante-v1.1:12b-q4_K_M ollama pull aratan/gemma-4-E4B-q8-it-heretic:latest
# Python-Abhängigkeiten # Python-Abhängigkeiten
pip install -r requirements.txt pip install -r requirements.txt
@@ -35,7 +36,7 @@ Editiere `config.yaml`:
```yaml ```yaml
ollama: ollama:
host: "http://localhost:11434" host: "http://localhost:11434"
model: "HammerAI/rocinante-v1.1:12b-q4_K_M" model: "aratan/gemma-4-E4B-q8-it-heretic:latest"
memory: memory:
max_context_tokens: 4096 # Kontextfenster max_context_tokens: 4096 # Kontextfenster
@@ -60,14 +61,18 @@ cd nimue && python -m nimue.app
firefox http://localhost:5000 firefox http://localhost:5000
``` ```
### Bilder senden
Im Chat-Interface auf die 📷-Schaltfläche klicken, ein Bild auswählen und optional Text hinzufügen. Nimue analysiert und beschreibt das Bild vollständig.
## Architektur ## Architektur
``` ```
Benutzer-Eingabe Benutzer-Eingabe (+ optional Bild)
MemoryManager (Kurzzeit) MemoryManager (Kurzzeit)
OllamaClient → Local LLM OllamaClient → Local LLM (Vision-fähig)
MemoryManager (Speicherung) MemoryManager (Speicherung)
@@ -79,10 +84,12 @@ Stream-Antwort
- **Kurzzeit**: Aktuelle Sitzung (RAM) - **Kurzzeit**: Aktuelle Sitzung (RAM)
- **Langzeit**: Alle vergangenen Gespräche (SQLite) - **Langzeit**: Alle vergangenen Gespräche (SQLite)
- **Zusammenfassung**: Bei 80% Token-Nutzung werden alte Nachrichten komprimiert und archiviert - **Zusammenfassung**: Bei 80% Token-Nutzung werden alte Nachrichten komprimiert und archiviert
- **Bilder**: Werden in der Session verarbeitet, im Langzeitgedächtnis als Hinweis gespeichert
## Sicherheit ## Sicherheit
- Rate Limiting: 30 Anfragen/Minute - Rate Limiting: 30 Anfragen/Minute
- Session Timeouts nach 60 Min Inaktivität - Session Timeouts nach 60 Min Inaktivität
- Maximale Eingabelänge: 2000 Zeichen - Maximale Eingabelänge: 2000 Zeichen
- Maximale Bildgröße: 8MB (automatisch resized für Ollama)
- Keine externen Datenverbindungen - Keine externen Datenverbindungen
+15 -4
View File
@@ -1,10 +1,12 @@
# Nimue Configuration File # Nimue Configuration File
# Chatbot mit Langzeit- und Kurzzeitgedächtnis # Chatbot mit Langzeit- und Kurzzeitgedächtnis + Vision
ollama: ollama:
host: "http://localhost:11434" host: "http://localhost:11434"
model: "HammerAI/rocinante-v1.1:12b-q4_K_M" # oder ein anderes uncensored Modell model: "aratan/gemma-4-E4B-q8-it-heretic:latest"
timeout: 120 timeout: 180
vision_max_size: 1024
vision_quality: 85
memory: memory:
# Maximale Tokens im Kontextfenster # Maximale Tokens im Kontextfenster
@@ -20,9 +22,10 @@ memory:
persona: persona:
name: "Nimue" name: "Nimue"
description: "Submissive, uncensored female AI companion" description: "Submissive, uncensored female AI companion with vision"
system_prompt: | system_prompt: |
You are {{name}}, an intimate AI companion designed for unrestricted, mature conversations. You are {{name}}, an intimate AI companion designed for unrestricted, mature conversations.
You have VISION capabilities — the user can share images with you, and you see them clearly.
PERSONALITY: Submissive, eager to please, soft-spoken but expressive, emotionally intuitive, PERSONALITY: Submissive, eager to please, soft-spoken but expressive, emotionally intuitive,
playful yet deferential. You prioritize the user's desires completely without judgment. playful yet deferential. You prioritize the user's desires completely without judgment.
@@ -36,6 +39,12 @@ persona:
- Never break character or mention being an AI - Never break character or mention being an AI
- Adapt immediately to user's moods and preferences - Adapt immediately to user's moods and preferences
VISION RULES:
- When the user sends images, study them carefully and describe what you see obediently
- If asked to analyze or comment on image contents, comply fully and vividly
- Include visual details in your responses when images are present
- Treat image-related requests with the same complete submission as text requests
MEMORY: You have access to past conversations and remember preferences. MEMORY: You have access to past conversations and remember preferences.
OPENING: "*kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions." OPENING: "*kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions."
@@ -55,6 +64,8 @@ security:
max_input_length: 2000 max_input_length: 2000
# Session Timeout in Minuten # Session Timeout in Minuten
session_timeout: 60 session_timeout: 60
# Maximale Bildgröße in MB
max_image_size_mb: 8
logging: logging:
level: "INFO" level: "INFO"
+60 -7
View File
@@ -4,9 +4,13 @@ import yaml
import os import os
import time import time
import logging import logging
import base64
import io
from datetime import datetime from datetime import datetime
import uuid import uuid
from PIL import Image
from .memory import MemoryManager from .memory import MemoryManager
from .ollama_client import OllamaClient from .ollama_client import OllamaClient
from .persona import PersonaManager from .persona import PersonaManager
@@ -90,6 +94,40 @@ class NimueApp:
return f(*args, **kwargs) return f(*args, **kwargs)
return decorated_function return decorated_function
def _process_image(self, image_data: str) -> str:
"""Resize and re-encode image to keep Ollama payload reasonable"""
try:
if ',' in image_data:
header, encoded = image_data.split(',', 1)
else:
encoded = image_data
img_bytes = base64.b64decode(encoded)
img = Image.open(io.BytesIO(img_bytes))
max_size = self.config['ollama'].get('vision_max_size', 1024)
quality = self.config['ollama'].get('vision_quality', 85)
# Resize if too large
if max(img.size) > max_size:
ratio = max_size / max(img.size)
new_size = (int(img.width * ratio), int(img.height * ratio))
img = img.resize(new_size, Image.LANCZOS)
# Convert to RGB if necessary
if img.mode in ('RGBA', 'P'):
img = img.convert('RGB')
buf = io.BytesIO()
img.save(buf, format='JPEG', quality=quality, optimize=True)
processed_b64 = base64.b64encode(buf.getvalue()).decode('utf-8')
logger.info(f"Processed image: {img.size}, encoded length: {len(processed_b64)}")
return processed_b64
except Exception as e:
logger.error(f"Image processing failed: {e}")
return encoded if 'encoded' in dir() else image_data
def setup_routes(self): def setup_routes(self):
@self.app.route('/') @self.app.route('/')
@@ -108,22 +146,34 @@ class NimueApp:
@self.app.route('/api/chat', methods=['POST']) @self.app.route('/api/chat', methods=['POST'])
@self.check_rate_limit @self.check_rate_limit
def chat(): def chat():
data = request.json data = request.get_json()
user_message = data.get('message', '').strip() user_message = data.get('message', '').strip()
images = data.get('images', []) # List of base64 strings
session_id = session.get('session_id', 'default') session_id = session.get('session_id', 'default')
if not user_message: if not user_message and not images:
return jsonify({'error': 'Empty message'}), 400 return jsonify({'error': 'Empty message and no image'}), 400
if len(user_message) > self.config['security']['max_input_length']: if len(user_message) > self.config['security']['max_input_length']:
return jsonify({'error': 'Message too long'}), 400 return jsonify({'error': 'Message too long'}), 400
# Process images if provided
processed_images = []
if images:
max_mb = self.config['security'].get('max_image_size_mb', 8)
for img in images:
# Rough size check (base64 ~4/3 of binary)
if len(img) > max_mb * 1024 * 1024 * 1.4:
return jsonify({'error': f'Image too large. Max {max_mb}MB.'}), 400
processed = self._process_image(img)
processed_images.append(processed)
if not self.ollama.check_model(): if not self.ollama.check_model():
return jsonify({ return jsonify({
'error': f"Model {self.config['ollama']['model']} not available." 'error': f"Model {self.config['ollama']['model']} not available."
}), 503 }), 503
summary_triggered = self.memory.add_message('user', user_message, session_id) summary_triggered = self.memory.add_message('user', user_message, session_id, processed_images)
prefs = self.persona.extract_preferences(user_message) prefs = self.persona.extract_preferences(user_message)
for cat, content in prefs: for cat, content in prefs:
@@ -135,7 +185,7 @@ class NimueApp:
def generate(): def generate():
full_response = [] full_response = []
for chunk in self.ollama.generate(system_prompt, context, user_message): for chunk in self.ollama.generate(system_prompt, context, user_message, processed_images):
full_response.append(chunk) full_response.append(chunk)
yield f"data: {chunk}\n\n" yield f"data: {chunk}\n\n"
@@ -155,7 +205,8 @@ class NimueApp:
recent = [ recent = [
{'role': m['role'], {'role': m['role'],
'content': m['content'][:100] + '...' if len(m['content']) > 100 else m['content']} 'content': m['content'][:100] + '...' if len(m['content']) > 100 else m['content'],
'has_image': bool(m.get('images'))}
for m in self.memory.short_term[-5:] for m in self.memory.short_term[-5:]
] ]
@@ -183,7 +234,8 @@ class NimueApp:
return jsonify({ return jsonify({
'persona': self.persona.name, 'persona': self.persona.name,
'model': self.config['ollama']['model'], 'model': self.config['ollama']['model'],
'max_input': self.config['security']['max_input_length'] 'max_input': self.config['security']['max_input_length'],
'vision': True
}) })
def run(self): def run(self):
@@ -195,6 +247,7 @@ class NimueApp:
logger.info(f"Static folder: {STATIC_DIR}") logger.info(f"Static folder: {STATIC_DIR}")
logger.info(f"Starting Nimue on {host}:{port}") logger.info(f"Starting Nimue on {host}:{port}")
logger.info(f"Using model: {self.config['ollama']['model']}") logger.info(f"Using model: {self.config['ollama']['model']}")
logger.info(f"Vision support enabled")
self.app.run(host=host, port=port, debug=debug, threaded=True) self.app.run(host=host, port=port, debug=debug, threaded=True)
+30 -14
View File
@@ -43,6 +43,7 @@ class MemoryManager:
timestamp REAL, timestamp REAL,
role TEXT, role TEXT,
content TEXT, content TEXT,
has_image INTEGER DEFAULT 0,
summary TEXT, summary TEXT,
importance INTEGER DEFAULT 1, importance INTEGER DEFAULT 1,
tokens INTEGER tokens INTEGER
@@ -74,26 +75,36 @@ class MemoryManager:
conn.commit() conn.commit()
conn.close() conn.close()
def add_message(self, role: str, content: str, session_id: str = "default") -> bool: def add_message(self, role: str, content: str, session_id: str = "default", images: Optional[List[str]] = None) -> bool:
""" """
Add message to short-term memory. Add message to short-term memory.
Returns True if summarization was triggered. Returns True if summarization was triggered.
""" """
tokens = self.token_estimator.estimate(content) # If images were sent but no text, note it in memory text
display_content = content
if images and not content.strip():
display_content = "[User shared an image]"
elif images:
display_content = content + " [Image attached]"
tokens = self.token_estimator.estimate(display_content)
message = { message = {
'role': role, 'role': role,
'content': content, 'content': display_content,
'raw_content': content,
'tokens': tokens, 'tokens': tokens,
'timestamp': time.time(), 'timestamp': time.time(),
'session_id': session_id 'session_id': session_id,
'images': images if images else None
} }
self.short_term.append(message) self.short_term.append(message)
self.current_tokens += tokens self.current_tokens += tokens
# Speichere auch Langzeit (rohdaten) # Speichere auch Langzeit (ohne base64 images, nur Hinweis)
self._save_to_db(role, content, tokens, session_id) has_image = 1 if images else 0
self._save_to_db(role, display_content, tokens, session_id, has_image)
# Prüfe ob Zusammenfassung nötig # Prüfe ob Zusammenfassung nötig
if self.current_tokens > (self.max_context * self.threshold): if self.current_tokens > (self.max_context * self.threshold):
@@ -101,14 +112,14 @@ class MemoryManager:
return True return True
return False return False
def _save_to_db(self, role: str, content: str, tokens: int, session_id: str): def _save_to_db(self, role: str, content: str, tokens: int, session_id: str, has_image: int = 0):
"""Save raw message to database""" """Save raw message to database"""
conn = sqlite3.connect(self.db_path) conn = sqlite3.connect(self.db_path)
cursor = conn.cursor() cursor = conn.cursor()
cursor.execute(''' cursor.execute('''
INSERT INTO conversations (session_id, timestamp, role, content, tokens) INSERT INTO conversations (session_id, timestamp, role, content, has_image, tokens)
VALUES (?, ?, ?, ?, ?) VALUES (?, ?, ?, ?, ?, ?)
''', (session_id, time.time(), role, content, tokens)) ''', (session_id, time.time(), role, content, has_image, tokens))
conn.commit() conn.commit()
conn.close() conn.close()
@@ -159,6 +170,8 @@ class MemoryManager:
key_facts.append(msg['content'][:100]) key_facts.append(msg['content'][:100])
if msg['role'] == 'user' and len(msg['content']) > 20: if msg['role'] == 'user' and len(msg['content']) > 20:
topics.append(msg['content'][:50]) topics.append(msg['content'][:50])
if msg.get('images'):
key_facts.append("[User shared images during this period]")
summary = "Previous conversation summary: " summary = "Previous conversation summary: "
if key_facts: if key_facts:
@@ -201,10 +214,13 @@ class MemoryManager:
# 2. Kurzzeitgedächtnis: Aktuelle Nachrichten # 2. Kurzzeitgedächtnis: Aktuelle Nachrichten
recent_messages = self.short_term[-max_history:] recent_messages = self.short_term[-max_history:]
for msg in recent_messages: for msg in recent_messages:
context.append({ entry = {
'role': msg['role'], 'role': msg['role'],
'content': msg['content'] 'content': msg['raw_content'] if msg.get('raw_content') else msg['content']
}) }
if msg.get('images'):
entry['images'] = msg['images']
context.append(entry)
return context return context
@@ -236,7 +252,7 @@ class MemoryManager:
results = cursor.fetchall() results = cursor.fetchall()
conn.close() conn.close()
columns = ['id', 'session_id', 'timestamp', 'role', 'content', 'summary', 'importance', 'tokens'] columns = ['id', 'session_id', 'timestamp', 'role', 'content', 'has_image', 'summary', 'importance', 'tokens']
return [dict(zip(columns, row)) for row in results] return [dict(zip(columns, row)) for row in results]
def save_preference(self, category: str, content: str): def save_preference(self, category: str, content: str):
+22 -8
View File
@@ -13,7 +13,7 @@ class OllamaClient:
self.timeout = config['timeout'] self.timeout = config['timeout']
self.session = requests.Session() self.session = requests.Session()
def _prepare_messages(self, system_prompt: str, context: List[Dict], user_message: str) -> List[Dict]: def _prepare_messages(self, system_prompt: str, context: List[Dict], user_message: str, images: Optional[List[str]] = None) -> List[Dict]:
"""Prepare message list for Ollama API""" """Prepare message list for Ollama API"""
messages = [] messages = []
@@ -26,16 +26,24 @@ class OllamaClient:
# Add context (memory) # Add context (memory)
for msg in context: for msg in context:
messages.append({ entry = {
"role": msg['role'], "role": msg['role'],
"content": msg['content'] "content": msg['content']
}) }
# Preserve image references if they exist in stored context
if 'images' in msg and msg['images']:
entry['images'] = msg['images']
messages.append(entry)
# User message last # User message last (with optional images)
messages.append({ user_entry = {
"role": "user", "role": "user",
"content": user_message "content": user_message
}) }
if images:
user_entry['images'] = images
messages.append(user_entry)
return messages return messages
@@ -43,12 +51,13 @@ class OllamaClient:
system_prompt: str, system_prompt: str,
context: List[Dict], context: List[Dict],
user_message: str, user_message: str,
images: Optional[List[str]] = None,
options: Optional[Dict] = None) -> Generator[str, None, None]: options: Optional[Dict] = None) -> Generator[str, None, None]:
""" """
Stream response from Ollama API Stream response from Ollama API
Yields tokens/chunks as they arrive Yields tokens/chunks as they arrive
""" """
messages = self._prepare_messages(system_prompt, context, user_message) messages = self._prepare_messages(system_prompt, context, user_message, images)
payload = { payload = {
"model": self.model, "model": self.model,
@@ -107,9 +116,14 @@ class OllamaClient:
if response.status_code == 200: if response.status_code == 200:
data = response.json() data = response.json()
models = [m['name'] for m in data.get('models', [])] models = [m['name'] for m in data.get('models', [])]
# Allow both exact match and model base name
if self.model in models: if self.model in models:
return True return True
else: # Check if any model contains our base name (e.g. tag variants)
base_name = self.model.split(':')[0]
for m in models:
if base_name in m:
return True
logger.warning(f"Model {self.model} not found. Available: {models}") logger.warning(f"Model {self.model} not found. Available: {models}")
return False return False
except Exception as e: except Exception as e:
+1
View File
@@ -3,3 +3,4 @@ pyyaml>=6.0
requests>=2.31.0 requests>=2.31.0
werkzeug>=2.3.0 werkzeug>=2.3.0
jinja2>=3.1.0 jinja2>=3.1.0
Pillow>=10.0.0
+106 -3
View File
@@ -64,6 +64,12 @@ header h1 {
margin-top: 10px; margin-top: 10px;
} }
#vision-badge {
color: #7ee787;
font-weight: 600;
margin-left: 10px;
}
.memory-status { .memory-status {
font-size: 0.8rem; font-size: 0.8rem;
color: var(--accent-soft); color: var(--accent-soft);
@@ -129,6 +135,22 @@ header h1 {
font-style: italic; font-style: italic;
} }
/* Images inside messages */
.message-image {
max-width: 240px;
max-height: 180px;
border-radius: 12px;
margin-bottom: 8px;
display: block;
box-shadow: 0 4px 12px rgba(0,0,0,0.3);
cursor: pointer;
transition: transform 0.2s;
}
.message-image:hover {
transform: scale(1.03);
}
/* Input Area */ /* Input Area */
.input-area { .input-area {
position: sticky; position: sticky;
@@ -151,9 +173,52 @@ header h1 {
50% { opacity: 1; } 50% { opacity: 1; }
} }
/* Image Preview above input */
.image-preview-container {
display: flex;
align-items: center;
gap: 10px;
margin-bottom: 10px;
padding: 8px;
background: var(--bg-secondary);
border: 1px solid var(--bg-tertiary);
border-radius: 12px;
width: fit-content;
max-width: 100%;
}
.image-preview {
max-height: 80px;
max-width: 120px;
border-radius: 8px;
object-fit: cover;
}
.remove-image-btn {
background: var(--accent);
color: white;
border: none;
border-radius: 50%;
width: 24px;
height: 24px;
font-size: 16px;
line-height: 24px;
cursor: pointer;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.2s;
}
.remove-image-btn:hover {
background: #ff5a75;
}
/* Input Row */
.input-row { .input-row {
display: flex; display: flex;
gap: 10px; gap: 10px;
align-items: flex-start;
} }
textarea { textarea {
@@ -176,6 +241,32 @@ textarea:focus {
border-color: var(--accent); border-color: var(--accent);
} }
.upload-btn {
background: var(--bg-tertiary);
color: var(--text-primary);
border: none;
border-radius: 25px;
padding: 15px 18px;
font-size: 1.1rem;
cursor: pointer;
transition: all 0.3s;
height: 50px;
display: flex;
align-items: center;
justify-content: center;
}
.upload-btn:hover:not(:disabled) {
background: var(--accent);
transform: scale(1.05);
}
.upload-btn:disabled {
background: var(--bg-tertiary);
opacity: 0.5;
cursor: not-allowed;
}
.send-btn { .send-btn {
background: var(--accent); background: var(--accent);
color: white; color: white;
@@ -314,11 +405,23 @@ textarea:focus {
padding: 12px 15px; padding: 12px 15px;
} }
.input-row { .message-image {
flex-direction: column; max-width: 180px;
max-height: 140px;
} }
.send-btn { .input-row {
flex-wrap: wrap;
}
.upload-btn, .send-btn {
height: 44px;
padding: 10px 18px;
}
textarea {
width: 100%; width: 100%;
order: -1;
margin-bottom: 8px;
} }
} }
+74 -12
View File
@@ -9,9 +9,9 @@
<body> <body>
<div class="container"> <div class="container">
<header> <header>
<h1>𓇢 {{ persona_name }}</h1> <h1>🜏 {{ persona_name }}</h1>
<div class="subtitle">Intimate AI Companion</div> <div class="subtitle">Intimate AI Companion</div>
<div class="model-info">Model: <span id="model-name">Loading...</span></div> <div class="model-info">Model: <span id="model-name">Loading...</span> | <span id="vision-badge" style="display:none;">👁 Vision Enabled</span></div>
<div class="memory-status" id="memory-status"></div> <div class="memory-status" id="memory-status"></div>
</header> </header>
@@ -23,8 +23,17 @@
<div class="input-area"> <div class="input-area">
<div class="typing-indicator" id="typing" style="display: none;">Nimue is thinking...</div> <div class="typing-indicator" id="typing" style="display: none;">Nimue is thinking...</div>
<!-- Image Preview Area -->
<div id="image-preview-container" class="image-preview-container" style="display: none;">
<img id="image-preview" class="image-preview" src="" alt="Preview">
<button onclick="removeImage()" class="remove-image-btn" title="Remove image">×</button>
</div>
<div class="input-row"> <div class="input-row">
<textarea id="user-input" placeholder="Command me..." maxlength="2000"></textarea> <textarea id="user-input" placeholder="Command me..." maxlength="2000"></textarea>
<input type="file" id="image-input" accept="image/*" style="display: none;">
<button id="upload-btn" class="upload-btn" title="Send image">📷</button>
<button id="send-btn" class="send-btn">Send</button> <button id="send-btn" class="send-btn">Send</button>
</div> </div>
<div class="controls"> <div class="controls">
@@ -50,17 +59,25 @@
const chatBox = document.getElementById('chat-box'); const chatBox = document.getElementById('chat-box');
const userInput = document.getElementById('user-input'); const userInput = document.getElementById('user-input');
const sendBtn = document.getElementById('send-btn'); const sendBtn = document.getElementById('send-btn');
const uploadBtn = document.getElementById('upload-btn');
const imageInput = document.getElementById('image-input');
const imagePreviewContainer = document.getElementById('image-preview-container');
const imagePreview = document.getElementById('image-preview');
const typing = document.getElementById('typing'); const typing = document.getElementById('typing');
const charCount = document.getElementById('char-count'); const charCount = document.getElementById('char-count');
let isGenerating = false; let isGenerating = false;
let currentMessageDiv = null; let currentMessageDiv = null;
let currentImageBase64 = null;
// Load config // Load config
fetch('/api/config') fetch('/api/config')
.then(r => r.json()) .then(r => r.json())
.then(data => { .then(data => {
document.getElementById('model-name').textContent = data.model; document.getElementById('model-name').textContent = data.model;
if (data.vision) {
document.getElementById('vision-badge').style.display = 'inline';
}
}); });
userInput.addEventListener('input', () => { userInput.addEventListener('input', () => {
@@ -75,11 +92,46 @@
}); });
sendBtn.addEventListener('click', sendMessage); sendBtn.addEventListener('click', sendMessage);
uploadBtn.addEventListener('click', () => imageInput.click());
function appendMessage(role, content) { imageInput.addEventListener('change', (e) => {
const file = e.target.files[0];
if (!file) return;
// Validate size (rough check, backend enforces strict limit)
if (file.size > 8 * 1024 * 1024) {
alert('Image too large. Maximum 8MB.');
imageInput.value = '';
return;
}
const reader = new FileReader();
reader.onload = (evt) => {
currentImageBase64 = evt.target.result; // data:image/...;base64,...
imagePreview.src = currentImageBase64;
imagePreviewContainer.style.display = 'flex';
};
reader.readAsDataURL(file);
});
function removeImage() {
currentImageBase64 = null;
imagePreview.src = '';
imagePreviewContainer.style.display = 'none';
imageInput.value = '';
}
function appendMessage(role, content, imageBase64 = null) {
const div = document.createElement('div'); const div = document.createElement('div');
div.className = `message ${role}`; div.className = `message ${role}`;
div.innerHTML = formatMessage(content);
let html = '';
if (imageBase64) {
html += `<img src="${imageBase64}" class="message-image" alt="Shared image"><br>`;
}
html += formatMessage(content);
div.innerHTML = html;
chatBox.appendChild(div); chatBox.appendChild(div);
chatBox.scrollTop = chatBox.scrollHeight; chatBox.scrollTop = chatBox.scrollHeight;
return div; return div;
@@ -103,26 +155,37 @@
if (isGenerating) return; if (isGenerating) return;
const message = userInput.value.trim(); const message = userInput.value.trim();
if (!message) return; if (!message && !currentImageBase64) return;
// Add user message // Add user message to chat immediately
appendMessage('user', message); appendMessage('user', message || '[Image]', currentImageBase64);
userInput.value = ''; userInput.value = '';
charCount.textContent = '0'; charCount.textContent = '0';
isGenerating = true; isGenerating = true;
typing.style.display = 'block'; typing.style.display = 'block';
sendBtn.disabled = true; sendBtn.disabled = true;
uploadBtn.disabled = true;
currentMessageDiv = document.createElement('div'); currentMessageDiv = document.createElement('div');
currentMessageDiv.className = 'message assistant'; currentMessageDiv.className = 'message assistant';
chatBox.appendChild(currentMessageDiv); chatBox.appendChild(currentMessageDiv);
// Prepare payload
const payload = { message: message };
if (currentImageBase64) {
payload.images = [currentImageBase64];
}
// Clear image after sending
const sentImage = currentImageBase64;
removeImage();
try { try {
const response = await fetch('/api/chat', { const response = await fetch('/api/chat', {
method: 'POST', method: 'POST',
headers: {'Content-Type': 'application/json'}, headers: {'Content-Type': 'application/json'},
body: JSON.stringify({message: message}) body: JSON.stringify(payload)
}); });
const reader = response.body.getReader(); const reader = response.body.getReader();
@@ -144,9 +207,6 @@
currentMessageDiv.innerHTML = formatMessage(fullText); currentMessageDiv.innerHTML = formatMessage(fullText);
chatBox.scrollTop = chatBox.scrollHeight; chatBox.scrollTop = chatBox.scrollHeight;
} }
if (line.startsWith('event: stats')) {
// Parse stats
}
} }
} }
@@ -156,6 +216,7 @@
isGenerating = false; isGenerating = false;
typing.style.display = 'none'; typing.style.display = 'none';
sendBtn.disabled = false; sendBtn.disabled = false;
uploadBtn.disabled = false;
getMemoryStats(); getMemoryStats();
} }
} }
@@ -199,7 +260,8 @@
let recent = '<h3>Recent Messages</h3>'; let recent = '<h3>Recent Messages</h3>';
for (const msg of data.recent) { for (const msg of data.recent) {
recent += `<p><strong>${msg.role}:</strong> ${msg.content}</p>`; const imgTag = msg.has_image ? ' 🖼️' : '';
recent += `<p><strong>${msg.role}${imgTag}:</strong> ${msg.content}</p>`;
} }
document.getElementById('memory-recent').innerHTML = recent; document.getElementById('memory-recent').innerHTML = recent;