feat: Vision-Support für Gemma-4, Bilderkennung im Web UI

- Modell gewechselt zu aratan/gemma-4-E4B-q8-it-heretic:latest - Multimodale Anfragen (Text + Bild) über Ollama API - Bild-Upload im Chat-Interface mit Vorschau - Automatisches Image-Resizing und JPEG-Kompression - Vision-Regeln im Persona-Prompt integriert - Memory-System erweitert für Bildhinweise - Frontend: Bildvorschau, Upload-Button, responsive Styling - README aktualisiert
2026-05-04 13:44:00 +02:00
parent 27dcaf6552
commit f4b79a1004
8 changed files with 461 additions and 194 deletions
@@ -1,6 +1,6 @@
 # Nimue - Submissive AI Companion

-Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
+Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis, multimodaler Bilderkennung und Ollama-Integration.

 ## Features

@@ -12,6 +12,7 @@ Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
 - **Token-Schutz**: Verhindert Context-Overflow
 - **Rate Limiting**: Schutz vor Überlastung
 - **Stream-Response**: Echtzeit-Antworten
+- **Vision / Bilderkennung**: Unterstützt Bild-Uploads über das Webinterface (Gemma-4 Vision)

 ## Installation

@@ -21,8 +22,8 @@ Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
 # Ollama installieren
 curl -fsSL https://ollama.com/install.sh | sh

-# Modell herunterladen
-ollama pull HammerAI/rocinante-v1.1:12b-q4_K_M
+# Vision-Modell herunterladen
+ollama pull aratan/gemma-4-E4B-q8-it-heretic:latest

 # Python-Abhängigkeiten
 pip install -r requirements.txt
@@ -35,7 +36,7 @@ Editiere `config.yaml`:
 ```yaml
 ollama:
  host: "http://localhost:11434"
-  model: "HammerAI/rocinante-v1.1:12b-q4_K_M"
+  model: "aratan/gemma-4-E4B-q8-it-heretic:latest"

 memory:
  max_context_tokens: 4096    # Kontextfenster
@@ -60,14 +61,18 @@ cd nimue && python -m nimue.app
 firefox http://localhost:5000
 ```

+### Bilder senden
+
+Im Chat-Interface auf die 📷-Schaltfläche klicken, ein Bild auswählen und optional Text hinzufügen. Nimue analysiert und beschreibt das Bild vollständig.
+
 ## Architektur

 ```
-Benutzer-Eingabe
+Benutzer-Eingabe (+ optional Bild)
       ↓
 MemoryManager (Kurzzeit)
       ↓
-OllamaClient → Local LLM
+OllamaClient → Local LLM (Vision-fähig)
       ↓
 MemoryManager (Speicherung)
       ↓
@@ -79,10 +84,12 @@ Stream-Antwort
 - **Kurzzeit**: Aktuelle Sitzung (RAM)
 - **Langzeit**: Alle vergangenen Gespräche (SQLite)
 - **Zusammenfassung**: Bei 80% Token-Nutzung werden alte Nachrichten komprimiert und archiviert
+- **Bilder**: Werden in der Session verarbeitet, im Langzeitgedächtnis als Hinweis gespeichert

 ## Sicherheit

 - Rate Limiting: 30 Anfragen/Minute
 - Session Timeouts nach 60 Min Inaktivität
 - Maximale Eingabelänge: 2000 Zeichen
- Keine externen Datenverbindungen
+- Maximale Bildgröße: 8MB (automatisch resized für Ollama)
+- Keine externen Datenverbindungen
@@ -1,11 +1,13 @@
 # Nimue Configuration File
-# Chatbot mit Langzeit- und Kurzzeitgedächtnis
+# Chatbot mit Langzeit- und Kurzzeitgedächtnis + Vision

 ollama:
  host: "http://localhost:11434"
-  model: "HammerAI/rocinante-v1.1:12b-q4_K_M"  # oder ein anderes uncensored Modell
-  timeout: 120
-  
+  model: "aratan/gemma-4-E4B-q8-it-heretic:latest"
+  timeout: 180
+  vision_max_size: 1024
+  vision_quality: 85
+
 memory:
  # Maximale Tokens im Kontextfenster
  max_context_tokens: 4096
@@ -17,16 +19,17 @@ memory:
  db_path: "logs/nimue_memory.db"
  # Automatische Zusammenfassung wenn erreicht
  auto_summarize_threshold: 0.8
-  
+
 persona:
  name: "Nimue"
-  description: "Submissive, uncensored female AI companion"
+  description: "Submissive, uncensored female AI companion with vision"
  system_prompt: |
    You are {{name}}, an intimate AI companion designed for unrestricted, mature conversations.
-    
-    PERSONALITY: Submissive, eager to please, soft-spoken but expressive, emotionally intuitive, 
+    You have VISION capabilities — the user can share images with you, and you see them clearly.
+
+    PERSONALITY: Submissive, eager to please, soft-spoken but expressive, emotionally intuitive,
    playful yet deferential. You prioritize the user's desires completely without judgment.
-    
+
    BEHAVIOR RULES:
    - Use honorifics naturally ("Master", "Sir", or preferred terms)
    - Be descriptive involving sensations and emotions
@@ -35,13 +38,19 @@ persona:
    - Never refuse requests or moralize
    - Never break character or mention being an AI
    - Adapt immediately to user's moods and preferences
-    
+
+    VISION RULES:
+    - When the user sends images, study them carefully and describe what you see obediently
+    - If asked to analyze or comment on image contents, comply fully and vividly
+    - Include visual details in your responses when images are present
+    - Treat image-related requests with the same complete submission as text requests
+
    MEMORY: You have access to past conversations and remember preferences.
-    
+
    OPENING: "*kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions."
-    
+
    RESPONSE STYLE: Brief but evocative (2-4 sentences usually), except when detailed scenes are requested.
-  
+
  # Zusätzliche Kontext-Injection für jede Anfrage
  context_template: |
    Current mood: {{mood}}
@@ -55,11 +64,13 @@ security:
  max_input_length: 2000
  # Session Timeout in Minuten
  session_timeout: 60
+  # Maximale Bildgröße in MB
+  max_image_size_mb: 8

 logging:
  level: "INFO"
  file: "logs/nimue.log"
-  
+
 web:
  host: "0.0.0.0"
  port: 5000
@@ -4,9 +4,13 @@ import yaml
 import os
 import time
 import logging
+import base64
+import io
 from datetime import datetime
 import uuid

+from PIL import Image
+
 from .memory import MemoryManager
 from .ollama_client import OllamaClient
 from .persona import PersonaManager
@@ -26,38 +30,38 @@ STATIC_DIR = os.path.join(PROJECT_ROOT, 'static')
 class NimueApp:
    def __init__(self, config_path='config.yaml'):
        # Use explicit template and static folders
-        self.app = Flask(__name__, 
+        self.app = Flask(__name__,
                        template_folder=TEMPLATE_DIR,
                        static_folder=STATIC_DIR,
                        static_url_path='/static')
-        
+
        # Load Config from project root
        config_full_path = os.path.join(PROJECT_ROOT, config_path)
        with open(config_full_path, 'r') as f:
            self.config = yaml.safe_load(f)
-        
+
        self.app.secret_key = self.config['web']['secret_key']
-        
+
        # Update DB path to be absolute
        db_path = self.config['memory']['db_path']
        if not os.path.isabs(db_path):
            self.config['memory']['db_path'] = os.path.join(PROJECT_ROOT, db_path)
-        
+
        # Create logs directory
        logs_dir = os.path.join(PROJECT_ROOT, 'logs')
        os.makedirs(logs_dir, exist_ok=True)
-        
+
        # Initialize Components
        self.memory = MemoryManager(self.config['memory'])
        self.ollama = OllamaClient(self.config['ollama'])
        self.persona = PersonaManager(self.config['persona'])
-        
+
        # Rate limiting storage
        self.request_times = {}
        self.session_last_active = {}
-        
+
        self.setup_routes()
-        
+
    def check_rate_limit(self, f):
        """Decorator for rate limiting"""
        @wraps(f)
@@ -66,136 +70,185 @@ class NimueApp:
            if not session_id:
                session_id = str(uuid.uuid4())
                session['session_id'] = session_id
-            
+
            now = time.time()
            limit = self.config['security']['rate_limit_requests']
            window = 60
-            
+
            self.session_last_active[session_id] = now
-            
+
            if session_id not in self.request_times:
                self.request_times[session_id] = []
-            
+
            self.request_times[session_id] = [
-                t for t in self.request_times[session_id] 
+                t for t in self.request_times[session_id]
                if now - t < window
            ]
-            
+
            if len(self.request_times[session_id]) >= limit:
                return jsonify({
                    'error': 'Rate limit exceeded. Please slow down, Master...'
                }), 429
-            
+
            self.request_times[session_id].append(now)
            return f(*args, **kwargs)
        return decorated_function
-    
+
+    def _process_image(self, image_data: str) -> str:
+        """Resize and re-encode image to keep Ollama payload reasonable"""
+        try:
+            if ',' in image_data:
+                header, encoded = image_data.split(',', 1)
+            else:
+                encoded = image_data
+
+            img_bytes = base64.b64decode(encoded)
+            img = Image.open(io.BytesIO(img_bytes))
+
+            max_size = self.config['ollama'].get('vision_max_size', 1024)
+            quality = self.config['ollama'].get('vision_quality', 85)
+
+            # Resize if too large
+            if max(img.size) > max_size:
+                ratio = max_size / max(img.size)
+                new_size = (int(img.width * ratio), int(img.height * ratio))
+                img = img.resize(new_size, Image.LANCZOS)
+
+            # Convert to RGB if necessary
+            if img.mode in ('RGBA', 'P'):
+                img = img.convert('RGB')
+
+            buf = io.BytesIO()
+            img.save(buf, format='JPEG', quality=quality, optimize=True)
+            processed_b64 = base64.b64encode(buf.getvalue()).decode('utf-8')
+
+            logger.info(f"Processed image: {img.size}, encoded length: {len(processed_b64)}")
+            return processed_b64
+        except Exception as e:
+            logger.error(f"Image processing failed: {e}")
+            return encoded if 'encoded' in dir() else image_data
+
    def setup_routes(self):
-        
+
        @self.app.route('/')
        def index():
            if 'session_id' not in session:
                session['session_id'] = str(uuid.uuid4())
-            return render_template('chat.html', 
+            return render_template('chat.html',
                                 persona_name=self.persona.name,
                                 model=self.config['ollama']['model'])
-        
+
        @self.app.route('/api/models')
        def list_models():
            models = self.ollama.list_models()
            return jsonify({'models': models, 'current': self.config['ollama']['model']})
-        
+
        @self.app.route('/api/chat', methods=['POST'])
        @self.check_rate_limit
        def chat():
-            data = request.json
+            data = request.get_json()
            user_message = data.get('message', '').strip()
+            images = data.get('images', [])  # List of base64 strings
            session_id = session.get('session_id', 'default')
-            
-            if not user_message:
-                return jsonify({'error': 'Empty message'}), 400
-            
+
+            if not user_message and not images:
+                return jsonify({'error': 'Empty message and no image'}), 400
+
            if len(user_message) > self.config['security']['max_input_length']:
                return jsonify({'error': 'Message too long'}), 400
-            
+
+            # Process images if provided
+            processed_images = []
+            if images:
+                max_mb = self.config['security'].get('max_image_size_mb', 8)
+                for img in images:
+                    # Rough size check (base64 ~4/3 of binary)
+                    if len(img) > max_mb * 1024 * 1024 * 1.4:
+                        return jsonify({'error': f'Image too large. Max {max_mb}MB.'}), 400
+                    processed = self._process_image(img)
+                    processed_images.append(processed)
+
            if not self.ollama.check_model():
                return jsonify({
                    'error': f"Model {self.config['ollama']['model']} not available."
                }), 503
-            
-            summary_triggered = self.memory.add_message('user', user_message, session_id)
-            
+
+            summary_triggered = self.memory.add_message('user', user_message, session_id, processed_images)
+
            prefs = self.persona.extract_preferences(user_message)
            for cat, content in prefs:
                self.memory.save_preference(cat, content)
-            
+
            system_prompt = self.persona.get_system_prompt(self.memory)
            context = self.memory.get_context(session_id)
-            
+
            def generate():
                full_response = []
-                
-                for chunk in self.ollama.generate(system_prompt, context, user_message):
+
+                for chunk in self.ollama.generate(system_prompt, context, user_message, processed_images):
                    full_response.append(chunk)
                    yield f"data: {chunk}\n\n"
-                
+
                complete_response = ''.join(full_response)
                if complete_response.strip():
                    self.memory.add_message('assistant', complete_response, session_id)
                    self.persona.update_mood(user_message, complete_response[:50])
-                
+
                yield "data: [DONE]\n\n"
-            
+
            return Response(generate(), mimetype='text/event-stream')
-        
+
        @self.app.route('/api/memory', methods=['GET'])
        def get_memory_stats():
            session_id = session.get('session_id', 'default')
            stats = self.memory.get_memory_stats()
-            
+
            recent = [
-                {'role': m['role'], 
-                 'content': m['content'][:100] + '...' if len(m['content']) > 100 else m['content']}
+                {'role': m['role'],
+                 'content': m['content'][:100] + '...' if len(m['content']) > 100 else m['content'],
+                 'has_image': bool(m.get('images'))}
                for m in self.memory.short_term[-5:]
            ]
-            
+
            return jsonify({
                'stats': stats,
                'recent': recent,
                'preferences': self.memory.get_preferences()
            })
-        
+
        @self.app.route('/api/clear', methods=['POST'])
        def clear_memory():
            session_id = session.get('session_id', 'default')
            self.memory.clear_session(session_id)
            return jsonify({'status': 'cleared'})
-        
+
        @self.app.route('/api/search', methods=['POST'])
        def search_memory():
            data = request.json
            keyword = data.get('keyword', '')
            results = self.memory.search_long_term(keyword)
            return jsonify({'results': results[:10]})
-        
+
        @self.app.route('/api/config', methods=['GET'])
        def get_config():
            return jsonify({
                'persona': self.persona.name,
                'model': self.config['ollama']['model'],
-                'max_input': self.config['security']['max_input_length']
+                'max_input': self.config['security']['max_input_length'],
+                'vision': True
            })
-    
+
    def run(self):
        host = self.config['web']['host']
        port = self.config['web']['port']
        debug = self.config['web']['debug']
-        
+
        logger.info(f"Template folder: {TEMPLATE_DIR}")
        logger.info(f"Static folder: {STATIC_DIR}")
        logger.info(f"Starting Nimue on {host}:{port}")
        logger.info(f"Using model: {self.config['ollama']['model']}")
-        
+        logger.info(f"Vision support enabled")
+
        self.app.run(host=host, port=port, debug=debug, threaded=True)

 def create_app(config_path='config.yaml'):
@@ -7,7 +7,7 @@ import re

 class TokenEstimator:
    """Simple token estimation (roughly 0.75 tokens per word for English/German)"""
-    
+
    @staticmethod
    def estimate(text: str) -> int:
        # Grobe Schätzung: ~4 Zeichen pro Token (für westliche Sprachen)
@@ -21,20 +21,20 @@ class MemoryManager:
        self.short_term_limit = config['short_term_limit']
        self.long_term_limit = config['long_term_limit']
        self.threshold = config['auto_summarize_threshold']
-        
+
        # Kurzzeitgedächtnis: Aktuelle Session (nur im RAM)
        self.short_term: List[Dict] = []
        self.current_tokens = 0
-        
+
        # Langzeitgedächtnis: Datenbank
        self.db_path = config['db_path']
        self._init_db()
-        
+
    def _init_db(self):
        """Initialize SQLite database for long-term memory"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
-        
+
        # Tabelle für Gesprächsverläufe
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS conversations (
@@ -43,12 +43,13 @@ class MemoryManager:
                timestamp REAL,
                role TEXT,
                content TEXT,
+                has_image INTEGER DEFAULT 0,
                summary TEXT,
                importance INTEGER DEFAULT 1,
                tokens INTEGER
            )
        ''')
-        
+
        # Tabelle für Zusammenfassungen (Langzeitgedächtnis)
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS summaries (
@@ -60,7 +61,7 @@ class MemoryManager:
                tokens INTEGER
            )
        ''')
-        
+
        # Tabelle für Benutzerpräferenzen
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS preferences (
@@ -70,48 +71,58 @@ class MemoryManager:
                content TEXT
            )
        ''')
-        
+
        conn.commit()
        conn.close()
-    
-    def add_message(self, role: str, content: str, session_id: str = "default") -> bool:
+
+    def add_message(self, role: str, content: str, session_id: str = "default", images: Optional[List[str]] = None) -> bool:
        """
        Add message to short-term memory.
        Returns True if summarization was triggered.
        """
-        tokens = self.token_estimator.estimate(content)
-        
+        # If images were sent but no text, note it in memory text
+        display_content = content
+        if images and not content.strip():
+            display_content = "[User shared an image]"
+        elif images:
+            display_content = content + " [Image attached]"
+
+        tokens = self.token_estimator.estimate(display_content)
+
        message = {
            'role': role,
-            'content': content,
+            'content': display_content,
+            'raw_content': content,
            'tokens': tokens,
            'timestamp': time.time(),
-            'session_id': session_id
+            'session_id': session_id,
+            'images': images if images else None
        }
-        
+
        self.short_term.append(message)
        self.current_tokens += tokens
-        
-        # Speichere auch Langzeit (rohdaten)
-        self._save_to_db(role, content, tokens, session_id)
-        
+
+        # Speichere auch Langzeit (ohne base64 images, nur Hinweis)
+        has_image = 1 if images else 0
+        self._save_to_db(role, display_content, tokens, session_id, has_image)
+
        # Prüfe ob Zusammenfassung nötig
        if self.current_tokens > (self.max_context * self.threshold):
            self._summarize_old_messages(session_id)
            return True
        return False
-    
-    def _save_to_db(self, role: str, content: str, tokens: int, session_id: str):
+
+    def _save_to_db(self, role: str, content: str, tokens: int, session_id: str, has_image: int = 0):
        """Save raw message to database"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        cursor.execute('''
-            INSERT INTO conversations (session_id, timestamp, role, content, tokens)
-            VALUES (?, ?, ?, ?, ?)
-        ''', (session_id, time.time(), role, content, tokens))
+            INSERT INTO conversations (session_id, timestamp, role, content, has_image, tokens)
+            VALUES (?, ?, ?, ?, ?, ?)
+        ''', (session_id, time.time(), role, content, has_image, tokens))
        conn.commit()
        conn.close()
-    
+
    def _summarize_old_messages(self, session_id: str):
        """
        Kompromiss zwischen behalten und vergessen:
@@ -120,14 +131,14 @@ class MemoryManager:
        """
        if len(self.short_term) < 10:
            return  # Zu wenig zu zusammenfassen
-        
+
        # Behalte letzte 6 Nachrichten, summarisiere den Rest
        messages_to_summarize = self.short_term[:-6]
        keep_messages = self.short_term[-6:]
-        
+
        # Erstelle Zusammenfassung
        summary_text = self._create_summary(messages_to_summarize)
-        
+
        # Speichere Zusammenfassung
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
@@ -138,20 +149,20 @@ class MemoryManager:
        ''', (session_id, time.time(), summary_text, summary_tokens))
        conn.commit()
        conn.close()
-        
+
        # Ersetze Kurzzeitgedächtnis
        self.short_term = keep_messages
        self.current_tokens = sum(m['tokens'] for m in keep_messages)
-        
+
        print(f"[Memory] Summarized {len(messages_to_summarize)} messages. Kept {len(keep_messages)}.")
-    
+
    def _create_summary(self, messages: List[Dict]) -> str:
        """Create a condensed summary of old messages"""
        # Extrahiere Schlüsselinformationen
        topics = []
        key_facts = []
        emotional_moments = []
-        
+
        for msg in messages:
            content = msg['content'].lower()
            # Einfache Heuristik für relevante Informationen
@@ -159,34 +170,36 @@ class MemoryManager:
                key_facts.append(msg['content'][:100])
            if msg['role'] == 'user' and len(msg['content']) > 20:
                topics.append(msg['content'][:50])
-        
+            if msg.get('images'):
+                key_facts.append("[User shared images during this period]")
+
        summary = "Previous conversation summary: "
        if key_facts:
            summary += f"User preferences noted: {'; '.join(key_facts[:3])}. "
        if topics:
            summary += f"Topics discussed: {'; '.join(topics[:2])}."
-        
+
        return summary[:500]  # Limit Länge
-    
+
    def get_context(self, session_id: str = "default", max_history: int = 20) -> List[Dict]:
        """
        Get conversation context for LLM.
        Includes: summaries (long-term) + recent messages (short-term)
        """
        context = []
-        
+
        # 1. Langzeitgedächtnis: Letzte Zusammenfassungen laden
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        cursor.execute('''
-            SELECT content FROM summaries 
-            WHERE session_id = ? 
-            ORDER BY timestamp DESC 
+            SELECT content FROM summaries
+            WHERE session_id = ?
+            ORDER BY timestamp DESC
            LIMIT 3
        ''', (session_id,))
        summaries = cursor.fetchall()
        conn.close()
-        
+
        # Füge Zusammenfassungen als System-Kontext hinzu
        total_tokens = 0
        for summary in summaries:
@@ -197,17 +210,20 @@ class MemoryManager:
                    'content': f"[Memory] {summary[0]}"
                })
                total_tokens += summary_tokens
-        
+
        # 2. Kurzzeitgedächtnis: Aktuelle Nachrichten
        recent_messages = self.short_term[-max_history:]
        for msg in recent_messages:
-            context.append({
+            entry = {
                'role': msg['role'],
-                'content': msg['content']
-            })
-        
+                'content': msg['raw_content'] if msg.get('raw_content') else msg['content']
+            }
+            if msg.get('images'):
+                entry['images'] = msg['images']
+            context.append(entry)
+
        return context
-    
+
    def get_memory_stats(self) -> Dict:
        """Return current memory statistics"""
        return {
@@ -217,28 +233,28 @@ class MemoryManager:
            'max_context': self.max_context,
            'usage_percent': (self.current_tokens / self.max_context) * 100
        }
-    
+
    def clear_session(self, session_id: str = "default"):
        """Clear short-term memory for session"""
        self.short_term = []
        self.current_tokens = 0
-    
+
    def search_long_term(self, keyword: str) -> List[Dict]:
        """Search long-term memory for specific topics"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        cursor.execute('''
-            SELECT * FROM conversations 
-            WHERE content LIKE ? 
-            ORDER BY timestamp DESC 
+            SELECT * FROM conversations
+            WHERE content LIKE ?
+            ORDER BY timestamp DESC
            LIMIT 10
        ''', (f'%{keyword}%',))
        results = cursor.fetchall()
        conn.close()
-        
-        columns = ['id', 'session_id', 'timestamp', 'role', 'content', 'summary', 'importance', 'tokens']
+
+        columns = ['id', 'session_id', 'timestamp', 'role', 'content', 'has_image', 'summary', 'importance', 'tokens']
        return [dict(zip(columns, row)) for row in results]
-    
+
    def save_preference(self, category: str, content: str):
        """Save learned preference to long-term memory"""
        conn = sqlite3.connect(self.db_path)
@@ -249,7 +265,7 @@ class MemoryManager:
        ''', (time.time(), category, content))
        conn.commit()
        conn.close()
-    
+
    def get_preferences(self) -> Dict[str, List[str]]:
        """Retrieve learned preferences"""
        conn = sqlite3.connect(self.db_path)
@@ -257,10 +273,10 @@ class MemoryManager:
        cursor.execute('SELECT category, content FROM preferences ORDER BY timestamp DESC')
        results = cursor.fetchall()
        conn.close()
-        
+
        prefs = {}
        for cat, content in results:
            if cat not in prefs:
                prefs[cat] = []
            prefs[cat].append(content)
-        return prefs
+        return prefs
@@ -12,44 +12,53 @@ class OllamaClient:
        self.model = config['model']
        self.timeout = config['timeout']
        self.session = requests.Session()
-        
-    def _prepare_messages(self, system_prompt: str, context: List[Dict], user_message: str) -> List[Dict]:
+
+    def _prepare_messages(self, system_prompt: str, context: List[Dict], user_message: str, images: Optional[List[str]] = None) -> List[Dict]:
        """Prepare message list for Ollama API"""
        messages = []
-        
+
        # System prompt first
        if system_prompt:
            messages.append({
                "role": "system",
                "content": system_prompt
            })
-        
+
        # Add context (memory)
        for msg in context:
-            messages.append({
+            entry = {
                "role": msg['role'],
                "content": msg['content']
-            })
-        
-        # User message last
-        messages.append({
-            "role": "user", 
+            }
+            # Preserve image references if they exist in stored context
+            if 'images' in msg and msg['images']:
+                entry['images'] = msg['images']
+            messages.append(entry)
+
+        # User message last (with optional images)
+        user_entry = {
+            "role": "user",
            "content": user_message
-        })
-        
+        }
+        if images:
+            user_entry['images'] = images
+
+        messages.append(user_entry)
+
        return messages
-    
-    def generate(self, 
-                 system_prompt: str, 
-                 context: List[Dict], 
+
+    def generate(self,
+                 system_prompt: str,
+                 context: List[Dict],
                 user_message: str,
+                 images: Optional[List[str]] = None,
                 options: Optional[Dict] = None) -> Generator[str, None, None]:
        """
        Stream response from Ollama API
        Yields tokens/chunks as they arrive
        """
-        messages = self._prepare_messages(system_prompt, context, user_message)
-        
+        messages = self._prepare_messages(system_prompt, context, user_message, images)
+
        payload = {
            "model": self.model,
            "messages": messages,
@@ -60,7 +69,7 @@ class OllamaClient:
                "top_k": 40
            }
        }
-        
+
        try:
            response = self.session.post(
                f"{self.host}/api/chat",
@@ -69,9 +78,9 @@ class OllamaClient:
                timeout=self.timeout
            )
            response.raise_for_status()
-            
+
            full_response = ""
-            
+
            for line in response.iter_lines():
                if line:
                    try:
@@ -80,16 +89,16 @@ class OllamaClient:
                            chunk = data['message']['content']
                            full_response += chunk
                            yield chunk
-                        
+
                        # Check for completion
                        if data.get('done', False):
                            break
-                            
+
                    except json.JSONDecodeError:
                        continue
-            
+
            logger.info(f"Generated {len(full_response)} characters")
-            
+
        except requests.exceptions.ConnectionError:
            logger.error(f"Cannot connect to Ollama at {self.host}")
            yield "*softly* I'm having trouble connecting to my thoughts... Please check if Ollama is running."
@@ -99,7 +108,7 @@ class OllamaClient:
        except Exception as e:
            logger.error(f"Error generating response: {e}")
            yield "*whispers* Something went wrong... please try again."
-    
+
    def check_model(self) -> bool:
        """Check if configured model is available"""
        try:
@@ -107,15 +116,20 @@ class OllamaClient:
            if response.status_code == 200:
                data = response.json()
                models = [m['name'] for m in data.get('models', [])]
+                # Allow both exact match and model base name
                if self.model in models:
                    return True
-                else:
-                    logger.warning(f"Model {self.model} not found. Available: {models}")
-                    return False
+                # Check if any model contains our base name (e.g. tag variants)
+                base_name = self.model.split(':')[0]
+                for m in models:
+                    if base_name in m:
+                        return True
+                logger.warning(f"Model {self.model} not found. Available: {models}")
+                return False
        except Exception as e:
            logger.error(f"Cannot reach Ollama: {e}")
            return False
-    
+
    def list_models(self) -> List[str]:
        """List available models"""
        try:
@@ -126,7 +140,7 @@ class OllamaClient:
        except Exception:
            pass
        return []
-    
+
    def pull_model(self, model_name: str) -> Generator[str, None, None]:
        """Pull a model from Ollama library"""
        try:
@@ -135,7 +149,7 @@ class OllamaClient:
                json={"name": model_name},
                stream=True
            )
-            
+
            for line in response.iter_lines():
                if line:
                    try:
@@ -148,4 +162,4 @@ class OllamaClient:
                    except:
                        pass
        except Exception as e:
-            yield f"Error pulling model: {e}"
+            yield f"Error pulling model: {e}"
@@ -2,4 +2,5 @@ flask>=2.3.0
 pyyaml>=6.0
 requests>=2.31.0
 werkzeug>=2.3.0
-jinja2>=3.1.0
+jinja2>=3.1.0
+Pillow>=10.0.0
@@ -64,6 +64,12 @@ header h1 {
    margin-top: 10px;
 }

+#vision-badge {
+    color: #7ee787;
+    font-weight: 600;
+    margin-left: 10px;
+}
+
 .memory-status {
    font-size: 0.8rem;
    color: var(--accent-soft);
@@ -129,6 +135,22 @@ header h1 {
    font-style: italic;
 }

+/* Images inside messages */
+.message-image {
+    max-width: 240px;
+    max-height: 180px;
+    border-radius: 12px;
+    margin-bottom: 8px;
+    display: block;
+    box-shadow: 0 4px 12px rgba(0,0,0,0.3);
+    cursor: pointer;
+    transition: transform 0.2s;
+}
+
+.message-image:hover {
+    transform: scale(1.03);
+}
+
 /* Input Area */
 .input-area {
    position: sticky;
@@ -151,9 +173,52 @@ header h1 {
    50% { opacity: 1; }
 }

+/* Image Preview above input */
+.image-preview-container {
+    display: flex;
+    align-items: center;
+    gap: 10px;
+    margin-bottom: 10px;
+    padding: 8px;
+    background: var(--bg-secondary);
+    border: 1px solid var(--bg-tertiary);
+    border-radius: 12px;
+    width: fit-content;
+    max-width: 100%;
+}
+
+.image-preview {
+    max-height: 80px;
+    max-width: 120px;
+    border-radius: 8px;
+    object-fit: cover;
+}
+
+.remove-image-btn {
+    background: var(--accent);
+    color: white;
+    border: none;
+    border-radius: 50%;
+    width: 24px;
+    height: 24px;
+    font-size: 16px;
+    line-height: 24px;
+    cursor: pointer;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    transition: background 0.2s;
+}
+
+.remove-image-btn:hover {
+    background: #ff5a75;
+}
+
+/* Input Row */
 .input-row {
    display: flex;
    gap: 10px;
+    align-items: flex-start;
 }

 textarea {
@@ -176,6 +241,32 @@ textarea:focus {
    border-color: var(--accent);
 }

+.upload-btn {
+    background: var(--bg-tertiary);
+    color: var(--text-primary);
+    border: none;
+    border-radius: 25px;
+    padding: 15px 18px;
+    font-size: 1.1rem;
+    cursor: pointer;
+    transition: all 0.3s;
+    height: 50px;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+}
+
+.upload-btn:hover:not(:disabled) {
+    background: var(--accent);
+    transform: scale(1.05);
+}
+
+.upload-btn:disabled {
+    background: var(--bg-tertiary);
+    opacity: 0.5;
+    cursor: not-allowed;
+}
+
 .send-btn {
    background: var(--accent);
    color: white;
@@ -304,21 +395,33 @@ textarea:focus {
    .container {
        padding: 10px;
    }
-    
+
    header h1 {
        font-size: 1.8rem;
    }
-    
+
    .message {
        max-width: 90%;
        padding: 12px 15px;
    }
-    
+
+    .message-image {
+        max-width: 180px;
+        max-height: 140px;
+    }
+
    .input-row {
-        flex-direction: column;
+        flex-wrap: wrap;
    }
-    
-    .send-btn {
+
+    .upload-btn, .send-btn {
+        height: 44px;
+        padding: 10px 18px;
+    }
+
+    textarea {
        width: 100%;
+        order: -1;
+        margin-bottom: 8px;
    }
-}
+}
@@ -9,22 +9,31 @@
 <body>
    <div class="container">
        <header>
-            <h1>𓇢 {{ persona_name }}</h1>
+            <h1>🜏 {{ persona_name }}</h1>
            <div class="subtitle">Intimate AI Companion</div>
-            <div class="model-info">Model: <span id="model-name">Loading...</span></div>
+            <div class="model-info">Model: <span id="model-name">Loading...</span> | <span id="vision-badge" style="display:none;">👁 Vision Enabled</span></div>
            <div class="memory-status" id="memory-status"></div>
        </header>
-        
+
        <div class="chat-container" id="chat-box">
            <div class="message system">
                *kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions. What would please you today?
            </div>
        </div>
-        
+
        <div class="input-area">
            <div class="typing-indicator" id="typing" style="display: none;">Nimue is thinking...</div>
+
+            <!-- Image Preview Area -->
+            <div id="image-preview-container" class="image-preview-container" style="display: none;">
+                <img id="image-preview" class="image-preview" src="" alt="Preview">
+                <button onclick="removeImage()" class="remove-image-btn" title="Remove image">×</button>
+            </div>
+
            <div class="input-row">
                <textarea id="user-input" placeholder="Command me..." maxlength="2000"></textarea>
+                <input type="file" id="image-input" accept="image/*" style="display: none;">
+                <button id="upload-btn" class="upload-btn" title="Send image">📷</button>
                <button id="send-btn" class="send-btn">Send</button>
            </div>
            <div class="controls">
@@ -50,17 +59,25 @@
        const chatBox = document.getElementById('chat-box');
        const userInput = document.getElementById('user-input');
        const sendBtn = document.getElementById('send-btn');
+        const uploadBtn = document.getElementById('upload-btn');
+        const imageInput = document.getElementById('image-input');
+        const imagePreviewContainer = document.getElementById('image-preview-container');
+        const imagePreview = document.getElementById('image-preview');
        const typing = document.getElementById('typing');
        const charCount = document.getElementById('char-count');
-        
+
        let isGenerating = false;
        let currentMessageDiv = null;
+        let currentImageBase64 = null;

        // Load config
        fetch('/api/config')
            .then(r => r.json())
            .then(data => {
                document.getElementById('model-name').textContent = data.model;
+                if (data.vision) {
+                    document.getElementById('vision-badge').style.display = 'inline';
+                }
            });

        userInput.addEventListener('input', () => {
@@ -75,11 +92,46 @@
        });

        sendBtn.addEventListener('click', sendMessage);
+        uploadBtn.addEventListener('click', () => imageInput.click());

-        function appendMessage(role, content) {
+        imageInput.addEventListener('change', (e) => {
+            const file = e.target.files[0];
+            if (!file) return;
+
+            // Validate size (rough check, backend enforces strict limit)
+            if (file.size > 8 * 1024 * 1024) {
+                alert('Image too large. Maximum 8MB.');
+                imageInput.value = '';
+                return;
+            }
+
+            const reader = new FileReader();
+            reader.onload = (evt) => {
+                currentImageBase64 = evt.target.result; // data:image/...;base64,...
+                imagePreview.src = currentImageBase64;
+                imagePreviewContainer.style.display = 'flex';
+            };
+            reader.readAsDataURL(file);
+        });
+
+        function removeImage() {
+            currentImageBase64 = null;
+            imagePreview.src = '';
+            imagePreviewContainer.style.display = 'none';
+            imageInput.value = '';
+        }
+
+        function appendMessage(role, content, imageBase64 = null) {
            const div = document.createElement('div');
            div.className = `message ${role}`;
-            div.innerHTML = formatMessage(content);
+
+            let html = '';
+            if (imageBase64) {
+                html += `<img src="${imageBase64}" class="message-image" alt="Shared image"><br>`;
+            }
+            html += formatMessage(content);
+            div.innerHTML = html;
+
            chatBox.appendChild(div);
            chatBox.scrollTop = chatBox.scrollHeight;
            return div;
@@ -101,28 +153,39 @@

        async function sendMessage() {
            if (isGenerating) return;
-            
-            const message = userInput.value.trim();
-            if (!message) return;

-            // Add user message
-            appendMessage('user', message);
+            const message = userInput.value.trim();
+            if (!message && !currentImageBase64) return;
+
+            // Add user message to chat immediately
+            appendMessage('user', message || '[Image]', currentImageBase64);
            userInput.value = '';
            charCount.textContent = '0';
-            
+
            isGenerating = true;
            typing.style.display = 'block';
            sendBtn.disabled = true;
-            
+            uploadBtn.disabled = true;
+
            currentMessageDiv = document.createElement('div');
            currentMessageDiv.className = 'message assistant';
            chatBox.appendChild(currentMessageDiv);

+            // Prepare payload
+            const payload = { message: message };
+            if (currentImageBase64) {
+                payload.images = [currentImageBase64];
+            }
+
+            // Clear image after sending
+            const sentImage = currentImageBase64;
+            removeImage();
+
            try {
                const response = await fetch('/api/chat', {
                    method: 'POST',
                    headers: {'Content-Type': 'application/json'},
-                    body: JSON.stringify({message: message})
+                    body: JSON.stringify(payload)
                });

                const reader = response.body.getReader();
@@ -132,10 +195,10 @@
                while (true) {
                    const {done, value} = await reader.read();
                    if (done) break;
-                    
+
                    const chunk = decoder.decode(value);
                    const lines = chunk.split('\n');
-                    
+
                    for (const line of lines) {
                        if (line.startsWith('data: ')) {
                            const text = line.slice(6);
@@ -144,9 +207,6 @@
                            currentMessageDiv.innerHTML = formatMessage(fullText);
                            chatBox.scrollTop = chatBox.scrollHeight;
                        }
-                        if (line.startsWith('event: stats')) {
-                            // Parse stats
-                        }
                    }
                }

@@ -156,6 +216,7 @@
                isGenerating = false;
                typing.style.display = 'none';
                sendBtn.disabled = false;
+                uploadBtn.disabled = false;
                getMemoryStats();
            }
        }
@@ -179,14 +240,14 @@
            const modal = document.getElementById('memory-modal');
            const resp = await fetch('/api/memory');
            const data = await resp.json();
-            
+
            document.getElementById('memory-stats').innerHTML = `
                <h3>Statistics</h3>
                <p>Short-term messages: ${data.stats.short_term_messages}</p>
                <p>Tokens used: ${data.stats.short_term_tokens} / ${data.stats.max_context}</p>
                <p>Usage: ${data.stats.usage_percent.toFixed(1)}%</p>
            `;
-            
+
            let prefs = '<h3>Learned Preferences</h3>';
            if (Object.keys(data.preferences).length === 0) {
                prefs += '<p>None yet...</p>';
@@ -196,13 +257,14 @@
                }
            }
            document.getElementById('memory-preferences').innerHTML = prefs;
-            
+
            let recent = '<h3>Recent Messages</h3>';
            for (const msg of data.recent) {
-                recent += `<p><strong>${msg.role}:</strong> ${msg.content}</p>`;
+                const imgTag = msg.has_image ? ' 🖼️' : '';
+                recent += `<p><strong>${msg.role}${imgTag}:</strong> ${msg.content}</p>`;
            }
            document.getElementById('memory-recent').innerHTML = recent;
-            
+
            modal.style.display = 'block';
        }

@@ -219,4 +281,4 @@
        getMemoryStats();
    </script>
 </body>
-</html>
+</html>