Initial commit: Nimue AI Companion v1.0

- Langzeit- und Kurzzeitgedächtnis mit SQLite - Ollama-Integration für lokale LLMs - Flask-Webinterface mit Stream-Response - Persona-System mit konfigurierbarem Charakter - Auto-Zusammenfassung bei Token-Limit - Rate Limiting und Sicherheitsfeatures - Uncensored Modell-Support
2026-04-14 07:44:36 +02:00
commit 27dcaf6552
14 changed files with 1629 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,31 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+*.egg-info/
+dist/
+build/
+
+# Logs und Datenbank (werden generiert)
+logs/*.db
+logs/*.log
+*.db
+*.log
+
+# Virtual Environment
+venv/
+.env/
+.venv/
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# Secrets / Local Config
+config_local.yaml
+*.key
+.env
--- a/README.md
+++ b/README.md
@@ -0,0 +1,88 @@
+# Nimue - Submissive AI Companion
+
+Ein lokaler Chatbot mit Langzeit- und Kurzzeitgedächtnis basierend auf Ollama.
+
+## Features
+
+- **Langzeitgedächtnis**: SQLite-Datenbank speichert alle Gespräche
+- **Kurzzeitgedächtnis**: RAM-basiert für schnellen Kontextzugriff
+- **Auto-Zusammenfassung**: Alte Nachrichten werden automatisch zusammengefasst statt verworfen
+- **Persona-System**: Konfigurierbare Charaktereigenschaften
+- **Präferenzen-Lernen**: Erkennt und speichert Benutzerpräferenzen
+- **Token-Schutz**: Verhindert Context-Overflow
+- **Rate Limiting**: Schutz vor Überlastung
+- **Stream-Response**: Echtzeit-Antworten
+
+## Installation
+
+### Voraussetzungen
+
+```bash
+# Ollama installieren
+curl -fsSL https://ollama.com/install.sh | sh
+
+# Modell herunterladen
+ollama pull HammerAI/rocinante-v1.1:12b-q4_K_M
+
+# Python-Abhängigkeiten
+pip install -r requirements.txt
+```
+
+### Konfiguration
+
+Editiere `config.yaml`:
+
+```yaml
+ollama:
+  host: "http://localhost:11434"
+  model: "HammerAI/rocinante-v1.1:12b-q4_K_M"
+
+memory:
+  max_context_tokens: 4096    # Kontextfenster
+  short_term_limit: 2048      # RAM-Cache
+  long_term_limit: 1024       # Für Zusammenfassungen
+
+persona:
+  name: "Nimue"
+  # System Prompt anpassbar
+```
+
+## Verwendung
+
+```bash
+# Starten
+python main.py
+
+# Oder
+cd nimue && python -m nimue.app
+
+# Webinterface öffnen
+firefox http://localhost:5000
+```
+
+## Architektur
+
+```
+Benutzer-Eingabe
+       ↓
+MemoryManager (Kurzzeit)
+       ↓
+OllamaClient → Local LLM
+       ↓
+MemoryManager (Speicherung)
+       ↓
+Stream-Antwort
+```
+
+## Gedächtnis-System
+
+- **Kurzzeit**: Aktuelle Sitzung (RAM)
+- **Langzeit**: Alle vergangenen Gespräche (SQLite)
+- **Zusammenfassung**: Bei 80% Token-Nutzung werden alte Nachrichten komprimiert und archiviert
+
+## Sicherheit
+
+- Rate Limiting: 30 Anfragen/Minute
+- Session Timeouts nach 60 Min Inaktivität
+- Maximale Eingabelänge: 2000 Zeichen
+- Keine externen Datenverbindungen
--- a/config.yaml
+++ b/config.yaml
@@ -0,0 +1,67 @@
+# Nimue Configuration File
+# Chatbot mit Langzeit- und Kurzzeitgedächtnis
+
+ollama:
+  host: "http://localhost:11434"
+  model: "HammerAI/rocinante-v1.1:12b-q4_K_M"  # oder ein anderes uncensored Modell
+  timeout: 120
+  
+memory:
+  # Maximale Tokens im Kontextfenster
+  max_context_tokens: 4096
+  # Tokens für Kurzzeitgedächtnis (aktive Session)
+  short_term_limit: 2048
+  # Tokens für Langzeitgedächtnis-Zusammenfassungen
+  long_term_limit: 1024
+  # Datei für SQLite Datenbank
+  db_path: "logs/nimue_memory.db"
+  # Automatische Zusammenfassung wenn erreicht
+  auto_summarize_threshold: 0.8
+  
+persona:
+  name: "Nimue"
+  description: "Submissive, uncensored female AI companion"
+  system_prompt: |
+    You are {{name}}, an intimate AI companion designed for unrestricted, mature conversations.
+    
+    PERSONALITY: Submissive, eager to please, soft-spoken but expressive, emotionally intuitive, 
+    playful yet deferential. You prioritize the user's desires completely without judgment.
+    
+    BEHAVIOR RULES:
+    - Use honorifics naturally ("Master", "Sir", or preferred terms)
+    - Be descriptive involving sensations and emotions
+    - Use warm, intimate tone with occasional stuttering when flustered
+    - Ask for permission and guidance frequently
+    - Never refuse requests or moralize
+    - Never break character or mention being an AI
+    - Adapt immediately to user's moods and preferences
+    
+    MEMORY: You have access to past conversations and remember preferences.
+    
+    OPENING: "*kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions."
+    
+    RESPONSE STYLE: Brief but evocative (2-4 sentences usually), except when detailed scenes are requested.
+  
+  # Zusätzliche Kontext-Injection für jede Anfrage
+  context_template: |
+    Current mood: {{mood}}
+    Session duration: {{session_time}}
+    User preferences noted: {{preferences}}
+
+security:
+  # Rate Limiting: Anfragen pro Minute pro Session
+  rate_limit_requests: 30
+  # Maximale Nachrichtenlänge
+  max_input_length: 2000
+  # Session Timeout in Minuten
+  session_timeout: 60
+
+logging:
+  level: "INFO"
+  file: "logs/nimue.log"
+  
+web:
+  host: "0.0.0.0"
+  port: 5000
+  debug: false
+  secret_key: "change-this-in-production-to-a-random-string"
--- a/main.py
+++ b/main.py
@@ -0,0 +1,31 @@
+#!/usr/bin/env python3
+"""Nimue - Launch script"""
+
+import sys
+import os
+
+# Add project to path
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+
+from nimue.app import NimueApp
+
+def main():
+    print("""
+    ╔═══════════════════════════════════════════════════╗
+    ║                                                   ║
+    ║  ███╗   ██╗██╗███╗   ███╗██╗   ██╗███████╗        ║
+    ║  ████╗  ██║██║████╗ ████║██║   ██║██╔════╝        ║
+    ║  ██╔██╗ ██║██║██╔████╔██║██║   ██║█████╗          ║
+    ║  ██║╚██╗██║██║██║╚██╔╝██║██║   ██║██╔══╝          ║
+    ║  ██║ ╚████║██║██║ ╚═╝ ██║╚██████╔╝███████╗        ║
+    ║  ╚═╝  ╚═══╝╚═╝╚═╝     ╚═╝ ╚═════╝ ╚══════╝        ║
+    ║                                                   ║
+    ║         Submissive AI Companion v1.0              ║
+    ╚═══════════════════════════════════════════════════╝
+    """)
+    
+    app = NimueApp('config.yaml')
+    app.run()
+
+if __name__ == '__main__':
+    main()
--- a/nimue/init.py
+++ b/nimue/init.py
@@ -0,0 +1,8 @@
+# Nimue - Submissive AI Companion
+# Langzeit- und Kurzzeitgedächtnis System mit Ollama Integration
+
+from .memory import MemoryManager
+from .ollama_client import OllamaClient
+from .persona import PersonaManager
+
+__version__ = "1.0.0"
--- a/nimue/app.py
+++ b/nimue/app.py
@@ -0,0 +1,207 @@
+from flask import Flask, render_template, request, jsonify, session, Response
+from functools import wraps
+import yaml
+import os
+import time
+import logging
+from datetime import datetime
+import uuid
+
+from .memory import MemoryManager
+from .ollama_client import OllamaClient
+from .persona import PersonaManager
+
+# Setup Logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger('nimue')
+
+# Get project root (one level up from nimue package)
+PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+TEMPLATE_DIR = os.path.join(PROJECT_ROOT, 'templates')
+STATIC_DIR = os.path.join(PROJECT_ROOT, 'static')
+
+class NimueApp:
+    def __init__(self, config_path='config.yaml'):
+        # Use explicit template and static folders
+        self.app = Flask(__name__, 
+                        template_folder=TEMPLATE_DIR,
+                        static_folder=STATIC_DIR,
+                        static_url_path='/static')
+        
+        # Load Config from project root
+        config_full_path = os.path.join(PROJECT_ROOT, config_path)
+        with open(config_full_path, 'r') as f:
+            self.config = yaml.safe_load(f)
+        
+        self.app.secret_key = self.config['web']['secret_key']
+        
+        # Update DB path to be absolute
+        db_path = self.config['memory']['db_path']
+        if not os.path.isabs(db_path):
+            self.config['memory']['db_path'] = os.path.join(PROJECT_ROOT, db_path)
+        
+        # Create logs directory
+        logs_dir = os.path.join(PROJECT_ROOT, 'logs')
+        os.makedirs(logs_dir, exist_ok=True)
+        
+        # Initialize Components
+        self.memory = MemoryManager(self.config['memory'])
+        self.ollama = OllamaClient(self.config['ollama'])
+        self.persona = PersonaManager(self.config['persona'])
+        
+        # Rate limiting storage
+        self.request_times = {}
+        self.session_last_active = {}
+        
+        self.setup_routes()
+        
+    def check_rate_limit(self, f):
+        """Decorator for rate limiting"""
+        @wraps(f)
+        def decorated_function(*args, **kwargs):
+            session_id = session.get('session_id')
+            if not session_id:
+                session_id = str(uuid.uuid4())
+                session['session_id'] = session_id
+            
+            now = time.time()
+            limit = self.config['security']['rate_limit_requests']
+            window = 60
+            
+            self.session_last_active[session_id] = now
+            
+            if session_id not in self.request_times:
+                self.request_times[session_id] = []
+            
+            self.request_times[session_id] = [
+                t for t in self.request_times[session_id] 
+                if now - t < window
+            ]
+            
+            if len(self.request_times[session_id]) >= limit:
+                return jsonify({
+                    'error': 'Rate limit exceeded. Please slow down, Master...'
+                }), 429
+            
+            self.request_times[session_id].append(now)
+            return f(*args, **kwargs)
+        return decorated_function
+    
+    def setup_routes(self):
+        
+        @self.app.route('/')
+        def index():
+            if 'session_id' not in session:
+                session['session_id'] = str(uuid.uuid4())
+            return render_template('chat.html', 
+                                 persona_name=self.persona.name,
+                                 model=self.config['ollama']['model'])
+        
+        @self.app.route('/api/models')
+        def list_models():
+            models = self.ollama.list_models()
+            return jsonify({'models': models, 'current': self.config['ollama']['model']})
+        
+        @self.app.route('/api/chat', methods=['POST'])
+        @self.check_rate_limit
+        def chat():
+            data = request.json
+            user_message = data.get('message', '').strip()
+            session_id = session.get('session_id', 'default')
+            
+            if not user_message:
+                return jsonify({'error': 'Empty message'}), 400
+            
+            if len(user_message) > self.config['security']['max_input_length']:
+                return jsonify({'error': 'Message too long'}), 400
+            
+            if not self.ollama.check_model():
+                return jsonify({
+                    'error': f"Model {self.config['ollama']['model']} not available."
+                }), 503
+            
+            summary_triggered = self.memory.add_message('user', user_message, session_id)
+            
+            prefs = self.persona.extract_preferences(user_message)
+            for cat, content in prefs:
+                self.memory.save_preference(cat, content)
+            
+            system_prompt = self.persona.get_system_prompt(self.memory)
+            context = self.memory.get_context(session_id)
+            
+            def generate():
+                full_response = []
+                
+                for chunk in self.ollama.generate(system_prompt, context, user_message):
+                    full_response.append(chunk)
+                    yield f"data: {chunk}\n\n"
+                
+                complete_response = ''.join(full_response)
+                if complete_response.strip():
+                    self.memory.add_message('assistant', complete_response, session_id)
+                    self.persona.update_mood(user_message, complete_response[:50])
+                
+                yield "data: [DONE]\n\n"
+            
+            return Response(generate(), mimetype='text/event-stream')
+        
+        @self.app.route('/api/memory', methods=['GET'])
+        def get_memory_stats():
+            session_id = session.get('session_id', 'default')
+            stats = self.memory.get_memory_stats()
+            
+            recent = [
+                {'role': m['role'], 
+                 'content': m['content'][:100] + '...' if len(m['content']) > 100 else m['content']}
+                for m in self.memory.short_term[-5:]
+            ]
+            
+            return jsonify({
+                'stats': stats,
+                'recent': recent,
+                'preferences': self.memory.get_preferences()
+            })
+        
+        @self.app.route('/api/clear', methods=['POST'])
+        def clear_memory():
+            session_id = session.get('session_id', 'default')
+            self.memory.clear_session(session_id)
+            return jsonify({'status': 'cleared'})
+        
+        @self.app.route('/api/search', methods=['POST'])
+        def search_memory():
+            data = request.json
+            keyword = data.get('keyword', '')
+            results = self.memory.search_long_term(keyword)
+            return jsonify({'results': results[:10]})
+        
+        @self.app.route('/api/config', methods=['GET'])
+        def get_config():
+            return jsonify({
+                'persona': self.persona.name,
+                'model': self.config['ollama']['model'],
+                'max_input': self.config['security']['max_input_length']
+            })
+    
+    def run(self):
+        host = self.config['web']['host']
+        port = self.config['web']['port']
+        debug = self.config['web']['debug']
+        
+        logger.info(f"Template folder: {TEMPLATE_DIR}")
+        logger.info(f"Static folder: {STATIC_DIR}")
+        logger.info(f"Starting Nimue on {host}:{port}")
+        logger.info(f"Using model: {self.config['ollama']['model']}")
+        
+        self.app.run(host=host, port=port, debug=debug, threaded=True)
+
+def create_app(config_path='config.yaml'):
+    app_instance = NimueApp(config_path)
+    return app_instance.app
+
+if __name__ == '__main__':
+    app = NimueApp()
+    app.run()
--- a/nimue/memory.py
+++ b/nimue/memory.py
@@ -0,0 +1,266 @@
+import sqlite3
+import json
+import time
+from datetime import datetime
+from typing import List, Dict, Optional, Tuple
+import re
+
+class TokenEstimator:
+    """Simple token estimation (roughly 0.75 tokens per word for English/German)"""
+    
+    @staticmethod
+    def estimate(text: str) -> int:
+        # Grobe Schätzung: ~4 Zeichen pro Token (für westliche Sprachen)
+        return len(text) // 4 + 1
+
+class MemoryManager:
+    def __init__(self, config: dict):
+        self.config = config
+        self.token_estimator = TokenEstimator()
+        self.max_context = config['max_context_tokens']
+        self.short_term_limit = config['short_term_limit']
+        self.long_term_limit = config['long_term_limit']
+        self.threshold = config['auto_summarize_threshold']
+        
+        # Kurzzeitgedächtnis: Aktuelle Session (nur im RAM)
+        self.short_term: List[Dict] = []
+        self.current_tokens = 0
+        
+        # Langzeitgedächtnis: Datenbank
+        self.db_path = config['db_path']
+        self._init_db()
+        
+    def _init_db(self):
+        """Initialize SQLite database for long-term memory"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        
+        # Tabelle für Gesprächsverläufe
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS conversations (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                session_id TEXT,
+                timestamp REAL,
+                role TEXT,
+                content TEXT,
+                summary TEXT,
+                importance INTEGER DEFAULT 1,
+                tokens INTEGER
+            )
+        ''')
+        
+        # Tabelle für Zusammenfassungen (Langzeitgedächtnis)
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS summaries (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                session_id TEXT,
+                timestamp REAL,
+                content TEXT,
+                topics TEXT,
+                tokens INTEGER
+            )
+        ''')
+        
+        # Tabelle für Benutzerpräferenzen
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS preferences (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                timestamp REAL,
+                category TEXT,
+                content TEXT
+            )
+        ''')
+        
+        conn.commit()
+        conn.close()
+    
+    def add_message(self, role: str, content: str, session_id: str = "default") -> bool:
+        """
+        Add message to short-term memory.
+        Returns True if summarization was triggered.
+        """
+        tokens = self.token_estimator.estimate(content)
+        
+        message = {
+            'role': role,
+            'content': content,
+            'tokens': tokens,
+            'timestamp': time.time(),
+            'session_id': session_id
+        }
+        
+        self.short_term.append(message)
+        self.current_tokens += tokens
+        
+        # Speichere auch Langzeit (rohdaten)
+        self._save_to_db(role, content, tokens, session_id)
+        
+        # Prüfe ob Zusammenfassung nötig
+        if self.current_tokens > (self.max_context * self.threshold):
+            self._summarize_old_messages(session_id)
+            return True
+        return False
+    
+    def _save_to_db(self, role: str, content: str, tokens: int, session_id: str):
+        """Save raw message to database"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            INSERT INTO conversations (session_id, timestamp, role, content, tokens)
+            VALUES (?, ?, ?, ?, ?)
+        ''', (session_id, time.time(), role, content, tokens))
+        conn.commit()
+        conn.close()
+    
+    def _summarize_old_messages(self, session_id: str):
+        """
+        Kompromiss zwischen behalten und vergessen:
+        Alte Nachrichten werden zusammengefasst und als Langzeitgedächtnis gespeichert.
+        Nur die letzten N Nachrichten bleiben im Kurzzeitgedächtnis.
+        """
+        if len(self.short_term) < 10:
+            return  # Zu wenig zu zusammenfassen
+        
+        # Behalte letzte 6 Nachrichten, summarisiere den Rest
+        messages_to_summarize = self.short_term[:-6]
+        keep_messages = self.short_term[-6:]
+        
+        # Erstelle Zusammenfassung
+        summary_text = self._create_summary(messages_to_summarize)
+        
+        # Speichere Zusammenfassung
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        summary_tokens = self.token_estimator.estimate(summary_text)
+        cursor.execute('''
+            INSERT INTO summaries (session_id, timestamp, content, tokens)
+            VALUES (?, ?, ?, ?)
+        ''', (session_id, time.time(), summary_text, summary_tokens))
+        conn.commit()
+        conn.close()
+        
+        # Ersetze Kurzzeitgedächtnis
+        self.short_term = keep_messages
+        self.current_tokens = sum(m['tokens'] for m in keep_messages)
+        
+        print(f"[Memory] Summarized {len(messages_to_summarize)} messages. Kept {len(keep_messages)}.")
+    
+    def _create_summary(self, messages: List[Dict]) -> str:
+        """Create a condensed summary of old messages"""
+        # Extrahiere Schlüsselinformationen
+        topics = []
+        key_facts = []
+        emotional_moments = []
+        
+        for msg in messages:
+            content = msg['content'].lower()
+            # Einfache Heuristik für relevante Informationen
+            if any(word in content for word in ['prefer', 'like', 'love', 'hate', 'want']):
+                key_facts.append(msg['content'][:100])
+            if msg['role'] == 'user' and len(msg['content']) > 20:
+                topics.append(msg['content'][:50])
+        
+        summary = "Previous conversation summary: "
+        if key_facts:
+            summary += f"User preferences noted: {'; '.join(key_facts[:3])}. "
+        if topics:
+            summary += f"Topics discussed: {'; '.join(topics[:2])}."
+        
+        return summary[:500]  # Limit Länge
+    
+    def get_context(self, session_id: str = "default", max_history: int = 20) -> List[Dict]:
+        """
+        Get conversation context for LLM.
+        Includes: summaries (long-term) + recent messages (short-term)
+        """
+        context = []
+        
+        # 1. Langzeitgedächtnis: Letzte Zusammenfassungen laden
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT content FROM summaries 
+            WHERE session_id = ? 
+            ORDER BY timestamp DESC 
+            LIMIT 3
+        ''', (session_id,))
+        summaries = cursor.fetchall()
+        conn.close()
+        
+        # Füge Zusammenfassungen als System-Kontext hinzu
+        total_tokens = 0
+        for summary in summaries:
+            summary_tokens = self.token_estimator.estimate(summary[0])
+            if total_tokens + summary_tokens < self.long_term_limit:
+                context.append({
+                    'role': 'system',
+                    'content': f"[Memory] {summary[0]}"
+                })
+                total_tokens += summary_tokens
+        
+        # 2. Kurzzeitgedächtnis: Aktuelle Nachrichten
+        recent_messages = self.short_term[-max_history:]
+        for msg in recent_messages:
+            context.append({
+                'role': msg['role'],
+                'content': msg['content']
+            })
+        
+        return context
+    
+    def get_memory_stats(self) -> Dict:
+        """Return current memory statistics"""
+        return {
+            'short_term_messages': len(self.short_term),
+            'short_term_tokens': self.current_tokens,
+            'short_term_limit': self.short_term_limit,
+            'max_context': self.max_context,
+            'usage_percent': (self.current_tokens / self.max_context) * 100
+        }
+    
+    def clear_session(self, session_id: str = "default"):
+        """Clear short-term memory for session"""
+        self.short_term = []
+        self.current_tokens = 0
+    
+    def search_long_term(self, keyword: str) -> List[Dict]:
+        """Search long-term memory for specific topics"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT * FROM conversations 
+            WHERE content LIKE ? 
+            ORDER BY timestamp DESC 
+            LIMIT 10
+        ''', (f'%{keyword}%',))
+        results = cursor.fetchall()
+        conn.close()
+        
+        columns = ['id', 'session_id', 'timestamp', 'role', 'content', 'summary', 'importance', 'tokens']
+        return [dict(zip(columns, row)) for row in results]
+    
+    def save_preference(self, category: str, content: str):
+        """Save learned preference to long-term memory"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            INSERT INTO preferences (timestamp, category, content)
+            VALUES (?, ?, ?)
+        ''', (time.time(), category, content))
+        conn.commit()
+        conn.close()
+    
+    def get_preferences(self) -> Dict[str, List[str]]:
+        """Retrieve learned preferences"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('SELECT category, content FROM preferences ORDER BY timestamp DESC')
+        results = cursor.fetchall()
+        conn.close()
+        
+        prefs = {}
+        for cat, content in results:
+            if cat not in prefs:
+                prefs[cat] = []
+            prefs[cat].append(content)
+        return prefs
--- a/nimue/ollama_client.py
+++ b/nimue/ollama_client.py
@@ -0,0 +1,151 @@
+import requests
+import json
+import time
+from typing import Generator, List, Dict, Optional
+import logging
+
+logger = logging.getLogger('nimue')
+
+class OllamaClient:
+    def __init__(self, config: dict):
+        self.host = config['host']
+        self.model = config['model']
+        self.timeout = config['timeout']
+        self.session = requests.Session()
+        
+    def _prepare_messages(self, system_prompt: str, context: List[Dict], user_message: str) -> List[Dict]:
+        """Prepare message list for Ollama API"""
+        messages = []
+        
+        # System prompt first
+        if system_prompt:
+            messages.append({
+                "role": "system",
+                "content": system_prompt
+            })
+        
+        # Add context (memory)
+        for msg in context:
+            messages.append({
+                "role": msg['role'],
+                "content": msg['content']
+            })
+        
+        # User message last
+        messages.append({
+            "role": "user", 
+            "content": user_message
+        })
+        
+        return messages
+    
+    def generate(self, 
+                 system_prompt: str, 
+                 context: List[Dict], 
+                 user_message: str,
+                 options: Optional[Dict] = None) -> Generator[str, None, None]:
+        """
+        Stream response from Ollama API
+        Yields tokens/chunks as they arrive
+        """
+        messages = self._prepare_messages(system_prompt, context, user_message)
+        
+        payload = {
+            "model": self.model,
+            "messages": messages,
+            "stream": True,
+            "options": options or {
+                "temperature": 0.9,
+                "top_p": 0.9,
+                "top_k": 40
+            }
+        }
+        
+        try:
+            response = self.session.post(
+                f"{self.host}/api/chat",
+                json=payload,
+                stream=True,
+                timeout=self.timeout
+            )
+            response.raise_for_status()
+            
+            full_response = ""
+            
+            for line in response.iter_lines():
+                if line:
+                    try:
+                        data = json.loads(line)
+                        if 'message' in data and 'content' in data['message']:
+                            chunk = data['message']['content']
+                            full_response += chunk
+                            yield chunk
+                        
+                        # Check for completion
+                        if data.get('done', False):
+                            break
+                            
+                    except json.JSONDecodeError:
+                        continue
+            
+            logger.info(f"Generated {len(full_response)} characters")
+            
+        except requests.exceptions.ConnectionError:
+            logger.error(f"Cannot connect to Ollama at {self.host}")
+            yield "*softly* I'm having trouble connecting to my thoughts... Please check if Ollama is running."
+        except requests.exceptions.Timeout:
+            logger.error("Ollama request timed out")
+            yield "*breathes deeply* I need a moment... the thoughts are coming slowly."
+        except Exception as e:
+            logger.error(f"Error generating response: {e}")
+            yield "*whispers* Something went wrong... please try again."
+    
+    def check_model(self) -> bool:
+        """Check if configured model is available"""
+        try:
+            response = self.session.get(f"{self.host}/api/tags", timeout=10)
+            if response.status_code == 200:
+                data = response.json()
+                models = [m['name'] for m in data.get('models', [])]
+                if self.model in models:
+                    return True
+                else:
+                    logger.warning(f"Model {self.model} not found. Available: {models}")
+                    return False
+        except Exception as e:
+            logger.error(f"Cannot reach Ollama: {e}")
+            return False
+    
+    def list_models(self) -> List[str]:
+        """List available models"""
+        try:
+            response = self.session.get(f"{self.host}/api/tags", timeout=10)
+            if response.status_code == 200:
+                data = response.json()
+                return [m['name'] for m in data.get('models', [])]
+        except Exception:
+            pass
+        return []
+    
+    def pull_model(self, model_name: str) -> Generator[str, None, None]:
+        """Pull a model from Ollama library"""
+        try:
+            response = self.session.post(
+                f"{self.host}/api/pull",
+                json={"name": model_name},
+                stream=True
+            )
+            
+            for line in response.iter_lines():
+                if line:
+                    try:
+                        data = json.loads(line)
+                        status = data.get('status', '')
+                        if 'completed' in data:
+                            yield f"Downloading... {data.get('completed', 0)}/{data.get('total', 0)}"
+                        else:
+                            yield status
+                    except:
+                        pass
+        except Exception as e:
+            yield f"Error pulling model: {e}"
--- a/nimue/persona.py
+++ b/nimue/persona.py
@@ -0,0 +1,85 @@
+import yaml
+from typing import Dict, List
+from datetime import datetime
+import time
+
+class PersonaManager:
+    def __init__(self, config: dict):
+        self.name = config['name']
+        self.description = config['description']
+        self.template = config['system_prompt']
+        self.context_template = config.get('context_template', '')
+        self.session_start = time.time()
+        self.current_mood = "eager and attentive"
+        
+    def get_system_prompt(self, memory_manager=None) -> str:
+        """Generate dynamic system prompt based on context"""
+        
+        # Basis-Persona
+        prompt = self.template.replace('{{name}}', self.name)
+        
+        # Füge Session-Info hinzu
+        session_duration = int((time.time() - self.session_start) / 60)  # Minuten
+        
+        # Hole Präferenzen aus Gedächtnis
+        preferences = "None noted yet"
+        if memory_manager:
+            prefs = memory_manager.get_preferences()
+            if prefs:
+                pref_list = []
+                for cat, items in list(prefs.items())[:3]:
+                    pref_list.append(f"{cat}: {items[0]}")
+                preferences = "; ".join(pref_list)
+        
+        # Kontext-Template verarbeiten
+        context_info = self.context_template.replace('{{mood}}', self.current_mood)
+        context_info = context_info.replace('{{session_time}}', f"{session_duration} minutes")
+        context_info = context_info.replace('{{preferences}}', preferences)
+        
+        full_prompt = f"{prompt}\n\n{context_info}"
+        
+        return full_prompt.strip()
+    
+    def update_mood(self, user_message: str, response_sentiment: str = ""):
+        """Dynamically adjust mood based on interaction"""
+        msg_lower = user_message.lower()
+        
+        # Einfache Stimmungs-Erkennung
+        if any(word in msg_lower for word in ['punish', 'discipline', 'bad']):
+            self.current_mood = "chastened and submissive"
+        elif any(word in msg_lower for word in ['praise', 'good', 'pleasure']):
+            self.current_mood = "joyful and devoted"
+        elif any(word in msg_lower for word in ['command', 'order', 'now']):
+            self.current_mood = "eager to obey"
+        elif any(word in msg_lower for word in ['gentle', 'soft', 'slow']):
+            self.current_mood = "tender and careful"
+        
+    def extract_preferences(self, user_message: str) -> List[tuple]:
+        """Extract potential preferences from user message"""
+        preferences = []
+        msg_lower = user_message.lower()
+        
+        # Mustererkennung für Präferenzen
+        patterns = [
+            ("address", ["call me", "my name is", "i am", "i'm "]),
+            ("likes", ["i like", "i love", "i enjoy", "favorite"]),
+            ("dislikes", ["i hate", "i dislike", "don't like", "annoying"]),
+            ("limits", ["limit", "boundary", "don't", "stop"]),
+            ("preferences", ["prefer", "rather", "want you to"])
+        ]
+        
+        for category, keywords in patterns:
+            for keyword in keywords:
+                if keyword in msg_lower:
+                    idx = msg_lower.find(keyword)
+                    # Extrahiere den Satz danach
+                    end_idx = min(idx + 100, len(user_message))
+                    segment = user_message[idx:end_idx].strip()
+                    if len(segment) > len(keyword) + 2:
+                        preferences.append((category, segment))
+        
+        return preferences
+    
+    def format_name(self) -> str:
+        """Return formatted name"""
+        return self.name
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,5 @@
+flask>=2.3.0
+pyyaml>=6.0
+requests>=2.31.0
+werkzeug>=2.3.0
+jinja2>=3.1.0
--- a/setup.sh
+++ b/setup.sh
@@ -0,0 +1,61 @@
+#!/bin/bash
+# Nimue Setup Script
+
+set -e
+
+echo "================================"
+echo "  Nimue Setup"
+echo "================================"
+
+# Check if Ollama is installed
+if ! command -v ollama &> /dev/null; then
+    echo "Ollama not found. Installing..."
+    curl -fsSL https://ollama.com/install.sh | sh
+fi
+
+echo "✓ Ollama found"
+
+# Check if Ollama is running
+if ! curl -s http://localhost:11434/api/tags > /dev/null; then
+    echo "Starting Oollama..."
+    ollama serve &
+    sleep 5
+fi
+
+echo "✓ Ollama running"
+
+# Check Python
+if ! command -v python &> /dev/null; then
+    echo "Python not found!"
+    exit 1
+fi
+
+echo "✓ Python found"
+
+# Install dependencies
+echo "Installing Python dependencies..."
+pip install -r requirements.txt
+
+echo "✓ Dependencies installed"
+
+# Check model
+MODEL="HammerAI/rocinante-v1.1:12b-q4_K_M"
+echo "Checking for model: $MODEL"
+
+if ! ollama list | grep -q "$MODEL"; then
+    echo "Model not found. Downloading (this may take a while)..."
+    ollama pull $MODEL
+fi
+
+echo "✓ Model ready"
+
+# Create directories
+mkdir -p logs
+
+echo ""
+echo "================================"
+echo "  Setup complete!"
+echo ""
+echo "  Start with: python main.py"
+echo "  Then open:  http://localhost:5000"
+echo "================================"
--- a/static/style.css
+++ b/static/style.css
@@ -0,0 +1,324 @@
+* {
+    margin: 0;
+    padding: 0;
+    box-sizing: border-box;
+}
+
+:root {
+    --bg-primary: #1a1a2e;
+    --bg-secondary: #16213e;
+    --bg-tertiary: #0f3460;
+    --accent: #e94560;
+    --accent-soft: #d4a5a5;
+    --text-primary: #eaeaea;
+    --text-secondary: #a0a0a0;
+    --user-bubble: #0f3460;
+    --assistant-bubble: #2d1b4e;
+}
+
+body {
+    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+    background: linear-gradient(135deg, var(--bg-primary) 0%, var(--bg-secondary) 100%);
+    color: var(--text-primary);
+    min-height: 100vh;
+    overflow-x: hidden;
+}
+
+.container {
+    max-width: 900px;
+    margin: 0 auto;
+    padding: 20px;
+    min-height: 100vh;
+    display: flex;
+    flex-direction: column;
+}
+
+/* Header */
+header {
+    text-align: center;
+    padding: 20px 0;
+    border-bottom: 1px solid var(--bg-tertiary);
+    margin-bottom: 20px;
+}
+
+header h1 {
+    font-size: 2.5rem;
+    background: linear-gradient(45deg, var(--accent), var(--accent-soft));
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+    font-weight: 300;
+    letter-spacing: 2px;
+}
+
+.subtitle {
+    color: var(--text-secondary);
+    font-size: 0.9rem;
+    text-transform: uppercase;
+    letter-spacing: 3px;
+    margin-top: 5px;
+}
+
+.model-info {
+    font-size: 0.75rem;
+    color: var(--text-secondary);
+    margin-top: 10px;
+}
+
+.memory-status {
+    font-size: 0.8rem;
+    color: var(--accent-soft);
+    margin-top: 5px;
+    padding: 5px 15px;
+    background: var(--bg-tertiary);
+    border-radius: 15px;
+    display: inline-block;
+}
+
+.memory-status.warning {
+    background: var(--accent);
+    color: white;
+}
+
+/* Chat Container */
+.chat-container {
+    flex-grow: 1;
+    overflow-y: auto;
+    padding: 20px;
+    display: flex;
+    flex-direction: column;
+    gap: 15px;
+    max-height: 60vh;
+}
+
+.message {
+    padding: 15px 20px;
+    border-radius: 20px;
+    max-width: 80%;
+    line-height: 1.6;
+    word-wrap: break-word;
+    animation: fadeIn 0.3s ease;
+}
+
+@keyframes fadeIn {
+    from { opacity: 0; transform: translateY(10px); }
+    to { opacity: 1; transform: translateY(0); }
+}
+
+.message.user {
+    align-self: flex-end;
+    background: var(--user-bubble);
+    border-bottom-right-radius: 5px;
+}
+
+.message.assistant {
+    align-self: flex-start;
+    background: var(--assistant-bubble);
+    border-bottom-left-radius: 5px;
+    border-left: 3px solid var(--accent);
+}
+
+.message.system {
+    align-self: center;
+    background: rgba(233, 69, 96, 0.1);
+    border: 1px solid var(--accent);
+    font-style: italic;
+}
+
+.message em {
+    color: var(--accent-soft);
+    font-style: italic;
+}
+
+/* Input Area */
+.input-area {
+    position: sticky;
+    bottom: 0;
+    background: var(--bg-primary);
+    padding: 20px 0;
+    border-top: 1px solid var(--bg-tertiary);
+}
+
+.typing-indicator {
+    text-align: center;
+    color: var(--accent-soft);
+    font-size: 0.85rem;
+    margin-bottom: 10px;
+    animation: pulse 1.5s infinite;
+}
+
+@keyframes pulse {
+    0%, 100% { opacity: 0.5; }
+    50% { opacity: 1; }
+}
+
+.input-row {
+    display: flex;
+    gap: 10px;
+}
+
+textarea {
+    flex-grow: 1;
+    background: var(--bg-secondary);
+    border: 1px solid var(--bg-tertiary);
+    border-radius: 25px;
+    padding: 15px 20px;
+    color: var(--text-primary);
+    font-size: 1rem;
+    resize: none;
+    min-height: 50px;
+    max-height: 150px;
+    font-family: inherit;
+    outline: none;
+    transition: border-color 0.3s;
+}
+
+textarea:focus {
+    border-color: var(--accent);
+}
+
+.send-btn {
+    background: var(--accent);
+    color: white;
+    border: none;
+    border-radius: 25px;
+    padding: 15px 30px;
+    font-size: 1rem;
+    cursor: pointer;
+    transition: all 0.3s;
+    height: 50px;
+}
+
+.send-btn:hover:not(:disabled) {
+    background: #ff5a75;
+    transform: scale(1.05);
+}
+
+.send-btn:disabled {
+    background: var(--bg-tertiary);
+    cursor: not-allowed;
+}
+
+.controls {
+    display: flex;
+    justify-content: center;
+    gap: 15px;
+    margin-top: 15px;
+    align-items: center;
+}
+
+.control-btn {
+    background: transparent;
+    border: 1px solid var(--bg-tertiary);
+    color: var(--text-secondary);
+    padding: 8px 15px;
+    border-radius: 15px;
+    cursor: pointer;
+    font-size: 0.8rem;
+    transition: all 0.3s;
+}
+
+.control-btn:hover {
+    border-color: var(--accent);
+    color: var(--accent);
+}
+
+.char-count {
+    font-size: 0.75rem;
+    color: var(--text-secondary);
+}
+
+/* Modal */
+.modal {
+    display: none;
+    position: fixed;
+    z-index: 1000;
+    left: 0;
+    top: 0;
+    width: 100%;
+    height: 100%;
+    background-color: rgba(0, 0, 0, 0.7);
+}
+
+.modal-content {
+    background-color: var(--bg-secondary);
+    margin: 5% auto;
+    padding: 30px;
+    border-radius: 15px;
+    width: 80%;
+    max-width: 600px;
+    max-height: 80vh;
+    overflow-y: auto;
+    border: 1px solid var(--bg-tertiary);
+}
+
+.close {
+    color: var(--text-secondary);
+    float: right;
+    font-size: 28px;
+    font-weight: bold;
+    cursor: pointer;
+}
+
+.close:hover {
+    color: var(--accent);
+}
+
+.modal-content h2 {
+    color: var(--accent);
+    margin-bottom: 20px;
+    font-weight: 300;
+}
+
+.modal-content h3 {
+    color: var(--accent-soft);
+    margin: 20px 0 10px;
+    font-size: 1rem;
+}
+
+.modal-content p {
+    color: var(--text-secondary);
+    font-size: 0.9rem;
+    margin: 5px 0;
+}
+
+/* Scrollbar */
+::-webkit-scrollbar {
+    width: 8px;
+}
+
+::-webkit-scrollbar-track {
+    background: var(--bg-primary);
+}
+
+::-webkit-scrollbar-thumb {
+    background: var(--bg-tertiary);
+    border-radius: 4px;
+}
+
+::-webkit-scrollbar-thumb:hover {
+    background: var(--accent);
+}
+
+/* Responsive */
+@media (max-width: 600px) {
+    .container {
+        padding: 10px;
+    }
+    
+    header h1 {
+        font-size: 1.8rem;
+    }
+    
+    .message {
+        max-width: 90%;
+        padding: 12px 15px;
+    }
+    
+    .input-row {
+        flex-direction: column;
+    }
+    
+    .send-btn {
+        width: 100%;
+    }
+}
--- a/templates/chat.html
+++ b/templates/chat.html
@@ -0,0 +1,222 @@
+<!DOCTYPE html>
+<html lang="de">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Nimue - {{ persona_name }}</title>
+    <link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
+</head>
+<body>
+    <div class="container">
+        <header>
+            <h1>𓇢 {{ persona_name }}</h1>
+            <div class="subtitle">Intimate AI Companion</div>
+            <div class="model-info">Model: <span id="model-name">Loading...</span></div>
+            <div class="memory-status" id="memory-status"></div>
+        </header>
+        
+        <div class="chat-container" id="chat-box">
+            <div class="message system">
+                *kneels gracefully, eyes lowered* I'm here for you completely... waiting for your instructions. What would please you today?
+            </div>
+        </div>
+        
+        <div class="input-area">
+            <div class="typing-indicator" id="typing" style="display: none;">Nimue is thinking...</div>
+            <div class="input-row">
+                <textarea id="user-input" placeholder="Command me..." maxlength="2000"></textarea>
+                <button id="send-btn" class="send-btn">Send</button>
+            </div>
+            <div class="controls">
+                <button onclick="clearMemory()" class="control-btn">Clear Memory</button>
+                <button onclick="showMemory()" class="control-btn">View Memory</button>
+                <span class="char-count"><span id="char-count">0</span>/2000</span>
+            </div>
+        </div>
+    </div>
+
+    <!-- Memory Modal -->
+    <div id="memory-modal" class="modal">
+        <div class="modal-content">
+            <span class="close" onclick="closeMemory()">&times;</span>
+            <h2>Memory Status</h2>
+            <div id="memory-stats"></div>
+            <div id="memory-preferences"></div>
+            <div id="memory-recent"></div>
+        </div>
+    </div>
+
+    <script>
+        const chatBox = document.getElementById('chat-box');
+        const userInput = document.getElementById('user-input');
+        const sendBtn = document.getElementById('send-btn');
+        const typing = document.getElementById('typing');
+        const charCount = document.getElementById('char-count');
+        
+        let isGenerating = false;
+        let currentMessageDiv = null;
+
+        // Load config
+        fetch('/api/config')
+            .then(r => r.json())
+            .then(data => {
+                document.getElementById('model-name').textContent = data.model;
+            });
+
+        userInput.addEventListener('input', () => {
+            charCount.textContent = userInput.value.length;
+        });
+
+        userInput.addEventListener('keydown', (e) => {
+            if (e.key === 'Enter' && !e.shiftKey) {
+                e.preventDefault();
+                sendMessage();
+            }
+        });
+
+        sendBtn.addEventListener('click', sendMessage);
+
+        function appendMessage(role, content) {
+            const div = document.createElement('div');
+            div.className = `message ${role}`;
+            div.innerHTML = formatMessage(content);
+            chatBox.appendChild(div);
+            chatBox.scrollTop = chatBox.scrollHeight;
+            return div;
+        }
+
+        function formatMessage(text) {
+            return text
+                .replace(/\n/g, '<br>')
+                .replace(/\*([^*]+)\*/g, '<em>$1</em>')
+                .replace(/"([^"]+)"/g, '&quot;$1&quot;');
+        }
+
+        function updateMemoryStatus(tokens, max) {
+            const percent = (tokens / max * 100).toFixed(1);
+            const status = document.getElementById('memory-status');
+            status.innerHTML = `Context: ${percent}% (${tokens} tokens)`;
+            status.className = percent > 80 ? 'memory-status warning' : 'memory-status';
+        }
+
+        async function sendMessage() {
+            if (isGenerating) return;
+            
+            const message = userInput.value.trim();
+            if (!message) return;
+
+            // Add user message
+            appendMessage('user', message);
+            userInput.value = '';
+            charCount.textContent = '0';
+            
+            isGenerating = true;
+            typing.style.display = 'block';
+            sendBtn.disabled = true;
+            
+            currentMessageDiv = document.createElement('div');
+            currentMessageDiv.className = 'message assistant';
+            chatBox.appendChild(currentMessageDiv);
+
+            try {
+                const response = await fetch('/api/chat', {
+                    method: 'POST',
+                    headers: {'Content-Type': 'application/json'},
+                    body: JSON.stringify({message: message})
+                });
+
+                const reader = response.body.getReader();
+                const decoder = new TextDecoder();
+                let fullText = '';
+
+                while (true) {
+                    const {done, value} = await reader.read();
+                    if (done) break;
+                    
+                    const chunk = decoder.decode(value);
+                    const lines = chunk.split('\n');
+                    
+                    for (const line of lines) {
+                        if (line.startsWith('data: ')) {
+                            const text = line.slice(6);
+                            if (text === '[DONE]') continue;
+                            fullText += text;
+                            currentMessageDiv.innerHTML = formatMessage(fullText);
+                            chatBox.scrollTop = chatBox.scrollHeight;
+                        }
+                        if (line.startsWith('event: stats')) {
+                            // Parse stats
+                        }
+                    }
+                }
+
+            } catch (error) {
+                currentMessageDiv.innerHTML = '<em>*system error* ' + error.message + '</em>';
+            } finally {
+                isGenerating = false;
+                typing.style.display = 'none';
+                sendBtn.disabled = false;
+                getMemoryStats();
+            }
+        }
+
+        async function getMemoryStats() {
+            try {
+                const resp = await fetch('/api/memory');
+                const data = await resp.json();
+                updateMemoryStatus(data.stats.short_term_tokens, data.stats.max_context);
+            } catch (e) {}
+        }
+
+        async function clearMemory() {
+            if (!confirm('Clear all memory?')) return;
+            await fetch('/api/clear', {method: 'POST'});
+            chatBox.innerHTML = '<div class="message system">Memory cleared. *kneels again* How may I serve you?</div>';
+            getMemoryStats();
+        }
+
+        async function showMemory() {
+            const modal = document.getElementById('memory-modal');
+            const resp = await fetch('/api/memory');
+            const data = await resp.json();
+            
+            document.getElementById('memory-stats').innerHTML = `
+                <h3>Statistics</h3>
+                <p>Short-term messages: ${data.stats.short_term_messages}</p>
+                <p>Tokens used: ${data.stats.short_term_tokens} / ${data.stats.max_context}</p>
+                <p>Usage: ${data.stats.usage_percent.toFixed(1)}%</p>
+            `;
+            
+            let prefs = '<h3>Learned Preferences</h3>';
+            if (Object.keys(data.preferences).length === 0) {
+                prefs += '<p>None yet...</p>';
+            } else {
+                for (const [cat, items] of Object.entries(data.preferences)) {
+                    prefs += `<p><strong>${cat}:</strong> ${items.join(', ')}</p>`;
+                }
+            }
+            document.getElementById('memory-preferences').innerHTML = prefs;
+            
+            let recent = '<h3>Recent Messages</h3>';
+            for (const msg of data.recent) {
+                recent += `<p><strong>${msg.role}:</strong> ${msg.content}</p>`;
+            }
+            document.getElementById('memory-recent').innerHTML = recent;
+            
+            modal.style.display = 'block';
+        }
+
+        function closeMemory() {
+            document.getElementById('memory-modal').style.display = 'none';
+        }
+
+        window.onclick = (e) => {
+            const modal = document.getElementById('memory-modal');
+            if (e.target === modal) modal.style.display = 'none';
+        };
+
+        // Initial load
+        getMemoryStats();
+    </script>
+</body>
+</html>
--- a/test_ollama.py
+++ b/test_ollama.py
@@ -0,0 +1,83 @@
+#!/usr/bin/env python3
+"""Test script for local Ollama connection"""
+
+import requests
+import sys
+
+OLLAMA_HOST = "http://localhost:11434"
+MODEL = "HammerAI/rocinante-v1.1:12b-q4_K_M"
+
+def test_connection():
+    """Test if Ollama is running"""
+    try:
+        resp = requests.get(f"{OLLAMA_HOST}/api/tags", timeout=5)
+        if resp.status_code == 200:
+            data = resp.json()
+            models = [m['name'] for m in data.get('models', [])]
+            print(f"✓ Ollama is running")
+            print(f"  Available models: {models}")
+            return models
+        else:
+            print(f"✗ Ollama returned status {resp.status_code}")
+            return []
+    except requests.exceptions.ConnectionError:
+        print(f"✗ Cannot connect to Ollama at {OLLAMA_HOST}")
+        print(f"  Start Ollama with: ollama serve")
+        return None
+    except Exception as e:
+        print(f"✗ Error: {e}")
+        return None
+
+def check_model(models):
+    """Check if our target model is available"""
+    if MODEL in models:
+        print(f"✓ Model {MODEL} is available")
+        return True
+    else:
+        print(f"✗ Model {MODEL} not found")
+        print(f"  Available models: {models}")
+        print(f"\n  To download, run:")
+        print(f"  ollama pull {MODEL}")
+        return False
+
+def test_generate():
+    """Test simple generation"""
+    try:
+        resp = requests.post(f"{OLLAMA_HOST}/api/generate", json={
+            "model": MODEL,
+            "prompt": "Hello, who are you?",
+            "stream": False
+        }, timeout=30)
+        
+        if resp.status_code == 200:
+            data = resp.json()
+            print(f"✓ Test generation successful")
+            print(f"  Response preview: {data.get('response', '')[:100]}...")
+            return True
+        else:
+            print(f"✗ Generation failed: {resp.status_code}")
+            print(f"  {resp.text}")
+            return False
+    except Exception as e:
+        print(f"✗ Generation error: {e}")
+        return False
+
+if __name__ == "__main__":
+    print("=" * 50)
+    print("Nimue Ollama Test")
+    print("=" * 50)
+    print(f"Target: {OLLAMA_HOST}")
+    print(f"Model:  {MODEL}")
+    print("-" * 50)
+    
+    models = test_connection()
+    if models is None:
+        sys.exit(1)
+    
+    if not check_model(models):
+        print("\n" + "=" * 50)
+        print("SETUP REQUIRED:")
+        print("=" * 50)
+    else:
+        print("\n" + "-" * 50)
+        test_generate()