fix: improve long model download handling and first-time startup experience

randomm · randomm · commit 483704e3b208 · 2025-03-24T11:05:42.000+02:00
- Increase health check timeouts to accommodate large model downloads (~300-500MB)
- Add informative messages about expected first-time download delays
- Add FAST_STARTUP environment variable option for quicker initialization
- Update documentation with startup time expectations and solutions
- Ensure project_initializer respects EMBEDDING_MODEL env variable
- Improve troubleshooting information for timeout issues
diff --git a/Dockerfile b/Dockerfile
@@ -25,7 +25,9 @@ ENV PYTHONUNBUFFERED=1 \
     PYTHONDONTWRITEBYTECODE=1
 
 # Health check
-HEALTHCHECK --interval=30s --timeout=30s --start-period=30s --retries=3 \
+# Note: First run will download large embedding models which can take several minutes
+# The run.sh script handles this with a longer timeout, but Docker health checks need to be configured too
+HEALTHCHECK --interval=30s --timeout=30s --start-period=600s --retries=10 \
     CMD curl -f http://localhost:8000/health || exit 1
 
 # Command to run when container starts
diff --git a/README.md b/README.md
@@ -50,11 +50,27 @@ The service will:
 
 Files-DB-MCP works without configuration, but you can customize it with environment variables:
 
-- `EMBEDDING_MODEL` - Change the embedding model (default: 'sentence-transformers/all-MiniLM-L6-v2')
+- `EMBEDDING_MODEL` - Change the embedding model (default: 'jinaai/jina-embeddings-v2-base-code' or project-specific model)
+- `FAST_STARTUP` - Set to 'true' to use a smaller model for faster startup (default: 'false')
 - `QUANTIZATION` - Enable/disable quantization (default: 'true')
 - `BINARY_EMBEDDINGS` - Enable/disable binary embeddings (default: 'false')
 - `IGNORE_PATTERNS` - Comma-separated list of files/dirs to ignore
 
+### First-Time Startup
+
+On first run, Files-DB-MCP will download embedding models which may take several minutes depending on:
+- The size of the selected model (300-500MB for high-quality models)
+- Your internet connection speed
+
+Subsequent startups will be much faster as models are cached. For faster initial startup, you can:
+```bash
+# Use a smaller, faster model (90MB)
+EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 files-db-mcp
+
+# Or enable fast startup mode
+FAST_STARTUP=true files-db-mcp
+```
+
 ## Claude Code Integration
 
 Add to your Claude Code configuration:
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -21,7 +21,10 @@ services:
     environment:
       - VECTOR_DB_HOST=vector-db
       - VECTOR_DB_PORT=6333
-      - EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2  # Default code embedding model
+      - EMBEDDING_MODEL=${EMBEDDING_MODEL:-sentence-transformers/all-MiniLM-L6-v2}  # Default code embedding model
+      # For faster startup, you can set EMBEDDING_MODEL to a smaller model: 
+      # Example: export EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
+      - FAST_STARTUP=${FAST_STARTUP:-false}                     # Set to 'true' to prioritize faster startup over quality 
       - QUANTIZATION=true                                       # Enable quantization by default
       - BINARY_EMBEDDINGS=false                                 # Disable binary embeddings by default
       - DEBUG=true                                              # Enable debug mode
@@ -33,9 +36,9 @@ services:
     healthcheck:
       test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
       interval: 30s
-      timeout: 20s
-      retries: 5
-      start_period: 180s  # Give more time for model downloads
+      timeout: 30s
+      retries: 10
+      start_period: 600s  # Give much more time for large model downloads
 
 volumes:
   qdrant_data:
diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
@@ -42,24 +42,37 @@ This guide helps you diagnose and solve common issues with Files-DB-MCP.
 - Ensure vector database is running and accessible
 - Check environment variables are properly set
 
-### Health Check Fails
+### Health Check Fails or Long Startup Time
 
 **Symptoms:**
-- Container shows "unhealthy" status
-- Service starts but health check fails
+- Container shows "unhealthy" status or "starting" for a long time
+- Timeout message: "Timeout waiting for MCP service to become healthy"
+- First run takes much longer than expected
+
+**Causes:**
+- First startup requires downloading large embedding models (300-500MB)
+- Default timeout may be too short for large model downloads
+- Slow internet connection can extend download time
 
 **Check:**
-1. **Health Check Response:**
+1. **Container Logs for Download Progress:**
+   ```bash
+   docker logs files-db-mcp-files-db-mcp-1
+   ```
+   Look for "Downloading model.safetensors" progress messages.
+
+2. **Health Check Response:**
    ```bash
    curl http://localhost:3000/health
    ```
 
-2. **Health Check Configuration:** Check the health check timeout in `docker-compose.yml`
+3. **Health Check Configuration:** Check the health check timeout in `docker-compose.yml`
 
 **Solutions:**
-- Increase health check timeout for model loading
-- Check if the vector database is properly connected
-- Verify network settings between containers
+- Be patient during first startup - subsequent starts will be much faster
+- Increase health check start_period in docker-compose.yml to 600s or more
+- Increase timeout in scripts/run.sh to 600 seconds or more
+- Switch to a smaller embedding model by setting EMBEDDING_MODEL environment variable
 
 ## Docker Compose Issues
 
diff --git a/scripts/run.sh b/scripts/run.sh
@@ -71,16 +71,18 @@ COMPOSE_HTTP_TIMEOUT=300 docker compose -f "$BASE_DIR/docker-compose.yml" up --b
 echo 
 echo "Files-DB-MCP is starting up..."
 echo "Waiting for services to initialize..."
+echo "Note: First run requires downloading embedding models (~300-500MB) which may take several minutes."
+echo "      Future startups will be much faster as models are cached."
 
 # Get the actual container names as they might be different
 MCP_CONTAINER=$(docker compose -f "$BASE_DIR/docker-compose.yml" ps -q files-db-mcp)
 VECTOR_DB_CONTAINER=$(docker compose -f "$BASE_DIR/docker-compose.yml" ps -q vector-db)
 
 echo "Container IDs: MCP=$MCP_CONTAINER, Vector DB=$VECTOR_DB_CONTAINER"
 
-# Wait up to 2 minutes for MCP to become healthy
-timeout=120
-interval=5
+# Wait up to 10 minutes for MCP to become healthy (especially for first-time model downloads)
+timeout=600
+interval=10
 elapsed=0
 
 echo "Waiting for MCP service to become healthy..."
diff --git a/src/project_initializer.py b/src/project_initializer.py
@@ -312,8 +312,22 @@ def select_embedding_model(self) -> Tuple[str, Dict[str, Any]]:
         Returns:
             Tuple of (model_name, model_config)
         """
-        # Check if we have a custom config
-        if self.config_file.exists():
+        # First check environment variable for FAST_STARTUP mode
+        fast_startup = os.environ.get("FAST_STARTUP", "false").lower() == "true"
+        
+        # Check if embedding model is explicitly set in environment
+        env_model = os.environ.get("EMBEDDING_MODEL")
+        if env_model:
+            logger.info(f"Using embedding model from environment: {env_model}")
+            self.embedding_model = env_model
+            # Keep default config but allow overrides from config file
+            if self.primary_project_type in DEFAULT_MODEL_CONFIGS:
+                self.model_config = DEFAULT_MODEL_CONFIGS[self.primary_project_type].copy()
+            else:
+                self.model_config = DEFAULT_MODEL_CONFIGS["default"].copy()
+                
+        # Check if we have a custom config file
+        elif self.config_file.exists():
             try:
                 with open(self.config_file, "r") as f:
                     config = json.load(f)
@@ -327,8 +341,14 @@ def select_embedding_model(self) -> Tuple[str, Dict[str, Any]]:
             except Exception as e:
                 logger.warning(f"Error reading config file: {e!s}")
         
-        # Use default model for the primary project type
-        if self.primary_project_type in DEFAULT_EMBEDDING_MODELS:
+        # If fast startup is requested, use the smallest model
+        elif fast_startup:
+            logger.info("FAST_STARTUP is enabled, using lightweight embedding model")
+            self.embedding_model = "sentence-transformers/all-MiniLM-L6-v2"  # Smallest model ~90MB
+            self.model_config = DEFAULT_MODEL_CONFIGS["default"].copy()
+            
+        # Otherwise use default model for the primary project type
+        elif self.primary_project_type in DEFAULT_EMBEDDING_MODELS:
             self.embedding_model = DEFAULT_EMBEDDING_MODELS[self.primary_project_type]
             self.model_config = DEFAULT_MODEL_CONFIGS[self.primary_project_type].copy()
         else: