Generic JupyterHub Integration Blueprint

Calliope Integration: This component is integrated into the Calliope AI platform. Some features and configurations may differ from the upstream project.

Overview

This blueprint shows how to transform any standalone containerized service into a JupyterHub-compatible service, based on the pattern implemented for WAIIDE (WAIIDE Server → JupyterHub integration).

Core Pattern: Dual-Mode Architecture

Transform a single-purpose service container into a dual-mode container:

Standalone Mode: Original service + API proxy
JupyterHub Mode: jupyterhub-singleuser + jupyter-server-proxy + original service

Implementation Recipe

Step 1: Environment Detection & Mode Selection

Create an entrypoint script that detects JupyterHub environment:

#!/bin/bash
# entrypoint-jupyterhub.sh

# Detect JupyterHub environment
if [ -n "$JUPYTERHUB_SERVICE_PREFIX" ] || [ -n "$JUPYTERHUB_USER" ] || [ -n "$JUPYTERHUB_API_TOKEN" ]; then
    MODE="jupyterhub"
else
    MODE="standalone"
fi

# Route to appropriate startup mode
if [ "$MODE" = "jupyterhub" ]; then
    start_jupyterhub_mode
else
    start_standalone_mode
fi

Key Environment Variables to Check:

JUPYTERHUB_SERVICE_PREFIX - URL prefix (e.g., /user/alice/myservice/)
JUPYTERHUB_USER - Username
JUPYTERHUB_API_TOKEN - OAuth token
JUPYTERHUB_SERVER_NAME - Named server identifier

Step 2: Permission Handling

Handle Docker user permissions properly:

# Start as root, fix permissions, then drop to target user
if [ "$(id -u)" = "0" ]; then
    echo "🔧 Running as root - fixing permissions..."
    
    # Create user directories
    mkdir -p "$USER_HOME/workspace"
    mkdir -p "$USER_HOME/.local/share/jupyter/runtime"
    
    # Fix ownership (UID 1000, GID 100 - standard for Jupyter containers)
    chown -R 1000:100 "$USER_HOME"
    
    # Drop to non-root user
    exec su -s /bin/bash -c "exec $0 $@" $(getent passwd 1000 | cut -d: -f1)
fi

Step 3: Dual-Mode Service Architecture

JupyterHub Mode (Port Strategy)

┌─────────────────────────────────────────────────────────────┐
│                    Container (Port 8080)                    │
├─────────────────────────────────────────────────────────────┤
│  jupyterhub-singleuser (0.0.0.0:8080)                      │
│           ↓                                                 │
│  jupyter-server-proxy                                       │
│           ↓                                                 │
│  Original Service (127.0.0.1:8081)                         │
│                                                             │
│  URL: /user/{username}/proxy/8081/ → localhost:8081        │
└─────────────────────────────────────────────────────────────┘

Standalone Mode (Port Strategy)

┌─────────────────────────────────────────────────────────────┐
│                    Container (Port 8080)                    │
├─────────────────────────────────────────────────────────────┤
│  API Server (0.0.0.0:8080)                                 │
│           ↓                                                 │
│  Proxy to Original Service (127.0.0.1:8081)                │
│                                                             │
│  URL: /api → API endpoints                                  │
│  URL: /* → Original Service (proxied)                      │
└─────────────────────────────────────────────────────────────┘

Step 4: Create API Server with URL Rewriting

#!/usr/bin/env python3
"""
Generic API server that provides JupyterHub-compatible API endpoints
and proxies requests to the original service with URL path rewriting.
"""

import os
import json
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.request import urlopen, Request
from urllib.parse import urlparse

class ServiceAPIHandler(BaseHTTPRequestHandler):
    def __init__(self, *args, **kwargs):
        self.service_host = '127.0.0.1'
        self.service_port = 8081  # Original service internal port
        
        # Get JupyterHub service prefix for URL rewriting
        self.service_prefix = os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '')
        if self.service_prefix and not self.service_prefix.endswith('/'):
            self.service_prefix += '/'
        super().__init__(*args, **kwargs)
    
    def strip_prefix(self, path):
        """Strip JupyterHub service prefix from path"""
        if self.service_prefix and path.startswith(self.service_prefix):
            stripped = path[len(self.service_prefix)-1:]
            if not stripped:
                stripped = '/'
            return stripped
        return path
    
    def rewrite_content(self, content, content_type):
        """Rewrite URLs in content to include JupyterHub prefix"""
        # Implement URL rewriting for HTML/CSS/JS content
        # Pattern: Replace absolute paths with prefixed paths
        pass
    
    def handle_api_endpoints(self, path):
        """Handle JupyterHub-compatible API endpoints"""
        if path == '/api' or path == '/api/':
            self.send_api_response({
                "status": "running",
                "user": os.environ.get('JUPYTERHUB_USER', 'unknown'),
                "server": "your-service-name",
                "version": "1.0.0",
                "mode": "jupyterhub" if self.service_prefix else "standalone",
                "service_prefix": self.service_prefix,
                "endpoints": {
                    "api": f"{self.service_prefix}api",
                    "service": f"{self.service_prefix}"
                }
            })
            return True
        return False
    
    def proxy_to_service(self, stripped_path):
        """Proxy request to original service with URL rewriting"""
        # Implementation similar to WAIIDE's proxy_to_vscode method
        pass
    
    def do_GET(self):
        stripped_path = self.strip_prefix(self.path)
        
        # Handle API endpoints
        if self.handle_api_endpoints(stripped_path):
            return
        
        # Proxy to original service
        self.proxy_to_service(stripped_path)

Step 5: Jupyter Server Configuration

Create jupyter_server_config.py:

"""
Jupyter Server configuration for JupyterHub integration
"""
import os
from jupyter_server_proxy import IdentityProvider

# Configure jupyter-server-proxy
c.ServerProxy.servers = {
    'your-service': {
        'command': ['echo', 'Service started elsewhere'],
        'port': 8081,
        'timeout': 60,
        'absolute_url': False,
        'rewrite_response': True,
    }
}

# Permissive authentication for JupyterHub
c.IdentityProvider.identity_provider_class = IdentityProvider

Step 6: Docker Configuration

Update your Dockerfile:

# Install JupyterHub dependencies
RUN pip3 install --break-system-packages --no-cache-dir \
    jupyter-server-proxy[standalone] \
    jupyterhub

# Set standard Jupyter container UID/GID (1000:100)
RUN groupadd -g 100 users 2>/dev/null || true && \
    useradd -m -u 1000 -g 100 -s /bin/bash myuser && \
    usermod -aG sudo myuser

# Copy integration scripts
COPY --chmod=755 scripts/entrypoint-jupyterhub.sh /usr/local/bin/
COPY --chmod=755 scripts/api_server.py /usr/local/bin/
COPY --chmod=755 scripts/jupyter_server_config.py /usr/local/bin/

# Expose ports
EXPOSE 8080 8081

# Use bash entrypoint for flexibility
ENTRYPOINT ["/bin/bash", "-c"]
CMD ["exec /usr/local/bin/entrypoint-jupyterhub.sh"]

Step 7: JupyterHub Spawner Configuration

Configure your JupyterHub spawner:

# jupyterhub_config.py
c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
c.DockerSpawner.image = 'your-org/your-service:latest'
c.DockerSpawner.network_name = 'jupyterhub-network'
c.DockerSpawner.volumes = {
    'jupyterhub-user-{username}': '/home/{username}'
}
c.DockerSpawner.extra_create_kwargs = {'user': 'root'}  # For permission fixing
c.DockerSpawner.cmd = ''  # Use container's entrypoint
c.Spawner.default_url = '/proxy/8081/'  # Direct to your service

Testing Strategy

Unit Tests

Test environment detection logic
Test API endpoint responses
Test URL rewriting functions
Test proxy functionality

Integration Tests

Test standalone mode startup
Test JupyterHub mode startup
Test API compatibility with JupyterHub
Test URL path handling

End-to-End Tests

Test spawning from JupyterHub
Test service accessibility
Test OAuth flows
Test WebSocket connections (if applicable)

Common OAuth Fixes

Named Server OAuth Issues

# oauth_named_server_fix.py
"""Fix OAuth redirect URLs for named servers"""
def fix_oauth_redirect_url(url):
    # Remove service prefix from hub OAuth URLs
    if '/user/' in url and '/hub/api/oauth2' in url:
        return url.replace('/user/{username}/{servername}/hub/', '/hub/')
    return url

Scope Fixes

# jupyter_scope_fix.py
"""Fix OAuth scopes for named servers"""
def patch_oauth_scopes():
    # Add proper scopes for named server access
    pass

Key Implementation Files

Based on the WAIIDE implementation, you’ll need:

Core Files (~1000 lines):
- entrypoint-jupyterhub.sh - Main orchestration script
- api_server.py - API server with proxy functionality
- jupyter_server_config.py - Jupyter server configuration
OAuth Fixes (~200 lines):
- oauth_named_server_fix.py - Fix OAuth redirects
- jupyter_scope_fix.py - Fix OAuth scopes
Testing (~1000 lines):
- test_api.py - API endpoint tests
- test_entrypoint.py - Startup logic tests
- test_url_rewriting.py - URL rewriting tests
- run_tests.py - Test runner
Documentation (~3000 lines):
- Configuration guides
- Troubleshooting guides
- Architecture documentation

Service-Specific Adaptations

For Web Services

Focus on URL rewriting for HTML/CSS/JS content
Handle WebSocket upgrades if needed
Implement proper CORS headers

For API Services

Ensure API endpoints don’t conflict with JupyterHub paths
Handle authentication properly
Consider API versioning

For Desktop Applications (via web interface)

May need VNC/X11 forwarding
Consider noVNC integration
Handle clipboard/file transfer

Success Metrics

✅ Service starts in both modes
✅ API endpoints respond correctly
✅ JupyterHub can health-check the service
✅ URL rewriting works correctly
✅ OAuth authentication works
✅ Service is accessible through JupyterHub
✅ WebSocket connections work (if applicable)

Troubleshooting Checklist

Environment Detection: Check if JupyterHub variables are detected
Permissions: Verify container can create user directories
Port Configuration: Ensure no port conflicts
URL Rewriting: Test with/without service prefix
OAuth: Check for named server OAuth issues
Proxy: Verify requests reach the original service

Advanced Features

Service Discovery

Implement /api/services endpoint for service discovery:

{
  "services": {
    "your-service": {
      "port": 8081,
      "status": "running",
      "description": "Your Service Description"
    }
  }
}

Health Monitoring

Add health check endpoints:

def check_service_health(self):
    """Check if original service is responding"""
    try:
        with socket.create_connection((self.service_host, self.service_port), timeout=2):
            return True
    except:
        return False

Custom URL Patterns

Support custom URL patterns beyond the standard /proxy/ pattern:

# Handle custom paths like /user/{username}/myservice/
c.DockerSpawner.default_url = '/myservice/'

Performance Considerations

Memory: Add ~500MB for JupyterHub components
CPU: Minimal overhead for proxy operations
Network: <5ms latency for proxy requests
Startup: Add 10-15 seconds for dual-mode initialization

Security Notes

Always start as root and drop privileges
Use standard Jupyter container UID/GID (1000:100)
Validate all URL rewrites to prevent injection
Implement proper CORS headers for API endpoints
Use secure WebSocket connections when possible

This blueprint provides a comprehensive pattern for transforming any containerized service into a JupyterHub-compatible service. The total implementation typically requires ~2000 lines of WAIIDE across 10-15 files, but provides robust dual-mode operation with full JupyterHub integration.

Fixing AI CLI Path Issues JupyterHub Configuration for WAIIDE