--- title: PansGPT Qwen3 Embedding API emoji: 🚀 colorFrom: blue colorTo: green sdk: docker sdk_version: 4.44.0 app_file: app.py pinned: false license: mit app_port: 7860 short_description: Embedding model --- # PansGPT Qwen3 Embedding API A stable, Docker-based API for generating text embeddings using the Qwen3-Embedding-0.6B model. This space provides a reliable service for the PansGPT application. ## Features - **Single Text Embedding**: Generate embeddings for individual texts - **Batch Processing**: Process multiple texts efficiently - **Similarity Calculation**: Compute cosine similarity between embeddings - **Docker-based**: Stable deployment with containerization - **Health Monitoring**: Built-in health check endpoints - **Fallback Support**: Automatic fallback to sentence-transformers if needed ## API Endpoints ### 1. Single Text Embedding ```bash POST /api/predict Content-Type: application/json { "data": ["Your text here"] } ``` ### 2. Batch Text Embedding ```bash POST /api/predict Content-Type: application/json { "data": [["Text 1", "Text 2", "Text 3"]] } ``` ### 3. Health Check ```bash GET /health ``` ## Usage Examples ### Python ```python import requests import json # Single text embedding response = requests.post( "https://ojochegbeng-pansgpt.hf.space/api/predict", json={"data": ["Hello, world!"]} ) embedding = response.json()["data"][0] # Batch embedding response = requests.post( "https://ojochegbeng-pansgpt.hf.space/api/predict", json={"data": [["Text 1", "Text 2", "Text 3"]]} ) embeddings = response.json()["data"][0] ``` ### JavaScript ```javascript // Single text embedding const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ data: ["Hello, world!"] }) }); const embedding = (await response.json()).data[0]; // Batch embedding const response = await fetch("https://ojochegbeng-pansgpt.hf.space/api/predict", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ data: [["Text 1", "Text 2", "Text 3"]] }) }); const embeddings = (await response.json()).data[0]; ``` ## Model Information - **Base Model**: Qwen3-Embedding-0.6B - **Embedding Dimension**: 1024 (Qwen3) or 384 (fallback) - **Max Input Length**: 512 tokens - **Device**: Auto-detects CUDA/CPU ## Docker Configuration This space uses Docker for stable deployment: - **Base Image**: Python 3.11-slim - **Port**: 7860 - **Health Check**: Built-in monitoring - **Non-root User**: Security best practices ## Performance - **Single Text**: ~100-500ms (depending on hardware) - **Batch Processing**: Optimized for multiple texts - **Memory Usage**: ~2-4GB RAM - **Concurrent Requests**: Supports multiple simultaneous requests ## Integration with PansGPT This API is specifically designed for the PansGPT application: 1. **Stable Connection**: Docker-based deployment eliminates connection issues 2. **Consistent Performance**: Reliable response times 3. **Error Handling**: Comprehensive error handling and fallbacks 4. **Monitoring**: Built-in health checks for monitoring ## Support For issues or questions: - Check the health endpoint first: `/health` - Review the logs for error details - Ensure your input format matches the expected structure --- **Note**: This space is optimized for stability and reliability. The Docker-based deployment ensures consistent performance for the PansGPT application.