LeoNguyen
Update documentation and refine requirements: Enhance the README with detailed installation instructions, Docker deployment steps, and key dependencies. Update requirements files to clarify optional packages and adjust CUDA-related dependencies. Modify .gitignore to include cache directories and ensure proper resource management in the application.
e3a80c0
AI Project
A FastAPI-based AI application providing chat, file, image, and web data services with advanced streaming and tool-based capabilities.
Features
- Real-time chat streaming with AI (supports context from uploaded files)
- File upload, processing, and vector storage (PDF, DOCX, images, etc.)
- Vector store for semantic search and context retrieval
- Image generation using Stable Diffusion
- Web search and web content reading via integrated tools
- Tool-based AI interactions (extensible via
utils/tools) - CORS-enabled API endpoints
- Static file serving for generated outputs
Project Structure
src/
├── constants/ # System prompts and configuration
│ ├── system_prompts.py
│ └── file_type.py
├── models/ # Request/Response models
│ ├── requests/
│ └── responses/
├── routes/ # API route definitions
│ ├── chat_routes.py
│ ├── process_file_routes.py
│ └── vector_store_routes.py
├── services/ # Business logic
│ ├── chat_service.py
│ ├── image_service.py
│ ├── process_file_service.py
│ ├── web_data_service.py
│ └── vector_store_service.py
├── utils/ # Utility functions
│ ├── tools/ # Tool implementations (image, web search, etc.)
│ ├── client.py
│ ├── exception.py
│ ├── image_pipeline.py
│ └── timing.py
└── main.py # Application entry point
API Endpoints
Chat API
POST /api/v1/chat/stream: Stream chat responses in real-time (supports tool-based and context-aware interactions)POST /api/v1/chat: Non-streaming chat (batched response)
File Processing API
POST /api/v1/process-file/upload-and-process-file: Upload and process files (PDF, DOCX, images, etc.), store in vector DB
Vector Store API
GET /api/v1/vector-store/get-all-ids: List all vector store collection IDsPOST /api/v1/vector-store/inspect-collection: Inspect a specific vector store collection
Tool-based AI
- Image Generation: Generate images from prompts using Stable Diffusion
- Web Search: Search the web using Brave Search API
- Web Content Reading: Fetch and summarize web pages using Jina API
Error Handling
- Custom exceptions and global error handling
- Standardized JSON error responses
Static Files
- Serves generated outputs (e.g., images) from
/outputsat/outputsendpoint
Getting Started
Prerequisites
- Python 3.11
- CUDA 12.9.0 (for GPU acceleration)
- FastAPI 0.114.0
- Uvicorn 0.34.2
Installation
- Clone the repository
- Create a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
Running the Application
Local Development
uvicorn main:app --reload --port 8080
Docker Deployment
# Build the Docker image
docker build -t ai-assistance-server .
# Run the container
docker run -p 7860:7860 --gpus all ai-assistance-server
The application will be available at:
- Local:
http://localhost:7860 - Server:
http://0.0.0.0:7860or https://leonguyen101120-ai-assistance.hf.space
Development
Key Dependencies
AI/ML:
- diffusers 0.33.1
- transformers 4.52.4
- torch 2.7.0
- accelerate 1.6.0
File Processing (Optional):
- beautifulsoup4 4.13.4
- langchain_chroma 0.2.2
- langchain_huggingface 0.1.2
- langchain_community 0.3.19
- chromadb 0.6.3
- pymupdf 1.25.1
Environment Variables
The following environment variables are required for specific features:
- Brave Search API key (for web search)
- Jina API key (for web content reading)
- HuggingFace API key (for model access)
License
[Add your license information here]