jdopensource
/

JoyAI-Image-Edit-Diffusers

@@ -27,20 +27,28 @@ JoyAI-Image-Edit is a multimodal foundation model specialized in instruction-gui
 **Requirements**: Python >= 3.10, CUDA-capable GPU
-#### Core Dependencies
 | Package | Version | Purpose |
 |---------|---------|---------|
 | `torch` | >= 2.8 | PyTorch |
 | `transformers` | >= 4.57.0, < 4.58.0 | Text encoder |
-#### Install the [Pull Request](https://github.com/huggingface/diffusers/pull/13444]) of JoyAI-Image-Edit of diffusers
 ```bash
-pip install git+https://github.com/huggingface/diffusers.git@refs/pull/13444
 ```
-#### Running with Diffusers
 ```python
 import torch
 from PIL import Image
@@ -54,7 +62,7 @@ pipeline.set_progress_bar_config(disable=None)
 print("pipeline loaded")
 img_path = "./test_images/input.png"
-prompt = "Remove the construction structure from the top of the crane."
 image = Image.open(img_path).convert("RGB")
 prompts = [f"<|im_start|>user\n<image>\n{prompt}<|im_end|>\n"]
@@ -103,8 +111,12 @@ Move the <object> into the red box and finally remove the red box.
 **Example:**
 ```text
-Move the apple into the red box and finally remove the red box.
 ```
 #### 2. Object Rotation
@@ -136,9 +148,13 @@ Rotate the <object> to show the <view> side view.
 **Examples:**
 ```text
-Rotate the chair to show the front side view.
-Rotate the car to show the rear left side view.
 ```
 #### 3. Camera Control
@@ -175,10 +191,14 @@ Move the camera.
 ```text
 Move the camera.
-- Camera rotation: Yaw -90°, Pitch 20°.
 - Camera zoom: unchanged.
 - Keep the 3D scene static; only change the viewpoint.
 ```
 ## License Agreement
@@ -186,5 +206,5 @@ JoyAI-Image is licensed under Apache 2.0.
 ## ☎️  We're Hiring!
-We are actively hiring Research Scientists, Engineers, and Interns to join us in building next-generation generative foundation models and bringing them into real-world applications. If you’re interested, please send your resume to: [huanghaoyang.ocean@jd.com](mailto:huanghaoyang.ocean@jd.com)

 **Requirements**: Python >= 3.10, CUDA-capable GPU
+### Core Dependencies
+The transformers version must be **between 4.57 and 4.58**; otherwise, incorrect results may occur.
 | Package | Version | Purpose |
 |---------|---------|---------|
 | `torch` | >= 2.8 | PyTorch |
 | `transformers` | >= 4.57.0, < 4.58.0 | Text encoder |
+| `torchvison` | - |Image process|
+| `einops` | - |Tensor manipulation|
+### Install the [Pull Request](https://github.com/huggingface/diffusers/pull/13444]) of JoyAI-Image-Edit of diffusers
+```bash
+pip install git+https://github.com/huggingface/diffusers.git@refs/pull/13444/head
+```
+### Or install from this repo (PR will merge to diffusers main branch soon)
 ```bash
+pip install torch==2.8 transformers==4.57.6 torchvision einops
+pip install git+https://github.com/Moran232/diffusers.git@joyimage_edit
 ```
+### Running with Diffusers
 ```python
 import torch
 from PIL import Image
 print("pipeline loaded")
 img_path = "./test_images/input.png"
+prompt = "Move the board into the red box and finally remove the red box."
 image = Image.open(img_path).convert("RGB")
 prompts = [f"<|im_start|>user\n<image>\n{prompt}<|im_end|>\n"]
 **Example:**
 ```text
+Move the board into the red box and finally remove the red box.
 ```
+<p align="center">
+  <img src="test_images/input1.png" width="40%" />
+  <img src="test_images/output1_predicted.png" width="40%" />
+</p>
 #### 2. Object Rotation
 **Examples:**
 ```text
+Rotate the dog to show the left side view.
 ```
+<p align="center">
+  <img src="test_images/input2.png" width="40%" />
+  <img src="test_images/output2_predicted.png" width="40%" />
+</p>
 #### 3. Camera Control
 ```text
 Move the camera.
+- Camera rotation: Yaw 0.0°, Pitch -15.0°.
 - Camera zoom: unchanged.
 - Keep the 3D scene static; only change the viewpoint.
 ```
+<p align="center">
+  <img src="test_images/input3.png" width="40%" />
+  <img src="test_images/output3_predicted.png" width="40%" />
+</p>
 ## License Agreement
 ## ☎️  We're Hiring!
+We are actively hiring Research Scientists, AI Infra Engineers, and Interns to join us in building next-generation generative foundation models and bringing them into real-world applications. If you’re interested, please send your resume to: [huanghaoyang.ocean@jd.com](mailto:huanghaoyang.ocean@jd.com)