Skip to content

PosePerfect is a Python project that uses generative AI for object pose editing. It detects and segments objects using YOLOv5 and SAM, then modifies their pose with Stable Diffusion inpainting. The project outputs segmented or inpainted images and includes simple setup and usage scripts.

Notifications You must be signed in to change notification settings

saarinii/PosePerfect

Repository files navigation

PosePerfect

PosePerfect is a Python project that allows you to edit the pose of an object within a scene using cutting-edge generative AI techniques. This project focuses on two key tasks:

Object Segmentation:

PosePerfectT1.ipynb (.ipynb for quicker view of the code and the output), PosePerfectTask1/run.py (to run the code)

What it does: Isolates the target object from the background image.

How it works: Takes an input image and a text prompt specifying the object class (e.g., chair, table).

Output: A segmented image with a red mask highlighting the object's boundaries.

This code utilizes YOLOv5 for object detection and the Segment Anything Model (SAM) for segmentation to detect, segment, and visually highlight specified objects in images.

Key Components

Library Imports:

OpenCV (cv2): For image processing (reading, writing).

NumPy (np): For numerical operations on image data.

Torch: To load and run the YOLOv5 and SAM models.

Matplotlib (plt): For displaying images.

Configuration:

The model path (CHECKPOINT_PATH) and type (MODEL_TYPE) are set, and the device is configured to use a GPU if available.

Models:

SAM is loaded using its checkpoint and set to the specified device. An instance of SamAutomaticMaskGenerator is created for segmentation. The YOLOv5 model is loaded from Ultralytics’ repository as a pre-trained model.

Image Display Function:

The display_image function reads an image, converts it from BGR to RGB, and displays it using Matplotlib.

Detection and Segmentation Function:

The function detects objects in the image and checks for the target object. If detected, it extracts the region of interest (ROI), segments it using SAM, and applies a red highlight before saving and displaying the final image.

chair highlighted_image1 highlighted_chair_red

SetUp

  1. Open VSCode
  2. Copy the given command
git clone https://github.com/saarinii/PosePerfect.git
  1. Navigate to the project directory:
cd PosePerfectTask1
  1. Install the required packages:
pip install -r requirements.txt
  1. You can now run the script as follows::
python run.py --image ./example.jpg --class "chair" --output ./generated.png

Replace ./example.jpg with the path to your input image, "chair" with the object class you want to segment, and ./generated.png with the desired output path.

Pose Editing:

PosePerfectT2.ipynb (.ipynb for quicker view of the code and the output), PosePerfectTask2/run.py (to run the code)

What it does: Detects a target object in an image, segments it, and then uses inpainting to modify the image based on the mask of the detected object.

How it works: The script accepts an input image and a text prompt specifying the object class (e.g., chair, table). It uses YOLOv5 to detect the specified object, SAM to generate a mask around the object, and Stable Diffusion inpainting to modify the image based on the mask.

Output: An inpainted image where the specified object is detected, segmented, and modified (e.g., filled in or removed) using Stable Diffusion. The output is saved as a new image file.

Key Components

Library Imports:

OpenCV (cv2): For image processing (reading, writing, and displaying). NumPy (np): For numerical operations on image data. Torch: For loading and running the YOLOv5, SAM, and Stable Diffusion models. PIL (Pillow): For handling images in the inpainting process. Diffusers: To handle the Stable Diffusion inpainting pipeline.

Configuration:

The script checks if CUDA is available to run models on a GPU. It specifies a pre-trained YOLOv5 model from Ultralytics for object detection. The Segment Anything Model (SAM) is loaded using a checkpoint, with its type (e.g., vit_h) specified.

Models:

YOLOv5: Loaded from Ultralytics, this pre-trained model detects objects in the input image and extracts the region of interest (ROI) for further segmentation. SAM (Segment Anything Model): SAM is used to generate a mask around the detected object, allowing for precise segmentation. Stable Diffusion Inpainting: Uses the diffusers library to apply inpainting on the segmented area, modifying the image based on the generated mask. Detection and Segmentation Function:

Detecting Objects: The script reads an image and detects objects using YOLOv5, identifying the specified object class by its label (e.g., chair). The detected object's coordinates are used to extract the ROI.

Segmentation: SAM is employed to generate a mask for the object within the ROI, and the best mask (highest confidence) is selected for further processing.

Inpainting: The mask is applied to the image, and Stable Diffusion is used to perform inpainting on the segmented region. This modifies the image based on the prompt and mask (e.g., removing or replacing the object).

Image Saving and Output:

The script saves the segmented mask and the final inpainted image as separate output files, ensuring you get both the segmented and modified versions of the input image.

chair isolated_chair isolated_chair (1)

rotated_chair (3)

highlighted_chair_white replaced_chair (1)

SetUp

  1. Open VSCode
  2. Copy the given command
git clone https://github.com/saarinii/PosePerfect.git
  1. Navigate to the project directory:
cd PosePerfectTask2
  1. Install the required packages:
pip install -r requirements.txt
  1. You can now run the script as follows::
python run.py --image ./example.jpg --class "chair" --azimuth +72 --polar +0 --output ./generated.png

Replace ./example.jpg with the path to your input image, "chair" with the object class you want to segment, ./generated.png with the desired output path, and '--azimuth +72 --polar +0' with the angle you want to rotate the object to.

My trials

To look more into my learning journey and what all the experimenting and trial and errors I had to do to make this project and a lot of my mistakes head to

AvatarAssignmentAllMyTrys.ipynb

About

PosePerfect is a Python project that uses generative AI for object pose editing. It detects and segments objects using YOLOv5 and SAM, then modifies their pose with Stable Diffusion inpainting. The project outputs segmented or inpainted images and includes simple setup and usage scripts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published