Generated Images
- python 3.9
- Pytorch 1.9
- At least 1xTesla v100 32GB GPU (for training)
- Only CPU (for inference)
EfficientCLIP-GAN is a small, rapid and efficient generative model which can generate multiple pictures in one second even on the CPU as compared to Diffusion Models.
Clone this repo.
git clone https://github.com/VinayHajare/EfficientCLIP-GAN
pip install -r requirements.txt
Install CLIP
- Download the preprocessed metadata for birds and extract them to
data/
- Download the birds image data. Extract them to
data/birds/
OR
- Download the preprocessed metadata and CUB dataset in a single zip download it and extract to
data/
cd EfficientCLIP-GAN/code/
- For bird dataset:
bash scripts/train.sh ./cfg/bird.yml
If your training process is interrupted unexpectedly, set state_epoch, log_dir, and pretrained_model_path in train.sh with appropriate values to resume training.
Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.
- For bird dataset:
tensorboard --logdir=./code/logs/bird/train --port 8166
- EfficientCLIP-GAN for Birds. Download and save it to
./code/saved_models/pretrained/
cd EfficientCLIP-GAN/code/
set pretrained_model in test.sh to models path
- For bird dataset:
bash scripts/test.sh ./cfg/bird.yml
The released model achieves better performance than the Latent Diffusion.
Model | Birds-FID↓ | Birds-CS↑ |
---|---|---|
EfficientCLIP-GAN | 11.806 | 31.70 |
The gradio demo is available as a hosted HuggingFace Space here.
You can run this app locally
cd EfficientCLIP-GAN/gradio app
pip install -r requirements.txt
then
python app.py
Weights are available on HuggingFace Hub
- the inference.ipynb can be used to sample
If you find this useful in your research, please consider giving a star to repository
The code is released for academic research use only. For commercial use, please contact Vinay Hajare.