Skip to content

Fine-tuning and inference of Semantic Segmentation task with Transformers

Notifications You must be signed in to change notification settings

eborghi10/semantic_segmentation_transformers

Repository files navigation

Semantic Segmentation with Vision Transformers (ViT)

This repository uses the SegFormer model proposed in SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers by Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo.

Prerequisites

Create a .env file in the root directory of this repository with your Hugging Face token:

HF_TOKEN=<your_token>
CUDA_LAUNCH_BLOCKING=1

Usage

Open this repository as a VS Devcontainer and open the following demos:

  • Dataset: dataset.py. Will push the dataset to Hugging Face.
  • Training: segformer-vineyard-train.ipynb.
  • Inference: segformer-vineyard-inference.ipynb.

Known issues

  • Training metrics are not being calculated.
  • Apply correct labels to masks and update dataset.

To do

  • Setup MLOps pipeline.
  • Compare hyperparameters using good metrics.

References

About

Fine-tuning and inference of Semantic Segmentation task with Transformers

Resources

Stars

Watchers

Forks