This is a collaborative data science project under the Omdena São Paulo Chapter focused on Brazilian Sign Language (Libras) recognition. The project aims to develop machine learning models that can classify sign language videos into corresponding Portuguese words.
See STRUCTURE.md for detailed project organization.
- Python 3.11 or higher
- uv package manager (recommended installation via pip:
pip install uv
)
-
Clone the repository:
git clone https://github.com/OmdenaAI/SaoPauloBrazilChapter_BrazilianSignLanguage.git cd SaoPauloBrazilChapter_BrazilianSignLanguage
-
Install core dependencies:
uv sync
For additional dependencies:
uv sync --extra <group> # Example: uv sync --extra data
See
pyproject.toml
for available dependency groups (data, model, app). -
Using the environment:
# Activate the environment uv venv activate # Run your code python your_script.py jupyter notebook
Or run commands directly without activation:
uv run python your_script.py uv run jupyter notebook
-
Adding new dependencies:
uv add <package> # Add to core dependencies uv add --extra data <pkg> # Add to data processing tools
SaoPauloBrazilChapter_BrazilianSignLanguage/
├── data/ # Data files
│ ├── raw/ # Original data
│ │ ├── INES/ # INES dataset
│ │ │ └── videos/ # Video files (stored on Google Drive)
│ │ ├── SignBank/ # SignBank dataset
│ │ │ └── videos/ # Video files (stored on Google Drive)
│ │ ├── UFV/ # UFV dataset
│ │ │ └── videos/ # Video files (stored on Google Drive)
│ │ └── V-Librasil/ # V-Librasil dataset
│ │ └── videos/ # Video files (stored on Google Drive)
│ ├── interim/ # Intermediate processing
│ ├── processed/ # Final datasets
│ ├── external/ # Third party data
│ └── papers/ # Related research
├── code/ # Source code
│ ├── data/ # Data processing
│ ├── models/ # Model implementations
├── notebooks/ # Jupyter notebooks
└── tests/ # Unit tests
See STRUCTURE.md for complete structure details.
- Large video files are stored on Google Drive
- Video directories in the repository structure are placeholders
- Download videos to your local
videos/
directories as needed
- Small files like CSV files, labels, and metadata are tracked in Git
- Store processed data (features, embeddings) in
processed/
- Document data formats in respective directories