Skip to content
#

image-captioning-ai

Here are 2 public repositories matching this topic...

Language: All
Filter by language

An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Ollama). It ensures privacy and offline use with a user-friendly GUI.

  • Updated Feb 23, 2025
  • Python

The dataset contains over 82,000 images, each of which has at least 5 different caption annotations. The code below downloads and extracts the dataset automatically. Warning: File Size 1.3GB Time consuming

  • Updated Nov 1, 2023
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the image-captioning-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the image-captioning-ai topic, visit your repo's landing page and select "manage topics."

Learn more