Important Licensing Notice: This application uses components from the handwriting_ocr package which may include components with unclear or proprietary licensing terms. This application is provided for research and educational purposes only. Please consult the handwriting_ocr package documentation and ensure compliance with all licensing requirements before using this software in any production environment.
A Python application for OCR processing of webcam input, with support for both physical and virtual cameras (like OBS Virtual Camera).
- Real-time webcam capture and display
- Support for physical and virtual cameras (including OBS Virtual Camera)
- OCR processing of captured frames
- GPU acceleration support
- Configurable content types and keywords for improved recognition
- Modern GUI with zoom and pan capabilities
- Memory-efficient processing
- Temporary file management
- Python 3.12.1 or later
- v4l2 and v4l2-utils on Linux
- CUDA (optional, for GPU acceleration)
- OBS Studio (optional, for virtual camera)
# Install v4l2 utilities
sudo apt-get update
sudo apt-get install v4l-utils
# For virtual camera support
sudo apt-get install v4l2loopback-dkms
# Load v4l2loopback module
sudo modprobe v4l2loopback
# Add user to video group for camera access
sudo usermod -a -G video $USER
The recommended way to install Webcam OCR is using pipx, which installs the application in an isolated environment:
# Install pipx if you haven't already
python -m pip install --user pipx
python -m pipx ensurepath
# Install Webcam OCR
pipx install git+https://github.com/sjvrensburg/webcam-ocr.git
# Run the application
webcam-ocr
You can also install using pip directly:
# Install from GitHub
pip install git+https://github.com/sjvrensburg/webcam-ocr.git
# Run the application
webcam-ocr
If you want to develop or modify the application:
- Clone the repository:
git clone https://github.com/sjvrensburg/webcam-ocr.git
cd webcam-ocr
- Install using Poetry:
# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
poetry install
# Run the application
poetry run webcam-ocr
Run the application:
webcam-ocr
List available cameras:
webcam-ocr --list-cameras
Use a specific camera by name:
webcam-ocr --camera-name "OBS Virtual Camera"
Use CPU instead of GPU:
webcam-ocr --device cpu
--width INTEGER Initial window width (default: 1200)
--height INTEGER Initial window height (default: 800)
--camera INTEGER Camera device index (default: 0)
--camera-name TEXT Partial name of camera to use (e.g. 'OBS' for OBS Virtual Camera)
--list-cameras List available camera devices and exit
--device TEXT Device to run OCR on (choices: cpu, cuda; default: cuda)
--api-key TEXT Anthropic API key for Claude (optional)
- Scroll: Zoom in/out
- Left mouse drag: Pan view
- C: Capture current frame
- ESC: Exit application
- Auto-detection of available cameras
- Support for both physical and virtual cameras
- Graceful fallback to default camera if selected camera fails
- Dynamic camera reconnection
- Real-time frame capture
- Configurable content types for improved recognition
- Keyword support for context-aware processing
- Memory-efficient processing with automatic cleanup
- Modern, responsive GUI
- Real-time camera preview
- Zoom and pan controls
- Progress indicators and status updates
- GPU memory monitoring
- Compatible with OBS Virtual Camera
- Reduced buffer size for minimal latency
- Automatic format negotiation
- Permission handling and user guidance
pipx uninstall webcam-ocr
pip uninstall webcam-ocr
- This application is currently only tested on Linux systems
- Some dependencies may have unclear or proprietary licensing terms
- The OCR functionality requires significant computational resources
Contributions are welcome! Please feel free to submit a Pull Request. Note that:
- This is an experimental/research project
- Some components have unclear licensing terms
- Production use is not recommended without careful review of all dependencies
This software is provided as-is, without any warranty or guarantee of fitness for any particular purpose. The maintainers of this project make no claims about the licensing status of all components used in this application. Users are responsible for ensuring their use complies with all applicable licenses and terms of use.
This project makes use of:
- OpenCV for image processing
- v4l2 for camera management on Linux
- OBS Studio's virtual camera functionality
- Various OCR and machine learning components (see licensing notice)