WhisperCast is a cutting-edge command-line application designed to transform text into engaging audio content. Whether you're creating podcasts, audiobooks, or simply exploring the power of AI-driven text-to-speech and language models, WhisperCast provides a seamless and intuitive experience.
- Generate high-quality podcast scripts using advanced language models.
- Convert scripts into audio files with natural-sounding voices.
- Customize topics and durations for tailored podcast episodes.
- Transform text files, PDFs, or URLs into immersive audiobooks.
- Supports multiple file formats, including
.txt
,.pdf
, and.docx
. - Ensures a conversational and engaging tone for listeners.
- Upload a file and let the AI teach you its content.
- Ask questions interactively, and get concise, accurate answers.
- Perfect for learning new topics or exploring complex documents.
- List all generated audio files with the
ls
command. - Play audio files directly from the command line using the
play
command.
- Fetch topic-related data from multiple sources:
- Wikipedia summaries
- Google News articles
- DuckDuckGo insights
- Reddit discussions
- Hacker News articles
- Combine and summarize content for comprehensive insights.
- Python 3.8 or higher
pip
(Python package manager)
-
Clone the repository:
git clone https://github.com/your-username/WhisperCast.git cd WhisperCast
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
- Create a
.env
file in the root directory. - Add the following variables:
GROQ_API_KEY=your_groq_api_key DEBUG=true ENVIRONMENT=development
- Create a
-
Run the application:
python main.py
podcast <topic>
: Generate a podcast script and audio file for the given topic.audiobook <file_path>
: Convert a file into an audiobook.sensei <file_path>
: Learn interactively about a file's content and ask questions.ls
: List all available audio files in theoutput
directory.play <file_number>
: Play an audio file by selecting its number from thels
command.clear
: Clear the terminal screen.bye
: Exit the application.
WhisperCast/
├── cli/
│ └── shell.py # Command-line interface implementation
├── utils/
│ ├── text_to_speech.py # Text-to-speech functionality
│ ├── llm.py # Language model interactions
│ ├── extractor.py # File and content extraction utilities
│ ├── fetcher.py # Fetch content from external sources
│ ├── finder.py # File management utilities
│ ├── log_manager.py # Logging configuration
├── main.py # Entry point for the application
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Text-to-Speech: Powered by Coqui TTS for natural-sounding audio generation.
- Language Models: Utilizes Groq's LLaMA 3 for advanced text processing and script generation.
- Web Scraping: Fetches content from Wikipedia, Google News, Reddit, and more using
BeautifulSoup
andfeedparser
. - Logging: Comprehensive logging with
loguru
for debugging and monitoring.
We welcome contributions to WhisperCast! To contribute:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Submit a pull request with a detailed description of your changes.
This project is licensed under the Apache License. See the LICENSE
file for details.
- Coqui TTS for their exceptional text-to-speech library.
- Groq for their powerful language models.
- The open-source community for providing invaluable tools and resources.
For questions, feedback, or support, please reach out to:
- Email: pseudopythonic@gmail.com
- GitHub: pythonicforge