Are you learning a new language and looking for a way to improve your pronunciation? Or maybe you are trying to reduce your accent in a foreign language? Look no further, this Jupyter notebook is just for you. The primary aim of this project is to provide an automated speech recognition tool that listens, understands, and provides feedback to your spoken English or German. This is achieved by processing sound files you create while reading books.
The notebook:
- Reads all sound files you crOeate, organized by their creation date.
- Translates your spoken words into text using the latest and best speech-to-text models.
- Highlights the words where the speech recognition model's confidence was the lowest.
- Lists words that were mentioned twice, indicative of self-correction during speech.
- Over time, tracks your improvement and creates a dictionary of words, their meanings, and how often they were used.
This project provides a unique, customizable, and interactive way to track your language learning progress, acting as a personalized language coach.
#SpeechRecognition
#VoiceAnalysis
#LanguageLearning
#SpeechToText
#PronunciationImprovement
#English
#German
- pydub: A simple and easy-to-use Python library for audio file manipulation. Pydub lets you do stuff to audio in a way that isn't stupid.
- SpeechRecognition: Library for performing speech recognition, with support for several engines and APIs, online and offline.
- wave: A Python interface to the WAV sound format. It provides support for operations such as reading from and writing to WAV files.
- os: The OS module in Python provides functions for interacting with the operating system. We can use this library to get the list of sound files and their creation dates.
- NLTK (Natural Language Toolkit): A leading platform for building Python programs to work with human language data.
- DeepSpeech: An open-source Speech-To-Text engine by Mozilla. It uses machine learning techniques and might be used as an alternative to other services.
- pandas: A library providing high-performance, easy-to-use data structures, and data analysis tools for Python.
- numpy: A Python library adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
- matplotlib: A Python plotting library which produces quality figures in a variety of hardcopy formats and interactive environments.
- python-Levenshtein: It provides the Levenshtein distance which can be used to calculate the similarity between words, useful for spell checking and correction.
- datetime: Basic date and time types in Python, useful for managing file dates.
- pyAudioAnalysis: An open-source Python library for audio signal analysis.
- gTTS (Google Text-to-Speech): A Python interface for Google's Text to Speech API.
- librosa: A Python package for music and audio analysis.
- gensim: A Python library for topic modelling, document indexing, and similarity retrieval with large corpora.
This project is beginner-friendly, and all contributions are welcome. Happy learning and happy coding!