Scan pdf format of resume and group similar resumes, identify common skillsets
Usecase: Consider a large set of resumes that have been received for various positions at a firm but are not classified according to the
position, can apply NLP for identifing and classifying resumes to particular job roles.
Goals to achieve:
Parse resumes
Build entity relations
Classify resumes according to domains
identify skillsets that are common per domain
Visualize using Javascript
Setting up Environment from scratch:
conda create --name NLPResumeScanner # creates new environment this project
Source activate NLPResumeScanner # activate the environment
python -m ipykernel install --user --name NLPResumeScanner # set the new kernel
Get help on install
pip install -U nltk # to install nltk module
python -m nltk.downloader all # to download the nltk data
pip install PyPDF2