frontend
Azad kumar
Backend and security Vijeyanidhi Mayank
OCR Nishee Sharma Vikas Yadav Amit Kumar Jena
Python libraries required:
- OpenCV
- Pytesseract
- Pandas
- Gensim
- Numpy
Clone the CliNER repository and install it as written here: https://github.com/text-machine-lab/CliNER CliNER is required to do analysis and extract information of problems, tests and treatments listed in the medical report.
Extract the Lab Reports.zip file. It contains some sample reports as well as required .csv files which contains the list of keywords to be searched in the report by RegEx.
The 'Information extraction' python file and notebook perform the same operation. We are able to extract following details:
- dict_basic: python dictionary which stores Basic Details like Name, Date, Age etc.
- dict_blood (in case of Blood reports): python pandas dataframe which stores Values of the Parameters as written in the Blood reports
- dict_urine (in case of Urine reports): python pandas dataframe which stores Values of the Parameters as present in the Urine reports
- dict_liver (in case of Liver reports): python pandas dataframe which stores Values of the Parameters as present in the Liver reports
- dict_stool (in case of Stool reports): python pandas dataframe which stores Values of the Parameters as present in the Stool reports
- Comments_Report: Comments written in the report, followed by the keyword 'Comments' (if available)
- Summary: summary of the comments (if available)
- list_problem: list of terms/phrases in the report which are categorised as 'Problem' by CliNER
- list_treatment: list of terms/phrases in the report which are categorised as 'Treatment' by CliNER
- list_tests: list of terms/phrases in the report which are categorised as 'Test' by CliNER