BNLP 3.3.0
Bug Fix
- remove
wasabi
text formatting for updated version build problem in different os, python version
New Feature
Text Cleaning
We adopted different text-cleaning formulas, and codes from clean-text and modified for Bangla. Now you can normalize and clean your text using the following methods.
from bnlp import CleanText
clean_text = CleanText(
fix_unicode=True,
unicode_norm=True,
unicode_norm_form="NFKC",
remove_url=False,
remove_email=False,
remove_emoji=False,
remove_number=False,
remove_digits=False,
remove_punct=False,
replace_with_url="<URL>",
replace_with_email="<EMAIL>",
replace_with_number="<NUMBER>",
replace_with_digit="<DIGIT>",
replace_with_punct = "<PUNC>"
)
input_text = "আমার সোনার বাংলা।"
clean_text = clean_text(input_text)
print(clean_text)