Built a python pipeline to preprocess blog posts (lemmatization, coreference resolution, identifying collocations, etc) and built an LDA topic model to flag irrelevant comments under those posts.
Please find the problem statement, discussion of the solution and the pipeline design [here].