Skip to content

Pipeline for distributed Natural Language Processing, made in Python

License

Notifications You must be signed in to change notification settings

andrebco/pypln.backend

This branch is 185 commits behind NAMD/pypln.backend:develop.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Álvaro Justen aka Turicas
Apr 12, 2013
4bb6640 · Apr 12, 2013
Nov 28, 2012
Apr 12, 2013
Jan 15, 2013
Dec 24, 2012
Apr 12, 2013
Aug 16, 2012
Nov 30, 2012
Nov 1, 2012
Dec 24, 2012
Feb 20, 2013
Nov 28, 2012
Dec 4, 2012

Repository files navigation

PyPLN

PyPLN is a distributed pipeline for natural language processing, made in Python. We use NLTK and ZeroMQ as our foundations. The goal of the project is to create an easy way to use NLTK for processing big corpora, with a Web interface.

We don't have a production release yet, but it's scheduled on our next milestone.

PyPLN is sponsored by Fundação Getulio Vargas.

License

PyPLN is free software, released under the GPLv3 https://gnu.org/licenses/gpl-3.0.html.

Documentation

Our documentation is hosted using GitHub Pages:

Requirements

You will need some Python packages, libmagic and poppler utils

To install dependencies (on a Debian-like GNU/Linux distribution):

sudo apt-get install python-setuptools libmagic-dev poppler-utils
pip install virtualenv virtualenvwrapper
mkvirtualenv pypln.backend
pip install -r requirements/production.txt

You will also need to install NLTK data. You can do so following the NLTK documentation.

Developing

To run tests:

workon pypln.backend
pip install -r requirements/development.txt
make test

See our code guidelines.

About

Pipeline for distributed Natural Language Processing, made in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 89.5%
  • Shell 9.9%
  • Graphviz (DOT) 0.6%