You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 4, 2021. It is now read-only.
Copy file name to clipboardExpand all lines: INSTALL.md
+23-11
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,20 @@
1
-
Start with fresh Ubuntu Server 14.03.03 LTS
1
+
# Hardware requirements
2
2
3
-
Check the raw version of this file for easy copy/paste
3
+
* CPU
4
+
* Phase 1: 32 Core recommended (language identification in [Phase 1](metadata/metadata.md) takes about 100 days on such a 4 Core CPU, 32 Core CPU recommended)
5
+
* Phase 2: 4 Core
6
+
* RAM
7
+
* 32 GB RAM (see issues #8 and #18)
8
+
* Drive space
9
+
* Phase 1: 3-4 TB per Common Crawl
10
+
* Phase 2: 300 GB per language direction
4
11
5
-
# Install packages
12
+
# Operating system requirements
13
+
The system was tested with Ubuntu 14.04 LTS, but other Debian-based Linux distributions should work as well.
When encountering issues with installing NLTK data, you might have to hand-edit `DEFAULT_URL` in `/usr/lib/python2.7/dist-packages/nltk/downloader.py` from `http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml` to `http://www.nltk.org/nltk_data/`.
0 commit comments