Skip to content
This repository was archived by the owner on May 4, 2021. It is now read-only.

Commit 4be52d0

Browse files
authored
Adding hardware and OS requirements
1 parent 23fb44e commit 4be52d0

File tree

1 file changed

+23
-11
lines changed

1 file changed

+23
-11
lines changed

INSTALL.md

+23-11
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,20 @@
1-
Start with fresh Ubuntu Server 14.03.03 LTS
1+
# Hardware requirements
22

3-
Check the raw version of this file for easy copy/paste
3+
* CPU
4+
* Phase 1: 32 Core recommended (language identification in [Phase 1](metadata/metadata.md) takes about 100 days on such a 4 Core CPU, 32 Core CPU recommended)
5+
* Phase 2: 4 Core
6+
* RAM
7+
* 32 GB RAM (see issues #8 and #18)
8+
* Drive space
9+
* Phase 1: 3-4 TB per Common Crawl
10+
* Phase 2: 300 GB per language direction
411

5-
# Install packages
12+
# Operating system requirements
13+
The system was tested with Ubuntu 14.04 LTS, but other Debian-based Linux distributions should work as well.
14+
15+
# Software installation
16+
17+
## Install packages
618
```
719
sudo apt-get update
820
sudo apt-get install build-essential git-core pkg-config
@@ -18,30 +30,30 @@ sudo apt-get install libgflags-dev libsnappy-dev libbz2-dev liblzma-dev zlib1g-d
1830
sudo apt-get install pigz
1931
```
2032

21-
# Make a directory for code
33+
## Make a directory for code
2234
```
2335
cd
2436
mkdir -p net/build
2537
```
2638

27-
# Clone project from github (add ssh key before)
39+
## Clone project from github (add ssh key before)
2840
```
2941
cd net/build/
3042
git clone git@github.com:ModernMT/DataCollection.git
3143
```
3244

33-
# Make new virtualenv
45+
## Make new virtualenv (optional)
3446
```
3547
cd net/build/
3648
virtualenv crawl
3749
```
3850

39-
# Activate virtualenv
51+
## Activate virtualenv (optional)
4052
```
4153
source ~/net/build/crawl/bin/activate
4254
```
4355

44-
# Install requirements
56+
## Install requirements
4557
```
4658
sudo apt-get install libffi-dev
4759
sudo apt-get install libssl-dev
@@ -53,7 +65,7 @@ pip install -r requirements.txt
5365
```
5466
When encountering issues with installing NLTK data, you might have to hand-edit `DEFAULT_URL` in `/usr/lib/python2.7/dist-packages/nltk/downloader.py` from `http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml` to `http://www.nltk.org/nltk_data/`.
5567

56-
# Install moses
68+
## Install Moses
5769
```
5870
sudo apt-get install build-essential git-core pkg-config automake libtool
5971
cd /home/build
@@ -62,9 +74,9 @@ cd moses
6274
make -f contrib/Makefiles/install-dependencies.gmake
6375
```
6476

65-
# Install Bitextor
77+
## Install Bitextor
6678

67-
Like described at http://sourceforge.net/p/bitextor/wiki/Home/ (used 4.1.0-rc4 for baseline test, but newer versions should work)
79+
Like described at http://sourceforge.net/p/bitextor/wiki/Home/ (tested baseline with 4.1.0-rc4, but newer versions should work)
6880

6981
Potentially needed option when configuring: `./configure --without-apertium`
7082

0 commit comments

Comments
 (0)