Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
kermitt2 committed Nov 28, 2023
1 parent 7aceb0d commit 06b0df2
Showing 1 changed file with 38 additions and 2 deletions.
40 changes: 38 additions & 2 deletions doc/Install-DeLFT.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ To ensure the availability of GPU devices for the right version of tensorflow, C

## Loading resources locally

Required resources to train models (static embeddings, pre-trained transformer models) will be downloaded automatically. However, if you wish to load these resources locally, you need to notify their local path in the resource registry file.
Required resources to train models (static embeddings, pre-trained transformer models) will be downloaded automatically, in particular via Hugging Face Hub using the model name identifier. However, if you wish to load these resources locally, you need to notify their local path in the resource registry file.

Edit the file `delft/resources-registry.json` and modify the value for `path` according to the path where you have saved the corresponding embeddings. The embedding files must be unzipped. For instance, for loading glove-840B embeddings from a local path:

Expand All @@ -47,8 +47,44 @@ Edit the file `delft/resources-registry.json` and modify the value for `path` ac
"item": "word"
},
...
]
],
...
}

```

For pre-trained transformer models (for example downloaded from Hugging Face), you can indicate simply the path to the model directory, as follow:


```json
{
"transformers": [
{
"name": "scilons/scilons-bert-v0.1",
"model_dir": "/media/lopez/T52/models/scilons/scilons-bert-v0.1/",
"lang": "en"
},
...
],
...
}
```

For older transformer formats with just config, vocab and checkpoint weights file, you can indicate the resources like this:

```json
{
"transformers": [
{
"name": "dmis-lab/biobert-base-cased-v1.2",
"path-config": "/media/lopez/T5/embeddings/biobert_v1.2_pubmed/bert_config.json",
"path-weights": "/media/lopez/T5/embeddings/biobert_v1.2_pubmed/model.ckpt-1000000",
"path-vocab": "/media/lopez/T5/embeddings/biobert_v1.2_pubmed/vocab.txt",
"lang": "en"
},
...
],
...
}
```

0 comments on commit 06b0df2

Please sign in to comment.