You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+130-15
Original file line number
Diff line number
Diff line change
@@ -5,15 +5,16 @@
5
5
6
6

7
7
8
-
**paperless-gpt** is a tool designed to generate accurate and meaningful document titles for [paperless-ngx](https://github.com/paperless-ngx/paperless-ngx) using Large Language Models (LLMs). It supports multiple LLM providers, including **OpenAI** and **Ollama**. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents.
8
+
**paperless-gpt** is a tool designed to generate accurate and meaningful document titles and tags for [paperless-ngx](https://github.com/paperless-ngx/paperless-ngx) using Large Language Models (LLMs). It supports multiple LLM providers, including **OpenAI** and **Ollama**. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents.
9
9
10
10
[](./demo.gif)
11
11
12
12
## Features
13
13
14
-
-**Multiple LLM Support**: Choose between OpenAI and Ollama for generating document titles.
14
+
-**Multiple LLM Support**: Choose between OpenAI and Ollama for generating document titles and tags.
15
+
-**Customizable Prompts**: Modify the prompt templates to suit your specific needs.
15
16
-**Easy Integration**: Works seamlessly with your existing paperless-ngx setup.
16
-
-**User-Friendly Interface**: Intuitive web interface for reviewing and applying suggested titles.
17
+
-**User-Friendly Interface**: Intuitive web interface for reviewing and applying suggested titles and tags.
17
18
-**Dockerized Deployment**: Simple setup using Docker and Docker Compose.
|`PAPERLESS_BASE_URL`| The base URL of your paperless-ngx instance (e.g., `http://paperless-ngx:8000`). | Yes |
114
-
|`PAPERLESS_API_TOKEN`| API token for accessing paperless-ngx. You can generate one in the paperless-ngx admin interface. | Yes |
115
-
|`LLM_PROVIDER`| The LLM provider to use (`openai` or `ollama`). | Yes |
116
-
|`LLM_MODEL`| The model name to use (e.g., `gpt-4`, `gpt-3.5-turbo`, `llama2`). | Yes |
117
-
|`OPENAI_API_KEY`| Your OpenAI API key. Required if using OpenAI as the LLM provider. | Cond. |
118
-
|`LLM_LANGUAGE`| The likely language of your documents (e.g., `English`, `German`). Default is `English`. | No |
119
-
|`OLLAMA_HOST`| The URL of the Ollama server (e.g., `http://host.docker.internal:11434`). Useful if using Ollama. Default is `http://127.0.0.1:11434`| No |
|`PAPERLESS_BASE_URL`| The base URL of your paperless-ngx instance (e.g., `http://paperless-ngx:8000`). | Yes |
129
+
|`PAPERLESS_API_TOKEN`| API token for accessing paperless-ngx. You can generate one in the paperless-ngx admin interface. | Yes |
130
+
|`LLM_PROVIDER`| The LLM provider to use (`openai` or `ollama`). | Yes |
131
+
|`LLM_MODEL`| The model name to use (e.g., `gpt-4o`, `gpt-3.5-turbo`, `llama2`).| Yes |
132
+
|`OPENAI_API_KEY`| Your OpenAI API key. Required if using OpenAI as the LLM provider. | Cond. |
133
+
|`LLM_LANGUAGE`| The likely language of your documents (e.g., `English`, `German`). Default is `English`. | No |
134
+
|`OLLAMA_HOST`| The URL of the Ollama server (e.g., `http://host.docker.internal:11434`). Useful if using Ollama. Default is `http://127.0.0.1:11434`. | No|
120
135
121
136
**Note:** When using Ollama, ensure that the Ollama server is running and accessible from the paperless-gpt container.
122
137
138
+
### Custom Prompt Templates
139
+
140
+
You can customize the prompt templates used by paperless-gpt to generate titles and tags. By default, the application uses built-in templates, but you can modify them by editing the template files.
141
+
142
+
#### Prompt Templates Directory
143
+
144
+
The prompt templates are stored in the `prompts` directory inside the application. The two main template files are:
145
+
146
+
-`title_prompt.tmpl`: Template used for generating document titles.
147
+
-`tag_prompt.tmpl`: Template used for generating document tags.
148
+
149
+
#### Mounting the Prompts Directory
150
+
151
+
To modify the prompt templates, you need to mount a local `prompts` directory into the container.
152
+
153
+
**Docker Compose Example:**
154
+
155
+
```yaml
156
+
services:
157
+
paperless-gpt:
158
+
image: icereed/paperless-gpt:latest
159
+
# ... (other configurations)
160
+
volumes:
161
+
- ./prompts:/app/prompts # Mount the prompts directory
162
+
```
163
+
164
+
**Docker Run Command Example:**
165
+
166
+
```bash
167
+
docker run -d \
168
+
# ... (other configurations)
169
+
-v $(pwd)/prompts:/app/prompts \
170
+
paperless-gpt
171
+
```
172
+
173
+
#### Editing the Prompt Templates
174
+
175
+
1.**Start the Container:**
176
+
177
+
When you first start the container with the `prompts` directory mounted, it will automatically create the default template files in your local `prompts` directory if they do not exist.
178
+
179
+
2.**Edit the Template Files:**
180
+
181
+
- Open `prompts/title_prompt.tmpl` and `prompts/tag_prompt.tmpl` with your favorite text editor.
182
+
- Modify the templates using Go's `text/template` syntax.
183
+
- Save the changes.
184
+
185
+
3.**Restart the Container (if necessary):**
186
+
187
+
The application automatically reloads the templates when it starts. If the container is already running, you may need to restart it to apply the changes.
188
+
189
+
#### Template Syntax and Variables
190
+
191
+
The templates use Go's `text/template` syntax and have access to the following variables:
192
+
193
+
-**For `title_prompt.tmpl`:**
194
+
195
+
-`{{.Language}}`: The language specified in `LLM_LANGUAGE` (default is `English`).
196
+
-`{{.Content}}`: The content of the document.
197
+
198
+
-**For `tag_prompt.tmpl`:**
199
+
200
+
-`{{.Language}}`: The language specified in `LLM_LANGUAGE`.
201
+
-`{{.AvailableTags}}`: A list (array) of available tags from paperless-ngx.
202
+
-`{{.Title}}`: The suggested title for the document.
203
+
-`{{.Content}}`: The content of the document.
204
+
205
+
**Example `title_prompt.tmpl`:**
206
+
207
+
```text
208
+
I will provide you with the content of a document that has been partially read by OCR (so it may contain errors).
209
+
Your task is to find a suitable document title that I can use as the title in the paperless-ngx program.
210
+
Respond only with the title, without any additional information. The content is likely in {{.Language}}.
211
+
212
+
Be sure to add one fitting emoji at the beginning of the title to make it more visually appealing.
213
+
214
+
Content:
215
+
{{.Content}}
216
+
```
217
+
218
+
**Example `tag_prompt.tmpl`:**
219
+
220
+
```text
221
+
I will provide you with the content and the title of a document. Your task is to select appropriate tags for the document from the list of available tags I will provide. Only select tags from the provided list. Respond only with the selected tags as a comma-separated list, without any additional information. The content is likely in {{.Language}}.
222
+
223
+
Available Tags:
224
+
{{.AvailableTags | join ","}}
225
+
226
+
Title:
227
+
{{.Title}}
228
+
229
+
Content:
230
+
{{.Content}}
231
+
232
+
Please concisely select the {{.Language}} tags from the list above that best describe the document.
233
+
Be very selective and only choose the most relevant tags since too many tags will make the document less discoverable.
234
+
```
235
+
236
+
**Note:** Advanced users can utilize additional functions from the [Sprig](http://masterminds.github.io/sprig/) template library, as it is included in the application.
0 commit comments