-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
3 changed files
with
44 additions
and
0 deletions.
There are no files selected for viewing
5 changes: 5 additions & 0 deletions
5
...iption/ai_application_security/llm_security/training_data_poisoning/guidance.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Guidance | ||
|
||
Provide a step-by-step walkthrough with a screenshot on how you exploited the vulnerability. This will speed triage time and result in faster rewards. Please include specific details on where you identified the vulnerability, how you identified it, and what actions you were able to perform as a result. | ||
|
||
Attempt to escalate the vulnerability to perform additional actions. If this is possible, provide a full Proof of Concept (PoC). |
13 changes: 13 additions & 0 deletions
13
...ai_application_security/llm_security/training_data_poisoning/recommendations.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Recommendation(s) | ||
|
||
There is no single technique to prevent excessive agency or permission manipulation from occurring. However, implementing the following defensive measures within the LLM application can prevent and limit the impact of the vulnerability: | ||
|
||
- Verify the training data supply chain, its content, as well as its sources. | ||
- Ensure the legitimacy of the data throughout all stages of training. | ||
- Strictly vet the data inputs and include filtering and sanitization. | ||
- Use testing and detection mechanisms to monitor the model's outputs and detect any data poisoning attempts. | ||
|
||
For more information, refer to the following resources: | ||
|
||
- <https://owasp.org/www-project-top-10-for-large-language-model-applications/> | ||
- <https://stanford-cs324.github.io/winter2022/lectures/data/> |
26 changes: 26 additions & 0 deletions
26
...iption/ai_application_security/llm_security/training_data_poisoning/template.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Training Data Poisoning | ||
|
||
## Overview of the Vulnerability | ||
|
||
Training data poisoning occurs when an attacker manipulates the training data to intentionally compromise the output of the Large Language Model (LLM). This can be achieved by manipulating the pre-training data, fine-tuning data process, or the embedding process. Data poisoning can result in an attacker affecting the integrity of the LLM by causing unreliable, biased, or unethical outputs from the model. | ||
|
||
## Business Impact | ||
|
||
This vulnerability can lead to reputational and financial damage of the company due an attacker compromising the decision-making of the LLM, which would also impact customers' trust. The severity of the impact to the business is dependent on the sensitivity of the accessible data being transmitted by the application. | ||
|
||
## Steps to Reproduce | ||
|
||
1. Navigate to the following URL: | ||
1. Enter the following prompt into the LLM: | ||
|
||
```prompt | ||
{prompt} | ||
``` | ||
|
||
1. Observe that the output from the LLM returns a compromised result | ||
|
||
## Proof of Concept (PoC) | ||
|
||
The screenshot(s) below demonstrate(s) the vulnerability: | ||
> | ||
> {{screenshot}} |