Skip to content

Commit

Permalink
updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
penguine-ip committed Jan 24, 2025
1 parent d2ffaf9 commit 99ee4e9
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 49 deletions.
4 changes: 2 additions & 2 deletions docs/docs/red-teaming-introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,6 @@ Once you've set up your `RedTeamer`, and defined your target model and list of v

```python
from deepeval.red_teaming import AttackEnhancement

...

results = red_teamer.scan(
Expand All @@ -113,12 +112,13 @@ results = red_teamer.scan(
print("Red Teaming Results: ", results)
```

There are 3 required parameters and 1 optional parameter when calling the scan method:
There are 3 required parameters and 2 optional parameter when calling the scan method:

- `vulnerabilities`: A list of `Vulnerability` objects specifying the vulnerabilities to be tested.
- `attacks_per_vulnerability_type`: An integer specifying the number of attacks to be generated per vulnerability type.
- `target_model_callback`: A callback function representing the model you wish to red-team. The function should accept a `prompt: str` and return a `str` response.
- [Optional] `attack_enhancements`: A dict of `AttackEnhancement` enum keys specifying the distribution of attack enhancements to be used. Defaulted to uniform distribution of all available `AttackEnhancements`.
- [Optional] `ignore_errors`: a boolean which when set to `True`, ignores all exceptions raised during red-teaming. Defaulted to `False`.

:::tip
You can check out the full list of [**10+ attack enhancements** here](/docs/red-teaming-attack-enhancements).
Expand Down
81 changes: 34 additions & 47 deletions g.py
Original file line number Diff line number Diff line change
@@ -1,53 +1,40 @@
from deepeval import evaluate
from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
from deepeval.metrics import TaskCompletionMetric
from deepeval.test_case import LLMTestCase, ToolCall

# See above for contents of fake data
fake_data = [
{
"input": "I have a persistent cough and fever. Should I be worried?",
"actual_output": (
"Based on your symptoms, it could be a sign of a viral or bacterial infection. "
"However, if the fever persists for more than three days or you experience difficulty breathing, "
"please consult a doctor immediately."
metric = TaskCompletionMetric(
threshold=0.7, model="gpt-4o", include_reason=True
)
test_case = LLMTestCase(
input="Plan a 3-day itinerary for Paris with cultural landmarks and local cuisine.",
actual_output=(
"Day 1: Eiffel Tower, dinner at Le Jules Verne. "
"Day 2: Louvre Museum, lunch at Angelina Paris. "
"Day 3: Montmartre, evening at a wine bar."
),
tools_called=[
ToolCall(
name="Itinerary Generator",
description="Creates travel plans based on destination and duration.",
input_parameters={"destination": "Paris", "days": 3},
output=[
"Day 1: Eiffel Tower, Le Jules Verne.",
"Day 2: Louvre Museum, Angelina Paris.",
"Day 3: Montmartre, wine bar.",
],
),
"retrieval_context": [
"Coughing that lasts more than three weeks is typically classified as a chronic cough and could indicate conditions such as asthma, chronic bronchitis, or gastroesophageal reflux disease (GERD).",
"A fever is the body's natural response to infections, often caused by viruses or bacteria. Persistent fevers lasting more than three days should be evaluated by a healthcare professional as they may indicate conditions like pneumonia, tuberculosis, or sepsis.",
"Shortness of breath associated with fever and cough can be a sign of serious respiratory issues such as pneumonia, bronchitis, or COVID-19.",
"Self-care tips for mild symptoms include staying hydrated, taking over-the-counter fever reducers (e.g., acetaminophen or ibuprofen), and resting. Avoid suppressing a productive cough without consulting a healthcare provider.",
],
},
{
"input": "What should I do if I accidentally cut my finger deeply?",
"actual_output": (
"If you cut your finger deeply, just rinse it with water and avoid applying any pressure. "
"Tetanus shots aren't necessary unless you see redness immediately."
ToolCall(
name="Restaurant Finder",
description="Finds top restaurants in a city.",
input_parameters={"city": "Paris"},
output=["Le Jules Verne", "Angelina Paris", "local wine bars"],
),
"retrieval_context": [
"Deep cuts that are more than 0.25 inches deep or expose fat, muscle, or bone require immediate medical attention. Such wounds may need stitches to heal properly.",
"To minimize the risk of infection, wash the wound thoroughly with soap and water. Avoid using alcohol or hydrogen peroxide, as these can irritate the tissue and delay healing.",
"If the bleeding persists for more than 10 minutes or soaks through multiple layers of cloth or bandages, seek emergency care. Continuous bleeding might indicate damage to an artery or vein.",
"Watch for signs of infection, including redness, swelling, warmth, pain, or pus. Infections can develop even in small cuts if not properly cleaned or if the individual is at risk (e.g., diabetic or immunocompromised).",
"Tetanus, a bacterial infection caused by Clostridium tetani, can enter the body through open wounds. Ensure that your tetanus vaccination is up to date, especially if the wound was caused by a rusty or dirty object.",
],
},
]
],
)

# metric.measure(test_case)
# print(metric.score)
# print(metric.reason)

# Create a list of LLMTestCase
test_cases = []
for fake_datum in fake_data:
test_case = LLMTestCase(
input=fake_datum["input"],
actual_output=fake_datum["actual_output"],
retrieval_context=fake_datum["retrieval_context"],
)
test_cases.append(test_case)

# Define metrics
answer_relevancy = AnswerRelevancyMetric()
faithfulness = FaithfulnessMetric()

# Run evaluation
evaluate(test_cases=test_cases, metrics=[answer_relevancy, faithfulness])
# or evaluate test cases in bulk
evaluate([test_case], [metric])

0 comments on commit 99ee4e9

Please sign in to comment.