-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(llmobs): implement answer relevancy ragas metric #11915
Conversation
|
9c8806f
to
5edd2c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a next step/PR I think I'd like llmobs eval tests to avoid mocking the submit_evaluation
function and instead create a dummy eval writer and use eval events to assert finished evals (like here for spans). WYDT?
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
@Yun-Kim sounds good, will add this as a todo for leaning up tests/ragas integration! |
…e-py into evan.li/ragas-answer-rel
Implements answer relevancy metric for ragas integration.
About Answer Relevancy
Answer relevancy metric focuses on assessing how pertinent the generated answer is to the given prompt. A lower score is assigned to answers that are incomplete or contain redundant information and higher scores indicate better relevancy. This metric is computed using the question, the retrieved contexts and the answer.
The Answer Relevancy is defined as the mean cosine similarity of the original question to a number of artificial questions, which where generated (reverse engineered) based on the response.
Example trace
Checklist
Reviewer Checklist