Open
Description
[x] I checked the documentation and related resources and couldn't find an answer to my question.
Your Question
I would like to use logprobs to assess the confidence of verdict predictions in ragas, the implementation will likely be using callbacks. Before implementing it, I would like to know if anyone has already experimented with this and what were your takes on it?
My hope is to reduce the indeterministic nature of the scores from one run to another, by e.g., reducing hallucinations in verdict predictions (i.e. if the confidence is very low for verdict prediction, then rerun the prediction).