Skip to content

Problems with SIMBA and using a verbal feedback for the metric #8278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
meditans opened this issue May 26, 2025 · 0 comments
Open

Problems with SIMBA and using a verbal feedback for the metric #8278

meditans opened this issue May 26, 2025 · 0 comments

Comments

@meditans
Copy link
Contributor

Hi,
I was pondering the fact that in order to improve the prompt when a metric gives a low score, it would be useful for the optimizer to access a verbalization of the reason for the low score. I see that as a step towards rendering the dspy program more modular: the explanation of the requirement could be colocated with the metrics, we wouldn't need to have them bundled in the initial prompt, and we can let the optimization process include that in the initial prompt in the way it sees fit.

So looking at the issues I found this issue #7938, in which @okhat and @ratzrattillo arrived at a very similar conclusion, and a new optimizer SIMBA was suggested. Here is the example code that was pasted there:

program = MyFancyDSPyModule()

def my_composite_metric(example, prediction, trace=None):
   score1 = metric1(example, prediction, trace)
   score2 = metric2(example, prediction, trace)
   score = (score1 + score2) / 2.0
   feedback = f"You scored {score1}/1.0 and {score2}/1.0 on metric1 and metric2, respectively"
   return dspy.Prediction(score=score, feedback=feedback)

optimized_program = dspy.SIMBA(metric=my_composite_metric).compile(program, trainset=[dspy.Example(...), ...])

Now I follow with a couple of problems I encountered when trying this idea, and a couple of questions. Keep in mind I tried to use this optimizer in the context of 0-shot learning - I was only interested in prompt optimization since the task is quite token intensive and I didn't want few-shot examples.

  1. Can be SIMBA used for only prompt (0-shot) learning? Can it be used to generate some synthetic examples for few shot learning?
  2. I used the suggested parameter, but in the end the prompt was not optimized (in the sense that the final program contained the same prompt I started with)
  3. I'm using mlflow to track the optimization process, but nothing was logged in that experiment (I think things get logged when I use mipro2. This might be related to Unable to access optimized prompt on SIMBA optimizer, train logs doesn't show any loss after specific steps and no initial prompt on lm.inspect_history #8150.
  4. You suggest to return a Prediction with a feedback field, but I can't figure out where that feedback would be used in the SIMBA optimizer code
  5. I would like to understand better how this paradigm of bundling feedback with metrics (which I think is the right direction) overlaps with the mechanism of dspy assertions. Or is it that assertions are test-time constructs while better metrics are compile-time constructs?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant