Skip to content

Remove overridden prefill API declaration in llava_runner.h #5149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 9 additions & 33 deletions examples/models/llava/runner/llava_runner.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,62 +29,38 @@ class ET_EXPERIMENTAL LlavaRunner
const std::string& tokenizer_path,
const float temperature = 0.8f)
: MultimodalRunner(model_path, tokenizer_path, temperature){};
bool is_loaded();
::executorch::runtime::Error load();

bool is_loaded() override;

::executorch::runtime::Error load() override;

::executorch::runtime::Error generate(
std::vector<::executorch::extension::llm::Image> images,
const std::string& prompt,
int32_t seq_len = 1024,
std::function<void(const std::string&)> token_callback = {},
std::function<void(const ::executorch::extension::llm::Stats&)>
stats_callback = {},
bool echo = true);
bool echo = true) override;

/**
* Prefill an LLaVA Module with the given images input.
* @param images The image input to LLaVA.
* @param start_pos The starting position in KV cache of the input in the LLM.
* It's passed as reference and will be updated inside this function.
* @return The error status of prefilling images.
*/
::executorch::runtime::Error prefill_images(
std::vector<::executorch::extension::llm::Image>& images,
int64_t& start_pos);
int64_t& start_pos) override;

/**
* Prefill an LLaVA Module with the given text input.
* @param prompt The text prompt to LLaVA.
* @param start_pos The starting position in KV cache of the input in the LLM.
* It's passed as reference and will be updated inside this function.
* @param bos The number of BOS (begin of sequence) token.
* @param eos The number of EOS (end of sequence) token.
* @return The generated token of the LLaVA Module after prefill prompt.
*/
::executorch::runtime::Result<uint64_t> prefill_prompt(
const std::string& prompt,
int64_t& start_pos,
int8_t bos = 0,
int8_t eos = 0);
int8_t eos = 0) override;

/**
* Generate tokens from the given prompt, starting from the given position.
* @param prompt The text prompt to LLaVA.
* @param seq_len The total sequence length, including the prompt tokens and
* new tokens.
* @param start_pos The starting position in KV cache of the input in the LLM.
* @param token_callback What to do after a token is generated.
* @param stats_callback What to do with Stats.
* @param echo Whether to echo the input prompt or not.
* @return The error code.
*/
::executorch::runtime::Error generate_from_pos(
const std::string& prompt,
int32_t seq_len = 1024,
int64_t start_pos = 0,
std::function<void(const std::string&)> token_callback = {},
std::function<void(const ::executorch::extension::llm::Stats&)>
stats_callback = {},
bool echo = true);
bool echo = true) override;

private:
inline static const std::string kPresetPrompt =
Expand Down
Loading