-
Notifications
You must be signed in to change notification settings - Fork 607
MTK Android Llama Runner #6208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MTK Android Llama Runner #6208
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6208
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6db8726 with merge base 2c32bf3 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
TODO: Name space changes (reflect latest on #5478) |
Also pointing to @kirklandsign on how he wants to handle the extension/android/CMakeLists.txt changes. |
examples/mediatek/executor_runner/llama_runner/llm_helper/include/llama_runner_values.h
Show resolved
Hide resolved
extension/android/CMakeLists.txt
Outdated
@@ -155,6 +155,26 @@ if(EXECUTORCH_BUILD_LLAMA_JNI) | |||
${EXECUTORCH_ROOT}/examples/models/llama2/runner | |||
${CMAKE_CURRENT_BINARY_DIR}/../../examples/models/llama2/runner | |||
) | |||
|
|||
target_sources( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should protect this under a flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aligned with @kirklandsign that he will help make change to this as it is CMake
@@ -17,6 +17,7 @@ | |||
|
|||
#include <executorch/examples/models/llama2/runner/runner.h> | |||
#include <executorch/examples/models/llava/runner/llava_runner.h> | |||
#include <executorch/examples/mediatek/executor_runner/mtk_llama_runner.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to put this under a preprocessor macro. We don't want to add MTK header library to our build always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aligned with @kirklandsign that he will help make change to this as he would like to refactor jni a bit.
c097f9f
to
4e310cb
Compare
@kirklandsign as part of the build scripts and jni changes you'll make. Just an fyi that there is an lintrunner error on the mixed upper and lower case:
|
7020c1a
to
a167d66
Compare
82d4ff2
to
03c12c8
Compare
const std::string TOKENIZER_PATH = | ||
"/data/local/tmp/et-mtk/llama3/tokenizer.model"; | ||
const std::string TOKEN_EMBEDDING_PATH = | ||
"/data/local/tmp/et-mtk/llama3/embedding_llama3-8B-instruct_fp32.bin"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix those
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for right now, tokenizer path, token embedding path and model paths will be hardcoded in aar. We will then make changes to see if we want to have a different flow.
f56b488
to
b2bca6e
Compare
@neuropilot-captain please take a look at the media llama runner change |
@cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept the same reason as #6304, @neuropilot-captain please let us know if you have any concern
@@ -37,6 +37,7 @@ build_android_native_library() { | |||
cmake . -DCMAKE_INSTALL_PREFIX="${CMAKE_OUT}" \ | |||
-DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake" \ | |||
-DANDROID_ABI="${ANDROID_ABI}" \ | |||
-DANDROID_PLATFORM=android-26 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we bump to more recent version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that here 26 is needed? Thought 30 is stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a4ff680
to
6db8726
Compare
rebase |
@cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@cmodi-meta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Adding the new MediaTek Runner that will work with the Android Demo app.
Run script to generate aar like below:
.aar file will live in
examples/demo-apps/android/Llamademo/app/libs
as executorch-llama.aarNote: The new runner (
mtk_llama_runner.cpp
) is a fork of the existingmtk_llama_executor_runner.cpp
. Ifmtk_llama_executor_runner.cpp
is modified thenmtk_llama_runner.cpp
will need to as well. Another alternative is to adopt themtk_llama_runner.cpp
and it's flow as primary.