Skip to content

Make normalization optional to fix zsl #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 14, 2023
Merged

Make normalization optional to fix zsl #56

merged 10 commits into from
Sep 14, 2023

Conversation

monatis
Copy link
Owner

@monatis monatis commented Sep 12, 2023

closes #54

Since we are already changing function signatures, this might be a good opportunity to use C-compatible signatures everywhere and remove functions with _c suffix. WDYT?

@denis-ismailaj
Copy link
Collaborator

denis-ismailaj commented Sep 13, 2023

Since we are already changing function signatures, this might be a good opportunity to use C-compatible signatures everywhere and remove functions with _c suffix. WDYT?

I think it's better to keep C++ signatures so that it feels natural for users that would like to call the library from C++ code, however we can use preprocessor directives to only expose C-compatible functions when used in C so that we can rename them to have the same names in both cases.

Meaning we go from

#ifdef __cplusplus
bool clip_text_encode(const clip_ctx * ctx, int n_threads, const std::vector<clip_vocab::id> & tokens, float * vec);
#endif
bool clip_text_encode_c(const struct clip_ctx * ctx, int n_threads, const struct clip_tokens * tokens, float * vec);

to

#ifdef __cplusplus
bool clip_text_encode(const clip_ctx * ctx, int n_threads, const std::vector<clip_vocab::id> & tokens, float * vec);
#else
bool clip_text_encode(const struct clip_ctx * ctx, int n_threads, const struct clip_tokens * tokens, float * vec);
#endif

EDIT: A con of this is that it would require doing #ifdefs in the implementation as well, so maybe it's not worth it.

@monatis
Copy link
Owner Author

monatis commented Sep 13, 2023

we can use preprocessor directives to only expose C-compatible functions when used in C so that we can rename them to have the same names in both cases

I think functions with the same name with different signatures would make it too confusing. I'd prefer to either keep both versions or expose only C-compatible functions without _c suffix, simplifying the API.

@denis-ismailaj
Copy link
Collaborator

I think functions with the same name with different signatures would make it too confusing.

Yeah, that's a valid concern.

expose only C-compatible functions without _c suffix, simplifying the API

In that case, I would recommend making the entire public API in clip.h C-compatible to streamline it. This would mean converting some struct fields to simpler types and "hiding" the other structs which don't need to be public in clip.cpp.
This seems to be the approach llama.cpp has taken as well.

I can open a PR to take a stab at this if that's fine with you.

@monatis
Copy link
Owner Author

monatis commented Sep 13, 2023

I would recommend making the entire public API in clip.h C-compatible to streamline it.

Yeah, this is what I'm also thinking of.

I can open a PR to take a stab at this if that's fine with you.

That would be awesome. You can target this PR (or directly push to it) and then we can merge to the main branch at once.

Thanks for the colaboration!

* main.c: remove hardcoded vec_dim

* Make all public functions use C-compatible signatures

* Move private declarations to clip.cpp

* Merge with fix-zsl

* Update examples

* python: rm redundant bindings to CLIPLayer and models

* formatting: apply formatting script

* readme: add notices about the new API

* zsl: use const auto &

* tests: fix benchmark utility

* python: bump version

* Add constructor-like functions for batched structs

---------

Co-authored-by: M. Yusuf Sarıgöz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve zero-shot labeling
2 participants