Skip to content

add SAELens as a library ⚡ #826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 1, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions packages/tasks/src/model-libraries.ts
Original file line number Diff line number Diff line change
Expand Up @@ -429,6 +429,12 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
filter: false,
countDownloads: `path:"tokenizer.model"`,
},
saelens: {
prettyLabel: "SAELens",
repoName: "SAELens",
repoUrl: "https://github.com/jbloomAus/SAELens",
filter: false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: @Wauplin - do you have any recommendations on the best way to track downloads in this scenario: https://huggingface.co/jbloom/GPT2-Small-SAEs-Reformatted/tree/main

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends if the general use case is to load only one block or all blocks at once.

  • if only one block, better to do as I suggested in add SAELens as a library ⚡ #826 (review)
  • if all blocks are generally instanciated at once, then it's better to count downloads only on the first block path: "blocks.0.hook_resid_pre/cfg.json"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it is a bit more complicated, on more newer repos (cannot link them atm as they are not public yet) they might just have an npz file. The usage is more layer wise so I think the former suggestion for npz, cfg, json makes more sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the each block provides a different lens so they are used independently. But the structure does not appear to be standardized, so I would merge as is for now (without download counts) and clarify when we can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, the downloads would need more time to standardise. Given this unblocks a release happening soon. Is it okay if we merge as is for now and revisit in a day when we have more clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me!

},
"sample-factory": {
prettyLabel: "sample-factory",
repoName: "sample-factory",
Expand Down
Loading