Skip to content
This repository was archived by the owner on Aug 7, 2024. It is now read-only.

Added use_activation_hooks: bool to swap #214

Closed
wants to merge 7 commits into from
Closed

Conversation

awgu
Copy link

@awgu awgu commented Feb 14, 2024

Stack from ghstack (oldest at bottom):

This PR exposes use_activation_hooks to swap_linear_with_float8_linear. I am not sure if we want this long-term. Are we going to consolidate to always using activation hooks in the future (because we need it for DTensor (?))?

Test Plan

./test/test_everything.sh

Differential Revision: D53780333

awgu pushed a commit that referenced this pull request Feb 14, 2024
ghstack-source-id: 65dd688
Pull Request resolved: #214
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 14, 2024
@@ -70,8 +70,10 @@ def _update_history_with_new_amax(new_amax, amax_history):
def swap_linear_with_float8_linear(
module: nn.Module,
module_cls: Type[nn.Module],
emulate: bool = False,
*,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that forcing these to be kwarg only would be less error prone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I like it!

@awgu awgu marked this pull request as ready for review February 14, 2024 21:31
@awgu
Copy link
Author

awgu commented Feb 14, 2024

@awgu has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Andrew Gu added 2 commits February 15, 2024 11:16
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
Andrew Gu added 3 commits February 16, 2024 07:36
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
@awgu awgu requested a review from drisspg February 16, 2024 16:06
This PR exposes `use_activation_hooks` to `swap_linear_with_float8_linear`. I am not sure if we want this long-term. Are we going to consolidate to _always_ using activation hooks in the future (because we need it for DTensor (?))?

**Test Plan**
```
./test/test_everything.sh
```

Differential Revision: [D53780333](https://our.internmc.facebook.com/intern/diff/D53780333)

[ghstack-poisoned]
awgu pushed a commit that referenced this pull request Feb 27, 2024
ghstack-source-id: 0029afb
Pull Request resolved: #214
@awgu
Copy link
Author

awgu commented Feb 27, 2024

@drisspg I am seeing:

[rank1]:   File "/data/users/andgu/float8_experimental/float8_experimental/float8_tensor.py", line 284, in __torch_dispatch__
[rank1]:     raise NotImplementedError(f"attempting to run {func}, this is not supported")

from test_dtensor.py with today's pytorch viable/strict (2ad66e6bf087a7fde1dd4fcfa14e99f9b7bc1440). It should be unrelated to this PR though.

@awgu
Copy link
Author

awgu commented Feb 27, 2024

@awgu has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@awgu merged this pull request in b9b37f8.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants