Skip to content

[vim] Improve iskeyword for LLVM IR #117905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 4, 2024
Merged

Conversation

frasercrmck
Copy link
Contributor

This patch sets the 'iskeyword' variable to characters found in LLVM IR identifiers. Keywords are used in many places in vim, most notably being treated as word boundaries for commands like 'w' and '*'. The aim with this is to improve the navigability and editability of LLVM IR files as now one is able to: skip over entire identifiers with motions (e.g., w/e/b); yank/delete whole identifiers (e.g., diw); highlight/search for the identifier under the cursor (*), etc.

More complicated LLVM identifiers including quotation marks are not supported. The 'iskeyword' variable is just a list of characters, not a regex, and including quotation marks and all the characters permitted in quoted identifiers would expand the scope to almost everything and become less usable. These types of identifiers are rare by comparison.

Note that this does change how words are considered across the entire LLVM IR file, so including strings, comments, names, etc. Given that the majority of editing/navigating LLVM IR is working with and across values, this is arguably a worthwhile trade-off.

This patch sets the 'iskeyword' variable to characters found in LLVM IR
identifiers. Keywords are used in many places in vim, most notably being
treated as word boundaries for commands like 'w' and '*'. The aim with
this is to improve the navigability and editability of LLVM IR files as
now one is able to: skip over entire identifiers with motions (e.g.,
w/e/b); yank/delete whole identifiers (e.g., diw); highlight/search for
the identifier under the cursor (*), etc.

More complicated LLVM identifiers including quotation marks are not
supported. The 'iskeyword' variable is just a list of characters, not a
regex, and including quotation marks and all the characters permitted in
quoted identifiers would expand the scope to almost everything and
become less usable. These types of identifiers are rare by comparison.

Note that this does change how words are considered across the entire
LLVM IR file, so including strings, comments, names, etc. Given that the
majority of editing/navigating LLVM IR is working with and across
values, this is arguably a worthwhile trade-off.
@nickdesaulniers
Copy link
Member

Just so I understand the issue being addressed by this issue, is it that variable identifiers that begin with % or @ are treated currently as two words, such that motions don't act on the whole identifier?

@ldrumm
Copy link
Contributor

ldrumm commented Dec 3, 2024

I'm a big fan of this. Like you say: it's not perfect, but it makes following SSA values so much easier that I think it's definitely worth it despite not covering all complex identifiers.

I'll let someone with more vim chops chime in, but big +1 from me

@frasercrmck
Copy link
Contributor Author

Just so I understand the issue being addressed by this issue, is it that variable identifiers that begin with % or @ are treated currently as two words, such that motions don't act on the whole identifier?

Yes, partially. However, it's more than that; since . is treated as a word separator, I find IR values very cumbersome to work with. For example, word-based motions on a %foo.i.i.0 b currently take 9 ws to move the cursor from a to b. With this patch it'd be 2, as with any other "word" in any other language. You can also have your cursor anywhere on %foo.i.i.0 and * will take you to the next occurrence. This was my primary motivation for the patch. Yes, you have capital-w Word motions to combat some of what I'm saying, but operators like * can't work with those.

It's a balance, however, since it becomes harder to move through these sub-components of identifiers. You'd probably have to start using motions based on finding the next/previous . character. Ultimately, people can change this to suit their needs. This is about a sensible default. I could always leave in a comment explaining to comment this iskeyword out to have such-and-such behaviour.

All that said, I was thinking I should add the non-latin alphanumeric characters back to the list. That keeps strings just as easy to navigate through (we don't want something like hübsch to become more difficult to navigate through)

@frasercrmck frasercrmck merged commit 8c46413 into llvm:main Dec 4, 2024
8 checks passed
@frasercrmck frasercrmck deleted the vim-iskeyword branch December 4, 2024 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants