Skip to content

Extract used CSS variables from .css files #17433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Fix negated `content` rules in legacy JavaScript configuration ([#17255](https://github.com/tailwindlabs/tailwindcss/pull/17255))
- Extract special `@("@")md:…` syntax in Razor files ([#17427](https://github.com/tailwindlabs/tailwindcss/pull/17427))
- Disallow arbitrary values with top-level braces and semicolons as well as unbalanced parentheses and brackets ([#17361](https://github.com/tailwindlabs/tailwindcss/pull/17361))
- Extract used CSS variables from `.css` files ([#17433](https://github.com/tailwindlabs/tailwindcss/pull/17433))

### Changed

Expand Down
36 changes: 36 additions & 0 deletions crates/oxide/src/extractor/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use crate::cursor;
use crate::extractor::machine::Span;
use bstr::ByteSlice;
use candidate_machine::CandidateMachine;
use css_variable_machine::CssVariableMachine;
use machine::{Machine, MachineState};
Expand Down Expand Up @@ -139,6 +140,41 @@ impl<'a> Extractor<'a> {

extracted
}

pub fn extract_variables_from_css(&mut self) -> Vec<Extracted<'a>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done in a pre-processor where we already have logic to do that based on file extension?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of but not really, we do have the contents and the extension so we do have all the information we need, but we return a new potentially transformed input file.

pub fn pre_process_input(content: &[u8], extension: &str) -> Vec<u8> {}

But we don't emit extracted variables here. So we could update this logic to also emit Vec<Extracted>.

Another solution is that we could transform the CSS file and replace everything that is not a used CSS variable with whitespace. That way the normal extractor won't find anything, and the CSS extractor will only find used variables.

But both these solutions seem like a hack, and a bit of an abuse for this pre_process_input function 😅 unless you're thinking about something completely different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this does make sense, agree that abusing pre-process for this does feel wrong too now that I think about it more.

let mut extracted = Vec::with_capacity(100);

let len = self.cursor.input.len();

let cursor = &mut self.cursor.clone();
while cursor.pos < len {
if cursor.curr.is_ascii_whitespace() {
cursor.advance();
continue;
}

if let MachineState::Done(span) = self.css_variable_machine.next(cursor) {
// We are only interested in variables that are used, not defined. Therefore we
// need to ensure that the variable is prefixed with `var(`.
if span.start < 4 {
cursor.advance();
continue;
}

let slice_before = Span::new(span.start - 4, span.start - 1);
if !slice_before.slice(self.cursor.input).starts_with(b"var(") {
cursor.advance();
continue;
}

extracted.push(Extracted::CssVariable(span.slice(self.cursor.input)));
}

cursor.advance();
}

extracted
}
}

// Extract sub-candidates from a given range.
Expand Down
1 change: 0 additions & 1 deletion crates/oxide/src/scanner/fixtures/ignored-extensions.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
css
less
lock
sass
Expand Down
64 changes: 62 additions & 2 deletions crates/oxide/src/scanner/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,9 @@ pub struct Scanner {
/// All found extensions
extensions: FxHashSet<String>,

/// All CSS files we want to scan for CSS variable usage
css_files: Vec<PathBuf>,

/// All files that we have to scan
files: Vec<PathBuf>,

Expand Down Expand Up @@ -212,11 +215,25 @@ impl Scanner {
fn extract_candidates(&mut self) -> Vec<String> {
let changed_content = self.changed_content.drain(..).collect::<Vec<_>>();

let candidates = parse_all_blobs(read_all_files(changed_content));
// Extract all candidates from the changed content
let mut new_candidates = parse_all_blobs(read_all_files(changed_content));

// Extract all CSS variables from the CSS files
let css_files = self.css_files.drain(..).collect::<Vec<_>>();
if !css_files.is_empty() {
let css_variables = extract_css_variables(read_all_files(
css_files
.into_iter()
.map(|file| ChangedContent::File(file, "css".into()))
.collect(),
));

new_candidates.extend(css_variables);
}

// Only compute the new candidates and ignore the ones we already have. This is for
// subsequent calls to prevent serializing the entire set of candidates every time.
let mut new_candidates = candidates
let mut new_candidates = new_candidates
.into_par_iter()
.filter(|candidate| !self.candidates.contains(candidate))
.collect::<Vec<_>>();
Expand Down Expand Up @@ -248,6 +265,12 @@ impl Scanner {
.and_then(|x| x.to_str())
.unwrap_or_default(); // In case the file has no extension

// Special handing for CSS files to extract CSS variables
if extension == "css" {
self.css_files.push(path);
continue;
}
Comment on lines +268 to +272
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Wouldn't we literally want to emit this in the files list so we also ensure we have a watcher for it etc?

Copy link
Member Author

@RobinMalfait RobinMalfait Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right I think. My thinking was that changing a CSS file triggers a full rebuild anyway. But that statement doesn't hold true for .module.css files and they should be watched indeed.

We also have to be careful that we don't scan and watch the dist/out.css file 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind following up on that? Just so we don't forget

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference. Follow up PR: #17467


self.extensions.insert(extension.to_owned());
self.changed_content.push(ChangedContent::File(
path.to_path_buf(),
Expand Down Expand Up @@ -402,6 +425,43 @@ fn read_all_files(changed_content: Vec<ChangedContent>) -> Vec<Vec<u8>> {
.collect()
}

#[tracing::instrument(skip_all)]
fn extract_css_variables(blobs: Vec<Vec<u8>>) -> Vec<String> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started passing options to the other parse_all_blobs function but it looked a bit messy. So duplicated it instead (even though there is a good chunk of duplication going on).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the only difference really should be that we no-op the Candidate machine though? Can you elaborate what you meant with messy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started passing through the contents and the extension and it leaked everywhere. Can see that in this commit: 7dc7878

Then I was thinking about an ExtractorOptions struct that we could pass in as well, but that also required checking the options in the extract() call at runtime. While that's not the end of the world, that's additional check we have to perform but it's a useless check for 99% of the files we scan. To make things a bit worse, the extract() function is called for every line in every file. So there could be a lot of unnecessary checks.

So instead of checking it in this hot path, I created a separate function instead.

let mut result: Vec<_> = blobs
.par_iter()
.flat_map(|blob| blob.par_split(|x| *x == b'\n'))
.filter_map(|blob| {
if blob.is_empty() {
return None;
}

let extracted = crate::extractor::Extractor::new(blob).extract_variables_from_css();
if extracted.is_empty() {
return None;
}

Some(FxHashSet::from_iter(extracted.into_iter().map(
|x| match x {
Extracted::CssVariable(bytes) => bytes,
_ => &[],
},
)))
})
.reduce(Default::default, |mut a, b| {
a.extend(b);
a
})
.into_iter()
.map(|s| unsafe { String::from_utf8_unchecked(s.to_vec()) })
.collect();

// SAFETY: Unstable sort is faster and in this scenario it's also safe because we are
// guaranteed to have unique candidates.
result.par_sort_unstable();

result
}

#[tracing::instrument(skip_all)]
fn parse_all_blobs(blobs: Vec<Vec<u8>>) -> Vec<String> {
let mut result: Vec<_> = blobs
Expand Down
35 changes: 35 additions & 0 deletions crates/oxide/tests/scanner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1735,4 +1735,39 @@ mod scanner {

assert_eq!(candidates, vec!["content-['abcd/xyz.html']"]);
}

#[test]
fn test_extract_used_css_variables_from_css() {
let dir = tempdir().unwrap().into_path();
create_files_in(
&dir,
&[
(
"src/index.css",
r#"
@theme {
--color-red: #ff0000; /* Not used, so don't extract */
--color-green: #00ff00; /* Not used, so don't extract */
}

.button {
color: var(--color-red); /* Used, so extract */
}
"#,
),
("src/used-at-start.css", "var(--color-used-at-start)"),
// Here to verify that we don't crash when trying to find `var(` in front of the
// variable.
("src/defined-at-start.css", "--color-defined-at-start: red;"),
],
);

let mut scanner = Scanner::new(vec![public_source_entry_from_pattern(
dir.clone(),
"@source './'",
)]);
let candidates = scanner.scan();

assert_eq!(candidates, vec!["--color-red", "--color-used-at-start"]);
}
}