Skip to content

[clang][scan-build] Treat --use-cc and --use-c++ as shell commands #131932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 24, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 36 additions & 58 deletions clang/tools/scan-build/libexec/ccc-analyzer
Original file line number Diff line number Diff line change
Expand Up @@ -51,63 +51,40 @@ sub silent_system {
# Compiler command setup.
##===----------------------------------------------------------------------===##

# Search in the PATH if the compiler exists
sub SearchInPath {
my $file = shift;
foreach my $dir (split (':', $ENV{PATH})) {
if (-x "$dir/$file") {
return 1;
}
}
return 0;
}

my $Compiler;
my $Clang;
my $DefaultCCompiler;
my $DefaultCXXCompiler;
my $IsCXX;
my $AnalyzerTarget;

# If on OSX, use xcrun to determine the SDK root.
my $UseXCRUN = 0;

if (`uname -s` =~ m/Darwin/) {
$DefaultCCompiler = 'clang';
$DefaultCXXCompiler = 'clang++';
# Older versions of OSX do not have xcrun to
# query the SDK location.
if (-x "/usr/bin/xcrun") {
$UseXCRUN = 1;
}
} elsif (`uname -s` =~ m/(FreeBSD|OpenBSD)/) {
$DefaultCCompiler = 'cc';
$DefaultCXXCompiler = 'c++';
} else {
$DefaultCCompiler = 'gcc';
$DefaultCXXCompiler = 'g++';
}

if ($FindBin::Script =~ /c\+\+-analyzer/) {
$Compiler = $ENV{'CCC_CXX'};
if (!defined $Compiler || (! -x $Compiler && ! SearchInPath($Compiler))) { $Compiler = $DefaultCXXCompiler; }

$Clang = $ENV{'CLANG_CXX'};
if (!defined $Clang || ! -x $Clang) { $Clang = 'clang++'; }

$IsCXX = 1
{
my ($DefaultCCompiler, $DefaultCXXCompiler);

my $os = `uname -s`;
if ($os =~ m/Darwin/) {
$DefaultCCompiler = 'clang';
$DefaultCXXCompiler = 'clang++';
} elsif ($os =~ m/(FreeBSD|OpenBSD)/) {
$DefaultCCompiler = 'cc';
$DefaultCXXCompiler = 'c++';
} else {
$DefaultCCompiler = 'gcc';
$DefaultCXXCompiler = 'g++';
}

sub DetermineCompiler {
my ($is_cxx) = @_;
my $default = $is_cxx ? $DefaultCXXCompiler : $DefaultCCompiler;
my $opt = $ENV{$is_cxx ? 'CCC_CXX' : 'CCC_CC'};
return defined $opt ? shellwords($opt) : $default;
}
}
else {
$Compiler = $ENV{'CCC_CC'};
if (!defined $Compiler || (! -x $Compiler && ! SearchInPath($Compiler))) { $Compiler = $DefaultCCompiler; }

$Clang = $ENV{'CLANG'};
if (!defined $Clang || ! -x $Clang) { $Clang = 'clang'; }

$IsCXX = 0
sub DetermineClang {
my ($is_cxx) = @_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshedding: Please use $UpperCamelCase within the new functions DetermineCompiler / DetermineClang because it's a bit jarring when $IsCXX / $is_cxx appears both in $UpperCamelCase and in $snake_case. I see that these styles both appear in the script, but in these parts $UpperCamelCase is more common, so I think it would be better if you used that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you on the mix of different styles in this code being far from ideal. I did briefly consider cleaning those things up a bit, but decided to not extend the scope of this changeset further and to perhaps save those kinds of changes for a separate pull-request if there was any interest in that.

Here's how I arrived at the names currently used in this pull-request:

  • Functions
    Almost all existing functions use UpperCamelCase, so I stuck with that even though it's not at all typical for the majority of modern(-ish) Perl code

  • Variables
    These were also quite inconsistent, but I stuck with common modern(-ish) practices, where most regular lexical variables use $snake_case, constants use $ALL_CAPS, and "globals" use $UpperCamelCase as per https://metacpan.org/dist/perl/view/pod/perlstyle.pod, the old Perl Best Practices book, and established community conventions.

    I'd find it much more jarring if a function was to potentially shadow a variable which reads like a $Global from a surrounding scope.

I don't feel particularly strongly about it, and would appreciate your input on the following suggestions for how to best move forwards:

  1. Just call the parameter $IsCXX in the functions as suggested.
    This defies the expectations of many Perl programmers when it comes to naming, but so does most of the rest of the program.
  2. Keep things as they are.
    This is consistent with modern practices.
  3. Call the local $is_cxx variable something else (but not $IsCXX), or rename the "global" $IsCXX.
    To not potentially confuse people unfamiliar with Perl or its current common practices.
  4. Just use the "global" $IsCXX directly.
    Reducing some of the global state to make it slightly easier to reason about those individual functions seemed nice, but given how much of it there is anyway, maybe it doesn't even matter in this context.

Thanks for your review and your feedback on the above!

Copy link
Contributor

@NagyDonat NagyDonat Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the clarifications. I didn't know about these Perl naming conventions (last time I touched Perl was 10+ years ago), but now that you explained them, I agree that it's reasonable to follow them and keep the code as it is currently. (The $is_cxx / $IsCXX difference is a bit strange for the untrained eye, but it won't cause confusion or misunderstandings.)

my $default = $is_cxx ? 'clang++' : 'clang';
my $opt = $ENV{$is_cxx ? 'CLANG_CXX' : 'CLANG'};
return defined $opt ? $opt : $default;
}

$AnalyzerTarget = $ENV{'CLANG_ANALYZER_TARGET'};
my $IsCXX = $FindBin::Script =~ /c\+\+-analyzer/;
my ($Compiler, @CompilerArgs) = DetermineCompiler($IsCXX);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rafl .
I'm not familiar with Perl. Just curious, how does it parse, for example, ccache gcc? Would $Compiler hold ccache and gcc becomes one of the @CompilerArgs?

Copy link
Contributor Author

@rafl rafl Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziqingluo-90 that's exactly right. DetermineCompiler returns a list of values (potentially a singleton list), and the call-site unpacks that return value into the first list element $Compiler and the rest of the list @CompilerArgs (which might be empty).

The actual parsing is done using shellwords from Text::ParseWords.

If you'd like to experiment with how it parses certain inputs, you can do something like this, which prints out the parsed list, one item per line:

$ CCC_CC='foo "bar baz" \"quux' perl -MText::ParseWords -le'print for shellwords $ENV{CCC_CC}'
foo
bar baz
"quux

my $Clang = DetermineClang($IsCXX);
my $AnalyzerTarget = $ENV{'CLANG_ANALYZER_TARGET'};

##===----------------------------------------------------------------------===##
# Cleanup.
Expand Down Expand Up @@ -199,7 +176,7 @@ sub GetCCArgs {
die "could not find clang line\n" if (!defined $line);
# Strip leading and trailing whitespace characters.
$line =~ s/^\s+|\s+$//g;
my @items = quotewords('\s+', 0, $line);
my @items = shellwords($line);
my $cmd = shift @items;
die "cannot find 'clang' in 'clang' command\n" if (!($cmd =~ /clang/ || basename($cmd) =~ /llvm/));
# If this is the llvm-driver the internal command will look like "llvm clang ...".
Expand Down Expand Up @@ -462,9 +439,9 @@ my $Output;
my %Uniqued;

# Forward arguments to gcc.
my $Status = system($Compiler,@ARGV);
my $Status = system($Compiler,@CompilerArgs,@ARGV);
if (defined $ENV{'CCC_ANALYZER_LOG'}) {
print STDERR "$Compiler @ARGV\n";
print STDERR "$Compiler @CompilerArgs @ARGV\n";
}
if ($Status) { exit($Status >> 8); }

Expand Down Expand Up @@ -698,8 +675,9 @@ if ($ForceAnalyzeDebugCode) {

# If we are on OSX and have an installation where the
# default SDK is inferred by xcrun use xcrun to infer
# the SDK.
if (not $HasSDK and $UseXCRUN) {
# the SDK. Older versions of OSX do not have xcrun to
# query the SDK location.
if (not $HasSDK and -x '/usr/bin/xcrun') {
my $sdk = `/usr/bin/xcrun --show-sdk-path -sdk macosx`;
chomp $sdk;
push @CompileOpts, "-isysroot", $sdk;
Expand Down
Loading