Skip to content

Commit 0735c1c

Browse files
committed
Build Phar: add custom "strip whitespace and comments" function
This implements option 3 as discussion in the thread for PR 3442. A custom `stripWhitespaceAndComments()` function has been created and added to the script used to build the PHAR files and replaces the use of the PHP native `php_strip_whitespace()` function, which does not allow for creating PHP cross-version compatible PHAR files if attributes are used anywhere in the code. The problem with the PHP native `php_strip_whitespace()` function is this: * When run on PHP <= 7.4, attributes would be stripped from the code as they are seen as `#` comments. This undoes the deprecation silencing for methods for which no return type can be added yet (as needed for full PHP 8.1 compatibility). * When run on PHP >= 8.0, attributes are _not_ stripped, but recognized correctly, however, the function strips **all** new lines, turning the file effectively into one long line of code. This is problematic when the PHAR would subsequently be run on PHP < 8.0, as any code after the first attribute would then be seen as "commented out", leading to the PHAR not running with a parse error. To solve this, the new `stripWhitespaceAndComments()` function emulates the PHP native function, with two important differences: * New lines are very selectively left in the regenerated content of the files. By ensuring that there is always a new line after an attribute closer, we can prevent code from being seen as commented out in PHP < 8.0. * As the PHPCS native PHP tokenizer is used to interpret the file content, the token stream will be the same PHP cross-version, meaning that attributes will be recognized on all supported PHP versions and the script can now be run again on any supported PHP version and the generated PHAR files will be the same. As an additional performance tweak, `xml` files will no longer be passed to the whitespace/comment stripping. This had now effect previously and as XML files would tokenize as 100% `T_INLINE_HTML`, passing these to the new function would have no effect either (other than slowing down the script). Same goes for the `license.txt` file.
1 parent a4b5217 commit 0735c1c

File tree

1 file changed

+68
-2
lines changed

1 file changed

+68
-2
lines changed

scripts/build-phar.php

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,71 @@
1414
* @link http://pear.php.net/package/PHP_CodeSniffer
1515
*/
1616

17+
use PHP_CodeSniffer\Config;
18+
use PHP_CodeSniffer\Exceptions\RuntimeException;
19+
use PHP_CodeSniffer\Exceptions\TokenizerException;
20+
use PHP_CodeSniffer\Tokenizers\PHP;
21+
use PHP_CodeSniffer\Util\Tokens;
22+
1723
error_reporting(E_ALL | E_STRICT);
1824

1925
if (ini_get('phar.readonly') === '1') {
2026
echo 'Unable to build, phar.readonly in php.ini is set to read only.'.PHP_EOL;
2127
exit(1);
2228
}
2329

30+
require_once dirname(__DIR__).'/autoload.php';
31+
require_once dirname(__DIR__).'/src/Util/Tokens.php';
32+
33+
if (defined('PHP_CODESNIFFER_VERBOSITY') === false) {
34+
define('PHP_CODESNIFFER_VERBOSITY', 0);
35+
}
36+
37+
38+
/**
39+
* Replacement for the PHP native php_strip_whitespace() function,
40+
* which doesn't handle attributes correctly for cross-version PHP.
41+
*
42+
* @param string $fullpath Path to file.
43+
* @param \PHP_CodeSniffer\Config $config Perfunctory Config.
44+
*
45+
* @return string
46+
*
47+
* @throws \PHP_CodeSniffer\Exceptions\RuntimeException When tokenizer errors are encountered.
48+
*/
49+
function stripWhitespaceAndComments($fullpath, $config)
50+
{
51+
$contents = file_get_contents($fullpath);
52+
53+
try {
54+
$tokenizer = new PHP($contents, $config, "\n");
55+
$tokens = $tokenizer->getTokens();
56+
} catch (TokenizerException $e) {
57+
throw new RuntimeException('Failed to tokenize file '.$fullpath);
58+
}
59+
60+
$stripped = '';
61+
foreach ($tokens as $token) {
62+
if ($token['code'] === T_ATTRIBUTE_END || $token['code'] === T_OPEN_TAG) {
63+
$stripped .= $token['content']."\n";
64+
continue;
65+
}
66+
67+
if (isset(Tokens::$emptyTokens[$token['code']]) === false) {
68+
$stripped .= $token['content'];
69+
continue;
70+
}
71+
72+
if ($token['code'] === T_WHITESPACE) {
73+
$stripped .= ' ';
74+
}
75+
}
76+
77+
return $stripped;
78+
79+
}//end stripWhitespaceAndComments()
80+
81+
2482
$startTime = microtime(true);
2583

2684
$scripts = [
@@ -53,7 +111,9 @@
53111
$rdi = new \RecursiveDirectoryIterator($srcDir, \RecursiveDirectoryIterator::FOLLOW_SYMLINKS);
54112
$di = new \RecursiveIteratorIterator($rdi, 0, \RecursiveIteratorIterator::CATCH_GET_CHILD);
55113

114+
$config = new Config();
56115
$fileCount = 0;
116+
57117
foreach ($di as $file) {
58118
$filename = $file->getFilename();
59119

@@ -68,13 +128,19 @@
68128
}
69129

70130
$path = 'src'.substr($fullpath, $srcDirLen);
71-
$phar->addFile($fullpath, $path);
131+
132+
if (substr($filename, -4) === '.xml') {
133+
$phar->addFile($fullpath, $path);
134+
} else {
135+
// PHP file.
136+
$phar->addFromString($path, stripWhitespaceAndComments($fullpath, $config));
137+
}
72138

73139
++$fileCount;
74140
}//end foreach
75141

76142
// Add autoloader.
77-
$phar->addFile(realpath(__DIR__.'/../autoload.php'), 'autoload.php');
143+
$phar->addFromString('autoload.php', stripWhitespaceAndComments(realpath(__DIR__.'/../autoload.php'), $config));
78144

79145
// Add licence file.
80146
$phar->addFile(realpath(__DIR__.'/../licence.txt'), 'licence.txt');

0 commit comments

Comments
 (0)