Skip to content

Commit 7c5b4e6

Browse files
committed
PHP 8.1: fix retokenization of "&" character
In PHP < 8.1, the ampersand was tokenized as a simple token, basically just a plain string "&". As of PHP 8.1, due to the introduction of intersection types, PHP is introducing two new tokens for the ampersand. This PR proposes to "undo" the new PHP 8.1 tokenization of the ampersand in favour of the pre-existing tokenization of the character as `T_BITWISE_AND` as it has been in PHPCS since forever. Includes taking the new tokens into account for the "next token after a function keyword token should be a `T_STRING`" logic. This change is already covered extensively by the tests for the `File::isReference()` method, though that method will need updating for PHP 8.1 intersection types, just like the `File::getMethodParameters()` method will need adjusting too. This PR, in combination with PR 3400, fixes all current test failures on PHP 8.1. We may want to consider adding an extra `'is_reference'` array key index to the token array for these tokens, which would allow the `File::isReference()` method to resolve tokens on PHP 8.1 much more quickly and more easily. We also may want to have a think about whether we want to move to the PHP 8.1 tokenization in PHPCS 4.x. All the same, this PR should not be held back by a decision like that as, for now, it just needs to be fixed for PHPCS 3.x.
1 parent 5be0b00 commit 7c5b4e6

File tree

2 files changed

+30
-1
lines changed

2 files changed

+30
-1
lines changed

src/Tokenizers/PHP.php

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,25 @@ protected function tokenize($string)
646646
}//end if
647647
}//end if
648648

649+
/*
650+
PHP 8.1 introduced two dedicated tokens for the & character.
651+
Retokenizing both of these to T_BITWISE_AND, which is the
652+
token PHPCS already tokenized them as.
653+
*/
654+
655+
if ($tokenIsArray === true
656+
&& ($token[0] === T_AMPERSAND_FOLLOWED_BY_VAR_OR_VARARG
657+
|| $token[0] === T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG)
658+
) {
659+
$finalTokens[$newStackPtr] = [
660+
'code' => T_BITWISE_AND,
661+
'type' => 'T_BITWISE_AND',
662+
'content' => $token[1],
663+
];
664+
$newStackPtr++;
665+
continue;
666+
}
667+
649668
/*
650669
If this is a double quoted string, PHP will tokenize the whole
651670
thing which causes problems with the scope map when braces are
@@ -1667,7 +1686,8 @@ protected function tokenize($string)
16671686
if ($token[0] === T_FUNCTION) {
16681687
for ($x = ($stackPtr + 1); $x < $numTokens; $x++) {
16691688
if (is_array($tokens[$x]) === false
1670-
|| isset(Util\Tokens::$emptyTokens[$tokens[$x][0]]) === false
1689+
|| (isset(Util\Tokens::$emptyTokens[$tokens[$x][0]]) === false
1690+
&& $tokens[$x][1] !== '&')
16711691
) {
16721692
// Non-empty content.
16731693
break;

src/Util/Tokens.php

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,15 @@
154154
define('T_ATTRIBUTE', 'PHPCS_T_ATTRIBUTE');
155155
}
156156

157+
// Some PHP 8.1 tokens, replicated for lower versions.
158+
if (defined('T_AMPERSAND_FOLLOWED_BY_VAR_OR_VARARG') === false) {
159+
define('T_AMPERSAND_FOLLOWED_BY_VAR_OR_VARARG', 'PHPCS_T_AMPERSAND_FOLLOWED_BY_VAR_OR_VARARG');
160+
}
161+
162+
if (defined('T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG') === false) {
163+
define('T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG', 'PHPCS_T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG');
164+
}
165+
157166
// Tokens used for parsing doc blocks.
158167
define('T_DOC_COMMENT_STAR', 'PHPCS_T_DOC_COMMENT_STAR');
159168
define('T_DOC_COMMENT_WHITESPACE', 'PHPCS_T_DOC_COMMENT_WHITESPACE');

0 commit comments

Comments
 (0)