Skip to content

bpo-32107 - Fix test_uuid #4494

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 78 commits into from
Closed

bpo-32107 - Fix test_uuid #4494

wants to merge 78 commits into from

Conversation

warsaw
Copy link
Member

@warsaw warsaw commented Nov 21, 2017

The original check was erroneous in two ways. First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards". I think this is referring to the universally vs. locally administered
MAC addresses. Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
"distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered. This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.

https://bugs.python.org/issue32107

The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
@warsaw
Copy link
Member Author

warsaw commented Nov 21, 2017

I also added a very minor simplification to the code, since it's always better to return an explicit value when there are other return paths in the code, rather than letting None be implicitly returned.

Lib/uuid.py Outdated
@@ -404,8 +404,7 @@ def _arp_getnode():
# This works on Linux, FreeBSD and NetBSD
mac = _find_mac('arp', '-an', [os.fsencode('(%s)' % ip_addr)],
lambda i: i+2)
if mac:
return mac
return mac
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should return None instead of 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. Let me fix that.

@warsaw warsaw self-assigned this Nov 21, 2017
Lib/uuid.py Outdated
if mac:
return mac
# Return None instead of 0.
return mac if mac else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if pep8-izy the code, I suggest to add explicit return None in other functions.

For uniformity I would prefer

    if mac:
        return mac
    return None

in this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the code @serhiy-storchaka proposed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Yep, agreed.

vstinner and others added 2 commits November 21, 2017 15:34
* Rename support._match_test() to support.match_test(): make it
  public
* Remove support.match_tests global variable. It is replaced with a
  new support.set_match_tests() function, so match_test() doesn't
  have to check each time if patterns were modified.
* Rewrite match_test(): use different code paths depending on the
  kind of patterns for best performances.

Co-Authored-By: Serhiy Storchaka <[email protected]>
Also, clean up some return sites.
@warsaw
Copy link
Member Author

warsaw commented Nov 22, 2017

@serhiy-storchaka @vstinner - I think this one's got it. The functions should never return a locally administered MAC address now.

ED: Ah darn maybe not. Too tired. I'll get back to this later.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that you want to exclude "local MAC". Ok. But what if a compute only has local MACs? uuid1() fails with a TypeError with a strange error message if getnode() returns None. At least, please modify uuid1() to raise a better error if no MAC can be found.

Or maybe fall back on a local MAC if no other address can be found?

Lib/uuid.py Outdated
@@ -355,7 +370,7 @@ def _find_mac(command, args, hw_identifiers, get_index):
try:
word = words[get_index(i)]
mac = int(word.replace(b':', b''), 16)
if mac:
if ~(mac & (1<<41)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the usage of the ~int operator.

What do you want to test? If a bit is unset? Do you mean "if not(mac & (1<<41)):" or "if (mac & (1<<41)) == 0:" ?

Please write a private subfunction for this test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was completely botched. Now that I've had some sleep I've fixed this with a helper function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also with the latest commit, we keep track of the first locally administered MAC found. If no universally administered MAC is found, we return the first local MAC.

And the way the getters work, if there are no local or universal MACs found, then getnode() ultimately falls back to the RFC 4122 specified random MAC, with the multicast bit set. So it should never be the case that getnode() doesn't return an RFC-defined node address.

vstinner and others added 14 commits November 22, 2017 20:58
* Optimize warnings.filterwarnings(). Replace re.compile('') with
  None to avoid the cost of calling a regex.match() method, whereas
  it always matchs.
* Optimize get_warnings_attr(): replace PyObject_GetAttrString() with
  _PyObject_GetAttrId().

Cleanup also create_filter():

* Use _Py_IDENTIFIER() to allow to cleanup strings at Python
  finalization
* Replace Py_FatalError() with a regular exceptions
Changes:

* Py_Main() initializes _PyCoreConfig.module_search_path_env from
  the PYTHONPATH environment variable.
* PyInterpreterState_New() now initializes core_config and config
  fields
* Compute sys.path a little bit ealier in
  _Py_InitializeMainInterpreter() and new_interpreter()
* Add _Py_GetPathWithConfig() private function.
…GH-4464)

Adds a simpler and faster alternative to ExitStack for handling
single optional context managers without having to change the
lexical structure of your code.
Move _PyCoreConfig.module_search_path_env to _PyMainInterpreterConfig
structure.
* Py_Main() now reads the PYTHONHOME environment variable
* Add _Py_GetPythonHomeWithConfig() private function
* Add _PyWarnings_InitWithConfig()
* init_filters() doesn't get the current core configuration from the
  current interpreter or Python thread anymore. Pass explicitly the
  configuration to _PyWarnings_InitWithConfig().
* _Py_InitializeCore() now fails on _PyWarnings_InitWithConfig()
  failure.
* Pass configuration as constant
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
@warsaw warsaw requested review from 1st1, rhettinger, terryjreedy and a team as code owners November 26, 2017 23:18
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

…torchaka/cpython into test_uuid

This merges the relevant bits of python#4572 and should be considered instead of
that PR.
@warsaw
Copy link
Member Author

warsaw commented Nov 26, 2017

Oh hells bells, the merge went badly.

warsaw and others added 12 commits November 26, 2017 18:31
The original check was erroneous in two ways.  First, the documentation said
that "47 bit will never be set in IEEE 802 addresses obtained from network
cards".  I think this is referring to the universally vs. locally administered
MAC addresses.  Network cards from hardware manufacturers will always be
universally administered in order to guarantee global uniqueness.

But from https://en.wikipedia.org/wiki/MAC_address it says that this flag is
the "distinguished by setting the second-least-significant bit of the first
octet of the address", where a 0 means universally administered and a 1 means
it's locally administered.  This works out to the 42nd bit when counting the
LSB as bit 1, or 1<<41.

The second bug is that the original bitmask value isn't right for either
description:

% ./python.exe -c "from math import log2; print(log2(0x010000000000))"
40.0

This causes the test to fail on valid MAC addresses.

Fix this by improving the comment, with references, and using the correct
bitmask.
Also, clean up some return sites.
* Add a helper function to determine whether a MAC address is universally
  administered or not.
* In the various MAC getter functions, keep track of the first locally
  administered MAC, and if no universally administered MAC is found, return
  the first local MAC.
* In the ultimate fallback _random_getnode(), add a comment for the multicast
  bit being set, and use a better bitmask
* In getnode(), just fall back to _random_getnode() explicitly if no other MAC
  has been found.
While it is required that the multicast bit (1<<40) should be set in
random generated addresses, there is no requirement that it should be
cleared in addresses obtained from network cards.
@warsaw
Copy link
Member Author

warsaw commented Nov 26, 2017

See #4576 which resolves the merge snafus.

@warsaw warsaw closed this Nov 26, 2017
warsaw added a commit that referenced this pull request Nov 27, 2017
Improve UUID1 MAC address calculation and related tests.

There are two bits in the MAC address that are relevant to UUID1.  The first is the locally administered vs. universally administered bit (second least significant of the first octet).   Physical network interfaces such as ethernet ports and wireless adapters will always be universally administered, but some interfaces --such as the interface that MacBook Pros communicate with their Touch Bars-- are locally administered.  The former are guaranteed to be globally unique, while the latter are demonstrably *not* globally unique and are in fact the same on every MBP with a Touch Bar.  With this bit is set, the MAC is locally administered; with it unset it is universally administered.

The other bit is the multicast bit (least significant bit of the first octet).  When no other MAC address can be found, RFC 4122 mandates that a random 48-bit number be generated.  This randomly generated number *must* have the multicast bit set.

The improvements in uuid.py include:

* Preferentially return a universally administered MAC address, falling back to a locally administered address if none of the former can be found.
* Improve several coding style issues, such as adding explicit returns of None, using a more readable bitmask pattern, and assuming that the ultimate fallback, random MAC generation will not fail (and propagating any exception there instead of swallowing them).

Improvements in test_uuid.py include:

* Always testing the calculated MAC for universal administration, unless explicitly disabled (i.e. for the random case), or implicitly disabled due to running in the Travis environment.  Travis test machines have *no* universally administered MAC address at the time of this writing.
@warsaw warsaw deleted the test_uuid branch November 27, 2017 19:41
vstinner added a commit that referenced this pull request Nov 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.