Skip to content

bpo-30485: Change the prefix for defining the default namespace in ElementPath from None to '' #12860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2019

Conversation

scoder
Copy link
Contributor

@scoder scoder commented Apr 16, 2019

There is existing code that uses that and it's more convenient to have an all-string-keys dict (e.g. when sorting items etc.).

https://bugs.python.org/issue30485

…ementPath from None to '' since there is existing code that uses that and it's more convenient to have an all-string-keys dict (e.g. when sorting items etc.).
@scoder scoder force-pushed the bpo-30485_empty_string_def_ns branch from 44e0e90 to f4e7368 Compare April 16, 2019 21:17
@@ -71,7 +71,7 @@
)

def xpath_tokenizer(pattern, namespaces=None):
default_namespace = namespaces.get(None) if namespaces else None
default_namespace = namespaces.get('') if namespaces else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lxml now supports both None and '' with commit lxml/lxml@aefded0. CPython with this PR only supports '' though initial PR for issue supported only None and ValueError for '' was incompatible. Is there a plan for lxml to drop support for None in future to unify with CPython or for CPython to support None and '' with lxml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering allowing both in ET as well (and implemented it along the way), but decided against it because a) the implementation is quite ugly, b) there are probably not that many users who go from lxml to ET, and c) ET's API generally has almost no exposure of prefixes whatsoever, so it doesn't matter there. Using prefixes in ElementPath expressions is not even the recommended way, because self-contained string expressions with explicit "{ns}tag" notation are much nicer to handle overall.

lxml, on the other hand, returns a dict with None as prefix when you let it collect the namespaces defined on an Element and its parents (nsmap), so not supporting both there would be unnatural, and changing this is essentially impossible. At least for now, I think it's ok to support only the simpler (and more convenient) empty string in ElementTree, and I also don't see lxml deprecate or remove support for None as prefix. Adapting the docs should fix it over time.

@serhiy-storchaka
Copy link
Member

This looks like a breaking change, so we should emit a deprecation warning before dropping the support of None and support both prefixes in meantime.

@brettcannon brettcannon added the type-feature A feature request or enhancement label Apr 17, 2019
@scoder
Copy link
Contributor Author

scoder commented Apr 18, 2019

@serhiy-storchaka, the feature itself was only added to ElementTree last weekend and never released even in an alpha. The only question is whether anyone sees a problem with '' as a prefix key (which was actually in the original contribution) versus the None which lxml used for the last two years (which I do not consider ideal any more, specifically not for the context of ElementTree). It's not ideal to have an ambiguity about the "one way to do it", but they both have their long standing user bases, and since this is a new feature for ET, I think it should have at least its own clean implementation.

For me, the plan forward is to support '' in both, and leave None only as the alternative in lxml where some use cases suggest it (because lxml's tree model is prefix aware, while ET is not).

@serhiy-storchaka
Copy link
Member

Ah, then all right!

@scoder scoder merged commit e8113f5 into python:master Apr 18, 2019
@bedevere-bot
Copy link

@scoder: Please replace # with GH- in the commit message next time. Thanks!

@scoder scoder deleted the bpo-30485_empty_string_def_ns branch April 18, 2019 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants