Skip to content

Commit 3a4875e

Browse files
committed
Issue #6488: Explain the XPath support of xml.etree.ElementTree, with code
samples and a reference. Also fix the other nits mentioned in the issue. This also partially addresses issue #14006.
1 parent 70ea34d commit 3a4875e

File tree

1 file changed

+132
-30
lines changed

1 file changed

+132
-30
lines changed

Doc/library/xml.etree.elementtree.rst

Lines changed: 132 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,119 @@ docs.
4545
The :mod:`xml.etree.cElementTree` module is deprecated.
4646

4747

48+
.. _elementtree-xpath:
49+
50+
XPath support
51+
-------------
52+
53+
This module provides limited support for
54+
`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a
55+
tree. The goal is to support a small subset of the abbreviated syntax; a full
56+
XPath engine is outside the scope of the module.
57+
58+
Example
59+
^^^^^^^
60+
61+
Here's an example that demonstrates some of the XPath capabilities of the
62+
module::
63+
64+
import xml.etree.ElementTree as ET
65+
66+
xml = r'''<?xml version="1.0"?>
67+
<data>
68+
<country name="Liechtenshtein">
69+
<rank>1</rank>
70+
<year>2008</year>
71+
<gdppc>141100</gdppc>
72+
<neighbor name="Austria" direction="E"/>
73+
<neighbor name="Switzerland" direction="W"/>
74+
</country>
75+
<country name="Singapore">
76+
<rank>4</rank>
77+
<year>2011</year>
78+
<gdppc>59900</gdppc>
79+
<neighbor name="Malaysia" direction="N"/>
80+
</country>
81+
<country name="Panama">
82+
<rank>68</rank>
83+
<year>2011</year>
84+
<gdppc>13600</gdppc>
85+
<neighbor name="Costa Rica" direction="W"/>
86+
<neighbor name="Colombia" direction="E"/>
87+
</country>
88+
</data>
89+
'''
90+
91+
tree = ET.fromstring(xml)
92+
93+
# Top-level elements
94+
tree.findall(".")
95+
96+
# All 'neighbor' grand-children of 'country' children of the top-level
97+
# elements
98+
tree.findall("./country/neighbor")
99+
100+
# Nodes with name='Singapore' that have a 'year' child
101+
tree.findall(".//year/..[@name='Singapore']")
102+
103+
# 'year' nodes that are children of nodes with name='Singapore'
104+
tree.findall(".//*[@name='Singapore']/year")
105+
106+
# All 'neighbor' nodes that are the second child of their parent
107+
tree.findall(".//neighbor[2]")
108+
109+
Supported XPath syntax
110+
^^^^^^^^^^^^^^^^^^^^^^
111+
112+
+-----------------------+------------------------------------------------------+
113+
| Syntax | Meaning |
114+
+=======================+======================================================+
115+
| ``tag`` | Selects all child elements with the given tag. |
116+
| | For example, ``spam`` selects all child elements |
117+
| | named ``spam``, ``spam/egg`` selects all |
118+
| | grandchildren named ``egg`` in all children named |
119+
| | ``spam``. |
120+
+-----------------------+------------------------------------------------------+
121+
| ``*`` | Selects all child elements. For example, ``*/egg`` |
122+
| | selects all grandchildren named ``egg``. |
123+
+-----------------------+------------------------------------------------------+
124+
| ``.`` | Selects the current node. This is mostly useful |
125+
| | at the beginning of the path, to indicate that it's |
126+
| | a relative path. |
127+
+-----------------------+------------------------------------------------------+
128+
| ``//`` | Selects all subelements, on all levels beneath the |
129+
| | current element. For example, ``./egg`` selects |
130+
| | all ``egg`` elements in the entire tree. |
131+
+-----------------------+------------------------------------------------------+
132+
| ``..`` | Selects the parent element. |
133+
+-----------------------+------------------------------------------------------+
134+
| ``[@attrib]`` | Selects all elements that have the given attribute. |
135+
+-----------------------+------------------------------------------------------+
136+
| ``[@attrib='value']`` | Selects all elements for which the given attribute |
137+
| | has the given value. The value cannot contain |
138+
| | quotes. |
139+
+-----------------------+------------------------------------------------------+
140+
| ``[tag]`` | Selects all elements that have a child named |
141+
| | ``tag``. Only immediate children are supported. |
142+
+-----------------------+------------------------------------------------------+
143+
| ``[position]`` | Selects all elements that are located at the given |
144+
| | position. The position can be either an integer |
145+
| | (1 is the first position), the expression ``last()`` |
146+
| | (for the last position), or a position relative to |
147+
| | the last position (e.g. ``last()-1``). |
148+
+-----------------------+------------------------------------------------------+
149+
150+
Predicates (expressions within square brackets) must be preceded by a tag
151+
name, an asterisk, or another predicate. ``position`` predicates must be
152+
preceded by a tag name.
153+
154+
Reference
155+
---------
156+
48157
.. _elementtree-functions:
49158

50159
Functions
51-
---------
160+
^^^^^^^^^
52161

53162

54163
.. function:: Comment(text=None)
@@ -199,7 +308,7 @@ Functions
199308
.. _elementtree-element-objects:
200309

201310
Element Objects
202-
---------------
311+
^^^^^^^^^^^^^^^
203312

204313
.. class:: Element(tag, attrib={}, **extra)
205314

@@ -297,21 +406,24 @@ Element Objects
297406
.. method:: find(match)
298407

299408
Finds the first subelement matching *match*. *match* may be a tag name
300-
or path. Returns an element instance or ``None``.
409+
or a :ref:`path <elementtree-xpath>`. Returns an element instance
410+
or ``None``.
301411

302412

303413
.. method:: findall(match)
304414

305-
Finds all matching subelements, by tag name or path. Returns a list
306-
containing all matching elements in document order.
415+
Finds all matching subelements, by tag name or
416+
:ref:`path <elementtree-xpath>`. Returns a list containing all matching
417+
elements in document order.
307418

308419

309420
.. method:: findtext(match, default=None)
310421

311422
Finds text for the first subelement matching *match*. *match* may be
312-
a tag name or path. Returns the text content of the first matching
313-
element, or *default* if no element was found. Note that if the matching
314-
element has no text content an empty string is returned.
423+
a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content
424+
of the first matching element, or *default* if no element was found.
425+
Note that if the matching element has no text content an empty string
426+
is returned.
315427

316428

317429
.. method:: getchildren()
@@ -345,8 +457,9 @@ Element Objects
345457

346458
.. method:: iterfind(match)
347459

348-
Finds all matching subelements, by tag name or path. Returns an iterable
349-
yielding all matching elements in document order.
460+
Finds all matching subelements, by tag name or
461+
:ref:`path <elementtree-xpath>`. Returns an iterable yielding all
462+
matching elements in document order.
350463

351464
.. versionadded:: 3.2
352465

@@ -391,7 +504,7 @@ Element Objects
391504
.. _elementtree-elementtree-objects:
392505

393506
ElementTree Objects
394-
-------------------
507+
^^^^^^^^^^^^^^^^^^^
395508

396509

397510
.. class:: ElementTree(element=None, file=None)
@@ -413,26 +526,17 @@ ElementTree Objects
413526

414527
.. method:: find(match)
415528

416-
Finds the first toplevel element matching *match*. *match* may be a tag
417-
name or path. Same as getroot().find(match). Returns the first matching
418-
element, or ``None`` if no element was found.
529+
Same as :meth:`Element.find`, starting at the root of the tree.
419530

420531

421532
.. method:: findall(match)
422533

423-
Finds all matching subelements, by tag name or path. Same as
424-
getroot().findall(match). *match* may be a tag name or path. Returns a
425-
list containing all matching elements, in document order.
534+
Same as :meth:`Element.findall`, starting at the root of the tree.
426535

427536

428537
.. method:: findtext(match, default=None)
429538

430-
Finds the element text for the first toplevel element with given tag.
431-
Same as getroot().findtext(match). *match* may be a tag name or path.
432-
*default* is the value to return if the element was not found. Returns
433-
the text content of the first matching element, or the default value no
434-
element was found. Note that if the element is found, but has no text
435-
content, this method returns an empty string.
539+
Same as :meth:`Element.findtext`, starting at the root of the tree.
436540

437541

438542
.. method:: getiterator(tag=None)
@@ -455,9 +559,7 @@ ElementTree Objects
455559

456560
.. method:: iterfind(match)
457561

458-
Finds all matching subelements, by tag name or path. Same as
459-
getroot().iterfind(match). Returns an iterable yielding all matching
460-
elements in document order.
562+
Same as :meth:`Element.iterfind`, starting at the root of the tree.
461563

462564
.. versionadded:: 3.2
463565

@@ -512,7 +614,7 @@ Example of changing the attribute "target" of every link in first paragraph::
512614
.. _elementtree-qname-objects:
513615

514616
QName Objects
515-
-------------
617+
^^^^^^^^^^^^^
516618

517619

518620
.. class:: QName(text_or_uri, tag=None)
@@ -528,7 +630,7 @@ QName Objects
528630
.. _elementtree-treebuilder-objects:
529631

530632
TreeBuilder Objects
531-
-------------------
633+
^^^^^^^^^^^^^^^^^^^
532634

533635

534636
.. class:: TreeBuilder(element_factory=None)
@@ -579,7 +681,7 @@ TreeBuilder Objects
579681
.. _elementtree-xmlparser-objects:
580682

581683
XMLParser Objects
582-
-----------------
684+
^^^^^^^^^^^^^^^^^
583685

584686

585687
.. class:: XMLParser(html=0, target=None, encoding=None)
@@ -648,7 +750,7 @@ This is an example of counting the maximum depth of an XML file::
648750
4
649751

650752
Exceptions
651-
----------
753+
^^^^^^^^^^
652754

653755
.. class:: ParseError
654756

0 commit comments

Comments
 (0)