How to return the heading number #590

tfitzhardinge · 2018-12-31T05:08:15Z

Hi

I have managed to return the heading text in a given filename (thank you to the great experts at Stack Overflow) however I cannot return the heading number. Refer to code below. Is there a way to print/return the heading number (list numbering value not heading level).

Thanks

import docx

doc=docx.Document('filename.docx')

def iter_heading(paragraphs):
    for paragraph in paragraphs:
        if paragraph.style.name('Heading 1'):
            yield paragraph

for heading in iter_heading(doc.paragraphs):
    print(heading.text)

The text was updated successfully, but these errors were encountered:

lilyzhaochina · 2019-01-15T07:58:54Z

up

ppebay · 2019-04-23T00:17:45Z

This is loosely related but is the closest issue I could find to my question: how can the heading number(s) be made visible? Currently,
document.add_heading(heading_title, 1)
creates a section heading with the desired content, size, indentation, etc., but NO section number.

It is my understanding from the current documentation that making the numbers appear in the rendered document is not doable via the current API, but that a numbering style is envisioned for the future. Am I missing something? Thank you.

chdelfosse · 2019-05-28T11:29:17Z

a solution, call addHeaderNumbering just before saving the document

# --- from https://github.com/python-openxml/python-docx/issues/590,
# --- mods by CD
def iter_heading(paragraphs):
    for paragraph in paragraphs:
        isItHeading=re.match('Heading ([1-9])',paragraph.style.name)
        if isItHeading:
            yield int(isItHeading.groups()[0]),paragraph

def addHeaderNumbering(document):
    hNums=[0,0,0,0,0]
    for index,hx in iter_heading(document.paragraphs):
        # ---put zeroes below---
        for i in range(index+1,5):
            hNums[i]=0
        # ---increment this---
        hNums[index]+=1
        # ---prepare the string---
        hStr=""
        for i in range(1,index+1):
            hStr+="%d."%hNums[i]
        # ---add the numbering---
        hx.text=hStr+" "+hx.text

yaleLeeNGA · 2019-08-28T15:32:23Z

The solution above does work. Still, is this feature planned to be added to python-docx library?
@ppebay

jfthuong · 2020-01-17T08:22:25Z

a solution, call addHeaderNumbering just before saving the document

# --- from https://github.com/python-openxml/python-docx/issues/590,
# --- mods by CD
def iter_heading(paragraphs):
    for paragraph in paragraphs:
        isItHeading=re.match('Heading ([1-9])',paragraph.style.name)
        if isItHeading:
            yield int(isItHeading.groups()[0]),paragraph

def addHeaderNumbering(document):
    hNums=[0,0,0,0,0]
    for index,hx in iter_heading(document.paragraphs):
        # ---put zeroes below---
        for i in range(index+1,5):
            hNums[i]=0
        # ---increment this---
        hNums[index]+=1
        # ---prepare the string---
        hStr=""
        for i in range(1,index+1):
            hStr+="%d."%hNums[i]
        # ---add the numbering---
        hx.text=hStr+" "+hx.text

@chdelfosse

Hi, 2 remarks:

You could replace isItHeading.groups()[0] by isItHeading.group(0), that would be more elegant...
You do not support the case with more than 5 levels of headings ;)

I have this function to help name the headings:

def get_heading_numbers(level: int, hierarchy: List[int]) -> str:
    """Return heading numbers crumbpath (level starts with '1', and not 0)"""
    # We need to fill-up indexes of 0 before the level, if needed
    # We clean-up elements after the level
    # Then join all elements with "." character
    index = level - 1
    for _ in range(len(hierarchy), index + 1):
        hierarchy.append(0)
    del hierarchy[index + 1 :]

    hierarchy[index] += 1
    return ".".join(str(e or 1) for e in hierarchy)

hierarchy: List[int] = list()
print(get_heading_numbers(1, hierarchy))
print(get_heading_numbers(1, hierarchy))
print(get_heading_numbers(2, hierarchy))
print(get_heading_numbers(2, hierarchy))
print(get_heading_numbers(3, hierarchy))
print(get_heading_numbers(1, hierarchy))
print(get_heading_numbers(2, hierarchy))
print(get_heading_numbers(5, hierarchy))
exit()

bushnerd · 2020-07-25T04:33:07Z

a solution, call addHeaderNumbering just before saving the document

# --- from https://github.com/python-openxml/python-docx/issues/590,
# --- mods by CD
def iter_heading(paragraphs):
    for paragraph in paragraphs:
        isItHeading=re.match('Heading ([1-9])',paragraph.style.name)
        if isItHeading:
            yield int(isItHeading.groups()[0]),paragraph

def addHeaderNumbering(document):
    hNums=[0,0,0,0,0]
    for index,hx in iter_heading(document.paragraphs):
        # ---put zeroes below---
        for i in range(index+1,5):
            hNums[i]=0
        # ---increment this---
        hNums[index]+=1
        # ---prepare the string---
        hStr=""
        for i in range(1,index+1):
            hStr+="%d."%hNums[i]
        # ---add the numbering---
        hx.text=hStr+" "+hx.text

Great, it works. But it will be change the content of the original docx.

chdelfosse · 2020-07-28T17:11:52Z

so what? nobody said you had to use it

beyond2002 · 2020-11-24T01:51:37Z

Not found a solution yet. UP

de-adshot · 2021-05-15T17:18:34Z

any solutions for this?... been more than 2 yrs

chdelfosse · 2021-05-15T17:47:01Z

I published an ad-hoc solution, someone else reformatted and improved it AFIK there is no source change to address the matter regards ch

…

On Sat, 15 May 2021 at 19:18, Avinash ***@***.***> wrote: any solutions for this?... been more than 2 yrs — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#590 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABQ2KAUND6GS4EKGGUGDJ2LTN2UHPANCNFSM4GMU2KSQ> .

disarticulate · 2022-10-14T14:50:34Z

so i've looked into this several times. The issue appears to be that the underlying xml is very convoluted in how apps like Word generate actual heading numbers.

Here's a primer: http://officeopenxml.com/WPnumbering.php

it seems implented in javascript's docx here: https://docx.js.org/#/usage/numbering

Regardless, it's pretty difficult to parse since it refers to document level metadata, which tracks numbering not as shown but as calculated through whatever is happening in the document.

While this would be a great feature, I can see already someone would need to spend quite a bit of energy getting it to work.

UchihaArk · 2024-01-12T09:46:47Z

I simply implemented the function, but it does not support numbers such as: 1.1, 1.1.1

def get_number_text(paragraph, num_id_map):
    numXML = paragraph.part.numbering_part.numbering_definitions._numbering.xml
    root = etree.fromstring(numXML)
    if paragraph.style.paragraph_format.element.pPr.numPr is not None:
        num_id = paragraph.style.paragraph_format.element.pPr.numPr.numId.val
    elif paragraph.paragraph_format.element.pPr.numPr is not None:
        num_id = paragraph.paragraph_format.element.pPr.numPr.numId.val
    else:
        return
    if num_id in num_id_map:
        val = num_id_map[num_id] + 1
        for key in num_id_map:
            if key > num_id:
                num_id_map[key] = 0
    else:
        val = 1

    num_id_map[num_id] = val
    abstractNumId = 0
    for num_data in paragraph.part.numbering_part.numbering_definitions._numbering.num_lst:
        if num_id == num_data.numId:
            abstractNumId = num_data.abstractNumId.val
            break
    abstract_nums = root.xpath(f'.//w:abstractNum[@w:abstractNumId="{abstractNumId}"]',
                               namespaces=root.nsmap)
    for abstract_num in abstract_nums:
        lvls_1 = abstract_num.xpath('.//w:lvl[@w:ilvl="0"]', namespaces=root.nsmap)
        if lvls_1 and len(lvls_1) > 0:
            lvlText = lvls_1[0].xpath(f'.//w:lvlText/@w:val', namespaces=root.nsmap)
            if lvlText and len(lvlText) > 0:
                num_text_format = lvlText[0]
                num_text = num_text_format.replace("%1", str(val))
    paragraph.text = num_text + " " + paragraph.text

chdelfosse · 2024-01-12T09:58:28Z

I published the original code (before others improved it), you may want to try it --------------------- #the functions that enable the header numbering #from #590, mods by CD def iter_heading(paragraphs): for paragraph in paragraphs: isItHeading=re.match('Heading ([1-9])',paragraph.style.name) if isItHeading: yield int(isItHeading.groups()[0]),paragraph def addHeaderNumbering(document): "rename the headers with a decimal notation" hNums=[0,-1,0,0,0] for index,hx in iter_heading(document.paragraphs): #put zeroes below for i in range(index+1,5): hNums[i]=0 #increment this hNums[index]+=1 #prepare the string hStr="" for i in range(1,index+1): hStr+="%d."%hNums[i] #add the numbering hx.text=hStr+" "+hx.text return -------------------------- regards ch

…

On Fri, 12 Jan 2024 at 10:46, UchihaArk ***@***.***> wrote: I simply implemented the function, but it does not support numbers such as: 1.1, 1.1.1 def get_number_text(paragraph, num_id_map): numXML = paragraph.part.numbering_part.numbering_definitions._numbering.xml root = etree.fromstring(numXML) if paragraph.style.paragraph_format.element.pPr.numPr is not None: num_id = paragraph.style.paragraph_format.element.pPr.numPr.numId.val elif paragraph.paragraph_format.element.pPr.numPr is not None: num_id = paragraph.paragraph_format.element.pPr.numPr.numId.val else: return if num_id in num_id_map: val = num_id_map[num_id] + 1 for key in num_id_map: if key > num_id: num_id_map[key] = 0 else: val = 1 num_id_map[num_id] = val abstractNumId = 0 for num_data in paragraph.part.numbering_part.numbering_definitions._numbering.num_lst: if num_id == num_data.numId: abstractNumId = num_data.abstractNumId.val break abstract_nums = ***@***.***:abstractNumId="{abstractNumId}"]', namespaces=root.nsmap) for abstract_num in abstract_nums: lvls_1 = ***@***.***:ilvl="0"]', namespaces=root.nsmap) if lvls_1 and len(lvls_1) > 0: lvlText = ***@***.***:val', namespaces=root.nsmap) if lvlText and len(lvlText) > 0: num_text_format = lvlText[0] num_text = num_text_format.replace("%1", str(val)) paragraph.text = num_text + " " + paragraph.text — Reply to this email directly, view it on GitHub <#590 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABQ2KAQ35ABAEE5U74VRGULYOEBBFAVCNFSM4GMU2KS2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBYHA3TMNJQGIYA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

nguyendangson · 2024-03-25T17:46:24Z

It seems that python-docx has no function to extract number headings so I created it, you can use it, please see in my github: https://github.com/nguyendangson/extract_number_heading_python-docx

lilyzhaochina mentioned this issue Jan 15, 2019

How to get automatic caption numbering? #600

Open

PinakiChat1 mentioned this issue May 20, 2019

Getting the heading number of a Word section heading #677

Open

renatodamas mentioned this issue Dec 12, 2024

Correctly reading enumeration and list numbers/letters #1454

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to return the heading number #590

How to return the heading number #590

tfitzhardinge commented Dec 31, 2018

lilyzhaochina commented Jan 15, 2019

Uh oh!

ppebay commented Apr 23, 2019

Uh oh!

chdelfosse commented May 28, 2019 •

edited by scanny

Loading

Uh oh!

yaleLeeNGA commented Aug 28, 2019

Uh oh!

jfthuong commented Jan 17, 2020 •

edited

Loading

Uh oh!

bushnerd commented Jul 25, 2020

Uh oh!

chdelfosse commented Jul 28, 2020

Uh oh!

beyond2002 commented Nov 24, 2020

Uh oh!

de-adshot commented May 15, 2021

Uh oh!

chdelfosse commented May 15, 2021 via email

Uh oh!

disarticulate commented Oct 14, 2022

Uh oh!

UchihaArk commented Jan 12, 2024

Uh oh!

chdelfosse commented Jan 12, 2024 via email

Uh oh!

nguyendangson commented Mar 25, 2024

Uh oh!

How to return the heading number #590

How to return the heading number #590

Comments

tfitzhardinge commented Dec 31, 2018

lilyzhaochina commented Jan 15, 2019

Uh oh!

ppebay commented Apr 23, 2019

Uh oh!

chdelfosse commented May 28, 2019 • edited by scanny Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yaleLeeNGA commented Aug 28, 2019

Uh oh!

jfthuong commented Jan 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bushnerd commented Jul 25, 2020

Uh oh!

chdelfosse commented Jul 28, 2020

Uh oh!

beyond2002 commented Nov 24, 2020

Uh oh!

de-adshot commented May 15, 2021

Uh oh!

chdelfosse commented May 15, 2021 via email

Uh oh!

disarticulate commented Oct 14, 2022

Uh oh!

UchihaArk commented Jan 12, 2024

Uh oh!

chdelfosse commented Jan 12, 2024 via email

Uh oh!

nguyendangson commented Mar 25, 2024

Uh oh!

chdelfosse commented May 28, 2019 •

edited by scanny

Loading

jfthuong commented Jan 17, 2020 •

edited

Loading