-
Notifications
You must be signed in to change notification settings - Fork 617
How to Insert new PDF Pages, Images and Text
Beginning with v1.11.0 PyMuPDF allows to insert new pages into (existing or new) PDFs. This works like so:
doc = fitz.open("some.pdf") # or a new PDF by fitz.open()
doc.insertPage(n, text = "some text") # insert a new page in front of page n
doc.save(...) # save what we did
Insertion page number n
is 0-based and means "insertion in front of this page". Insertion at end is achieved by n = -1
.
Several parameters and options are available: fontsize, color, standard Base14 fonts or other fonts from your system, page dimension, etc. If text
is a string containing line breaks (\n
) or a Python sequence, then several text lines are generated.
We have included a new demo program `text2pdf.py" that converts a text file to a new PDF using this feature. As usual with PyMuPDF: a very fast alternative to similar solutions.
New images can now be put on PDF pages. Use this new method like so:
doc = fitz.open("some.pdf") # some existing PDF
page = doc[n] # load page (0-based)
rect = fitz.Rect(0, 0, 100, 100) # where we want to put the image
pix = fitz.Pixmap("some.image") # any supported image file
page.insertImage(rect, pixmap = pix, overlay = True) # insert image
doc.save(...) # save our deeds
The image will overlay (default) what currently is there in the rectangle. Transparent images are supported and thus can be used for some kind of "watermarking" your PDF. With overlay = False
, the image will become background.
In order to put an identical thumbnail on each page, do this:
for page in doc:
page.insertImage(rect, pixmap = pix)
Potentially except for the first page, this is a very fast process. On my machine it took 6 seconds to stamp all the 1'310 pages of Adobe's PDF Reference Manual with a (relatively small) image.
I am not (quite) serious here ... but using this technique you can overlay certain text pieces with images as well:
doc = fitz.open(...)
for page in doc:
rl = page.searchFor("nasty word", hit_max = nnn)
for r in rl:
page.insertImage(r, "black.jpg")
doc.save("censored.pdf")
File "censored.pdf" will now have every (up to nnn
per page) occurrence of "nasty word" overlaid with picture "black.jpg".
But note: the overlaid text is physically still there, and can be accessed e.g. via page.getText()
.
You can also use this approach to emphasize text in a textmarker style:
Create a small image file that only contains pixels of one color to be used for textmarking, say "yellow.jpg". Then use it in a variation of the above:
pix = fitz.Pixmap("yellow.jpg") # arbitrary size
for page in doc:
rl = page.searchFor("interesting stuff", hit_max = ...)
for r in rl: # every rectangle containing this text
page.insertImage(r, pixmap = pix, overlay = False)
doc.save(...)
All "interesting stuff" will now be textmarked yellow, i.e. shown with a yellow background.
You can insert new text on existing pages. This works similar to creating a new page together with text, but adds more flexibility. You can freely position your text pieces, choose different fonts / text sizes / colors for each piece, etc.
page = doc[n]
text = "some text containing line breaks and\na prettier mono-spaced font."
fname = "F0"
ffile = "c:/windows/fonts/dejavusansmono.ttf"
where = fitz.Point(50, 100) # text starts here
# this inserts 2 lines of text using font `DejaVu Sans Mono`
page.insertText(where, text,
fontname = fname, # arbitrary if fontfile given
fontfile = ffile, # any file containing a font
fontsize = 11, # default
color = (0, 0, 1)) # this is blue
HOWTO Button annots with JavaScript
HOWTO work with PDF embedded files
HOWTO extract text from inside rectangles
HOWTO extract text in natural reading order
HOWTO create or extract graphics
HOWTO create your own PDF Drawing
Rectangle inclusion & intersection
Metadata & bookmark maintenance