Skip to content

Commit f189f07

Browse files
DOC: Add outlines documentation and link it in User Guide (#3511)
Closes #3484. --------- Co-authored-by: Stefan <96178532+stefan6419846@users.noreply.github.com>
1 parent a29e532 commit f189f07

3 files changed

Lines changed: 215 additions & 0 deletions

File tree

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,14 @@ You can contribute to `pypdf on GitHub <https://github.com/py-pdf/pypdf>`_.
3838
user/add-javascript
3939
user/viewer-preferences
4040
user/forms
41+
user/handling-outlines
4142
user/streaming-data
4243
user/file-size
4344
user/pdf-version-support
4445
user/pdfa-compliance
4546

4647

48+
4749
.. toctree::
4850
:caption: API Reference
4951
:maxdepth: 1

docs/user/complete-outlines.png

64.3 KB
Loading

docs/user/handling-outlines.md

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
# Handling Outlines
2+
3+
PDF outlines - also known as bookmarks - provide a structured navigation panel in PDF readers. `pypdf` allows you to read, create, and modify both simple and deeply nested outlines.
4+
5+
## Writing PDF Outlines
6+
7+
To add outlines, use the {meth}`~pypdf.PdfWriter.add_outline_item` method. This method returns a reference to the created outline, which you can use as a parent to create nested (hierarchical) bookmarks.
8+
9+
### Adding a Simple Outline
10+
11+
The following example shows how to add a single top-level bookmark. We add an outline item pointing to the first page (index `0`) and save the result.
12+
13+
14+
```{testsetup}
15+
pypdf_test_setup("user/handling-outlines", {
16+
"crazyones.pdf":"../resources/crazyones.pdf",
17+
})
18+
```
19+
20+
```{testcode}
21+
from pypdf import PdfWriter
22+
23+
writer = PdfWriter(clone_from="crazyones.pdf")
24+
25+
# Add a top-level bookmark
26+
writer.add_outline_item(
27+
title="Introduction",
28+
page_number=0
29+
)
30+
31+
writer.write("simple-example.pdf")
32+
```
33+
34+
35+
### Adding Nested Outlines
36+
37+
You can build hierarchies (like Chapter → Section) by passing the parent outline item to the `parent` parameter of a new item.
38+
39+
In the example below, we create a root item "Introduction" and nest two sections under it.
40+
41+
```{testcode}
42+
from pypdf import PdfWriter
43+
44+
writer = PdfWriter(clone_from="crazyones.pdf")
45+
46+
# Add parent (Chapter)
47+
introduction = writer.add_outline_item(
48+
title="Chapter 1",
49+
page_number=0
50+
)
51+
52+
# Add children (sections) nested under the introduction
53+
writer.add_outline_item(
54+
title="Section 1.1",
55+
page_number=0,
56+
parent=introduction
57+
)
58+
59+
writer.add_outline_item(
60+
title="Section 1.2",
61+
page_number=0,
62+
parent=introduction
63+
)
64+
65+
writer.write("nested-example.pdf")
66+
```
67+
68+
69+
### Advanced Styling and View Modes (Fit Options)
70+
71+
You can customize the appearance and behavior of bookmarks using optional parameters, such as changing the text color or applying bold and italic styles.
72+
73+
For detailed information on all available parameters and their formats, please refer to the {meth}`~pypdf.PdfWriter.add_outline_item` API documentation.
74+
75+
The ``fit`` parameter determines how the page is displayed when the user clicks the bookmark. You can use the {class}`~pypdf.generic.Fit` helper to specify modes like {meth}`~pypdf.generic.Fit.fit`, {meth}`~pypdf.generic.Fit.fit_horizontally`, or {meth}`~pypdf.generic.Fit.xyz`.
76+
77+
78+
```{testcode}
79+
from pypdf import PdfWriter
80+
from pypdf.generic import Fit
81+
82+
writer = PdfWriter(clone_from="crazyones.pdf")
83+
84+
# Top-level chapter (Points to Page 3, Index 2)
85+
chapter2 = writer.add_outline_item(
86+
title="Chapter 2",
87+
page_number=0,
88+
color=(0, 0, 1),
89+
bold=True,
90+
italic=False,
91+
is_open=True,
92+
fit=Fit.fit()
93+
)
94+
95+
# Section under Chapter 2 (Points to Page 3, Index 2)
96+
section2_1 = writer.add_outline_item(
97+
title="Section 2.1",
98+
page_number=0,
99+
parent=chapter2,
100+
color=(0, 0.5, 0),
101+
bold=False,
102+
italic=True,
103+
is_open=False,
104+
fit=Fit.fit_horizontally(top=800)
105+
)
106+
107+
# Section with custom zoom (Points to Page 3, Index 2)
108+
section2_2 = writer.add_outline_item(
109+
title="Section 2.2",
110+
page_number=0,
111+
parent=chapter2,
112+
color=(1, 0, 0),
113+
bold=True,
114+
italic=True,
115+
is_open=True,
116+
fit=Fit.xyz(left=0, top=800, zoom=1.25)
117+
)
118+
119+
writer.write("advanced-example.pdf")
120+
```
121+
122+
```{figure} complete-outlines.png
123+
:alt: An annotated screenshot illustrating simple, nested, and advanced PDF bookmarks.
124+
125+
An annotated screenshot illustrating simple, nested, and advanced PDF bookmarks in a Table of Contents.
126+
```
127+
128+
## Reading PDF Outlines
129+
130+
`pypdf` represents outlines as a list of {class}`~pypdf.generic.Destination` objects. If an outline has children, they appear as a nested list directly following their parent.
131+
132+
To retrieve the page number a bookmark points to, use the {meth}`~pypdf.PdfReader.get_destination_page_number` method, which returns a zero-based page index.
133+
134+
### Reading Simple Outlines
135+
136+
To extract only the top-level bookmarks (ignoring nested sections), you can iterate over the {attr}`~pypdf.PdfReader.outline` property. Since nested children appear as lists within the outline structure, you must explicitly check for and skip them (`isinstance(outline, list)`) to avoid errors. The example below reads the file created in the previous section.
137+
138+
```{testcode}
139+
from pypdf import PdfReader
140+
141+
reader = PdfReader("simple-example.pdf")
142+
143+
print("Simple Outline (Top-Level Only):")
144+
print("-" * 32)
145+
146+
for outline in reader.outline:
147+
# Check if the item is a list (which represents nested children)
148+
if isinstance(outline, list):
149+
continue # Skip the nested parts completely
150+
151+
page_number = reader.get_destination_page_number(outline)
152+
153+
if page_number is None:
154+
print(f"{outline.title} -> No page destination")
155+
else:
156+
print(f"{outline.title} -> page {page_number + 1}")
157+
```
158+
159+
```{testoutput}
160+
Simple Outline (Top-Level Only):
161+
--------------------------------
162+
Introduction -> page 1
163+
```
164+
165+
### Reading Nested Outlines
166+
167+
When dealing with hierarchical bookmarks, the {attr}`~pypdf.PdfReader.outline` property may contain lists inside lists. You should use a recursive function to traverse the tree.
168+
169+
The following example defines a `print_outline` function that handles indentation and nested lists to display the structure of the document we created earlier.
170+
171+
```{testcode}
172+
from typing import List, Union
173+
174+
from pypdf import PdfReader
175+
from pypdf.generic import Destination
176+
177+
178+
def print_outline(
179+
outlines: List[Union[Destination, List[Destination]]],
180+
reader: PdfReader,
181+
level: int = 0
182+
) -> None:
183+
"""Recursively print all outline items with indentation."""
184+
for item in outlines:
185+
if isinstance(item, list):
186+
# Recursively handle the nested list of children
187+
print_outline(item, reader, level + 1)
188+
else:
189+
page_number = reader.get_destination_page_number(item)
190+
191+
indent = " " * level
192+
193+
if page_number is None:
194+
print(f"{indent}- {item.title} (No page destination)")
195+
else:
196+
print(f"{indent}- {item.title} (Page {page_number + 1})")
197+
198+
199+
reader = PdfReader("nested-example.pdf")
200+
201+
print("Nested Outline Hierarchy:")
202+
print("-" * 25)
203+
204+
print_outline(reader.outline, reader)
205+
```
206+
207+
```{testoutput}
208+
Nested Outline Hierarchy:
209+
-------------------------
210+
- Chapter 1 (Page 1)
211+
- Section 1.1 (Page 1)
212+
- Section 1.2 (Page 1)
213+
```

0 commit comments

Comments
 (0)