You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Only `test__decorators.py`, `test_split_pdf_hook.py`,
`split_pdf_hook.py` and `overlay_client.yaml` files were modified by
human. Rest of them were auto generated.**
To run integration tests first run `unstructured-api` on port 8000
Copy file name to clipboardExpand all lines: docs/models/shared/partitionparameters.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,7 @@
25
25
| `pdf_infer_table_structure` | *Optional[bool]* | :heavy_minus_sign: | Deprecated! Use skip_infer_table_types to opt out of table extraction for any file type. If False and strategy=hi_res, no Table Elements will be extracted from pdf files regardless of skip_infer_table_types contents. | |
26
26
| `skip_infer_table_types` | List[*str*] | :heavy_minus_sign: | The document types that you want to skip table extraction with. Default: [] | |
27
27
| `split_pdf_page` | *Optional[bool]* | :heavy_minus_sign: | Should the pdf file be split at client. Ignored on backend. | |
28
+
| `starting_page_number` | *Optional[int]* | :heavy_minus_sign: | The real number of the first PDF page. | |
28
29
| `strategy` | *Optional[str]* | :heavy_minus_sign: | The strategy to use for partitioning PDF/image. Options are fast, hi_res, auto. Default: auto | hi_res |
29
30
| `unique_element_ids` | *Optional[bool]* | :heavy_minus_sign: | When True, assign UUIDs to element IDs, which guarantees their uniqueness (useful when using them as primary keys in database). Otherwise a SHA-256 of element text is used. Default: False | |
30
31
| `xml_keep_tags` | *Optional[bool]* | :heavy_minus_sign: | If True, will retain the XML tags in the output. Otherwise it will simply extract the text from within the tags. Only applies to partition_xml. | |
0 commit comments