Skip to content
This documentation is currently in preview, therefore subject to change.

Convert PDF Document

The Convert PDF Document action converts a PDF file to another format and returns the resulting file content.

Parameters

NameTypeRequiredDescription
DocumentFile contentYesPDF file content (e.g. the output of a Get file content action from SharePoint or OneDrive).
Output FormatStringYesDesired target format. One of: .doc, .docx, .pptx, .html, SVG.

Optional Parameters by Output Format

DOC
ParameterTypeDescriptionOptions
Add Return To Line EndBooleanAppend a line break at the end of each extracted paragraph.Yes, No
Image Resolution XIntegerHorizontal resolution for any images extracted from the PDF.Any integer
Image Resolution YIntegerVertical resolution for any images extracted from the PDF.Any integer
Recognition ModeEnumOCR layout strategy for text recognition.Textbox, Enhanced flow, Flow
Recognise BulletsBooleanConvert bullet characters into actual list items.Yes, No
DOCX
ParameterTypeDescriptionOptions
Add Return To Line EndBooleanAppend a line break at the end of each extracted paragraph.Yes, No
Image Resolution XIntegerHorizontal resolution for any images extracted from the PDF.Any integer
Image Resolution YIntegerVertical resolution for any images extracted from the PDF.Any integer
Recognition ModeEnumOCR layout strategy for text recognition.Textbox, Enhanced flow, Flow
Recognise BulletsBooleanConvert bullet characters into actual list items.Yes, No
HTML
ParameterTypeDescriptionOptions
Scale To PixelsBooleanRender vector elements at pixel-aligned dimensions.Yes, No
Explicit List Of Saved PagesInteger ArrayOne-based list of page numbers to convert; only these pages will be rendered.e.g. [1,3,5]
Fixed LayoutBooleanPreserve original PDF layout exactly, without any reflow.Yes, No
Flow Layout Paragraph Full WidthBooleanAllow paragraphs to span the full available width when reflowing text.Yes, No
Image ResolutionIntegerResolution for any images rendered within the HTML output.Any integer
Minimal Line WidthFloatMinimum stroke width (in pixels) for lines.Any positive number
PPTX
ParameterTypeDescriptionOptions
Image ResolutionIntegerResolution for any images embedded in the generated slides.Any integer
Optimise Text BoxesBooleanMerge and simplify text boxes for cleaner PowerPoint editing.Yes, No
SVG
ParameterTypeDescriptionOptions
Scale to PixelsBooleanRender SVG elements with pixel-aligned dimensions.Yes, No

Returns

NameTypeDescription
File contentString (base64‑encoded)Base64‑encoded bytes of the converted file, suitable for downstream actions (e.g. “Create file”).

Troubleshooting

Click to expand common errors and fixes

Document Missing, Truncated or Invalid

Cause:
PDF input is empty, corrupted, or not valid PDF bytes.

Fix:

  • Provide the full PDF base64 and verify it decodes to a valid PDF.
  • Test conversion with a sample PDF to confirm baseline behavior.

Unsupported or Misspelled Output Format

Cause:
Output format string is incorrect or not supported.

Fix:

  • Verify the format value exactly matches one of the supported strings (.doc, .docx, .pptx, .html, SVG).
  • Correct typos and retry.

Password-Protected or Encrypted PDF

Cause:
PDF is encrypted or requires a password to open.

Fix:

  • Supply an unencrypted copy.
  • Remove password protection prior to conversion.

OCR / Recognition Errors (Text Missing, Garbled, or Layout Incorrect)

Cause:
Source PDF is scanned or contains images of text and OCR settings are inappropriate (wrong recognition mode, low resolution).

Fix:

  • Select an OCR/recognition mode appropriate to the content (e.g., Enhanced flow for complex layouts).
  • Increase image resolution parameters if text is small or low-quality.
  • For consistently poor OCR, provide a higher-quality source or pre-OCR the PDF with a specialised OCR tool.

Bullets, Lists or Paragraph Breaks Lost or Malformed

Cause:
Recognition settings or paragraph handling options (e.g., “Add Return To Line End”) do not match the PDF’s layout semantics.

Fix:

  • Toggle paragraph/line-end options to preserve expected line breaks.
  • Enable “Recognise Bullets” if list detection is required.
  • Test small samples to confirm behavior.

Images Missing or Poor-Quality in Output

Cause:
Image extraction/resolution settings too low, or compression removed image fidelity.

Fix:

  • Increase image resolution (Image Resolution X/Y) and set a higher image quality/compression setting.
  • Use lossless or less aggressive compression if detail is important.

Output Corrupted or Cannot Be Opened

Cause:
Returned payload truncated, incorrectly encoded, or not matching expected type.

Fix:

  • Verify the full base64 string is returned and decodes without error.
  • Re-run with a minimal PDF to confirm whether truncation occurs consistently.

Page Selection or Partial Conversion Wrong

Cause:
Options specifying page ranges, explicit pages, or page indices are invalid or out of range.

Fix:

  • Validate page indices/arrays and ensure values are within the PDF’s page count.
  • Use default full-range behavior if unsure.

Performance, Timeouts or Resource Limits on Large PDFs

Cause:
Very large PDFs or extremely high DPI/image settings exceed processing limits.

Fix:

  • Reduce DPI/resolution or convert fewer pages at a time.
  • Test with representative smaller files to determine safe limits.

Invalid Format-Specific Option Values

Cause:
Out-of-range integers, invalid enum values, or conflicting options for the chosen format.

Fix:

  • Validate numeric ranges and use documented enum values.
  • Remove conflicting parameters and retry.

Generic Runtime or Transient Failure

Cause:
Malformed inputs, intermittent conversion engine error, or unexpected internal state.

Fix:

  • Reproduce the failure with a minimal PDF and a minimal option set.
  • Validate inputs and retry to rule out transients.

Quick Checklist

  • Document contains the complete PDF binary/base64 payload (not a path/URL).
  • Output Format is exactly one of the supported values and correctly spelled.
  • If PDF pages are targeted, page indices/ranges are valid and within the document’s page count.
  • OCR/recognition mode and image resolution are appropriate for scanned PDFs.
  • Image resolution and compression settings are tuned for required fidelity.
  • For editable outputs (.doc / .docx / .pptx), expect reflow and validate on a small sample.
  • If output is corrupted, verify base64 decodes correctly and reproduce with a minimal test file.
  • For persistent failures, reproduce with: the exact PDF bytes (or a small redacted sample), the chosen Output Format, and the full set of format options - that input set is required to isolate the root cause.