Skip to content

modern-pdf-lib


modern-pdf-lib / extractText

Function: extractText()

extractText(operators, resources?, options?): string

Defined in: src/parser/textExtractor.ts:76

Extract plain text from a sequence of parsed content-stream operators.

This function concatenates all text-showing operator strings, inserting spaces between text objects (BT/ET blocks) and newlines at line breaks (T*, Td, TD).

Parameters

operators

ContentStreamOperator[]

Parsed content-stream operators.

resources?

PdfDict

Optional page /Resources dictionary (used to look up font encodings and ToUnicode CMaps).

options?

TextExtractionOptions

Extraction options.

Returns

string

The extracted text as a single string.

Released under the MIT License.