Capture & Parse PDF file

We are building a flow where I need to download a file from an sFTP site and parse it into a Base64 string before sending it to a REST API endpoint.

I can’t seem to find a suitable reader so I can then use the byte2base64 function. Do we need to create our own PDF reader?

That’s correct, we currently do not have a PDF reader that directly extracts text from PDFs. However, we can transfer any file, including a .pdf file, as bytes.

Here is how to do it:

  1. Set the URL to the .pdf file in the FlatFileReader and connect it to a Map component.
  2. Edit Metadata
  • Double-click the edge on the output port of the FlatFileReader
  • Set the Type of the output field to Byte .
  • Select the metadata.
  • In properties remove Record delimiter Default delimiter and set EOF as delimiter to true .

  1. In the Map component, apply the transformation using the byte2base64 function to convert the file to a Base64 string and send it to the output port.

For better understanding, we are attaching an example graph to the answer.

Base64.grf (2.3 KB)