Splitting PDF Files in Microsoft Flow

In one of my previous blog post, we created a Flow to Merge Multiple documents into Single PDF. In this post, we will do perform an opposite action i.e. Split up an existing PDF file into multiple files containing 10 pages each.

This is quite a common scenario for organisations that deal with massive documents who frequently split up these kinds of files in batches of 100 pages to keep the files manageable. If your document is using a format other than PDF then make sure you use the Convert to PDF Muhimbi Action to Convert it to PDF first.

Please make sure the following prerequisites are in place:

  • An Office 365 subscription with SharePoint Online license.
  • Muhimbi PDF Converter Services Online full, free or trial subscription(Sign up).
  • Appropriate privileges to create Flow.
  • Working knowledge of SharePoint Online, Microsoft Flow and Microsoft Forms.

From a High level our Flow will look like:


Step 1:

  • For this demo, we will use “When a file is created or modified in a folder ” SharePoint ‘Flow Trigger’ action.
  • In the action, specify the path to the SharePoint Online folder to monitor for new files.

Note: To trigger a Flow for files created anywhere in a Document Library, use the ‘When a file is created (properties only)’ trigger.

Step 2: This is where the real Magic happens.

  • Source file name:File name” is the output of the “When a file is created or modified in a folder ” SharePoint ‘Flow Trigger’ action.
  • Source file content: “File Content” is the output of the “When a file is created or modified in a folder” SharePoint ‘Flow Trigger’ action. Note: You may need to click on ‘see more’ to see this field.
  • Split by: Specify if you wish to split based on the number of pages or the level of the bookmark.
  • Split parameter: When splitting based on the number of pages then this parameter must be set to the maximum number of pages to include in each split file. When splitting based on the bookmark level then this parameter should contain the ‘depth’ at which to split. E.g. specify ‘1’ to split on top level chapters (Chapter 1, chapter 2, etc.) or a higher number to split at a deeper level (e.g. ‘2’ splits on Chapter 1, 1.1, 1.2, 2, 2.1 etc.)
  • File name template:This parameter determines the file name for the returned file. For this demo we are using “File name“-(utcnow())

The file name template can be anything and allows the standard .NET string formatting facilities for numbering, e.g. ‘split-{0:3D}’ will use 3 digits for the sequential numbers starting at ‘split-001.pdf’. When splitting by bookmark then an optional {1} parameter can be inserted in the file name to include the name of the bookmark as well.

If the “File name template” value is blank:

Scenario 1: If the “Split by” is set to “Number of pages“. It will use the SharePoint “File name” property and will use 3 digits for the sequential numbers starting at ‘<File name>-001.pdf’

Scenario 2: When “Split by” is set to “Bookmark”, it will include the SharePoint “File name” and concatenate it with the “Bookmark” name.


Step 3: Use the “Create file” SharePoint action to write the PDF document to the SharePoint document library.

Note: The output of the Split action is an array of files. Flow will automatically put an ‘Apply to each’ loop around any action that uses the split results.

  • File name: For this Demo, we will use Processed files Processed file name” (output variable of the “Split PDF” action).
  • File content: Processed files Processed file content the (output variable of the “Split PDF”)action. Note: You may need to click on ‘see more’ to see this field.

The tutorial described in this blog post is also available as a video.

Thanks for Reading…..

2 thoughts on “Splitting PDF Files in Microsoft Flow

  1. Nice write-up, thanks!

    Is it possible to “split-by-content” instead of number-of-pages as in your example above?
    My use case involves a batch scan of multiple invoices that come in a single file, and that I want to split into separate documents. Each invoice within that single PDF is from 1 to n pages long,and is separated from the next by a special control page which contains a QR code. The process I’m looking for would split the PDF wherever it finds the page with the QR code and save each section into its own PDF. The page with the QR code needs to be removed. Could this be implemented with Muhimbi?

    Thanks,
    Dan

    Like

  2. Hi, I am Glad that you liked the blog … I had a discussion with Muhimbi Team and they have informed me that at present they only support split by the “number of pages” or split by “bookmarks”. They are planning to implement other options i.e. split based on the content of the page in the future release.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s