Web Scraping – Automate collecting tabular data from web page using Power Automate desktop

In this post, we will use Power Automate Desktop(RPA) and collect tabular data from web page.

What is Web scraping?

Web scraping is the process of using bots to extract content and data from a website. Unlike screen scraping, which only copies pixels displayed onscreen, web scraping extracts underlying HTML code. In this demo.

Introduction to Power Automate Desktop:

Power Automate Desktop broadens the existing robotic process automation (RPA) capabilities in Power Automate and enables you to automate all repetitive desktop processes. It’s quicker and easier than ever to automate with the new intuitive Power Automate Desktop designer using the prebuilt drag-and-drop actions or recording your own desktop flows to run later.

Before you begin, please make sure the following prerequisites are in place:

  • Power Automate Desktop installed and configured on your computer.
  • Install Power Automate Browser extension and configure the browser.
  • Working knowledge of the Power Automate desktop.
  • If you are planning to run the Flows in an attended or unattended mode, you need the on-premises data gateway for your device to have the desktop flow triggered by Power Automate.

Scenario :

Your boss wants to keep a track of the Gold Prices for everyday. You as an RPA manager need to automate the process using Power Automate Desktop.

You have used Process Advisor(Hindi Video) and found that an employee using the URL (https://www.bullion-rates.com/gold/INR/2021-5-history.htm) to track daily gold prices . The website https://www.bullion-rates.com/ provides a good tabular structure and accurate pricing.

Part 1 : Configuring Power Automate Desktop

Step 1 :

  • Open Power Automate Desktop.
  • Click on ‘+ New Flow’, give your Flow a meaningful name and click on the ‘Create‘ button at the bottom of the screen.

Step 2 :

  • Initialize two variable,
    • MonthNumber‘ –  Variable Type(Input)
    • PadTable‘ – Variable Type(Output).
  • We will be using the ‘MonthNumber variable to get the month number from Power Automate Cloud to Power Automate Desktop and the ‘PadTable‘ to return value from Power Automate Desktop to  Power Automate Cloud.

Step 3:

Step 4:

  • Add the ‘Extract data from web page‘ action.

Step 5:

  • Add the ‘Set variable‘ action.
    • Set ‘PadTable‘ to %DataFromWebPage%(Output of ‘Extract data from web page‘ action).

Step 6:

  • Add the ‘Close web browser‘ action and click on ‘Save‘.

Part 2: Configuring Power Automate Cloud.

Step 1:

For this demo, we will use the ‘Manual‘  Trigger action. You can use this trigger to start the Flow manually.

In the Trigger, click on ‘Add an Input‘  of type ‘Text‘.

Step 2:

  • Add the ‘Run an a flow built with Power Automate Desktop’ action
  • Run Mode: Power Automate Desktop Flow from the Drop Down.
  • NewInput : ‘MonthNumber‘ the output of the Manual Trigger action.

Note: To use this action the Prerequisites would be to configure on-premises data gateway.

Step 3:

  • Add the Muhimbi ‘Convert document‘ action.
  • Source file name: For this demo we will hardcode the name as ‘Gold.txt’.
  • Source file content: Enter the output of the ‘Run an a flow built with Power Automate Desktop’ action ‘padTable ;
  • Output Format: ‘PDF‘.

For details see the screenshot below:

Step 4:

  • Add the ‘Create file‘ SharePoint action to create the PDF document in the SharePoint document library.
  • Folder Path: Specify the output path to write the PDF file to.
  • File Name: For this demo we will create the File Name ‘Gold Price Month-@{triggerBody()[‘number’]}.pdf’.
  • File Content: ‘Processed file content‘ is the output variable of the ‘Convert document‘ action.

All Done!  Start your Flow ‘Manually‘ and after a few seconds, the Power Automate Desktop should launch, scrap the data from the web page and a PDF document will be created in the destination library.

One thought on “Web Scraping – Automate collecting tabular data from web page using Power Automate desktop

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s