Javascript Rendering
Sometimes you may not get the required data from initial HTML response, as many modern websites rely on JavaScript to load content dynamically.
To handle this type of websites, we have a feature called JavaScript rendering
which emulates a real browser environment to fully load and render the page before extracting the data.
This also allows to interact with the page, like clicking on buttons, filling forms, etc.
Enabling JavaScript Rendering
To enable the Javascript rendering you need to pass render=true
with the request. This will open a real browser environment to render the page, allows you to interact with the page, and extract the data that is not available in the initial HTML response.
Enabling JavaScript rendering is more expensive than standard requests, costing ten times as much.
Features of JavaScript Rendering
There are some features that are available when using JavaScript rendering, such as:
- Screenshot: Allows you to take a screenshot of the rendered page.
- Blocking Resources: Blocks specific resources from loading. e.g. images, scripts, ads, cookie banners, etc.
- Instructions: Allows you to interact with the page, like clicking on buttons, filling forms, etc.
- Window Width & Height: Enables you to specify the browser’s window width and height.
Window Width & Height
Sometimes, adjusting the browser window size is necessary to render a webpage correctly. Some elements may be hidden due to responsiveness, or the default screen dimensions might not fit your use case.
ScrapeAutomate allows you to customize the browser’s width and height using the window_width and window_height query parameters to handle this.
Setting Window Width
The window_width
parameter specifies the browser’s width in pixels. It must be a positive integer; otherwise, you will receive a validation error. You can set up to a maximum value of 3820 as the window width.
In the following example, we set the window width to 1920 pixels:
Setting Window Height
Similarly, you can specify the browser height using the window_height
parameter. The value must be a positive integer, and the maximum height is 2680 pixels. If you exceed this limit, you will receive a validation error.
In this example, we set the window height to 1080 pixels:
Using Both Width & Height Together
You can specify both window_width
and window_height
in a single request
In this example, we set the browser window to 1920x1080:
The default window size is 1920x1080 pixels. If you only specify one of the parameters, the other will default to the default value.
Instructions
ScrapeAutomate provides a set of instructions that allow you to interact with the page, like wait for an element, clicking on buttons, filling forms, submit form, etc. This provides fleixibility to interact with the page and extract the data that is not available in the initial HTML response.
Using Instructions
To use the instructions, you must need to use JavaScript rendering. You can pass the instructions as an array of objects in the instructions
query parameters. Each object can have different properties based on the instruction you want to perform. When passing the instructions inside the query parameters make sure to stringify them into a json before sending it ot the API. To do this you can use our Builder to generate the instructions easily
Here is an example of using instructions:
This set of instructions will wait for an element with matching .btn
CSS selector to appear on the page, then click on the first element with the matching .btn2
CSS selector. This accepts an array of command that will execute sequentially. The instructions parameter should be URL encoded before passing it to the API.
Here is an example of using instructions in a request:
Waiting for a selector
Sometimes, elements on a page load dynamically, so you need to wait for them before taking any further actions. The wait
instruction pauses the execution until the specified CSS selector is found on the page. This is very useful for handling SPA (Single Page Application) or dynamic content websites.
This ensures that the page is fully loaded before proceeding to the next instruction. For instance, you want to ensure that a message element will be visible after submitting a form, you can use the wait
instruction to wait for the message element to appear. The wait
instruction accepts a CSS selector as a string.
If the selector is not found or takes too long to appear, the API will return an error message.
Click on an element
The click
instruction simulates a click event on a specified CSS selector. It lets you programmatically click on a button, link, or any other element on the page. This is useful for navigating to another page, submitting a form, or triggering an action.
This instruction often pairs with the wait
instruction to ensure that the element is present before clicking on it. For example, on a search page, you can wait for the search button to appear and then click on it to submit the search query.
Delay
The delay instruction pauses execution for a specified duration (in milliseconds) before performing the next action. This is useful when additional time is needed for animations, transitions, or content loading. For example, delay: 5000
will pause for 5 seconds before proceeding to the next instruction.
This can be useful when you need to wait for a specific time before performing the next action. For example, you can use the delay instruction to wait for a modal to appear after clicking a button.
Fill in an input field
The fill
instruction inputs text into a specified form field. It accepts the following properties:
selector
: The valid input field CSS selector.text
: The value text you want to input in that field.delay
: The number in ms of type speed. This is optional and defaults to 0.
This instruction is useful for filling out forms, entering search queries, or providing user input. For example, you can use the fill
instruction to input a search query into a search field.
Example of using multiple instructions
Let’s create a real example of using multiple instructions. We will use DuckDuckGo and Wikipedia websites for this example.The process starts with vising the DuckDuckGo homepage, searching for “ScrapeAutomate”, and clicking on the first search result. Then, we will visit the Wikipedia homepage. We can summarize the process in few steps:
- Wait for the search box to appear and fill it with “wikipedia” then click on the search button.
- Wait for the search result to appear and click on the first search result.
- Wait for the wikipedia search box to appear and fill it with “automation” then click on the search button.
- Wait for the search result to appear.
However, this is just a simple example. You can create more complex instructions based on your requirements.