Generally, recipes with routine steps can handle most of scraping tasks. However, for some specific scenario, we need to take some futher actions to make scraping possible.
- page loading is slow, we need to wait several seconds before starting to scrape
- start to scrape after some element disappeared
- wait new elements ready after clicking pagination button
- click a link to load complete content to scrape
- delete all items on the page after scraping done
To handle various scenario, NDS provides 4 interceptors where you can add associate actions to handle the page for scraping:
Pre-transit - actions
The actions to execute before entering next node In a Transit node, all actions before '>Enter Next Node Here<' action are pre-transit actions. If there is no '>Enter Next Node Here<' action, all actions are pre-transit actions in default.
Usually, we use pre-transit actions to make page ready for List or Detail node.
Pre-list - Actions after pagination triggered,before data loaded
If new list page loading is slow, we can add wait action here to wait new list page loading done before List node start to handle it.
Pre-block - Actions before Processing Next Block
We add actions here to make next block ready for scraping
Pre-field - simple actions
We can add simple actions here to prepare field(s) to scrape. All these actions are imposed on current field, or current tab.
Post-transit - actions
In a Transit node, if there is a '>Enter Next Node Here<' action, then all actions after the action are called post-transit actions. These actions will not be executed until all following nodes have been processed.