What is node?

A Node is a package of instructions to finish a specific task.

To make scraping simple, NDS provides 3 kinds of Node. Transit, List and Detail. You can define a complete recipe with the combination of these nodes.

  •   Transit

    Transit node is a container of actions. You can add actions here and they will be executed sequentially. Usually we use Transit node to open url, submit search or make preparation for the next node.

    Transit node executes actions one by one from top to bottom. When encounting action '>Enter Next Node Here<', Transit will enter the next node immediately, and continues to execute the remaining actions after all following nodes processed. If there is no '>Enter Next Node Here<' in the action list, Transit will enter the next node automatically after all actions executed.

  •  List
    List node is to handle repeating data scraping.

    For example:

    • rows on structured table
    • items on eCommerce list page
    • entries on search result page

    All these pages have multiple blocks with similar structure or layout, and each block contains similar fields, such as title, price, brief description etc. Following is a Google Map screenshot:

    Blocks Fields

    Here each resturant highlighted in left is a block, and the restaurant name in the block is highlighted as a field in the right.

    The website usually load more blocks by scrolling mouse down or clicking page turning button.

    List node has 3 tabs: Data, Pages and Navigation.

    •  Data Tab

      to declare block and fields, and any actions to be executed before each block processed
    •  Pages Tab

      to declare how to turn pages or scroll mouse to load more blocks, and actions to be executed before new page/list loaded
    •  Navigation Tab

      to declare how to navigate to the next node

  •   Detail
    Different from List node, Detail node scrapes one-time content from the current page.

In List and Detail node, you can link the current page with the node via the pin icon () after Node Name, and browser will load the linked page automatically when you switch to the node via click or navigation.