Field attributes

Once field element(s) is selected, NDS can help extract various attributes from the element. Here are built-in attributes you can use in Attribute drop-down select box.

Built-in Attribute

  • Default

    The Default attribute is used to extract the content of the target element based on the type of element. For example if the target element is an input, return the value of the input box; if the target element is a select, return selected option text;

    Usually Default attribute works well for most elements.

    Here the result of default attribute is the same as that of Direct Text attribute.

  • Direct Text

    return the selected element's inner text.

  • HTML Code

    return the selected element's outer HTML Code.

  • Link URL

    return the element's HREF attribute (url address) if the selected element is an A element.

    For some elements on web page, the cursor will be pointer when moving mouse on it, but the element itself is not an A element. So NDS cannot extract link URL for such element.

  • Image URL

    return the element's SRC attribute if the selected element is an IMG element(image url), video element (video url), or iFrame element(iframe url).

  • Image FileName

    return the image name if the selected element is an IMG element and filename existing.

  • CSS background Image URL

    If the image is not rendered by an IMG element, but by CSS on the element, then using this attribute to extract background image URL.

  • Email

    extract all email address in the selected element's content.

  • OCR-Number

    When the element's content is encrypted or encoded, you can use OCR to recognize content.

    This attibute is to recognize the element content as numbers.

    For example, the price element on the page is encoded with a custom font. So extract the element's default attribute (or Direct Text) does not return what we want.

    We change the attribute to OCR-Number, and then click the preview icon(). Now there is a OCR button next to the price column title on the preview table.

    Click the OCR button, NDS will execute OCR to recognize the top 3 elements of the fields.

    If the OCR result is correct, then the recipe will follow the same step to recognize all fields when scraping.

  • OCR-English

    recognize the element content as english text.

  • OCR-Simple Chinese

    recognize the element content as Simple Chinese text.

NOTE: Here the result of OCR extraction is affected by various factors, such as page layout, screen resolution ratio. You'd better preview the OCR result before taking it for the recipe.

Custom Attribute

Except all foregoing built-in attributes, you can input the any attribute name of the selected element here to extract the corresponding value.

Once field attribute selected, click the preview icon( ) next to the flow box, NDS will show the current node's scraping result on the current web page.

Fixed Field Value

If you want to put a fixed value in a field, instead of extracting content from element, you can let Element empty, and put the value in 'Default' input box directly.

If you want to process the extracted value from the target element before saving to output table, you can refer to "Data Transforming" section for more details