Deduplication

Data Deduplication refers to filtering out duplicated data that may be encountered during scraping.

When creating a new output table, you can select one or more fields as primary key to filter out duplication. In the screenshot, we select 'name' and 'link' field as primary key.

In NDS, each data table can accept more than one recipe to store scraped data to. The configured unqiue field(s) work as filter for all recipes which save data to the table.

Once a data table has configured unique field(s), you cannot edit it again. To change the unique field(s), you can rename the 'Save to' to create a new table and set new primary key for the table on recipe saving dialog.

Note: changing the output data table name when starting does not affect the unique fields. The old unique fields (if existing) will be imposed on the new data table automatically.