NAiOS IconNAiOS Logo
Volver al catálogo

Webscraper io

webscraper_io

WebScraper.IO is a web scraping tool that makes web data extraction easy and accessible for everyone through a cloud-based API.

Acciones
10
Triggers
0
Autenticación
OAuth gestionado
No
Información técnica: el detalle de parámetros, schemas y triggers de esta página está pensado para equipos de integración. Si solo necesitas saber si tu herramienta favorita está disponible, basta con ver la lista de acciones.

Acciones disponibles (10)

Cada acción es una operación que el agente puede ejecutar contra este conector. Haz clic en una acción para ver sus parámetros.

Create SitemapWEBSCRAPER_IO_CREATE_SITEMAPAcción

Tool to create a new sitemap configuration for web scraping. Use when you need to define a new scraping structure with start URLs and selector rules for data extraction from a website.

Parámetros de entrada

  • startUrlstring[]Obligatorio

    Array of starting URLs where scraping begins. At least one URL is required.

  • selectorsobject[]Obligatorio

    Array of selector objects defining data extraction rules. Minimum one selector required.

  • sitemap_idstringObligatorio

    Unique identifier for the sitemap. Must be alphanumeric with hyphens (e.g., 'webscraper-io-landing').

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Delete SitemapWEBSCRAPER_IO_DELETE_SITEMAPAcción

Tool to permanently delete a sitemap configuration from Web Scraper Cloud account. Use when you need to remove a sitemap that is no longer needed.

Parámetros de entrada

  • sitemap_idintegerObligatorio

    The unique identifier of the sitemap to delete

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Disable Sitemap SchedulerWEBSCRAPER_IO_DISABLE_SITEMAP_SCHEDULERAcción

Tool to disable automatic scheduling for a sitemap. Use when you need to stop automated scraping jobs from running on a schedule.

Parámetros de entrada

  • sitemap_idintegerObligatorio

    The unique identifier of the sitemap whose scheduler should be disabled

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Enable Sitemap SchedulerWEBSCRAPER_IO_ENABLE_SITEMAP_SCHEDULERAcción

Tool to enable and configure automatic scheduling for sitemap scraping jobs. Use when you need to automate scraping jobs to run at specific times using cron expressions with customizable request intervals, page load delays, driver types, and proxy settings.

Parámetros de entrada

  • proxyany

    Proxy configuration. Use format 'datacenter-{country_code}' (e.g., 'datacenter-us') or 'residential-{country_code}' (e.g., 'residential-us'), or 0 for no proxy, or 1 to use proxy, or proxy id for Scale plan users

  • driverstringObligatorioenum

    Scraper driver type. 'fast' doesn't execute JavaScript and extracts from raw HTML. 'fulljs' is full driver with JavaScript execution

    fastfulljs
  • cron_daystringObligatorio

    Day of month field of cron expression. Use '*' for any day, '1-31' for range, or specific values

  • cron_hourstringObligatorio

    Hour field of cron expression. Use '*' for any hour, '0-23' for range, or specific values like '9,17'

  • cron_monthstringObligatorio

    Month field of cron expression. Use '*' for any month, '1-12' for range, or specific values

  • sitemap_idintegerObligatorio

    The unique identifier of the sitemap to enable scheduling for

  • cron_minutestringObligatorio

    Minute field of cron expression. Use '*' for any minute, '*/10' for every 10 minutes, or specific values like '0,15,30,45'

  • cron_weekdaystringObligatorio

    Day of week field of cron expression. Use '*' for any weekday, '0-6' for range (0=Sunday), or specific values

  • cron_timezonestringObligatorio

    Timezone for cron schedule using tz database format (e.g., 'Europe/Riga', 'America/New_York', 'Asia/Tokyo')

  • page_load_delayintegerObligatorio

    Time period in milliseconds that scraper will wait for the page to load before extracting data. Default is 2000ms (2 seconds)

  • request_intervalintegerObligatorio

    Page request interval in milliseconds. Default is 2000ms (2 seconds). Defines the delay between page requests during scraping

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get Account InfoWEBSCRAPER_IO_GET_ACCOUNT_INFOAcción

Tool to retrieve account information including email and page credits. Use when you need to check account details or available credits.

Parámetros de entrada

Sin parámetros.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get Scraping JobsWEBSCRAPER_IO_GET_SCRAPING_JOBSAcción

Tool to retrieve all scraping jobs for the account with optional filtering and pagination. Use when you need to list scraping jobs, check job status, or filter jobs by sitemap or tag.

Parámetros de entrada

  • tagstring

    Filter jobs by tag name. Use to retrieve jobs with a specific tag.

  • pageinteger

    Page number for pagination. Use to retrieve specific page of results.

  • sitemap_idinteger

    Filter jobs by specific sitemap ID. Use to retrieve jobs for a particular sitemap.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get SitemapWEBSCRAPER_IO_GET_SITEMAPAcción

Tool to retrieve a specific sitemap configuration by ID. Use when you need to inspect or reference an existing sitemap's configuration.

Parámetros de entrada

  • sitemap_idintegerObligatorio

    The numeric identifier of the sitemap to retrieve

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get SitemapsWEBSCRAPER_IO_GET_SITEMAPSAcción

Tool to retrieve all sitemaps for the authenticated account with pagination support. Use when you need to list available sitemaps or filter them by tag. Supports optional pagination via page parameter and filtering by tag name.

Parámetros de entrada

  • tagstring

    Filter sitemaps by tag name to retrieve only sitemaps with a specific tag.

  • pageinteger

    Page number for pagination (e.g., 2 for the second page).

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get Sitemap SchedulerWEBSCRAPER_IO_GET_SITEMAP_SCHEDULERAcción

Tool to retrieve scheduler configuration for a sitemap. Use when you need to check scheduling settings including cron configuration and proxy settings.

Parámetros de entrada

  • sitemap_idintegerObligatorio

    The unique identifier of the sitemap

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Update SitemapWEBSCRAPER_IO_UPDATE_SITEMAPAcción

Tool to update an existing sitemap configuration including structure, URLs, and selectors. Use when you need to modify sitemap settings.

Parámetros de entrada

  • _idstringObligatorio

    Internal identifier for the sitemap

  • startUrlstring[]Obligatorio

    Array of URLs where scraping begins

  • selectorsobject[]Obligatorio

    Array of selector objects defining data extraction rules

  • sitemap_idintegerObligatorio

    The unique identifier of the sitemap to update

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not