NAiOS IconNAiOS Logo
Volver al catálogo

Scrapingant

scrapingant

ScrapingAnt is a web scraping API service that enables data extraction from websites through headless Chrome browsers, rotating proxies, CAPTCHA/Cloudflare bypass, LLM-ready markdown output, and AI-powered structured data extraction.

Acciones
5
Triggers
0
Autenticación
OAuth gestionado
No
Información técnica: el detalle de parámetros, schemas y triggers de esta página está pensado para equipos de integración. Si solo necesitas saber si tu herramienta favorita está disponible, basta con ver la lista de acciones.

Acciones disponibles (5)

Cada acción es una operación que el agente puede ejecutar contra este conector. Haz clic en una acción para ver sus parámetros.

Extract Content as MarkdownSCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWNAcción

This tool extracts content from a given url and converts it into markdown format. it is particularly useful for preparing text for language learning models (llms) and retrieval-augmented generation (rag) systems. it supports get, post, put, and delete methods.

Parámetros de entrada

  • urlstringObligatoriouri

    The URL of the web page to scrape and convert to Markdown.

  • methodstringenum

    HTTP method to use for the request.

    getpostputdelete
  • browserboolean

    Enables the use of a headless browser for scraping. Default is true.

  • cookiesstring

    Cookies to include with the request.

  • js_snippetstring

    Base64-encoded JavaScript to execute on the page after it loads.

  • proxy_typestring

    Specifies the type of proxy to use.

  • proxy_countrystring

    Specifies the country for the proxy (e.g., US, GB).

  • block_resourcestring[]

    List of resource types to block (e.g., image, script, stylesheet, font, media, websocket, other).

  • wait_for_selectorstring

    CSS selector to wait for before returning the result.

  • return_page_sourceboolean

    Returns the raw HTML as received from the server, without JavaScript rendering. Default is false.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Extract Data with AISCRAPINGANT_EXTRACT_DATA_WITH_AIAcción

This tool allows you to extract structured data from a web page using scrapingant's ai-powered extraction capabilities. you provide a url and an ai query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. it supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.

Parámetros de entrada

  • urlstringObligatorio

    The URL of the page to extract data from.

  • cookiesstring

    Cookies to use for the request. (e.g. cookie1=value1; cookie2=value2)

  • proxy_typestring

    Proxy type to use for the request. (datacenter, residential)

  • return_textboolean

    Return text content of the page. (default: false)

  • proxy_countrystring

    Proxy country to use for the request. (e.g. US, GB, DE)

  • enable_javascriptboolean

    Enable browser rendering. (default: true)

  • wait_for_selectorstring

    Wait for a specific selector to appear on the page before extracting data.

  • extract_propertiesstringObligatorio

    A free-form text describing the data you want to extract.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Get API Credits UsageSCRAPINGANT_GET_API_CREDITS_USAGEAcción

This tool retrieves the current api credit usage status for the authenticated scrapingant account. it enables users to monitor their consumption of api credits, check their current usage against the subscription limits, and manage their api credits effectively.

Parámetros de entrada

Sin parámetros.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Scrape Web PageSCRAPINGANT_SCRAPE_WEB_PAGEAcción

This tool scrapes a web page using the scrapingant api. it fetches the html content of the specified url. users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing javascript, passing cookies, and blocking certain resources.

Parámetros de entrada

  • urlstringObligatoriouri

    URL of the web page to scrape.

  • browserboolean

    Enable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.

  • cookiesstring

    Cookies to pass with the scraping request.

  • js_snippetstring

    Base64 encoded JavaScript snippet to execute on the page. Requires headless browser.

  • proxy_typestringenum

    Specifies the type of proxy to use.

    datacenterresidential
  • proxy_countrystring

    Specifies the country for the proxy.

  • block_resourcestring[]

    List of resource types to block. Requires headless browser.

  • wait_for_selectorstring

    CSS selector to wait for before returning the result. Requires headless browser.

  • return_page_sourceboolean

    Enable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not

Scrape with Extended JSON OutputSCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUTAcción

This tool scrapes a target url and returns an extended json response. it utilizes scrapingant's /v2/extended endpoint, providing richer information than the standard scraping tool, including page html, cookies, headers, and additional details.

Parámetros de entrada

  • urlstringObligatorio

    The URL of the web page to scrape.

Parámetros de salida

  • dataobjectObligatorio

    Data from the action execution

  • errorstring

    Error if any occurred during the execution of the action

  • successfulbooleanObligatorio

    Whether or not the action execution was successful or not