Scrapingant
scrapingantScrapingAnt is a web scraping API service that enables data extraction from websites through headless Chrome browsers, rotating proxies, CAPTCHA/Cloudflare bypass, LLM-ready markdown output, and AI-powered structured data extraction.
Acciones disponibles (5)
Cada acción es una operación que el agente puede ejecutar contra este conector. Haz clic en una acción para ver sus parámetros.
Extract Content as MarkdownSCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWNAcciónThis tool extracts content from a given url and converts it into markdown format. it is particularly useful for preparing text for language learning models (llms) and retrieval-augmented generation (rag) systems. it supports get, post, put, and delete methods.
SCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWNAcciónThis tool extracts content from a given url and converts it into markdown format. it is particularly useful for preparing text for language learning models (llms) and retrieval-augmented generation (rag) systems. it supports get, post, put, and delete methods.
Parámetros de entrada
urlstringObligatoriouriThe URL of the web page to scrape and convert to Markdown.
methodstringenumHTTP method to use for the request.
getpostputdeletebrowserbooleanEnables the use of a headless browser for scraping. Default is true.
cookiesstringCookies to include with the request.
js_snippetstringBase64-encoded JavaScript to execute on the page after it loads.
proxy_typestringSpecifies the type of proxy to use.
proxy_countrystringSpecifies the country for the proxy (e.g., US, GB).
block_resourcestring[]List of resource types to block (e.g., image, script, stylesheet, font, media, websocket, other).
wait_for_selectorstringCSS selector to wait for before returning the result.
return_page_sourcebooleanReturns the raw HTML as received from the server, without JavaScript rendering. Default is false.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Extract Data with AISCRAPINGANT_EXTRACT_DATA_WITH_AIAcciónThis tool allows you to extract structured data from a web page using scrapingant's ai-powered extraction capabilities. you provide a url and an ai query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. it supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.
SCRAPINGANT_EXTRACT_DATA_WITH_AIAcciónThis tool allows you to extract structured data from a web page using scrapingant's ai-powered extraction capabilities. you provide a url and an ai query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. it supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.
Parámetros de entrada
urlstringObligatorioThe URL of the page to extract data from.
cookiesstringCookies to use for the request. (e.g. cookie1=value1; cookie2=value2)
proxy_typestringProxy type to use for the request. (datacenter, residential)
return_textbooleanReturn text content of the page. (default: false)
proxy_countrystringProxy country to use for the request. (e.g. US, GB, DE)
enable_javascriptbooleanEnable browser rendering. (default: true)
wait_for_selectorstringWait for a specific selector to appear on the page before extracting data.
extract_propertiesstringObligatorioA free-form text describing the data you want to extract.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Get API Credits UsageSCRAPINGANT_GET_API_CREDITS_USAGEAcciónThis tool retrieves the current api credit usage status for the authenticated scrapingant account. it enables users to monitor their consumption of api credits, check their current usage against the subscription limits, and manage their api credits effectively.
SCRAPINGANT_GET_API_CREDITS_USAGEAcciónThis tool retrieves the current api credit usage status for the authenticated scrapingant account. it enables users to monitor their consumption of api credits, check their current usage against the subscription limits, and manage their api credits effectively.
Parámetros de entrada
Sin parámetros.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Scrape Web PageSCRAPINGANT_SCRAPE_WEB_PAGEAcciónThis tool scrapes a web page using the scrapingant api. it fetches the html content of the specified url. users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing javascript, passing cookies, and blocking certain resources.
SCRAPINGANT_SCRAPE_WEB_PAGEAcciónThis tool scrapes a web page using the scrapingant api. it fetches the html content of the specified url. users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing javascript, passing cookies, and blocking certain resources.
Parámetros de entrada
urlstringObligatoriouriURL of the web page to scrape.
browserbooleanEnable to use a headless browser for scraping. Defaults to True. If False, JavaScript will not be rendered.
cookiesstringCookies to pass with the scraping request.
js_snippetstringBase64 encoded JavaScript snippet to execute on the page. Requires headless browser.
proxy_typestringenumSpecifies the type of proxy to use.
datacenterresidentialproxy_countrystringSpecifies the country for the proxy.
block_resourcestring[]List of resource types to block. Requires headless browser.
wait_for_selectorstringCSS selector to wait for before returning the result. Requires headless browser.
return_page_sourcebooleanEnable to return the raw HTML from the server without JavaScript rendering. Requires headless browser. Defaults to False.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Scrape with Extended JSON OutputSCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUTAcciónThis tool scrapes a target url and returns an extended json response. it utilizes scrapingant's /v2/extended endpoint, providing richer information than the standard scraping tool, including page html, cookies, headers, and additional details.
SCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUTAcciónThis tool scrapes a target url and returns an extended json response. it utilizes scrapingant's /v2/extended endpoint, providing richer information than the standard scraping tool, including page html, cookies, headers, and additional details.
Parámetros de entrada
urlstringObligatorioThe URL of the web page to scrape.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not