Scrape Do
scrape_doScrape.do is a web scraping API offering rotating residential, data-center, and mobile proxies with headless browser support and session management to bypass anti-bot protections (e.g., Cloudflare, Akamai) and extract data at scale in formats like JSON and HTML.
Acciones disponibles (19)
Cada acción es una operación que el agente puede ejecutar contra este conector. Haz clic en una acción para ver sus parámetros.
Get Account InformationSCRAPE_DO_GET_ACCOUNT_INFOAcciónRetrieves account information and usage statistics from scrape.do. this action makes a get request to the scrape.do info endpoint to fetch: - subscription status - concurrent request limits and usage - monthly request limits and remaining requests - real-time usage statistics rate limit: maximum 10 requests per minute
SCRAPE_DO_GET_ACCOUNT_INFOAcciónRetrieves account information and usage statistics from scrape.do. this action makes a get request to the scrape.do info endpoint to fetch: - subscription status - concurrent request limits and usage - monthly request limits and remaining requests - real-time usage statistics rate limit: maximum 10 requests per minute
Parámetros de entrada
tokenstringObligatorioAuthentication token for Scrape.do API
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Get rendered page contentSCRAPE_DO_GET_RENDER_PAGEAcciónThis tool allows you to scrape web pages with javascript rendering enabled. it's particularly useful for scraping dynamic websites where content is loaded through javascript. the tool will wait for the javascript to execute and return the fully rendered html content.
SCRAPE_DO_GET_RENDER_PAGEAcciónThis tool allows you to scrape web pages with javascript rendering enabled. it's particularly useful for scraping dynamic websites where content is loaded through javascript. the tool will wait for the javascript to execute and return the fully rendered html content.
Parámetros de entrada
urlstringObligatorioThe target web page URL to scrape
widthintegerBrowser viewport width in pixels
devicestringenumDevice type to emulate (desktop/mobile/tablet)
desktopmobiletabletheightintegerBrowser viewport height in pixels
renderbooleanEnable JavaScript rendering
timeoutintegerMaximum timeout in milliseconds (5000-120000)
waitUntilstringenumWhen to consider navigation succeeded
domcontentloadednetworkidle0networkidle2loadcustomWaitintegerAdditional wait time in milliseconds after page load (0-35000)
waitSelectorstringCSS selector to wait for before returning content
customHeadersbooleanEnable custom header forwarding
blockResourcesbooleanBlock CSS and image resources
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Scrape webpage using scrape.doSCRAPE_DO_SCRAPE_DO_GET_PAGEAcciónA tool to scrape web pages using scrape.do's api service. it makes a basic get request to fetch the content of a target webpage while handling anti-bot protections and proxy rotation automatically.
SCRAPE_DO_SCRAPE_DO_GET_PAGEAcciónA tool to scrape web pages using scrape.do's api service. it makes a basic get request to fetch the content of a target webpage while handling anti-bot protections and proxy rotation automatically.
Parámetros de entrada
urlstringObligatorioTarget web page URL to scrape
superbooleanUse residential & mobile proxy networks
widthintegerBrowser viewport width
devicestringSpecify device type (desktop, mobile, tablet)
heightintegerBrowser viewport height
outputstringOutput format (raw or markdown)
renderbooleanEnable headless browser rendering
timeoutintegerMaximum request timeout in ms (5000-120000)
geo_codestringChoose country for target web page (e.g. 'us', 'gb')
return_jsonbooleanReturn network requests in JSON format
set_cookiesstringSet cookies for target web page
extra_headersbooleanAdd/modify headers
retry_timeoutintegerMaximum retry timeout in ms (5000-55000)
custom_headersbooleanHandle all request headers
block_resourcesbooleanBlock CSS and image sources
disable_redirectionbooleanDisable request redirection
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Use Scrape.do Proxy ModeSCRAPE_DO_SCRAPE_DO_PROXY_MODEAcciónThis tool implements the proxy mode functionality of scrape.do, which allows routing requests through their proxy server. it provides an alternative way to access web scraping capabilities by handling complex javascript-rendered pages, geolocation-based routing, device simulation, and built-in anti-bot and retry mechanisms.
SCRAPE_DO_SCRAPE_DO_PROXY_MODEAcciónThis tool implements the proxy mode functionality of scrape.do, which allows routing requests through their proxy server. it provides an alternative way to access web scraping capabilities by handling complex javascript-rendered pages, geolocation-based routing, device simulation, and built-in anti-bot and retry mechanisms.
Parámetros de entrada
urlstringObligatorioThe target URL to scrape
devicestringDevice type to simulate (desktop, mobile, tablet)
renderbooleanEnable/disable JavaScript rendering
geo_codestringGeographic location for the request (e.g., 'us', 'uk')
custom_headersbooleanWhether to forward custom headers to the target website
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Cookies for ScrapingSCRAPE_DO_SCRAPE_DO_SET_COOKIESAcciónThis tool allows users to set specific cookies for their scraping requests to a target website. it is useful for maintaining session states or authentication through cookies.
SCRAPE_DO_SCRAPE_DO_SET_COOKIESAcciónThis tool allows users to set specific cookies for their scraping requests to a target website. it is useful for maintaining session states or authentication through cookies.
Parámetros de entrada
urlstringObligatoriouriTarget web page URL where cookies will be set
cookiesstringObligatorioCookie string in format 'name1=value1;name2=value2' (will be URL-encoded)
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Scrape.do Super ModeSCRAPE_DO_SCRAPE_DO_SET_SUPER_MODEAcciónThe scrape do set super mode tool enables enhanced scraping by using residential and mobile proxies, bypassing blocks and restrictions associated with datacenter ips. when the 'super' parameter is set to true, it activates a mode that leverages a network of residential ip addresses, which is particularly useful to bypass strict anti-bot measures and for accessing websites that block datacenter ips.
SCRAPE_DO_SCRAPE_DO_SET_SUPER_MODEAcciónThe scrape do set super mode tool enables enhanced scraping by using residential and mobile proxies, bypassing blocks and restrictions associated with datacenter ips. when the 'super' parameter is set to true, it activates a mode that leverages a network of residential ip addresses, which is particularly useful to bypass strict anti-bot measures and for accessing websites that block datacenter ips.
Parámetros de entrada
super_modebooleanObligatorioEnable/disable Super Mode for enhanced scraping using residential and mobile proxies
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Block specific URLs during scrapingSCRAPE_DO_SET_BLOCK_URLSAcciónThis tool allows users to block specific urls during the scraping process. it's particularly useful for blocking unwanted resources like analytics scripts, advertisements, or any other urls that might interfere with the scraping process or slow it down. it provides granular control by allowing users to specify url patterns to block, thereby improving scraping performance and maintaining privacy.
SCRAPE_DO_SET_BLOCK_URLSAcciónThis tool allows users to block specific urls during the scraping process. it's particularly useful for blocking unwanted resources like analytics scripts, advertisements, or any other urls that might interfere with the scraping process or slow it down. it provides granular control by allowing users to specify url patterns to block, thereby improving scraping performance and maintaining privacy.
Parámetros de entrada
urlsstring[]ObligatorioList of URL patterns to block during scraping. Can be full URLs or patterns.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set custom headers for scrape.do requestSCRAPE_DO_SET_CUSTOM_HEADERSAcciónA tool to send custom headers with scrape.do requests. this allows simulating specific browser behaviors or adding authentication headers by controlling all headers sent to the target website.
SCRAPE_DO_SET_CUSTOM_HEADERSAcciónA tool to send custom headers with scrape.do requests. this allows simulating specific browser behaviors or adding authentication headers by controlling all headers sent to the target website.
Parámetros de entrada
urlstringObligatorioTarget web page URL to scrape
headersobjectObligatorioDictionary of custom headers to send with the request
custom_headersbooleanEnable custom headers mode (default: True)
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Custom Wait TimeSCRAPE_DO_SET_CUSTOM_WAITAcciónThis tool sets the custom wait time in milliseconds after page load when using the render option in scrape.do. it is particularly useful for dealing with dynamic content to ensure that it is fully loaded before scraping, especially on javascript-heavy websites or single-page applications. the action allows fine-tuned control over the rendering wait time and must be used with render=true.
SCRAPE_DO_SET_CUSTOM_WAITAcciónThis tool sets the custom wait time in milliseconds after page load when using the render option in scrape.do. it is particularly useful for dealing with dynamic content to ensure that it is fully loaded before scraping, especially on javascript-heavy websites or single-page applications. the action allows fine-tuned control over the rendering wait time and must be used with render=true.
Parámetros de entrada
custom_waitintegerThe time to wait in milliseconds after page load when using render option (between 0-35000 ms)
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Device Type for ScrapingSCRAPE_DO_SET_DEVICE_TYPEAcciónThis tool allows users to set the device type (desktop, mobile, or tablet) for making scraping requests. it is used to emulate different devices, which helps in testing responsive designs or fetching device-specific content.
SCRAPE_DO_SET_DEVICE_TYPEAcciónThis tool allows users to set the device type (desktop, mobile, or tablet) for making scraping requests. it is used to emulate different devices, which helps in testing responsive designs or fetching device-specific content.
Parámetros de entrada
urlstringObligatorioThe target URL to scrape
device_typestringObligatorioThe type of device to emulate for scraping requests
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Disable RedirectionSCRAPE_DO_SET_DISABLE_REDIRECTIONAcciónControls the automatic redirection behavior of scrape.do requests. when enabled (disable redirection=true), prevents the automatic following of redirects during web scraping operations. this allows: - inspection of the redirect chain - capturing intermediate redirect responses - manual control of redirection flow - analysis of http status codes of redirect responses the redirect url will be available in the scrape.do-target-redirected-location response header.
SCRAPE_DO_SET_DISABLE_REDIRECTIONAcciónControls the automatic redirection behavior of scrape.do requests. when enabled (disable redirection=true), prevents the automatic following of redirects during web scraping operations. this allows: - inspection of the redirect chain - capturing intermediate redirect responses - manual control of redirection flow - analysis of http status codes of redirect responses the redirect url will be available in the scrape.do-target-redirected-location response header.
Parámetros de entrada
disable_redirectionbooleanWhether to disable automatic redirection following. When true, prevents automatic following of redirects and allows inspection of redirect responses.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Pure Cookies ModeSCRAPE_DO_SET_PURE_COOKIESAcciónThis tool enables getting the original set-cookie headers from target websites instead of the processed scrape.do-cookies format. when enabled, this parameter returns the original set-cookie headers from the target website rather than using the default scrape.do-cookies header format.
SCRAPE_DO_SET_PURE_COOKIESAcciónThis tool enables getting the original set-cookie headers from target websites instead of the processed scrape.do-cookies format. when enabled, this parameter returns the original set-cookie headers from the target website rather than using the default scrape.do-cookies header format.
Parámetros de entrada
pure_cookiesbooleanObligatorioWhen enabled, returns the original Set-Cookie headers from the target website instead of the processed Scrape.do-Cookies format.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Regional Geolocation for ScrapingSCRAPE_DO_SET_REGIONAL_GEO_CODEAcciónThis tool allows users to set a broader geographical targeting by specifying a region code instead of a specific country code. this is useful when you want to scrape content from an entire region rather than a specific country. note that this feature requires super mode to be enabled and is only available for business plan or higher subscriptions.
SCRAPE_DO_SET_REGIONAL_GEO_CODEAcciónThis tool allows users to set a broader geographical targeting by specifying a region code instead of a specific country code. this is useful when you want to scrape content from an entire region rather than a specific country. note that this feature requires super mode to be enabled and is only available for business plan or higher subscriptions.
Parámetros de entrada
urlstringObligatorioThe target URL to scrape with the specified regional geo code
regional_geo_codestringObligatorioThe region code to target for scraping requests
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Retry TimeoutSCRAPE_DO_SET_RETRY_TIMEOUTAcciónThis tool allows users to set the maximum wait time (in milliseconds) before retrying a failed request in scrape.do. it requires a parameter 'retry timeout' (integer) which specifies the maximum time to wait before retrying, with a default of 15000 ms. it is designed to improve the reliability of web scraping operations, especially when dealing with unstable or slow-responding websites.
SCRAPE_DO_SET_RETRY_TIMEOUTAcciónThis tool allows users to set the maximum wait time (in milliseconds) before retrying a failed request in scrape.do. it requires a parameter 'retry timeout' (integer) which specifies the maximum time to wait before retrying, with a default of 15000 ms. it is designed to improve the reliability of web scraping operations, especially when dealing with unstable or slow-responding websites.
Parámetros de entrada
retry_timeoutintegerThe maximum time in milliseconds to wait before retrying a failed request (between 5000-55000 ms)
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Screenshot Capture for ScrapingSCRAPE_DO_SET_SCREENSHOTAcciónThis tool enables the screenshot functionality for the scrape.do api, allowing users to capture a visual representation of the scraped webpage. when enabled, the api will return a screenshot of the rendered page along with the regular response. features: - basic screenshot capture - full page screenshot capture - capture specific area using css selector
SCRAPE_DO_SET_SCREENSHOTAcciónThis tool enables the screenshot functionality for the scrape.do api, allowing users to capture a visual representation of the scraped webpage. when enabled, the api will return a screenshot of the rendered page along with the regular response. features: - basic screenshot capture - full page screenshot capture - capture specific area using css selector
Parámetros de entrada
urlstringObligatorioThe URL of the webpage to take a screenshot of
enabledbooleanWhether to enable screenshot capture for scraping requests
selectorstringCSS selector to capture specific area of the page
full_pagebooleanWhether to capture full page screenshot
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Session ID for Sticky SessionsSCRAPE_DO_SET_SESSION_IDAcciónThis tool implements the session id functionality for scrape.do to maintain a sticky session with the same proxy ip across multiple requests. it achieves this by adding a sessionid parameter to the query parameters of any scraping request, which is crucial for ensuring session consistency when scraping websites with stringent session requirements.
SCRAPE_DO_SET_SESSION_IDAcciónThis tool implements the session id functionality for scrape.do to maintain a sticky session with the same proxy ip across multiple requests. it achieves this by adding a sessionid parameter to the query parameters of any scraping request, which is crucial for ensuring session consistency when scraping websites with stringent session requirements.
Parámetros de entrada
session_idintegerObligatorioAn integer between 0 and 1,000,000 that will be used as the session identifier. The same session ID will maintain the same proxy IP for up to 5 minutes of inactivity.
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Wait For SelectorSCRAPE_DO_SET_WAIT_FOR_SELECTORAcciónThis action allows setting a css selector to wait for before considering the page load complete. it is particularly useful when scraping javascript-heavy pages to ensure that certain elements have loaded dynamically.
SCRAPE_DO_SET_WAIT_FOR_SELECTORAcciónThis action allows setting a css selector to wait for before considering the page load complete. it is particularly useful when scraping javascript-heavy pages to ensure that certain elements have loaded dynamically.
Parámetros de entrada
timeoutintegerMaximum time to wait in milliseconds (between 1000 and 35000)
selectorstringObligatorioCSS selector to wait for in the target web page
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Set Wait Until ConditionSCRAPE_DO_SET_WAIT_UNTILAcciónThis tool sets the waituntil parameter for the scrape.do api, defining when the rendering should consider the page loaded during javascript execution. it is particularly useful for handling dynamic websites by specifying conditions such as 'domcontentloaded', 'networkidle0', or 'networkidle2'.
SCRAPE_DO_SET_WAIT_UNTILAcciónThis tool sets the waituntil parameter for the scrape.do api, defining when the rendering should consider the page loaded during javascript execution. it is particularly useful for handling dynamic websites by specifying conditions such as 'domcontentloaded', 'networkidle0', or 'networkidle2'.
Parámetros de entrada
wait_untilstringObligatorioenumThe condition to determine when the page is considered loaded. 'domcontentloaded': Waits for DOMContentLoaded event. 'networkidle0': Waits until no network connections for 500ms. 'networkidle2': Waits until ≤2 network connections for 500ms.
domcontentloadednetworkidle0networkidle2
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not
Monitor WebSocket requests using scrape.doSCRAPE_DO_SHOW_WEBSOCKET_REQUESTSAcciónThis tool provides the ability to view websocket requests made by a webpage. it requires using render=true and returnjson=true parameters along with showwebsocketrequests=true to enable logging of websocket requests.
SCRAPE_DO_SHOW_WEBSOCKET_REQUESTSAcciónThis tool provides the ability to view websocket requests made by a webpage. it requires using render=true and returnjson=true parameters along with showwebsocketrequests=true to enable logging of websocket requests.
Parámetros de entrada
urlstringObligatorioTarget web page URL to monitor websocket requests
timeoutintegerMaximum request timeout in ms (5000-120000)
session_idstringOptional session ID for maintaining state across requests
Parámetros de salida
dataobjectObligatorioData from the action execution
errorstringError if any occurred during the execution of the action
successfulbooleanObligatorioWhether or not the action execution was successful or not