This is how I do it. I send the URLs I want scraped to Urlbox[0] it renders the pages saves HTML (and screenshot and metadata) to my S3 bucket[1]. I get a webhook[2] when it’s ready for me to process. I prefer to use Ruby so Nokogiri[3] is the tool I use for scraping step. This has been particularly useful when I’ve want to scrape some pages live from a web app and don’t want to manage running Puppeteer or…
– Source: Hacker News
/
7 months ago
Hi there, I run urlbox.io, which is a screenshot API that allows clicking elements, waiting for elements, injecting custom JS/CSS etc.
Source:
almost 2 years ago