Web Scraping with AWS Lambda

  • we tried setting our lambda to the location that we were trying to scrape — didn't improve
  • Tried rotating cookies — didn't improve much
  • It's a warm start. Looks like they change IP’s less frequently than what we require. So if I hit 100 parallel requests, then there is a high possibility of all 100 coming from the same IP
  • Try invoking a cold start, add adequate sleep time. Again there is no fixed time for a cold start to occur. It can vary from 5–60 minutes. This solution won't solve the problem entirely but after a small interval, we can change the IP.
  • Another approach would be to have your scraper in n different lambda’s. And rotate the lambda.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
reena .m

reena .m

More from Medium

AWS Lambda

Mail Server Using Serverless Framework On AWS [Lambda + SES +IAM]

AWS Lambda

A quick look at serverless systems: AWS Lambda