One’s first experience with a page full of dynamic content can be pretty confusing. Generally one can request the HTML, but it’s missing the data that is sought.
- If you go to http://screen-scraper.com/infinite%20scroller/demo.html you can see my sample page. In this case it’s one of those pages that keeps tacking content to the end forever like Facebook or Pintrest.
- If you make a scrapeable file of http://screen-scraper.com/infinite%20scroller/demo.html you can get a successful response, but the content text isn’t there.
- Now you need to pull out the screen-scraper proxy, and proxy the request. You will see the one page is making 3 requests:
- http://screen-scraper.com/infinite%20scroller/demo.html -> The landing page
Now you have the response, and in this case it’s JSON that you can either use extractor patterns on, or parse.