I must dilemma a week a selection of ecommerce web sites for his or her merchandise. Recent web sites be a half of the bunch daily, so I even enjoy as a map to write down original scrapers for them quite like a flash.
For every product I need the title, the price and the thumbnail. Net sites made with Shopify fortunately provide the /product.json endpoint. But for the others, AFAIK I must dilemma the html, or maybe the sitemap.
My questions are:
1) Are there assorted endpoint/solutions that I must be responsive to? (esteem sitemaps, or merchandise.json for Shopify)
2) Collect any ready-made libraries that take into accout of scraping merchandise from ecommerce web sites?
3) If not, which frameworks/libraries would you suggest to strike this balance between writing scrapers like a flash and having quite flexible code to be up so a ways and rerun a week?
For now I'm going with Scrapy, but in most cases I feel it's a dinky little bit of an overkill.