python 2.7 - Scrapy Request callback method never called -
i building crawlspider using scrapy 0.22.2 python 2.7.3 , having problems requests, callback method specify never called. here snippet parsing method initiates request within elif block:
elif current_status == "superseded": #need more work here. have check whether there replacement unit available. if there isn't, download whatever outline there # need <td> element contains "is superseded " , follow link updated_unit = hxs.xpath('/html/body/div[@id="page"]/div[@id="layoutwrapper"]/div[@id="twocollayoutwrapper"]/div[@id="twocollayoutleft"]/div[@class="layoutcontentwrapper"]/div[@class="outer"]/div[@class="fieldset"]/div[@class="display-row"]/div[@class="display-row"]/div[@class="display-field-info"]/div[@class="t-widget t-grid"]/table/tbody/tr[1]/td[contains(., "is superseded ")]/a') # need child element updated_unit_link = updated_unit.xpath('@href').extract()[0] updated_url = "http://training.gov.au" + updated_unit_link print "\033[0;31msuperceded "+updated_url+"\033[0m" # prints in red superseded, need follow link current yield request(url=updated_url, callback='sortsuperseded', dont_filter=true) def sortsuperseded(self, response): print "\033[0;35mtest callback called\033[0m"
there no errors when execute , url ok, sortsuperseded never called never see 'test callback called' printed in console.
the url extracting within domain specify crawlspider.
allowed_domains = ["training.gov.au"]
where going wrong?
quotes not required around callback method name. change line:
yield request(url=updated_url, callback='sortsuperseded', dont_filter=true)
to
yield request(updated_url, callback=self.sortsuperseded, dont_filter=true)
Comments
Post a Comment