Skip to content

Commit f0c8d31

Browse files
authored
Merge pull request #4694 from Jgaldos/improve-httpstatus-all-meta
Improve http status all on http error middleware
2 parents cc095aa + a41c205 commit f0c8d31

File tree

3 files changed

+16
-2
lines changed

3 files changed

+16
-2
lines changed

docs/topics/spider-middleware.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,7 +253,8 @@ this::
253253
The ``handle_httpstatus_list`` key of :attr:`Request.meta
254254
<scrapy.http.Request.meta>` can also be used to specify which response codes to
255255
allow on a per-request basis. You can also set the meta key ``handle_httpstatus_all``
256-
to ``True`` if you want to allow any response code for a request.
256+
to ``True`` if you want to allow any response code for a request, and ``False`` to
257+
disable the effects of the ``handle_httpstatus_all`` key.
257258

258259
Keep in mind, however, that it's usually a bad idea to handle non-200
259260
responses, unless you really know what you're doing.

scrapy/spidermiddlewares/httperror.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ def process_spider_input(self, response, spider):
3232
if 200 <= response.status < 300: # common case
3333
return
3434
meta = response.meta
35-
if 'handle_httpstatus_all' in meta:
35+
if meta.get('handle_httpstatus_all', False):
3636
return
3737
if 'handle_httpstatus_list' in meta:
3838
allowed_statuses = meta['handle_httpstatus_list']

tests/test_spidermiddleware_httperror.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,19 @@ def test_meta_overrides_settings(self):
139139
self.assertIsNone(self.mw.process_spider_input(res404, self.spider))
140140
self.assertRaises(HttpError, self.mw.process_spider_input, res402, self.spider)
141141

142+
def test_httperror_allow_all_false(self):
143+
crawler = get_crawler(_HttpErrorSpider)
144+
mw = HttpErrorMiddleware.from_crawler(crawler)
145+
request_httpstatus_false = Request('http://scrapytest.org', meta={'handle_httpstatus_all': False})
146+
request_httpstatus_true = Request('http://scrapytest.org', meta={'handle_httpstatus_all': True})
147+
res404 = self.res404.copy()
148+
res404.request = request_httpstatus_false
149+
res402 = self.res402.copy()
150+
res402.request = request_httpstatus_true
151+
152+
self.assertRaises(HttpError, mw.process_spider_input, res404, self.spider)
153+
self.assertIsNone(mw.process_spider_input(res402, self.spider))
154+
142155

143156
class TestHttpErrorMiddlewareIntegrational(TrialTestCase):
144157
def setUp(self):

0 commit comments

Comments
 (0)