KeyError [b'frontier'] on Request Creation from Spider

Issue might be related to #337

Hi,

I have already read in discussions here, that the scheduling of requests should be done by frontera and apparently even the creation should be done by the frontier and not by the spider.
However, in the documentation of scrapy and frontera it is written that requests shall be yielded in the spider `parse` function.

How should the process look like, if requests are to be created by the crawling strategy and not yielded by the spider? How does the spider trigger that?

In my use case, I am using [scrapy-selenium](https://github.com/clemfromspace/scrapy-selenium) with scrapy and frontera (I use SeleniumRequests to be able to wait for JS loaded elements).

I have to generate the URLs I want to scrape in two phases: I am yielding them firstly in the
`start_requests()` method of the spider instead of a seeds file and yield requests for extracted links in the first of two `parse` functions.

Yielding SeleniumRequests from `start_requests` works, but yielding SeleniumRequests from the `parse` function afterwards results in the following error (only pasted an extract, as the iterable error prints the same errors over and over):

```
return (_set_referer(r) for r in result or ())
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
    return (r for r in result or () if _filter(r))
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 64, in _evaluate_iterable
    for r in iterable:
  File "/Users/user/opt/anaconda3/envs/frontera-update/lib/python3.8/site-packages/frontera/contrib/scrapy/schedulers/frontier.py", line 112, in process_spider_output
    frontier_request = response.meta[b'frontier_request']
KeyError: b'frontier_request'
```

Very thankful for all hints and examples!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError [b'frontier'] on Request Creation from Spider #401

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development