Its a crawler with the goal of extract offers of python jobs from websites, mostly Brazilian websites.
- Check if you have libxml2-dev, libffi-dev, libssl-dev libxml2-dev libxslt-dev and mongodb, if you doesn't install it:
sudo apt-get install libxml2-dev libffi-dev libssl-dev libxml2-dev libxslt-dev mongodb
- Install project requirements
pip install -r requirements.txt
Please, be kind with yourself and install it in an virtualenv! :)
scrapy crawl ceviu
scrapy crawl catho
scrapy crawl vagas
scrapy crawl empregos
[x] - Iterate over CEVIU search pages
[x] - Store items in database, preferably a NoSQL database such as MongoDB
[x] - Implement Catho.com.br spider
[x] - Implement Empregos.com.br spider
[x] - Implement Vagas.com.br spider
[] - Build an web interface to search for jobs