forked from sananth12/ImageScraper
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
65 lines (42 loc) · 1.31 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
ImageScraper
============
A simple python script which downloads all images in the given webpage.
Download
--------
tar file:
Grab the latest build using https://pypi.python.org/pypi/ImageScraper
pip install:
$pip install ImageScraper
Usage
-----
Using the tar file:
Extract the contents of the tar file.
Note that ``ImageScraper`` depends on ``lxml``. and ``requests``.
If you run into problems in the compilation of ``lxml`` through ``pip``, install the ``libxml2-dev`` and ``libxslt-dev`` packages on your system.
$cd ImageScraper/image_scraper/
$python __init__.py
$ Enter URL to scrap: https://github.com
$ Found 6 images:
$ How many images do you want ? : 6
$ Done.
If installed using pip:
Open python in terminal:
$python
>>>import image_scraper
Enter URL to scrap: https://github.com
Found 6 images:
How many images do you want ? : 6
Done.
NOTE:
A new folder called "images" will be created in the same place, containing all the downloaded images.
Upgrading
---------
Check and updates and upgrade using:
$ sudo pip install ImageScraper --upgrade
Issues
------
Q.)All images were not downloaded?
It could be that the content was injected into the page via javascript and this scraper doesn't run javascript.
Todo
----
Scraping sites which inject image tags via javascript by using PhantomJS or Selenium.