forked from FCC/Crawler
-
Notifications
You must be signed in to change notification settings - Fork 0
Crawler is a bare-bones spider designed to quickly and effectively build an index of all files and pages on a given Web site as well as the link relationship (both incoming and outgoing) between each page.
flyabroad/Crawler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
TO USE: 1. Edit config.PHP with appropriate database and domain information 2. (for now) in phpMyAdmin insert the seed URL into the urls table. * URL should be www. * URL should have a trailing slash * (for now) May also want to set clicks to '0' to avoid problems 3. Open crawler.php 4. (optional) open stats.php to watch progress TIPS: Changes to php.ini 1. Increase memory limit (1GB) 2. Remove execution time limit Changes to mysql.ini * Increased max query size (to avoid "mysql went away" error) Additional documentation (source code) in (/source)
About
Crawler is a bare-bones spider designed to quickly and effectively build an index of all files and pages on a given Web site as well as the link relationship (both incoming and outgoing) between each page.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published