Skip to content

Commit

Permalink
fix bug --resume + --workdir
Browse files Browse the repository at this point in the history
  • Loading branch information
yaroslaff committed Apr 10, 2023
1 parent a6e7dc0 commit e5956c2
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 33 deletions.
19 changes: 0 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,25 +81,6 @@ Page B: *sasha grey* from 18 Apr (16 images, 12 clearly nsfw, 4 are clearly safe
|detect-image-aid (docker) | 124s | 10 | 28s | 6 (false negatives) |
|detect-image-nudenet (scripts) | 90s | 57 | 24s | 12 |

### Example usage:
Check one page (using built-in :nude filter):
~~~
nudecrawler -v --url1 https://telegra.ph/your-page-address
~~~

~~~
nudecrawler -w urls.txt --nude 5 -d 30 -f 5 --stats .local/mystats.json --log .local/nudecrawler.log
~~~
process urls from urls.txt, report page if 5+ nude images (or 1 any video, default), nudity must be over 0.5 threshold, check from todays date to 30 days ago, append all found pages to .local/nudecrawler.log, save periodical statistics to .local/mystats.json

If crawler will see page `Sasha-Grey-01-23-100`, but `Sasha-Grey-01-23-101` is 404 Not Found, it will try `-102` and so on. It will stop only if 5 (-f) pages in a row will fail.

~~~
nudecrawler -v --detect-image bin/detect-image-nsfw-api.py -f 5 --total 10 --nude 3 -w urls.txt --stats .local/stats.json --log .local/urls.log --stop 1000 --refresh bin/refresh-nsfw-api.sh
~~~

Work verbosely (-v), use NSFW_API for resolving (and call refresh-nsfw-api.sh script to restart docker container every 1000 images).

## Options
~~~
usage: nudecrawler [-h] [-d DAYS] [--url1 URL] [-f FAILS] [--day MONTH DAY] [--expr EXPR] [--total N] [--max-errors N] [--min-content-length N] [-a] [--detect-image SCRIPT]
Expand Down
25 changes: 12 additions & 13 deletions bin/nudecrawler
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,18 @@ def main():
sanity_check(args)

# when fastforward, we go to specific word/day/count quickly
fastforward = False
fastforward = False

if args.unbuffered:
sys.stdout = Unbuffered(sys.stdout)


if args.workdir:
for attr in ['cache', 'wordlist', 'log', 'resume', 'stats']:
old = getattr(args, attr)
if old is not None:
new = os.path.join(args.workdir, old)
setattr(args, attr, new)

if args.resume:
print("Resume from", args.resume)
Expand All @@ -351,20 +362,8 @@ def main():
fastforward = True
else:
stats['cmd'] = ' '.join(sys.argv)



if args.unbuffered:
sys.stdout = Unbuffered(sys.stdout)


if args.workdir:
for attr in ['cache', 'wordlist', 'log', 'resume', 'stats']:
old = getattr(args, attr)
if old is not None:
new = os.path.join(args.workdir, old)
setattr(args, attr, new)


# nude = args.nude
# video = args.video
Expand Down
2 changes: 1 addition & 1 deletion nudecrawler/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
version="0.3.8"
version="0.3.9"

0 comments on commit e5956c2

Please sign in to comment.