Junk

Junk Commit Details

Date:2015-09-29 11:13:38 (4 years 1 month ago)
Author:Nicola Fontana
Branch:master
Commit:c6889901786e2d7832a9f903249b5d2529ebb4e8
Parents: aaae3083ec5f242de60b2268305844ebf105bd1b
Message:grabweb: initial implementation of a web grabber

Based on wget, this script attempts to grab a whole domain for offline
browsing. It follows every link inside the domain specified in the
arguments, e.g.:

grabweb http://www.entidi.com/

follows every link inside the entidi.com domain.
Changes:
Amisc/grabweb (full)

File differences

misc/grabweb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#! /bin/sh
[ -z "$1" ] && echo "Usage: $0 [WGET OPTIONS] url" && exit 0
# Set domains to be followed if no -D is found in [WGET OPTIONS]
args=$(echo "$@" | sed -e '/-D/!s/^.*[ .\/]\(.\+\..\+\)$/-D \1 &/')
echo $args; exit 0
wget-HprEk \
--limit-rate=200k \
--no-clobber \
--no-host-directories \
--random-wait \
-e robots=off \
-U Mozilla/5.0 \
$args

Archive Download the corresponding diff file

Branches