- WGET questions. Pleae Help.
- Posted by Crikey Schmikey on February 28th, 2004
How can I instruct WGET to retrieve and webpages of links on a webpage
link I specify? For eg., if http://www.somesite.com/somepage.html has
several links to other pages, mentioned in the text, how can I get it
to get those webpages of the links as well?
Another example would be,
http://hotwired.lycos.com/webmonkey/...xml/index.html has
several links, such as "Sharing Your Site with RSS"
(http://hotwired.lycos.com/webmonkey/...?tw=authoring),
"Getting Your Feet Wet With SOAP"
(http://hotwired.lycos.com/webmonkey/...l?tw=authoring)
and so on. And I wish for WGET to retrieve those webpages as well.
I know this is copyrighted material, but please rest assured, I have
no plans whatsoever to use this for any other purposes or reasons
other than self education. I read that this site will be closed down
soon by its parent company and wish to save some of the pages for my
own use only.
I'd really appreciate it someone can provide me with the correct
syntax or flags so I can get WGET to perform what I explained above.
Or is this even possible with WGET? Thanks for your time and
courtesy.
- Posted by Chris F.A. Johnson on February 28th, 2004
On Sat, 28 Feb 2004 at 23:41 GMT, Crikey Schmikey wrote:
The wget man page has all the information you need. Look at the -r
option.
--
Chris F.A. Johnson http://cfaj.freeshell.org/shell
================================================== =================
My code (if any) in this post is copyright 2004, Chris F.A. Johnson
and may be copied under the terms of the GNU General Public License
- Posted by Crikey Schmikey on February 28th, 2004
On 29 Feb 2004 00:05:52 GMT, "Chris F.A. Johnson"
<c.fa.johnson@rogers.com> wrote:
Yep, did that, and specified -l=0. Didn't work at all. It it
actually did do, even though I used -np, was it started retrieving
pages from Wired's own page!
- Posted by Paul Boekholt on February 29th, 2004
On Sun, 29 Feb 2004 00:56:12 GMT, Crikey Schmikey <holy@nospam.com> said:
`-i FILE'
`--input-file=FILE'
Read URLs from FILE, in which case no URLs need to be on the
command line. If there are URLs both on the command line and in
an input file, those on the command lines will be the first ones to
be retrieved. The FILE need not be an HTML document (but no harm
if it is)--it is enough if the URLs are just listed sequentially.
Simply get the page, extract any links you want to download to a file and
give that to wget.
- Posted by Noi on February 29th, 2004
On Sun, 29 Feb 2004 18:37:55 +0000, Paul Boekholt thoughtfully wrote:
Also it's -l 0 not -l=0 and --no-clobber to cut traffic and dupes.
- Posted by Steve Lee on March 1st, 2004
On Sun, 29 Feb 2004 21:36:12 GMT, Noi <noi@siam.com> wrote:
Super, super, thanks a million, guys. I really appreciate it. I'll
give it a try.