Webchecker
Tools/webchecker/webchecker.py in Python distribution
Not a CGI application but a web client application
- while still pages to do:
- request page via http
- parse html, collecting links
- pages once requested won't be requested again
- links outside original tree treated as leaves
- existence checked but links not followed
- reports on bad links
- what the bad URL is
- on which page(s) it is referenced
- could extend for other reporting