<<Up     Contents

User:BryceHarrington/Code examples 1

Redirected from Code examples 1

< User:BryceHarrington

Code Examples:

Hmm, good idea. Here's code to do it:

First thing, nuke all the empty pages that just say "Describe the new page..."

 mkdir /tmp[?]/emptypages
 cd [....]/wiki/lib-http/db/wiki/html
 for file in `find . -type f -size -1400c ! -name '*["]*' -print | xargs grep 'Describe the new page here.' | cut -d: -f1`; do mv $file /tmp[?]/emptypages; done

Review and delete the stuff in /tmp[?]/emptypages at your convenience.

Now for the fun part

 cd [....]/wiki/lib-http/db/wiki/html
 for file in `find . -type f -size -1800c -print`; do ls -ol $file; done

This one gets a listing of all the pages which have fewer than about 500 characters worth of content. Knock the number up or down depending on where you wish to draw the line.

-- BryceHarrington

wikipedia.org dumped 2003-03-17 with terodump