How to handle Web Crawler Activities ?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

jgreve
Premium Member
Premium Member
Posts: 107
Joined: Mon Sep 25, 2006 4:25 pm

don't go manual...

Post by jgreve »

Manual-anything will tend to suck... both for the volunteers that verify things, and would-be account holders that end up waiting over weeekends, or vacations, or crunch-times to be verified.

At account creation time, throw in an image-test (http://en.wikipedia.org/wiki/CAPTCHA) in to help weed out bots.

Hmm... actually, the account-creation page is REALLY busy now. Consider breaking it into 2 pages:
First page: isolate the 3 dozen informational / background questions.
2nd page: account id, permanent-email, image test.

Page #2) will make it less painful to loop in the event of account-id's already being used collisions.

I'm not saying it would be impossible to defeat this, but getting past the image validation, and correlating an email-reply with automated requests requires rather more energy.

As for spammers that defeat the above process, handle them manually. People can complain about "Hey, id 'jgreve' is posting spam - suspend their account!" or whatever in the admin forum.

p.s. a neat looking library for PHP/image-validation: http://www.ejeliot.com/pages/2
(not that I know anything about php; don't touch the stuff myself :wink: ).

sud wrote:
chulett wrote:... another such way is a manual approval step for new posters, but that puts a burden on Walter and may prove a barrier to growth. Anyone sign up with ADN way back when it was a one man manual process and remember how long you had to wait before you could start posting?...
Actually we can have new registrations only through two channels :

1> Through reference from an existing user
2> Manual Approval from Walter

That way Walter won't get killed with requests :!: :roll:
[/url]
whardeman
Posts: 111
Joined: Mon Oct 21, 2002 11:17 am
Location: Fort Worth, Tx
Contact:

Post by whardeman »

There is, in fact, CAPTCHA image validation already in place. It was added in a recent release of phpBB, so you may not have seen it.

It is possible that the CAPTCHA has been defeated by smart bots though...
Walter Hardeman
Webmaster
Post Reply