How to handle Web Crawler Activities ?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

How to handle Web Crawler Activities ?

Post by narasimha »

What should be done in future for crawler activities in our posts? :evil:

- Ignore them? Let Walter look into it.
- Should anybody reply with comments?
- Should someone just mail Walter, when such activities happen?

Some rules need to be set up as to what action needs to be taken in such conditions. (Till these crawlers are handled more cleverly by the site)
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We email Walter, but he is entitled to his weekends. The first to encounter the problem should email Walter and post that he/she has done so, so that Walter does not get deluged with emails.

It might be worth seeking legal advice. I don't know what US law is; such activity is illegal here in Australia, under the Spam Act. There have been successful prosecutions.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

For a signal for other, maybe start a new post with the heading "Email sent to Walter regarding Spam" or something along those lines. I sent him an email at the first sigt of spam activity and later realized that Craig had already sent one. I got to know that when i went through all the posts. :roll:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

I also noticed along with others, that the forum which gets hit the most is the "Editor's Corner".
The reason could be because it the first one to appear in the Forum list. :idea:
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

narasimha wrote:I also noticed along with others, that the forum which gets hit the most is the "Editor's Corner".
The reason could be because it the first one to appear in the Forum list. :idea:
Ditto observation and conclusion.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Perhaps a new first forum is required, in which posts are automatically deleted and the poster barred. Flamed would be nice, but this never gets back to the perpetrator.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Only the brain-dead spambots stay in the first forum and reply to existing threads. I've seen plenty of other occurrences on other sites, new posts can show up in any forum.

Others have suggested a hidden first forum as Spam Bait, like a Roach Motel if that means anything to people out there ( spam checks in but it doesn't check out ) but it's not a silver bullet.

You need to deny them their ability to auto-register and post right away. Besides the image recognition you've probably seen elsewhere (What does this blot look like? Umm... bacon!) another such way is a manual approval step for new posters, but that puts a burden on Walter and may prove a barrier to growth. Anyone sign up with ADN way back when it was a one man manual process and remember how long you had to wait before you could start posting? :evil:

Ray, I don't think anything can get back to the perpetrator - unless you click on the link and I don't think that's the 'get back' you had in mind. Keep in mind the fact that if you do click on the link, that fact gets back to the spammer and let's them know it is working - and your site name goes on 'the list'. Please resist the urge to see exactly what she may or may not have inside waiting just for you...
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Thats what I was thinking as well. The forum could be named as "Spam Catchers", a general alert can be made about the forum. Something to that effect.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Roach Motel is a great name in this context, and a weird memory.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
narasimha
Charter Member
Charter Member
Posts: 1236
Joined: Fri Oct 22, 2004 8:59 am
Location: Staten Island, NY

Post by narasimha »

Roach Motel - Thats a nice one!

We subscribe to a Belgium company called fantomaster to get a list of valid world wide search engine IPs, User Agents and domain names.
We restrict the others, from crawling our site.

HTH
Narasimha Kade

Finding answers is simple, all you need to do is come up with the correct questions.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Is there any way to validate Datastage License if that been entered during process of registration. That will allow only those who have access to Datastage.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That would count me out some of the time. I am a self-employed consultant; there are times when I am not doing DataStage things.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

kumar_s wrote:...validate Datastage License...
I'd be out of here as well since I can't afford my own license, I just leech off the clients that I work for
sud
Premium Member
Premium Member
Posts: 366
Joined: Fri Dec 02, 2005 5:00 am
Location: Here I Am

Post by sud »

chulett wrote:... another such way is a manual approval step for new posters, but that puts a burden on Walter and may prove a barrier to growth. Anyone sign up with ADN way back when it was a one man manual process and remember how long you had to wait before you could start posting?...
Actually we can have new registrations only through two channels :

1> Through reference from an existing user
2> Manual Approval from Walter

That way Walter won't get killed with requests :!: :roll:
It took me fifteen years to discover I had no talent for ETL, but I couldn't give it up because by that time I was too famous.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I think most of us who are working consultants would not be able to use DSXchange if a license number is required :roll: . Merci merci :wink:
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Post Reply