Welcome to Discuss Everything Forums...

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed.


 

Tags for this Thread

+ Reply to Thread
Results 1 to 3 of 3
  1. #1
    nicholasdewaal's Avatar
    Junior Member

    Status
    Offline
    Join Date
    Sep 2009
    Posts
    2
    Downloads
    0
    Uploads
    0

    Which is better for making a fast and efficient web spider? Python, C, Perl, or other?

    I want to collect specific parts of several web pages that are changing rapidly--thus requiring heavy use of regular expressions and text analysis, and the need for speed.

  2. #2
    Seigneur A's Avatar
    Junior Member

    Status
    Offline
    Join Date
    Sep 2009
    Posts
    2
    Downloads
    0
    Uploads
    0
    I think the actual http request to get the information is going to take a lot more than your actual processing. And you don't want to make a lot of requests per second to a website because you could end up doing an unintentional DoS attack (which can be considered trespass to chattels and you'll have a lawsuit coming right up). There should be a crawling delay of about 10 seconds, in which you would be able to do a lot of processing. So better choose the language you're most comfortable with.

  3. #3
    martinthurn's Avatar
    Senior Member

    Status
    Offline
    Join Date
    May 2009
    Posts
    183
    Downloads
    0
    Uploads
    0
    Perl is perfect for that. There are DOZENS of pre-made libraries for doing what you describe, so you will have to write VERY LITTLE code yourself!

 

 

Quick Reply Quick Reply

Click here to log in


What color is our footer?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 0
    Last Post: 09-16-2010, 10:37 AM
  2. Replies: 1
    Last Post: 10-20-2009, 06:29 PM
  3. Replies: 0
    Last Post: 09-08-2009, 12:07 PM
  4. Replies: 0
    Last Post: 07-09-2009, 11:56 AM
  5. Replies: 0
    Last Post: 04-17-2009, 06:53 PM

Bookmarks

Posting Permissions

  • You may post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts
  •