I want to collect specific parts of several web pages that are changing rapidly--thus requiring heavy use of regular expressions and text analysis, and the need for speed.
I want to collect specific parts of several web pages that are changing rapidly--thus requiring heavy use of regular expressions and text analysis, and the need for speed.
I think the actual http request to get the information is going to take a lot more than your actual processing. And you don't want to make a lot of requests per second to a website because you could end up doing an unintentional DoS attack (which can be considered trespass to chattels and you'll have a lawsuit coming right up). There should be a crawling delay of about 10 seconds, in which you would be able to do a lot of processing. So better choose the language you're most comfortable with.
Perl is perfect for that. There are DOZENS of pre-made libraries for doing what you describe, so you will have to write VERY LITTLE code yourself!
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks