[SOLVED] Is there any way to download all the comments from a news site like www.cbc.ca?

Darkmatterx

Distinguished
Apr 8, 2003
552
6
19,015
Hi, I know this is an odd question, and I really didn't know where to put it, but I want to download all the comments made in a news story on www.cbc.ca.

Part of the problem is that, as far as I can tell, you have to keep hitting "load more comments" after every 5 or so. If there are a lot of comments, that would take hours. Plus if the persons text is to long, there's also a "more" button to show the rest of the comment, and if there are to many replies, you have to click a button to show more. :(
 

Darkmatterx

Distinguished
Apr 8, 2003
552
6
19,015
Are you sure you're allowed to scrape the contents off that site? I'd say, that system with "Load more" is made intentionally, just for protection of actions like yours.

RSS purpose is exactly that - to provide for "automatic" reading of site updates.

I looked, it doesn't include the comments, even with scripts enabled. This is probably on purpose, but as it's publicly available information, you COULD do it manually, but that would take forever.

I imagine you could do it using one of the scripting languages like python or javascript

Unfortunately, I couldn't code, or script my way out of a wet bag made of soggy noodles, so unless I can find something already made (and I will look) I'm probably SOL.

Maybe there is something for this on Git. I'll have to check.

Thanks
 
I looked, it doesn't include the comments, even with scripts enabled. This is probably on purpose, but as it's publicly available information, you COULD do it manually, but that would take forever.



Unfortunately, I couldn't code, or script my way out of a wet bag made of soggy noodles, so unless I can find something already made (and I will look) I'm probably SOL.

Maybe there is something for this on Git. I'll have to check.

Thanks
Put an ad on craigslist with how ever much you are willing to pay for it to be done.