How to extract the main body(text content) from arbitary webpage?
Hi all, In my current project, I need to write python code extracting tons of pages grabbed from the web. By extraction, I mean strip all tags and comments and if possible, filter out small sections like navigation links. The only thing should be left is the length paragraph, if there's any. ...
__________________
Python
Domain Name Forum
|