HTML Text Extractor

A few days ago i posted a very simple app that just extracts the text for any html file, today i have posted the source code for that app, it is so simple i can’t believe some people actually charge for apps like this one, but of course this one is so dumb that it can’t do almost anything aside from removing the html tags, i am sure most commercial apps will do a lot more or a better job.

This code was done with Visual C++ 2008 Express Edition, this was just to try out the compiler so nothing fancy you will find in this code, it was done in less than an hour so it is very simple.

What would be cool i think is a simple app like this that can be improved to include support for more file formats, and then include something like plug-ins so that the app can be updated with more file formats by installing new plug-ins, that would be awesome right? i think i’ll start working on something like that soon.

So anyway, i posted the code here because i think it could be useful for someone developing a similar tool, maybe they can take this code and improve over it, this code has no license terms, it is public domain so that anyone can take it and make whatever they want with it.

You can find the app and the code in the Downloads page