ParseHTML

ParseHTML is an HTML parser which works with Ruby 1.8 and above. ParseHTML will even try to handle invalid HTML to some degree.

ParseHTML was originally written to supplement the BlueCloth HTML-to-Markdown library.

git repository located here: http://github.com/cpjolicoeur/parsehtml

More info coming soon.