Introduction
html2xhtml converts HTML files into XHTML. It can fix many common errors in HTML files (e.g. missing end tags, elements with incorrect content model, non-standard elements or attributes, etc.) It can also handle invalid or non well-formed XHTML input, and clean it to produce a well-formed and valid XHTML output. The output document type can be selected among several XHTML DTDs (1.0, 1.1, Basic, etc.)
You can convert HTML files from a Web browser using the online conversion form or download the program and run it on your own computers as a command-line tool, which is quite more convenient for batch conversion and off-line use. The program is free software, licensed under the terms of the GNU General Public License (GPL) version 2.
If you need to call html2xhtml from a program in the .NET framework, there is a separate project, authored by another developer, that provides a .NET 4.0 library (also called Html2Xhtml). The library uses html2xhtml internally and has also been released with GPL version 2 or higher license.
The program has been developed in C and does not depend on other libraries, apart from the GNU libc and GNU libiconv. It has been tested both in GNU/Linux and Windows platforms, but I hope it can also be compiled for other environments. Please, let me know if you succeed to build and run it on other platforms.
A Web API has been recently released (still beta, though) for developers. It allows other programs to remotely invoke html2xhtml through HTTP.
Contribute!
Want to contribute? You can contribute or contact me through the github page of html2xhtml.
Bug reports may be filed at the issue tracker of html2xhtml in github.
Other resources
The xhtmlpedia is a browsable list of XHTML elements and attributes. It lists the elements available for each XHTML DTD, their content rules, attributes, etc. I find it much easier to read and browse than the actual DTDs. The xhtmlpedia has been automatically created from the DTDs with the help of the module of html2xhtml that encodes the definitions of the XHTML DTDs. It is updated frequently to keep it in sync with the official DTDs.