Tuesday, June 17, 2008

HTML Tidy

HTML Tidy is a computer program and a library whose purpose is to fix invalid HTML and give the source code a reasonable layout (aka indent style).

It was developed by Dave Raggett of W3C, then passed on to become a Sourceforge project. Its source code is written in ANSI C for maximum portability and precompiled binaries are available for a variety of platforms. It is available under the W3C license (a permissive, BSD-style license).

Examples of bad code it is able to fix:

  • Missing or mismatched end tags, mixed up tags
  • Adding missing items (some tags, quotes, ...)
  • Reporting proprietary HTML extensions
  • Change layout owing to predefined style
  • Transform characters from some encodings into HTML entities
  • Cleaning up presentational markup
HTML Tidy Project Page

JTidy

JTidy is a Java port of HTML Tidy, a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

JTidy was written by Andy Quick, who later stepped down from the maintainer position. Now JTidy is maintained by a group of volunteers.

More information on JTidy can be found on the JTidy SourceForge project page .

No comments: