For documentation, simple markup is often best. For instance, I used to write this blog in reStructured Text; the blogging engine’s Python back-end converted the reStructured Text markup to HTML. (I now use Markdown.)
When I started writing more Scala, I wanted to do something similar with the documents (the user’s guides, README files, etc.) that go with the code I write. I looked for a reStructured Text parser in Scala or Java, but I couldn’t find one, and I didn’t feel like writing one. I could probably have used the existing DocUtils package with Jython, but that seemed like a lot of extra overhead, just to generate HTML from a simple text markup file.
Eventually, I settled on Markdown. Initially, I chose the Java-based MarkdownJ parser. Unfortunately, I ran into some problems, chief among them that I kept getting exceptions when running MarkdownJ under Java 6. It ran fine under Java 5, so this problem wasn’t insurmountable. But MarkdownJ also doesn’t appear to be aggressively maintained these days. So, when I began converting my Scala code to Scala 2.8, I also looked for another way to convert my Markdown documents.
Googling around, I ran across Brian Carper’s Clojure and Markdown blog entry, in which he describes using The Mozilla Rhino Javascript engine to run the Javascript Showdown Markdown parser. One of Brian’s reasons for his approach was consistency: He wanted to be able to show users a preview of parsed Markdown in the browser, but also have his Clojure-based backend parse the same Markdown document. Using the same code (Showdown, in his case) increased the chances that the HTML output would match.
This approach seemed reasonable to me, too, so I wrote a small function to
do the same thing in Scala. The Java 6 JDK I’m using on my Mac,
SoyLatte, does not ship with the JSR 223 (i.e., javax.script
)
bindings for Rhino, so I elected to use the Rhino API directly.
Here’s a simple Scala function that takes an iterator over lines of Markdown (presumed not to have a trailing newline) and returns the HTML markup produced by the Markdown processor.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
The generated HTML markup does not contain html
or body
tags, so we can
use a simple wrapper function, combined with Scala’s inline XML
capabilities, to generate a full XHTML-compliant document. The following
function takes an iterator over lines of Markdown and optional
Cascading Style Sheet content, and produces a complete HTML document.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
There. Markdown to HTML in Scala, by way of Showdown and Rhino. It’s kind of a roundabout way to get there, but it gets the job done.
I’m currently using a variant of this technique in the SBT build file for one of my Scala projects, to convert various documents (including a user’s guide) from Markdown to HTML. I’m seriously considering putting a more generic version, similar to the above, in my Grizzled Scala library.
Update (11 February, 2010): I put a version of this code in my Grizzled Scala library.
Update (3 March, 2010): Tristan Juricek’s Knockoff Markdown parser looks very interesting. It’s written in Scala, and it parses Markdown into an internal object format.
Update (14 December, 2010): MarkWrap, a Scala wrapper API for various lightweight markup APIs, replaces the Markdown parser that was in my Grizzled Scala library. MarkWrap parses Markdown and Textile, and it can easily be extended to handle others.