Olivier Mengué – Code & rando - Tag - syntax highlighting2024-03-29T11:29:45+01:00urn:md5:57e646ab8ca85028168daaaa985aa995DotclearSyntax highlighting on this blog using semantic tags and Vimurn:md5:8cb83156e4185424b2028db23e75e7d42007-08-25T18:51:00+00:002008-04-24T18:07:57+00:00Olivier MenguéCode2htmlbloggingCSSHTMLmicroformatsemantic websyntax highlightingVim <p>The regular readers of the HTML view of this blog (if they exist) may have noticed some changes in the syntax highlighting of the source code in my posts. I've tweaked the CSS style sheet to use the <a href="http://wiseheartdesign.com/2006/3/11/ruby-blue-textmate-theme">RubyBlue Vim theme</a> instead of the Blue Vim theme. This is the occasion to explain you my process to format source code samples in HTML.</p>
<p>I'm a <a href="http://www.vim.org/">Vim</a> addict, using it on Linux, AIX and Windows. Vim has a powerful and extensible syntax highlighting engine that can format almost any existing text file format. And most importantly, it has a plugin that can export the highlighted code as HTML.</p>
<p>There is many advantages in using Vim and storing statically highlighted code:</p>
<ul>
<li>I can use the huge set of languages supported by Vim highlighting ;</li>
<li>I can use the huge set of themes built for Vim and easily convert to a CSS for the web ;</li>
<li>if a language is not supported, I can define highlighting myself (<a href="http://www.vim.org/account/profile.php?user_id=4820">I already did it</a> for 4 languages) and it will be done once for use both in Vim and on my blog ;</li>
<li>as I store only pure HTML as data in the blog engine (no special wiki code, no plugin), I am not dependent on the engine I'm currently using ;</li>
<li>no charge on the server (as with PHP formatting engines such as <a href="http://qbnz.com/highlighter/">GeSHi</a>) or on the client (such as with <a href="http://code.google.com/p/syntaxhighlighter/">syntaxhighlighter</a>) ;</li>
<li>as we are using simple <code><pre></code> tags, there is no characters/tags pollution: the reader can simply select and copy the text to the clipboard ;</li>
<li>last but not least, I can tweak the output to improve it and fix the highlighting bugs (some languages are very hard to parse).</li>
</ul>
<p>In Vim you can invoke the conversion to HTML from the "Syntax" menu or that way:</p>
<pre class="code vim vimft-vim">
<span class="Statement">:runtime syntax/2html.vim</span>
</pre>
<p>To get the best XHTML code, I'm using the following settings in <code>$HOME/.vimrc</code>:</p>
<pre class="code vim vimft-vim">
<span class="Statement">syntax</span> <span class="Type">on</span>
<span class="Comment">" Conversion HTML (:help 2html.vim)</span>
<span class="Statement">let</span> g:html_use_css <span class="Operator">=</span> <span class="Constant">1</span>
<span class="Statement">let</span> g:html_use_encoding <span class="Operator">=</span> <span class="Constant">"utf8"</span>
<span class="Statement">let</span> g:use_xhtml <span class="Operator">=</span> <span class="Constant">1</span>
</pre>
<p>For example, here is the HTML I extract (I remove anything around the <code><pre></code> tag) from what is generated by <code>2html.vim</code> from the code above:</p>
<pre class="code vim vimft-html">
<span class="Identifier"><</span><span class="Statement">pre</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Comment"</span><span class="Identifier">></span><span class="Special">&quot;</span> Conversion HTML (:help 2html.vim)<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Statement"</span><span class="Identifier">></span>let<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> g:html_use_css <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Operator"</span><span class="Identifier">></span>=<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Constant"</span><span class="Identifier">></span>1<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Statement"</span><span class="Identifier">></span>let<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> g:html_use_encoding <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Operator"</span><span class="Identifier">></span>=<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Constant"</span><span class="Identifier">></span><span class="Special">&quot;</span>utf8<span class="Special">&quot;</span><span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Statement"</span><span class="Identifier">></span>let<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> g:use_xhtml <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Operator"</span><span class="Identifier">></span>=<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span> <span class="Identifier"><</span><span class="Statement">span</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"Constant"</span><span class="Identifier">></span>1<span class="Identifier"></</span><span class="Statement">span</span><span class="Identifier">></span>
<span class="Identifier"></</span><span class="Statement">pre</span><span class="Identifier">></span>
</pre>
<p>I just have to add my own set of classes to enable highlighting:</p>
<pre class="code vim vimft-html">
<span class="Identifier"><</span><span class="Statement">pre</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"code vim vimft-vim"</span><span class="Identifier">></span>...<span class="Identifier"></</span><span class="Statement">pre</span><span class="Identifier">></span>
</pre>
<p>Here is the semantic associated to the classes:</p>
<ul>
<li><code class="vim vimft-html"><span class="Constant">code</span></code> is my generic class for source code blocks </li>
<li><code class="vim vimft-html"><span class="Constant">vim</span></code> is for source code formatted using the Vim classes for highlighting </li>
<li><code class="vim vimft-html"><span class="Constant">vimft-html</span></code> is the class for the specific kind of source code: Vim's <a href="http://vimdoc.sourceforge.net/htmldoc/options.html#'ft'">filetype</a> option, displayed with "<code class="vim vimft-vim">:<span class="Statement">set</span> <span class="PreProc">ft</span>?</code>".</li>
</ul>
<p>For terminal output samples, I'm using my own highlighting using semantic XHTML tags :</p>
<ul>
<li><code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">pre</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"terminal"</span><span class="Identifier">></span></code>, the enclosing tag, with an optional class :
<ul>
<li><code class="vim vimft-html"><span class="Constant">unix</span></code> for Unix/Linux samples ;</li>
<li><code class="vim vimft-html"><span class="Constant">cmd</span></code> for Windows <code>cmd.exe</code> shell code.</li>
</ul>
</li>
<li><code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">kbd</span><span class="Identifier">></span></code> for what is typed in the terminal, with the following optional classes:
<ul>
<li><code class="vim vimft-html"><span class="Constant">shell</span></code> for any Unix shell samples ;</li>
<li><code class="vim vimft-html"><span class="Constant">bash</span></code> or <code class="vim vimft-html"><span class="Constant">ksh</span></code> (in addition to <code class="vim vimft-html"><span class="Constant">shell</span></code>) for Unix shell samples that uses features which are not in the standard POSIX shell ;</li>
<li><code class="vim vimft-html"><span class="Constant">cmd</span></code> for Windows <code>cmd.exe</code> shell code.</li>
</ul>
</li>
<li><code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier">></span></code> for programs output:
<ul>
<li><code class="vim vimft-html"><span class="Constant">prompt</span></code> for shell or interactive programs prompts ;</li>
<li><code class="vim vimft-html"><span class="Constant">shell</span></code>, <code class="vim vimft-html"><span class="Constant">bash</span></code>, <code class="vim vimft-html"><span class="Constant">ksh</span></code> or <code class="vim vimft-html"><span class="Constant">cmd</span></code> for shell prompts (in addition to <code class="vim vimft-html"><span class="Constant">prompt</span></code>) ;</li>
<li><code class="vim vimft-html"><span class="Constant">sqlite</span></code>, <code class="vim vimft-html"><span class="Constant">sqlite3</span></code> for SQLite client samples...</li>
</ul>
</li>
<li><code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">var</span><span class="Identifier">></span></code> for variable input/output. Everything except the <code><var></code> content should exactly match if you repproduce it yourself. The <code class="Identifier">title</code> can indicate what the variable represent, and on which data it depends. The tag is always a direct child of either <code><kbd></code> or <code><samp></code>.</li>
</ul>
<p>Here is an example:</p>
<pre class="code vim vimft-html">
<span class="Identifier"><</span><span class="Statement">pre</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"terminal unix"</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"prompt shell"</span><span class="Identifier">></span>$ <span class="Identifier"></</span><span class="Statement">samp</span><span class="Identifier">></span><span class="Identifier"><</span><span class="Statement">kbd</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"shell"</span><span class="Identifier">></span>echo Hello, world!<span class="Identifier"></</span><span class="Statement">kbd</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier">></span>Hello, world!<span class="Identifier"></</span><span class="Statement">samp</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"prompt shell"</span><span class="Identifier">></span>$ <span class="Identifier"></</span><span class="Statement">samp</span><span class="Identifier">></span><span class="Identifier"><</span><span class="Statement">kbd</span><span class="Identifier"> </span><span class="Type">class</span><span class="Identifier">=</span><span class="Constant">"shell"</span><span class="Identifier">></span>date<span class="Identifier"></</span><span class="Statement">kbd</span><span class="Identifier">></span>
<span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier">><</span><span class="Statement">var</span><span class="Identifier">></span>samedi 25 août 2007, 18:51:00 (UTC+0200)<span class="Identifier"></</span><span class="Statement">var</span><span class="Identifier">></</span><span class="Statement">samp</span><span class="Identifier">></span>
<span class="Identifier"></</span><span class="Statement">pre</span><span class="Identifier">></span>
</pre>
<p>And the final result:</p>
<pre class="terminal unix">
<samp class="prompt shell">$ </samp><kbd class="shell">echo Hello, world!</kbd>
<samp>Hello, world!</samp>
<samp class="prompt shell">$ </samp><kbd class="shell">date</kbd>
<samp><var>samedi 25 août 2007, 18:51:00 (UTC+0200)</var></samp>
</pre>
<p>This semantic tags will allow me to provide later additional feature using JavaScript code. I'm thinking to a button that would hide any <code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">samp</span><span class="Identifier">></span></code> tags and keep only <code class="vim vimft-html"><span class="Identifier"><</span><span class="Statement">kbd</span><span class="Identifier">></span></code> tags to ease copy of the commands to a terminal to run the commands.</p>
<p>With these tags in place, the CSS stylesheet is quite short and simple. More importantly it is easily replaceable in case I change the theme of the blog.</p>
<pre class="code vim vimft-css">
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim<span class="Normal">,</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Identifier">{</span> <span class="Type">margin-left</span>: <span class="Constant">1pt</span>; <span class="Type">padding</span>: <span class="Constant">5pt</span>; <span class="Identifier">}</span>
<span class="Comment">/* Text not embedded in samp or kbd will be in red, to easily detect errors */</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Identifier">{</span> <span class="Type">background</span>: <span class="Constant">#000</span>; <span class="Type">color</span>: <span class="Constant">#f00</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Statement">samp</span><span class="Special">.</span>prompt <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#888</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Statement">samp</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#eee</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Statement">kbd</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#fff</span>; <span class="Type">font-weight</span>: <span class="Type">bold</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span>terminal <span class="Statement">var</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#55f</span>; <span class="Type">font-style</span>: <span class="Type">italic</span>; <span class="Identifier">}</span>
<span class="Comment">/* colorscheme <a href="http://wiseheartdesign.com/2006/3/11/ruby-blue-textmate-theme">rubyblue</a> */</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#c7d4e2</span>; <span class="Type">background-color</span>: <span class="Constant">#162433</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Statement">a</span><span class="Special">[</span>href<span class="Special">]</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#0f0</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Constant <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#0c0</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Comment <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#428bdd</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Identifier <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#fff</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span><span class="Statement">Label</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#ff0</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Operator <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#ff0</span>; <span class="Type">font-weight</span>: <span class="Type">bold</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>PreProc <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#f9bb00</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Special <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#0c0</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Statement <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#f9bb00</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span><span class="Statement">Title</span> <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#fff</span>; <span class="Type">font-weight</span>: <span class="Type">bold</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Type <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#fff</span>; <span class="Type">text-decoration</span>: <span class="Type">underline</span>; <span class="Identifier">}</span>
<span class="Statement">pre</span><span class="Special">.</span><span class="Statement">code</span><span class="Special">.</span>vim <span class="Special">.</span>Underlined <span class="Identifier">{</span> <span class="Type">color</span>: <span class="Constant">#208aff</span>; <span class="Type">text-decoration</span>: <span class="Type">underline</span>; <span class="Identifier">}</span>
</pre>
<p>I had to tweak a bit the blog engine I use (<a href="http://www.dotclear.net/">DotClear</a>) to add this style sheet: modifying the <code>template.php</code> in the theme directory is not enough because the theme is used only on the public part of the blog. So I added an <code class="vim vimft-css"><span class="PreProc">@import</span></code> rule in <code>ecrire/style/default.css</code> to enable the CSS in the private area of the blog.</p>
<p> </p>
<p>I would be glad to read your experiences about source code formatting on your own blog or CMS.</p>
<p class="update"><strong>Update 2007-09-06:</strong> added missing information about the usage of <code><var></code> tags.</p>