Creating Custom Pandoc Writers in Lua
Introduction
If you need to render a format not already handled by pandoc, or you want to change how pandoc renders a format, you can create a custom writer using the Lua language. Pandoc has a built-in Lua interpreter, so you needn’t install any additional software to do this.
A custom writer is a Lua file that defines how to render the
document. Writers must define just a single function, named either
Writer or ByteStringWriter, which gets
passed the document and writer options, and then handles the
conversion of the document, rendering it into a string. This
interface was introduced in pandoc 2.17.2, with ByteString writers
becoming available in pandoc 3.0.
Pandoc also supports “classic” custom writers, where a Lua function must be defined for each AST element type. Classic style writers are deprecated and should be replaced with new-style writers if possible.
Writers
Custom writers using the new style must contain a global
function named Writer or
ByteStringWriter. Pandoc calls this function with the
document and writer options as arguments, and expects the function
to return a UTF-8 encoded string.
function Writer (doc, opts)
-- ...
endWriters that do not return text but binary data should define a
function with name ByteStringWriter instead. The
function must still return a string, but it does not have to be
UTF-8 encoded and can contain arbitrary binary data.
If both Writer and ByteStringWriter
functions are defined, then only the Writer function
will be used.
Format extensions
Writers can be customized through format extensions, such as
smart, citations, or
hard_line_breaks. The global Extensions
table indicates supported extensions with a key. Extensions
enabled by default are assigned a true value, while those that are
supported but disabled are assigned a false value.
Example: A writer with the following global table supports the
extensions smart, citations, and
foobar, with smart enabled and the
others disabled by default:
Extensions = {
smart = true,
citations = false,
foobar = false
}The users control extensions as usual, e.g.,
pandoc -t my-writer.lua+citations. The extensions are
accessible through the writer options’ extensions
field, e.g.:
function Writer (doc, opts)
print(
'The citations extension is',
opts.extensions:includes 'citations' and 'enabled' or 'disabled'
)
-- ...
endDefault template
The default template of a custom writer is defined by the
return value of the global function Template. Pandoc
uses the default template for rendering when the user has not
specified a template, but invoked with the
-s/--standalone flag.
The Template global can be left undefined, in
which case pandoc will throw an error when it would otherwise use
the default template.
Example: modified Markdown writer
Writers have access to all modules described in the Lua filters
documentation. This includes pandoc.write, which
can be used to render a document in a format already supported by
pandoc. The document can be modified before this conversion, as
demonstrated in the following short example. It renders a document
as GitHub Flavored Markdown, but always uses fenced code blocks,
never indented code.
function Writer (doc, opts)
local filter = {
CodeBlock = function (cb)
-- only modify if code block has no attributes
if cb.attr == pandoc.Attr() then
local delimited = '```\n' .. cb.text .. '\n```'
return pandoc.RawBlock('markdown', delimited)
end
end
}
return pandoc.write(doc:walk(filter), 'gfm', opts)
end
Template = pandoc.template.default 'gfm'Reducing boilerplate with
pandoc.scaffolding.Writer
The pandoc.scaffolding.Writer structure is a
custom writer scaffold that serves to avoid common boilerplate
code when defining a custom writer. The object can be used as a
function and allows to skip details like metadata and template
handling, requiring only the render functions for each AST element
type.
The value of pandoc.scaffolding.Writer is a
function that should usually be assigned to the global
Writer:
Writer = pandoc.scaffolding.WriterThe render functions for Block and Inline values can then be
added to Writer.Block and Writer.Inline,
respectively. The functions are passed the element and the
WriterOptions.
Writer.Inline.Str = function (str)
return str.text
end
Writer.Inline.SoftBreak = function (_, opts)
return opts.wrap_text == "wrap-preserve"
and cr
or space
end
Writer.Inline.LineBreak = cr
Writer.Block.Para = function (para)
return {Writer.Inlines(para.content), pandoc.layout.blankline}
endThe render functions must return a string, a pandoc.layout
Doc element, or a list of such elements. In the latter
case, the values are concatenated as if they were passed to
pandoc.layout.concat. If the value does not depend on
the input, a constant can be used as well.
The tables Writer.Block and
Writer.Inline can be used as functions; they apply
the right render function for an element of the respective type.
E.g., Writer.Block(pandoc.Para 'x') will delegate to
the Writer.Para render function and will return the
result of that call.
Similarly, the functions Writer.Blocks and
Writer.Inlines can be used to render lists of
elements, and Writer.Pandoc renders the document’s
blocks. The function Writer.Blocks can take a
separator as an optional second argument, e.g.,
Writer.Blocks(blks, pandoc.layout.cr); the default
block separator is pandoc.layout.blankline.
All predefined functions can be overwritten when needed.
The resulting Writer uses the render functions to handle metadata values and converts them to template variables. The template is applied automatically if one is given.
Classic style
A writer using the classic style defines rendering functions for each element of the pandoc AST. Note that this style is deprecated and may be removed in later versions.
For example,
function Para(s)
return "<paragraph>" .. s .. "</paragraph>"
endTemplate variables
New template variables can be added, or existing ones modified,
by returning a second value from function Doc.
For example, the following will add the current date in
variable date, unless date is already
defined as either a metadata value or a variable:
function Doc (body, meta, vars)
vars.date = vars.date or meta.data or os.date '%B %e, %Y'
return body, vars
endChanges in pandoc 3.0
Custom writers were reworked in pandoc 3.0. For technical
reasons, the global variables PANDOC_DOCUMENT and
PANDOC_WRITER_OPTIONS are set to the empty document
and default values, respectively. The old behavior can be restored
by adding the following snippet, which turns a classic into a new
style writer.
function Writer (doc, opts)
PANDOC_DOCUMENT = doc
PANDOC_WRITER_OPTIONS = opts
loadfile(PANDOC_SCRIPT_FILE)()
return pandoc.write_classic(doc, opts)
end