NAV
shell perl http

WORK IN PROGRESS

Introduction

Welcome to a practical cookbook of LaTeXML-related invocations, and customization switches. This resource contains a variety of recipes for using LaTeXML and its ecosystem, as well as a full enumeration of the public API

Basic use

The most common use pattern is to produce an HTML5 equivalent for a given TeX input, which we'll use to get started.

We start with a simple hello world snippet

% file.tex
\documentclass{article}
\begin{document}
Hello World!
\end{document}

And convert:

use LaTeXML;
use LaTeXML::Common::Config;

my $config = LaTeXML::Common::Config->new(format=>'html5');
my $converter = LaTeXML->get_converter($config);

$response = $converter->convert('file.tex');

my ($result, $log, $status, $status_code) = map {$$response{$_}} qw(result log status status_code);
latexml file.tex --dest=file.xml
latexmlpost file.xml --dest=file.html

# Alternatively, latexmlc = latexml+latexmlpost+(optional http client)
latexmlc file.tex --dest=file.html
POST /convert HTTP/1.1
Host: latexml.mathweb.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 30

tex=Hello%20World!&format=html

In a nutshell, LaTeXML will always start with an input in the TeX/LaTeX ecosystem, and map it into a target format of choice, with a wide range of choices for customizing individual aspects of the conversion. This documentation effort will try to enumerate as many as possible from the pragmatic uses of LaTeXML we know of, with brief explanations of the exact invocation.

LaTeXML can be used with different degrees of complexity, proportionally to the complexity of the input text. Certain questions only become relevant with document size (e.g. cross-linking multiple web pages corresponding to book chapters), while others become relevant with input volume (e.g. converting millions of individual formulas from Wikipedia).

Converting a single formula

On the command side, you have the choice between using the dedicated out-of-the-box latexmlmath executable, or the omni-executable latexmlc, for "formula-to-formula" conversions. We will provide examples using both latexmlmath and latexmlc where possible.

To Presentation MathML

# The general form of latexmlmath is:
latexmlmath '\sqrt{x}'

# The equivalent formula-to-formula call of latexmlc is:
latexmlc --whatsin=math --whatsout=math --format=html 'literal:\sqrt{x}'
my $converter = LaTeXML->get_converter(LaTeXML::Common::Config->new(
  whatsin=>'math', whatsout=>'math', format=>'html' ));

$converter->convert('literal:\sqrt{x}');
POST /convert HTTP/1.1
Host: latexml.mathweb.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 57

tex=%5Csqrt%7Bx%7D&whatsin=math&whatsout=math&format=html

Result:

<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="\sqrt{x}" display="block">
  <msqrt>
    <mi>x</mi>
  </msqrt>
</math>

Presentation MathML is the default output of latexmlmath, and is also the default serialization for all of LaTeXML's post-processing formats (ePub, (X)HTML, JATS, ...). Instead, the defaults for latexml and latexmlc target LaTeXML's own XML schema, as they do not include post-processing in a default run.

To full MathML (presentation+content)

# Cross-referenced presentation and content in a single formula output:
latexmlc --pmml --cmml --whatsin=math --whatsout=math --format=html 'literal:\sqrt{x}'

# Alternatively, separate presentation and content written in single-formula files:
latexmlmath --pmml=sqrt.pmml --cmml=sqrt.cmml '\sqrt{x}'

To full MathML with annotations

To SVG image

To PNG image

Short article

Book-sized manuscript

Advanced Uses for Mathematics

Advanced Uses for Formats (EPUB, JATS)

Batch processing

...

API

All Customization switches

Errors

LaTeXML has a sizeable collection of different errors, which have evolved over time from stress testing the application over the 1+ million articles in arXiv.org. The general convention is introduced in the manual link here. In this cookbook, we try to enumerate the intuitions of each concrete error message.

Status Codes

Each conversion ends with a given severity, or status, coded as:

Severity/Status Code
no_problem 0
warning 1
error 2
fatal 3

No Problems

When no error-level messages are emitted during conversion and post-processing, the results is considered to have "No Obvious Problems" and terminates successfully.

Warning

Severity Category What Description
Warning expected <metadata> expected metadata was not found, such as the refered targets of labels, sources of graphics inclusions, or missing numeric arguments for some assignments (treated as zero)
Warning imageprocessing Crop empty images can not be cropped, minor, ignored.
Warning latex \GenericWarning Usually when a .sty or .cls file is interpreted raw, and latexml triggers a warning coded by the original package author.
Warning limitation <filename> Usually a complex transform limitation in Post::Graphics, operation may be ignored.
Warning malformed labels No open node can be assigned labels, so the labels in question will be dropped.
Warning misdefined \command minor convention violations, e.g. a conditional macro being defined without starting with an \if prefix.
Warning missing_file <filename> A raw TeX dependency (usually .sty, .cls or .def) can not be loaded. Either the dependency is missing from the system entirely, or when --includestyles is not specified, a .ltxml binding is missing.
Warning not_parsed <GRAMMAR_ROLES+> A warning emitted when latexml's MathGrammar fails to construct a tree for a given TeX formula. The error includes a window of the grammar lexemes where the parse failed to proceed.
Warning perl warn A native perl warning that did not have a dedicated handler, e.g. redefined subroutines; wrong offsets in indexing
Warning undefined <name> an undefined construct with minor impact was encountered, commonly a TeX counter
Warning unexpected <construct> A macro, value, XML element, or inclusion was not expected in the current context, but a clear recovery path is known.
Warning uninitialized <$value> Caught a native perl warning when encountering operations over uninitialized variables

Error

Severity Category What Description
Error expected <token> Cases where a mandatory argument was necessary, but no argument was available. Processing proceeds with an empty or default value.
Error I/O <filename> Could not read auxiliary file, ignoring when caught.
Error imageprocessing `Read Scale
Error latex \GenericError Usually seen when a .sty or .cls file is interpreted raw, and latexml triggers an error coded by the original package author. Generally due to latexml mistakes in emulating pdflatex.
Error malformed <ns:element> The document construction phase could not use the given element in the document fragment built so far, as enforced by the latexml XML schema. Usually a result of a previous processing error, sometimes a discrepancy between the TeX boxing model and the XML tree paradigm.
Error misdefined <token> A token was not meaningful in the current context (e.g. it reached the stomach without a definition, or was not part of a mini language for a package, as in an RGB color specification). Usually processing ignores the token and proceeds.
Error missing_file <filename> can't read an auxiliary file for a binding macro, usually left ignored as empty.
Error pgfparse pgfparse failure to parse a pgf expression, either due to bad syntax, or latexml binding deficiency
Error recursion \command Macro may be defined as expanding into itself, which would be an infinite loop if unchecked. Expansion is ignored when caught.
Error undefined \command An error where a TeX \command was attempted to be digested without first being defined. While this could be an author error, it is also common where prerequisite .sty or .cls libraries were not present or failed to correctly load.
Error unexpected \command or <token> Processing did not expect the content at this expansion point. Any of: ending a wrong mode or environment; using a construct in the wrong TeX mode; or expanding illegal content in restricted processing (such as inside \csname).

Fatal

Severity Category What Description
Fatal die <filename> A native perl death, as in Fatal:perl:die, caught by a different handler (to be refactored)
Fatal expected <tokens> Expected specific tokens that were not found, usually as part of a matching the signature of a TeX macro's arguments.
Fatal I/O unreadable It was impossible to read a mandatory file
Fatal internal <recursion> Detected an infinite recursion during TeX expansion, leading to a hard abort.
Fatal invalid binary main TeX source seems to have been a binary file, abort.
Fatal invalid archive did not detect a main TeX source when examining an input archive, abort.
Fatal misdefined \command The defined expansion of the macro in question was considered unusable (e.g. unbalanced {})
Fatal missing_file <filename> The main TeX file was missing from the file system
Fatal perl deep_recursion Perl exceeded its maximum allowed recursion depth in native subroutine calls. This could indicate an infinite loop, so is treated as a hard abort.
Fatal perl die critical error in native Perl, usually preceded by a cascade of lighter errors with TeX interpretation
Fatal timeout timedout Took longer than the alloted --timeout seconds, and aborted.
Fatal too_many_errors 100 When latexml encounters 100 errors, it considers the document too badly broken to continue, and terminates early without creating any output.
Fatal unexpected <endgroup> Conversion ran into too many closing groups, the last of which was locked and should have never been ended