InDesign 20.0 Goes to MathML — Part 1
October 18, 2024 | Snippets | en
The latest version of InDesign provides an internal solution for formatting math formulas. The underlying technology is a UXP WebView panel (Math Expressions) interacting with MathJax, a popular JavaScript package capable of processing maths in various formats. In this case, MathML as input and SVG as output…
Even if some implementation choices lack maturity — not to mention the very adoption of MathML, very intimidating for the novice — we salute Adobe's effort to respond to a very old demand from designers of scientific works.
MathML is a markup language focused on mathematical descriptors, so it is in line with many XML/HTML dialects intended to improve both the semantics and the rendering of documents. Although virtually as powerful (and as complicated!) as LaTeX lingo, it remains cumbersome to grasp and debug.
It seems that engineers still haven't figured out that for a non-developer, opening and closing XML branches is an unbearable chore. Considering that MathJax also natively supports AsciiMath, we could hope that the Math Expressions dialog will expand and offer this input alternative in the near future. Could we?
MathML Basics
MathJax supports the MathML3.0 mathematics tags, “with some limitations” (see MathML Support). For its part, Adobe indicates that “some glyphs, such as Closed surface integral, Thick Space, Differential D, Imaginary I, Exponential E, and a rare closing bracket symbol, are not supported and appear as missing glyphs in the generated expression.” Developers promise to fix these issues in future updates.
For now, all you need to know is that a MathML expression has the following form
<math> ... </math>
and that all inner XML elements will start with the letter ‘m’. They are “meant to express the syntactic structure of mathematical notation (…) Because of this, a single row of identifiers and operators will often be represented by multiple nested mrow
elements rather than a single mrow
.”
The typical example, ‘x+a/b’ is encoded:
<math> <mrow> <mi> x </mi> <mo> + </mo> <mrow> <mi> a </mi> <mo> / </mo> <mi> b </mi> </mrow> </mrow> </math>
So you can see that each fundamental element has its own locker, explicitly associated with one of the following tokens:
Token | Meaning | Examples |
---|---|---|
mi | identifier | x ; a ; β ; sin ; f |
mn | number | 1 ; 3.1416 ; 0xFF |
mo | operator | + ; – ; × ; ∀ ; ≈ ; ⊂ ; ⊕ |
mtext | arbitrary text | Theorem 1: |
mspace | space | (any blank space) |
With these elements in mind you have the essential bricks to get started.
Nested Terms and Formatting
MathML then provides formatting schemes for handling groups, superscripts/subscripts, fractional notations, etc. Here are the main tools:
Element | Arguments | Effect |
---|---|---|
mrow | any | Grouping (inline) |
mfrac | 2 | Fraction (num/den) |
msqrt | 1 | Square root |
mroot | 2 | Nth root |
msub | 2 | Subscript |
msup | 2 | Superscript |
msubsup | 3 | Sub & Superscript |
munder | 2 | Underscript |
mover | 2 | Overscript |
munderover | 3 | Under & Overscript |
mtable | any | Table or matrix |
mtr | any | Row in table/matrix |
mtd | 1 | Cell in table/matrix |
mfenced | any | Fences e.g parentheses |
menclose | 1 | Enclosing notation |
mpadded | 1 | Padding attributes |
mstyle | 1 | Styling attributes |
The deciding parameter is the number of arguments (and the order in which you declare them). For example, the sub-expression x³ requires the encoding
<msup><mi>x</mi><mn>3</mn></msup>
which tells the engine: “I want a superscript structure (hence msup
), my base element is the identifier x (1st argument) and my exponent is the number 3 (2nd argument).”
Here is another example… with a tiny bug:
<mi> i </mi> <mo> = </mo> <msqrt> <mo> - </mo> <mn> 1 </mn> </msqrt>
If I refer to my table, the number of arguments of msqrt
should be 1. The above code, <msqrt><mo>-</mo><mn>1</mn></msqrt>
, gives the false impression that it could digest two successive elements. In fact, this is a syntax error that InDesign tries to fix on its own by inserting missing mrow
element(s).
Note. — Without going into too many technical details, let's point out that the event handler of the UXP resource com.adobe.indesign.mathexprpanel implements some (basic) mathML code cleanup routines. They are far from eliminating all problems, and some can even add unnecessary mrow
tags!
The important thing to remember is that the mrow
element allows you to “group any number of sub-expressions horizontally” in order to respect the formal syntax, whenever needed. Hence, if a structure is designed to eat one argument, go to the mrow glue:
<math> <mrow> <mi> i </mi> <mo> = </mo> <msqrt> <mrow> <mo> - </mo> <mn> 1 </mn> </mrow> </msqrt> </mrow> </math>
Best practice is to correct your code manually, to avoid both XML parsing and MathML interpretation errors. For a more comprehensive view on the issue of inline groupings, give a read to the inferred mrow's section of the specification.
Style change (color, size) is primarily handled by common attributes (mathcolor
, mathsize
) available on most inner elements. The dedicated mstyle
container can define a consistent set of attributes being applied at its level and on child elements. InDesign's Math Expressions panel also offers a pair of “Typeface Properties” (font size and fill color) that govern the overall formatting of the expression. Spacing and alignment issues can be controlled from both the mspace
token and the mpadded
element.
Tokens (in particular mi
) also enjoy the important mathvariant
attribute, which “specifies the logical class of the token.” It allows the formatting of conventional glyphs (bold, double-struck, fraktur, script) while conveying semantic intent. Here's an example for Cantor fans:
This short introduction to MathML syntax in InDesign does not come close to covering all the subtleties and options available. To learn how to write more elaborate expressions, I recommend:
• elsenaju.eu/mathml/MathML-Examples.htm
• Geeks for Geeks: MathML Tutorial
• mathjax.github.io
The second part of this article will focus on the outgoing side of Math Expressions (a mysterious SVG stream…) and various issues related to the Scripting DOM. (Coming soon.)