Marking up text documents is an increasingly popular method of encoding information in a way that is both comprehensible to humans, but also machine-processable.
In their simplest form, markup languages can be used to include processing instructions for text - to indicate how the text should be displayed, for example. But markup languages can also be used to encode information about the text, making it possible to have a greater understanding of the meaning of the text.
Because the marked-up documents reside in simple text files, it is relatively easy to open and view them in a text editor. They are also less likely to become obsolete and unreadable, particularly if the markup language follows a standard.
There are several commonly used markup languages. In some cases, the markup languages themselves have been expanded to more specific purposes. The sections below link to additional information about some common markup systems.
LaTeX is a markup system generating print quality output with an emphasis on rendering math expressions.
There are several LaTeX editors the provide an editing pane and a WYSIWYG preview pane. A popular online editor is Overleaf. With a free account, you can save documents in the cloud and collaborate with others. The output is rendered as a PDF that can be downloaded to your local computer.
Detexify is a tool that allows you to hand-draw symbols, then choose from its best attempt to match your drawing with known symbols encoded as LaTeX. The code you choose can be copied and pasted into your LaTeX editor.
A free online tutorial is available from ShareLaTeX.
Markdown is increasingly popular as a lightweight way to include formatting in text documents. It is the markup system used by the popular GitHub repository system.
XML provides a powerful way to encode complex information in a text document. It can also be used to impose structure to data. As a W3C standard, its use is common across the Internet in forms such as XHTML web pages and SVG graphics. There are also standards for accessing and manipulating XML data, such as XPath, XSLT, and XQuery.
If you are interested in learning more about technology related to XML, check out the XQuery working group.
Basic introduction to XML (slides) Includes introduction to TEI.
TEI is commonly used in the Digital Humanities to mark up text documents for processing and data extraction. The Music Encoding Initiative (MEI) is a similar system that is used to mark up text that denotes musical scores.
Extensible Stylesheet Language Transformations (XSLT) is a language for transforming XML documents into other XML documents.
XQuery is a powerful functional programming language that can operate on XML. For further information, see the XQuery resources page.
JSON is an increasingly popular way to transmit data in a structured way. It is probably the most common way that data are provided to software and web pages via APIs.
Questions? Contact us