Pre-calculated line breaks for HTML / CSS

Although slowly improving, typography on the web pages is considerably lower quality than that of high-quality print / PDF typography, such as that produced by LaTeX or Adobe InDesign. In particular, line breaks and hyphenation need considerable improvement. While CSS originally never specified what sort of line breaking algorithm should be used, browsers all converged on greedy line breaking, which produces poor-quality typography but is fast, simple, and stable. CSS Text Module Level 4 standardizes the current behavior as the default with a text-wrap property while introducing a pretty option, which instructs the browser to use a higher quality line breaking algorithm. However, as of the time of writing, no browsers supported this property.

I recently came across a CSS library for emulating LaTeX’s default appearance.1 However, it doesn’t emulate the Knuth–Plass line breaking algorithm, which is one of the things that makes LaTeX look good. This got me wondering whether or not it’s possible to emulate this with plain HTML and CSS. A JavaScript library already exists to emulate this, but it adds extra complexity and is a bit slow. It turns out that it is possible to pre-calculate line breaks and hyphenation for specific column widths in a manner that can be encoded in HTML and CSS, as long as web fonts are used to standardize the text appearance across various browsers.

The key is to wrap all the potential line breaks (inserted via ::after pseudo-elements) and hyphens in <span> elements that are hidden by default with display: none;. Media queries are then used to selectively show the line breaks specific to a given column width. Since every line has an explicit line break, justification needs to be enabled using text-align-last: justify;, and word-spacing: -10px; is used to avoid additional automatic line breaks due to slight formatting differences between browsers. However, this presents a problem for the actual last line of each paragraph, since it is now also justified instead of left aligned. This is solved by wrapping each possible last line in a <span> element. Using media queries, the <span> element corresponding to the given column width is set to use display: flex;, which makes the content be left-aligned and take up the minimum space required, thereby undoing the justification; word-spacing: 0; is also set to undo the previous change to it and fix the word spacing. Unfortunately, the nested <span> elements are problematic, because there are no spaces between them; this is fixed by including a space in the HTML markup at the beginning of the <span> and setting white-space: pre; to force the space to appear.

I’ve prepared a demo page demonstrating this technique. It was constructed by calculating line breaks in Firefox 76 using the tex-linebreak bookmarklet and manually inserting the markup corresponding to the line breaks; some fixes were manually made because the library does not properly support em dashes. Line breaks were calculated for column widths between 250 px and 500 px at 50 px increments. The Knuth–Plass line breaks lead to a considerable improvement in the text appearance, particularly for narrower column widths. In addition to the improved line breaks, I also implemented protrusion of hyphens, periods, and commas into the right margin, a microtypography technique, which further improves the appearance. To (hopefully) avoid issues with screen readers, aria-hidden="true" is set on the added markup; user-select: none; is also set, to avoid issues with text copying.

While this technique works fine in Firefox and Chrome, it does not work in Safari, since Safari does not support text-align-last as of Safari 13.2 Despite it not working, the corresponding WebKit bug is marked as “resolved fixed”; it seems that support was actually added in 2014, but the support is behind the CSS3_TEXT compile-time flag, which is disabled by default. Thus, I devised an alternative method that used invisible 100% width elements to force line breaks without using explicit line breaks. This again worked in Firefox and Chrome, although it caused minor issues with text selection, but it again had significant issues in Safari. It appears that Safari does not properly handle justified text with negative word spacing; relaxing the word spacing, however, causes extra line breaks due to formatting differences, which breaks the technique. At this point, I gave up on supporting Safari and just set it to use the browser default line breaking by placing the technique’s CSS behind an @supports query for text-align-last: justify.

Automated creation of the markup would be necessary to make this technique more generally useful, but the demo page serves as a proof of concept. Ideally, browsers would implement an improved line breaking algorithm, which would make this technique obsolete.

  1. Also see corresponding Hacker News discussion.  

  2. Even Internet Explorer 6 supports this.  

This entry was posted in , and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *