How To Create A Table Of Contents Shortcode

It’s time for another WordPress plugin development tutorial 🙂 In this post, I will provide a step-by-step explanation of how to create a WordPress plugin that lets the user add an automatically generated table of contents (TOC) to their posts by using a simple shortcode.

The completed plugin will support the following syntax:

[[toc title="Contents" class="css classes" headings="1,2,3"]]

For example, the table of contents for this post was generated by this plugin.

Resources

Setting goals

Before we begin, we need to decide what features our plugin is going to have. Let’s keep it simple:

  • Generate a hierarchical TOC based on heading tags.
  • Configurable title.
  • Configurable CSS class.
  • Let the user decide which heading levels (H1-H6) to include in the TOC.
  • Basic styling for the TOC.

I will briefly discuss other possible features and improvements at the end of this article.

Another thing that we need to figure out is what version(s) of PHP and WordPress are we going to support. This will affect what PHP features and WP APIs we’ll be able to use. Again, lets keep it simple and stick to WP 3.3+ and PHP 5.2+. Being able to use PHP 5.3 features like closures would be nice, but most WordPress users are probably still on 5.2.x and wouldn’t be able to run the plugin if it used 5.3 specific syntax.

Finally, we’ll need a name – or at least a working title. Let’s go with “Easy TOC”.

Getting it working

Basic setup

First, let’s get the basic plugin header in place. This will allow us to activate the plugin in WordPress right away and test and debug it as we go. I’ll also add a skeleton of the main plugin class.

<?php
/**
 * Plugin Name: Easy Table of Contents
 * Plugin URI: http://w-shadow.com/2012/06/17/table-of-contents-plugin-tutorial/
 * Version: 1.0
 * Author: Janis Elsts
 * Author URI: http://w-shadow.com/
 * Description: Insert a table of contents into your posts by using a [[toc]] shortcode.
 */

class EasyTOC {
    public function __construct() {
        //TODO
    }
}

new EasyTOC();

For reference: WordPress File Headers.

You may notice that I’ve left out the closing “?>” tag. This is intentional. The end-of-file closing tag is optional and omitting it helps prevent accidental “headers already sent” errors. See also: more reasons to omit the closing PHP tag.

Algorithm overview

There are four things we need to do to build a table of contents for a post:

  1. Find and parse the [toc] shortcode to figure out which heading levels to include in the TOC, what title to use (if any), and so on.
  2. Parse the current post for eligible heading tags and build a list of TOC entries.
  3. Add an anchor name to each heading so that we can link to it from the table of contents. Anchors must be unique.
  4. Actually render the TOC.

We can’t do everything inside a shortcode callback because #3 involves modifying the entire post. So we will generate the TOC in two large steps – first, parse the shortcode, store its attributes, and replace it with a temporary placeholder. Then, in a separate “the_content” hook, process the headings, build the TOC and replace the placeholder we created earlier with the result.

Creating the [toc] shortcode

Modify the plugin class like this:

class EasyTOC {
	const TOC_POSITION_MARKER = '/--Put_TOC_here-/';

	private $tocSettings;

	public function __construct() {
		add_shortcode('toc', array($this, 'doTocShortcode'));
	}

	public function doTocShortcode($attributes) {
		//Only display the TOC on single posts and pages, not archives.
		if ( !(is_single() || is_page()) ) {
			return '';
		}

		$attributes = shortcode_atts(
			array(
				'title' => '',
				'class' => '',
				'headings' => '1,2,3,4,5,6',
			),
			$attributes
		);
		$attributes['headings'] = array_map('intval', explode(',', $attributes['headings']));

		//We'll use these settings later.
		$this->tocSettings = $attributes;

		return self::TOC_POSITION_MARKER;
	}
}
/* ... */

We register the [toc] shortcode using the add_shortcode() API function and set the doTocShortcode() method as the callback. The callback parses shortcode attributes (using sensible defaults) and stores them in a private field. Finally, it replaces the shortcode with a placeholder string that acts as a marker for where to insert the TOC.

Preparing TOC entries

Next, we’ll scan the post for headings and prepare a list of TOC entries. We can use the well-known “the_content” filter to intercept and modify the post’s content.

The WordPress core function that processes shortcodes is also attached to this filter and runs at priority 11. We will attach our own hook with priority 20 to make sure it runs after the [toc] shortcode has already been processed.

Add this to the class constructor:

add_action('the_content', array($this, 'insertToc'), 20); //Run after shortcodes.

Here’s the insertToc() callback and supporting code:

class EasyTOC {
	private $tocEntries;
	/* ... */

	public function insertToc($content) {
		//Replace the marker with the actual table of contents.
		if (strpos($content, self::TOC_POSITION_MARKER) !== false) {
			list($entries, $content) = $this->collectTocEntries($content, $this->tocSettings['headings']);
			$toc = $this->buildToc($entries, $this->tocSettings['title'], $this->tocSettings['class']);
			$content = str_replace(self::TOC_POSITION_MARKER, $toc, $content);
		}
		return $content;
	}

	private function collectTocEntries($content, $headingLevels) {
		$this->tocEntries = array();

		$pattern = sprintf(
			'@<h(?P<level>%s)[^>]*>(?P<content>.*?)</h(?P=level)>@si',
			implode('|', $headingLevels)
		);
		$content = preg_replace_callback($pattern, array($this, 'processHeading'), $content);

		return array($this->tocEntries, $content);
	}

	public function processHeading($matches) {
		$anchor = $this->addTocEntry($matches['content'], $matches['level']);
		return $this->addAnchorTarget($matches[0], $anchor);
	}

	private function addTocEntry($title, $level) {
		$title = strip_tags($title);
		$level = intval($level);

		//Each TOC entry needs a unique anchor.
		$anchor = sanitize_title($title);
		if ( array_key_exists($anchor, $this->tocEntries) ) {
			$anchor = $anchor . '-' . count($this->tocEntries);
		}

		$this->tocEntries[$anchor] = array($level, $title);
		return $anchor;
	}

	private function addAnchorTarget($heading, $anchor) {
		//For in-page links, <elem id="..."> is preferred to <a name="..."></a>.
		//See http://stackoverflow.com/questions/484719/html-anchors-with-name-or-id
		$target = sprintf('<span id="%s"></span>', esc_attr($anchor));
		return $target . $heading;
	}

	private function buildToc($entries, $title = '', $class = '') {
		//TODO: Implement TOC rendering.
	}

	/* ... */

Okay, let’s go through the above code step-by-step.

insertToc() is very simple as it delegates all the real work to other methods. If the post contains our placeholder string, insertToc() builds a table of contents based on the shortcode attributes retrieved by doTocShortcode() and inserts it into the post. If not, it returns the post without modification.

collectTocEntries() is a utility function that scans an input string for heading tags and uses them to generate a list of TOC entries. It also adds an anchor target to each heading. Again, most of the work is delegated to other methods.

The HTML4 spec allows for six levels of headings, from H1 to H6. We use regular expressions to find the headings that match the levels specified by the headings=”…” shortcode attribute. For example, if headings=”1,2,3″, the regex generated by the sprintf() call will look like this:

'@<h(?P<level>1|2|3)[^>]*>(?P<content>.*?)</h(?P=level)>@si'

To help you understand how it works, here’s the same regular expression in tree form:

Note: Normally, parsing HTML with regular expressions is a Bad Idea™. It’s better to use a real HTML parser. In PHP that would mean using DOMDocument and friends. However, in this case we’re not just parsing HTML – we’re modifying it, too. DOMDocument is not really suited for modifying HTML fragments like WP posts as it will add extraneous <html>, <head> and <body> elements to the output. We could come up with a work-around or use a third-party parser, but that would unnecessarily complicate the plugin. Regular expressions it is, then.

addTocEntry() uses a particularly handy WP function called sanitize_title() to convert each heading into a “safe” string that can be used as an anchor or a URL component. For example, sanitize_title() will turn “My Nifty Hēadiņg” into “my-nifty-heading”. To ensure that the anchor is unique, addTocEntry() also checks it against the list of already processed headings.

addAnchorTarget() simply adds an invisible <span> element to each heading. The element has an ID attribute set to the specified anchor name. This enables us to link to the heading by using the following syntax:

<a href="#anchor">...</a>

As  you may know, there are actually two ways to create an element that we can link to using that syntax:

<element id="anchor">...</element>
<a name="anchor">...</a>

I chose the first approach because the “name” attribute is no longer valid for <a> tags in HTML5. While this plugin does not specifically target any HTML standard, it’s still a good idea to generate valid HTML5 markup to ensure it won’t break any HTML5 themes. See this StackOverflow question for a deeper discussion.

Rendering the TOC

All right, time to finally get something on the screen!  Here’s the code that renders a hierarchical table of contents:

private function buildToc($entries, $title = '', $class = '') {
	//No entries - no TOC.
	if ( empty($entries) ) {
		return '';
	}

	$class = 'toc ' . $class;
	$html = '<div class="' . esc_attr($class) . '">';

	if ( !empty($title) ) {
		$html .= '<span class="toc-title">' . $title . '</span>';
	}

	//Determine the minimum heading level.
	$minLevel = PHP_INT_MAX;
	foreach($entries as $entry) {
		$minLevel = min($entry[0], $minLevel);
	}

	$currentLevel = $minLevel - 1;

	foreach($entries as $anchor => $entry) {
		list($level, $title) = $entry;

		if ( $currentLevel < $level ) {
			$html .= str_repeat('<ul><li>', $level - $currentLevel);
		} else {
			$html .= str_repeat('</li></ul>', $currentLevel - $level) . '<li>';
		}
		$currentLevel = $level;

		$html .= sprintf('<a href="#%s">%s</a>', esc_attr($anchor), $title);
	}

	//Close any open lists.
	if ( $currentLevel > $minLevel - 1 ) {
		$html .= str_repeat('</li></ul>', $currentLevel - $minLevel + 1);
	}

	$html .= '</div>';
	return $html;
}

The resulting HTML will look something like this (re-formatted for readability):

<div class="toc other classes">
	<span class="toc-title">Contents</span>
	<ul>
		<li><a href="#heading-1">Heading 1</a><li>
		<li><a href="#heading-2">Heading 2</a>
			<ul>
				<li><a href="#sub-heading-1">Sub-heading 1</a></li>
				<li><a href="#sub-heading-2">Sub-heading 2</a></li>
			</ul>
		</li>
		<li><a href="#sub-heading-3">Heading 3</a></li>
	</ul>
</div>

The first half of the buildToc() method is fairly simple – check if there are any entries to display,  make a container element,  set its CSS classes and add a title (if any).

The trickiest part of generating a hierarchical table of contents is making sure all the different heading levels and the resulting nested lists are handled correctly. There are several possible complications and edge cases we need to consider here:

  • The first heading we find won’t be a H1 – that’s usually reserved for the post title or the site name.
  • The first heading will normally be one of the highest-level headings used in the post (e.g. a H2), but it’s not guaranteed to.
  • Some heading levels might be skipped, e.g. H2 followed by H4.

The plugin these problems by figuring our the minimum  (= highest importance)  heading level ahead of time and keeping track of the current level. Whenever it encounters a lower level heading, it creates the required number of nested lists to “descend” to that level. On the other hand, when the current entry is a higher level heading, the plugin closes the required number of lists to “ascend” to that level. In each case, the output string ($html) is left in a state where it’s ready for the next TOC entry to be added.

Here’s a work-in-progress screenshot:

An example TOC (WIP)

Styling

Hey, aside from the rather plain title, that TOC actually looks pretty good! Maybe we can get skip adding any CSS and just ship the plugin as-is?

Alas. The fine looks are mostly due to the theme I use on my test site – Twenty Eleven. Here’s what the same table of contents looks like with a different theme:

Example TOC (different theme)

We’ll need to do something about those margins. And while we’re at it, lets bold the title and tweak the alignment slightly. Create a file name style.css in the plugin’s directory and paste this CSS into it:

.toc ul,
.toc li,
.toc ul ul
{
	margin: 0;
	padding: 0;
	list-style-position: outside;
}

.toc > ul {
	margin: 0.2em 0 1em 1.2em;
}

.toc ul ul {
	margin-left: 2.5em;
}

.toc .toc-title {
	font-weight: bold;
}

Now we need to add the stylesheet to the pages that contain the shortcode. To do this, we’ll use the wp_register_style() function to register it during the “wp_enqueue_scripts” action, and the wp_enqueue_style() function to enqueue it from the doTocShortcode() method.

First, lets add this line to the constructor:

add_action('wp_enqueue_scripts', array($this, 'registerStyle'));

And add this method before the end of the class definition:

public function registerStyle() {
	wp_register_style('easy-toc-style', plugins_url('style.css', __FILE__), array(), '20120617');
}

Finally, insert these two lines before the “return” in doTocShortcode():

//Add our stylesheet to pages that contain a TOC.
wp_enqueue_style('easy-toc-style');

Here’s what our table of contents looks like now (click to view it in page context):

Finished TOC (“Twenty Eleven” theme)

Finished TOC (different theme)

If you haven’t already done so, you can download the completed plugin in the Resources section.

Possible improvements

  • Add a link to the comments to the TOC.
    This one is fairly easy – the comment area usually has the ID “comments”, so you could just add a TOC entry with an anchor “#comments”.
  • Add “Back to Top” links.
    Hint: Instead of trying to figure out where each section of the post ends, you can them before or after each heading + one at the end of the post.
  • Add the TOC automatically if no shortcode is present.
    You’ll probably need need a way for the user to specify settings for auto-generated TOCs. See Settings API.
  • Handle multi-page posts.
    You can detect if the current post is split into multiple pages by checking if the global variable $multipage equals 1. The number of pages is stored in the $numpages global, and each page’s contents – in the $pages array. You can use the _wp_list_page() core function as a template for generating URLs to different pages.
  • Add a visual editor button.
    See the TinyMCE Custom Buttons page in the WordPress Codex.
  • Add a Show/hide button, like on Wikipedia.
    A little bit of jQuery makes this easy. Remember to be a good citizen and only add your script to pages that actually contain your shortcode.

That’s it! I hope you found this tutorial useful. If you have any questions, please leave a comment below.

Related posts :

2 Responses to “How To Create A Table Of Contents Shortcode”

  1. Rad says:

    Hello, constantly i used to check web site posts here early in the morning, for the reason that
    i love to gain knowledge of more and more.

  2. what a kickass WP coding tutorial

Leave a Reply