cliffordp
7/6/2017 - 2:16 PM

Use DOMDocument to do a more robust job at force_balance_tags.

Use DOMDocument to do a more robust job at force_balance_tags.

<?php

/**
 * Use DOMDocument to do a more robust job at force_balance_tags.
 * 
 * "force_balance_tags() is not a really safe function. It doesn’t use an HTML parser 
 * but a bunch of potentially expensive regular expressions. You should use it only if 
 * you control the length of the excerpt too. Otherwise you could run into memory issues 
 * or some obscure bugs." <http://wordpress.stackexchange.com/a/89169/8521>
 *
 * For more reasons why to not use regular expressions on markup, see http://stackoverflow.com/a/1732454/93579
 * 
 * @link http://wordpress.stackexchange.com/questions/89121/why-doesnt-default-wordpress-page-view-use-force-balance-tags
 * @see force_balance_tags()
 *
 * @param string $markup
 * @return string
 */
function force_balanced_tags2( $markup ) {
	$dom = new DOMDocument();
	// Note the meta charset is used to prevent UTF-8 data from being interpreted as Latin1, thus corrupting it
	$html = '<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body>';
	$html .= $markup;
	$html .= '</body></html>';
	$dom->loadHTML( $html );
	$body = $dom->getElementsByTagName( 'body' )->item( 0 );
	$markup = str_replace( array( '<body>', '</body>' ), '', $dom->saveHTML( $body ) );
	return $markup;
}