kmcm

Web Development with PHP, HTML, CSS, & JavaScript

Stripping HTML tags from content with PHP

Friday 23rd October 2009

It may sometimes be useful to remove the html tags from a portion of text, for example when you want to display a snippet of text in a paragraph, or add it to an xml feed with breaking it. There are a couple of ways to do this. The most obvious, and blanket solution, is to use the strip_tags() function in php. This function takes in a string variable, and outputs the string with all html tags removed.

strip_tags($text);

There's also an optional parameter to this function, which you can use to specify html tags that you want to keep in the string. This is string of the tags, such as '<p><a><h1><ul><li>'. For example, to remove all html elements from a string, except for paragraph and anchor/link tags, you would use:

strip_tags($text, '<p><a>');

This works well when you know exactly what html tags will be in the string, so you can use the allowable tags parameter to help target specific tags to remove (by exclusion). But if you are unsure of what tags will be in the string, if the string is constantly being edited in a user controlled content management system for example, you may need to explicitly target certain types of tags, and not others. Below is a function that takes in two parameters - a string and a tag name - and removes any instances of that tag from the string, and returns the string:

$newstring = stripTags(p, $originalstring);
function stripTags($tag, $string) {
$regExp = "<" . "$tag" . "[^>]*>";
$string = str_replace("</$tag>", "", $string);
$string = ereg_replace($regExp, "", $string);
return $string;
}

There may also be times where just removing the string may not be sufficient, and replacing it with something is required. The following function is like the stripTags one, but instead of removing the tag in question, it replaces the closing tag with a line break.

function replaceTags($tag, $string) {
$regExp = "<" . "$tag" . "[^>]*>";
$string = str_replace("</$tag>", "<br />", $string);
$string = ereg_replace($regExp, "", $string);
return $string;
}

Filed under: