Expression Engine 2 Plugin: SuperGeekery Tag Stripper, version 1.0
Note: Since it's initial release, the SuperGeekery Tag Stripper has had a few minor updates. It is now at version 1.0.2. The links in the article download the most recent version. Eventually, I will write a proper standalone page for this addon including a real change log.
The redesign of SuperGeekery coincided with the release of Expression Engine 2 which gave me the opportunity to jump into the latest version of EE on a project that I was the client on, allowing me to be potentially break things without affecting a client. It also gave me the opportunity to try my hand at creating my first EE plugin. It's called SuperGeekery Tag Stripper and you're welcome to download the plugin and use it in your own projects, personal or commercial. And the price is right. It's free, as in beer.
The license is the GNU General Public License.What does the SuperGeekery Tag Stripper plugin do?
The plugin is actually a pretty simple thing. It takes a block of HTML and strips out tags, based on what you want to keep or save. I built it because I needed it for the site's redesign. If you notice the front page of this site shows an excerpt of the full article without paragraph breaks, images, or anything else except for <a> links. When you click through to the full article, there are paragraph break, images, and other HTML tags that would mess up my layout I wanted on the home page. I also wanted to keep the authoring and entry of each post as simple as possible. I didn't want to manually create a 'summary' portion of the article without the unwanted tags. I wanted to write each entry once, and have it formatted by EE as intended. Call it laziness or efficiency as you choose, but it lead me to build the plugin.
Installation of the plugin.
Thanks to the redesigned architecture of Expression Eninge 2, installing add-ons to EE has become incredibly simple. You install this one like you install most add-ons, after you download the plugin, you unzip it and put the tagstripper folder system/expressionengine/third_party folder within your site's directory. After you do that, check the Plugins page of your control panel and be sure it is installed correctly. It should be listed as SuperGeekery Tag Stripper.
How to use the plugin.
The plugin basically wraps some regular expressions in an easy-to-use EE tag set that lets you specify what tags to keep or kill. It has 3 options.
1. exp:tagstripper:stripAllTags - Removes all HTML tags. Ignores all arguments passed in. BEFORE EXAMPLE (wrapped in the appropriate EE tag):{exp:tagstripper:stripAllTags}
<h1>Example of exp:tagstripper:stripAllTags</h1>
<h2>This is an h2 tag.</h2> <a href="http://www.flickr.com/photos/morton/3969410575/" title="My Monitors Rock by John Morton, on Flickr">A photo of my <strong>computer</strong>.</a> <img src="http://farm3.static.flickr.com/2609/3969410575_0987891ac7_t.jpg" width="100" height="75" alt="My Monitors Rock"/>
{/exp:tagstripper:stripAllTags}
AFTER EXAMPLE: Example of exp:tagstripper:stripAllTags
This is an h2 tag. A photo of my computer.2. exp:tagstripper:tagsToSave tags='h1|span|img' - Removes all HTML tags except those tags passed in through a 'tags' parameter. Multiple tags are passed in separated by the pipe | character, sometimes referred to as the OR operator. The 'tags' parameter is optional, so it in essence operates like exp:tagstripper:stripAllTags. The 'tags' parameter can also take a regexp range, for example, passing in 'h[1-3]' would strip out h1, h2, h3, but not touch h4, h5, etc.
BEFORE EXAMPLE (wrapped in the appropriate EE tag):
{exp:tagstripper:tagsToSave tags="h1"}
<h1>Example of exp:tagstripper:tagsToSave tags="h1"</h1>
<a href="http://www.flickr.com/photos/morton/3969410575/" title="My Monitors Rock by John Morton, on Flickr">A photo of my <strong>computer</strong>.</a> <img src="http://farm3.static.flickr.com/2609/3969410575_0987891ac7_t.jpg" width="100" height="75" alt="My Monitors Rock" />
{/exp:tagstripper:tagsToSave}
AFTER EXAMPLE:<h1>Example of exp:tagstripper:tagsToSave tags="h1"</h1> A photo of my computer.3. exp:tagstripper:tagsToStrip tags='img|a'- Removes specified HTML tags passed in through a 'tags' parameter. Multiple tags are passed in separated by the pipe | character, sometimes referred to as the OR operator. The 'tags' parameter is optional, but if you're not going to strip out any tags, you probably should just not use this plugin. :)
BEFORE EXAMPLE (wrapped in the appropriate EE tag):
{exp:tagstripper:tagsToStrip tags='img|a'}
<h1>Example of exp:tagstripper:tagsToStrip tags='img|a'</h1>
<a href="http://www.flickr.com/photos/morton/3969410575/" title="My Monitors Rock by John Morton, on Flickr">A photo of my <strong>computer</strong>.</a> <img src="http://farm3.static.flickr.com/2609/3969410575_0987891ac7_t.jpg" width="100" height="75" alt="My Monitors Rock" />
{/exp:tagstripper:tagsToStrip}
AFTER EXAMPLE:<h1>Example of exp:tagstripper:tagsToStrip tags='img|a'</h1> A photo of my <strong>computer</strong>.As of version 1.0.2, you have the option of escaping special HTML characters. There is a post about this update here. Basically, there is now added option 'escapeHTMLchars' that you can set to 'true' which will turn special HTML character to their HTML equivalents. Look like something you can use? Download the SuperGeekery Tag Stripper plugin for Expression Eninge 2 here. Let me know if you have any feedback using the comment area below. Thanks.
Comments on this post.
Thanks a million for this plug in. I really needed it.
There are several mentions of Tweetify in the code.
I’m guessing you cut and pasted some stuff, so you might want to correct it to avoid confusion.
I’ve been looking for a EE2 version of the 1.6 plugin, Cleaner by Silenz before I can upgrade some sites, so hopefully I’ve found it.
Yes, Paul. I used Tweetify as my starter template. Let me know if the plug in works for you. I’ll clean it up and repost soon.
Thanks for the tip, Paul. I’ve fixed that and reposted.
Working perfectly.
I’m using it to take the Summary text and then strip all tags and insert into the meta description content= in the header.
I’m not sure if it is still needed with Google and it’s clever algorithms, but it might help.
I’m now thinking of other uses due to the flexible stripping options.
Cool, Paul. That’s a great idea!
Thanks for making this - I also used this for “purifying” the title and meta description content.
Woohooo, thanks for a super great plugin!
interesting, bookmarked and link from my blog.
Thanks man!!! this is exactly what I needed. Really helpful.
Let’s say the HTML I want to strip has a double or single quote character in the text (not in an HTML tag, but just in the plain text). Like this:
this product measures 24” long
If I take that line of text and put it into a meta description tag, the quote mark messes everything up. Is there a way to get your add-on to strip out quote marks too?
Is it possible to remove ‘ ’ with this module? Thanks
Oops, in my last comment I meant to have this ‘& n b s p ;’.
Hi James,
It looks like there is a plug in by Lowe that is ideal for what you’re looking to do:
http://devot-ee.com/add-ons/find-and-replace
Check it out and let me know if that works for you.
Would it be possible to strip out script tags and their content too?
I’m having problems when there is an email address that is encoded within the stripped text.
Paul,
So are you trying to get something like:
<a href=“mailto:someone@example.com?Subject=Hello again”>
Send Mail</a>
to be removed entirely?
-John
No, it’s the javascript that is created when EE automatically encodes an email address to hide it from spammers.
Thanks - I really needed this plug-in. I’m in a similar position where I’m taking the plunge for my client’s redesign and I could have played it safe with EE1.6 but as the site is a few months off I’ve taken a gamble on EE2. Just hoping all my favourite plug-ins and extensions will be redeveloped by the time the site goes live.