Pages

Monday, March 16, 2015

How-To remove HTMLTags to display plain text using XSLT


In this post i will be showing how you can use XSLT to strip out HTML tags from HTML data sources (Rss, sharepoint list item, database field ... etc) and display plain text

Below is the function to remove HTML tags:

  <xsl:template name="removeHtmlTags">
    <xsl:param name="html"/>
    <xsl:choose>
      <xsl:when test="contains($html, '&lt;')">
        <xsl:value-of select="substring-before($html, '&lt;')"/>
        <!-- Recurse through HTML -->
        <xsl:call-template name="removeHtmlTags">
          <xsl:with-param name="html" select="substring-after($html, '&gt;')"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$html"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

You can see that the function name is removeHtmlTags which accepts one argument / parameter named as html which is my Description field that contains HTML Tags.
Logic is simple, its a recursive function which finds for '&lt;' that is '<' means starting of any HTML Tag and take out the substring after this '<' Tag using substring-before() function as substring-before($html, '&lt;') and again call the function with the rest of the string left after  '&gt;' that is '>' Tag.

This is how this function will be called:

    <xsl:template name="RssCell">
        <xsl:variable name="pureText">
            <xsl:call-template name="removeHtmlTags">
                <xsl:with-param name="html" select="DescriptionField" />
            </xsl:call-template>
        </xsl:variable>

        <div height='40' class='blog_text'>
            <xsl:value-of disable-output-escaping="yes"  select="substring($pureText, 0, 175)"/>
        </div>
    </xsl:template>

One Variable is declared as pureText. removeHtmlTags() function will strip out the HTML Tags and return the Plain Text values in this pureText variable.
I am passsin DescriptionField that is my DB Field with HTML Tags.

Finally, I am displaying max 175 chars of Plain Text as substring($pureText, 0, 175) inside a DIV.

Thats It!


Reference