Removing HTML Tags From Your Shopify Product CSV Export With EZ Exporter

When you export your Shopify product data to CSV, you'll notice that it will export the product description with HTML code included in a field called "Body (HTML)."

If you plan to use this product CSV in other platforms such as for a Facebook Product Feed or Google Shopping Feed, you'd normally want the HTML tags removed as those platforms either won't render them or won't allow them at all. You'd want the product description to be in plain text.

EZ Exporter can automatically strip these out for you on export using our Calculated Fields feature. We have a function called strip_html_tags() that will take care of this during the export.

For example, the the body_html field coming from Shopify might look something like this:

<h2><span style="text-decoration: underline;"><strong>Best seller!</strong></span></h2>
<p>Qui saepe facere provident commodi doloremque maxime corporis. At delectus earum error et ut voluptatem.</p>
<p>Occaecati eum officia et ut voluptas quaerat beatae harum. Temporibus saepe veniam sit sapiente quo. Perspiciatis omnis beatae soluta consequatur nihil. Laborum sequi reprehenderit et.</p>

In EZ Exporter, you can use a formula like this to remove the HTML tags:

strip_html_tags({{ body_html }})

And the output will be like this instead:

Best seller!
Qui saepe facere provident commodi doloremque maxime corporis. At delectus earum error et ut voluptatem.
Occaecati eum officia et ut voluptas quaerat beatae harum. Temporibus saepe veniam sit sapiente quo. Perspiciatis omnis beatae soluta consequatur nihil. Laborum sequi reprehenderit et.

Recently, we had a customer who needed something even more advanced as the product descriptions in their store contain CSS values inside the HTML style tag. For example:

<style><!--
* {
box-sizing: border-box;
}

body {
}

/* Create two equal columns that floats next to each other */
.column {
float: left;
padding: 17px
}

/* Clear floats after the columns */
.row:after {
content: "";
display: table;
clear: both;
}

.left {
width: 65%;
}

.right {
width: 35%;
padding-right: 5px;
}
--></style>
<div class="row">
<div class="column left">
<div><span face="Arial" size="3" style="font-family: Arial; font-size: medium;"></span></div>
This thing rocks!
<div></div>
<ul style="list-style-type: bullet;">
<li><span face="Arial" size="3" style="font-family: Arial; font-size: medium;">Availability: Available Now</span></li>
<li><span face="Arial" size="3" style="font-family: Arial; font-size: medium;">Size: 5.25 X 3.75 in.</span></li>
</ul>
</div>
<div class="column right" style="background-color: #f3f3f3;"><span size="2" style="font-size: small;"><b>SKU: </b>12345 <br /> <b> Category: </b>Apparel<br /><br /> <b>More Details</b> <br /> Brand: SUPER DUPER<br /> Color: Multi-Color<br /> Material: Steel<br /><br /> </span></div>
</div>

The strip_html_tags() function won't remove the CSS values as it's meant to just remove HTML tags. So we ended up writing a new function specifically for removing the contents of a specific HTML tag.

We called this function strip_html_tag_contents() and is implemented like this:

strip_html_tag_contents({{ body_html }}, "style")

The formula above will remove the content of the style tag. We can then combine this formula with strip_html_tags() to remove both the CSS values and the tags themselves like this:

strip_html_tags(strip_html_tag_contents({{ body_html }}, "style"))

To clean up further, we can append .strip() to the formula to also remove any leading and trailing spaces and newline characters.

strip_html_tags(strip_html_tag_contents({{ body_html }}, "style")).strip()

This will output:

This thing rocks!


Availability: Available Now
Size: 5.25 X 3.75 in.


SKU: 12345   Category: Apparel More Details  Brand: SUPER DUPER Color: Multi-Color Material: Steel

As you can see, it's much cleaner and more readable. :)

Tags: ez exporter, shopify tips, product feed