What kinds of things do you, as a web developer, need to keep in mind when baking HTML into code? Here’s one that recently caught me by surprise…
I work with several web sites that use AddSearch for their on-site search engine. AddSearch, based in Helsinki, Finland, provides a highly customizable search interface that you can add to almost any web site.
Recently, one of our site’s search indexes dropped a noticeable number of PDF documents. Artem, an AddSearch support rep, snapped into action and found that it was because the canonical tag on a dynamic PHP page that lists PDF documents by category. Each category’s page is shown using a parameter in the URL, similar to this:
However, the page’s canonical tag stays the same for each page load, regardless of the category :
<link rel="canonical" href="https://website.com/animals.php" />
Artem explained that AddSearch only indexes the first instance of a canonical tag and that any duplicate tags are ignored. This means that, in our case, only one category was being indexed. He fixed the problem by turning off canonical tag checking. On the next crawl, each individual category’s URL sprang to life in the index, along with the PDF documents for each category. That saved us from having to spend time making and testing new code changes on a legacy site. The canonical tag setting is still under wraps in the AddSearch interface, but Artem did say they might make it available in a future version.
I’ve studied canonical tags in other contexts before, but this particular problem never occurred to me. I’d assumed the tags only affect SEO and didn’t realize their impact on search functionality.
So where do the worlds of SEO and search functionality intersect? And how do they differ? — I wonder how profitable that space is as its own niche.
After a little more research, I’m reminded that this is just the tip of the canonical iceberg. Here’s a comprehensive guide from Yoast…