What kinds of things do you, as a web developer, need to keep in mind when baking HTML into code? Here’s one…
I work with several web sites that use AddSearch for their on-site search engine. AddSearch, based in Helsinki, Finland, provides a highly customizable search interface that you can add to almost any web site.
Recently, one of our search indexes dropped a noticeable number of PDF documents. Artem, an AddSearch support rep, snapped into action and found the culprit. It was caused by the canonical tag on a page that lists PDF documents by category, where each category listing is displayed as a separate page. In other words, each category’s page is shown using a parameter in the URL like this:
However, the page’s canonical tag is the same for each page load, regardless of the category :
<link rel="canonical" href="https://website.com/animals.php" />
Artem explained that AddSearch only indexes the first instance of a canonical tag and that any duplicate tags are ignored. This means that, in our case, only one category was being indexed. He fixed the problem by turning off canonical tag checking, and on the next crawl, each individual category URL sprang to life in the index, along with the PDF documents for each category. That saved us from having to make code changes to the site — changes that were not in scope for a legacy project. The canonical tag setting is still under wraps in the AddSearch interface, but Artem did say they might make it available in a future version.
I’ve studied canonical tags before, but this particular problem never occurred to me. I’d assumed the tags only affect SEO and didn’t realize their impact on search functionality.
So where do the worlds of SEO and search functionality intersect? And how do they differ? — I wonder how profitable that space is as its own niche.
After a little more research, I’m reminded that this is just the tip of the canonical iceberg. Here’s a comprehensive guide from Yoast…