Handling SEO for Discourse Communities
If you are concerned about how your community is performing in search engine results (SEO), there are a number of questions to keep in mind when looking to optimize your community software of choice. We frequently get asked questions in our support inboxes and in our public community about how Discourse handles SEO. In this article, we’ll aim to demystify the most common Discourse SEO questions, including:
If you’re new to search engine optimization, be sure to check out our primer on community SEO before diving in here.
Ready? Let’s go!
Titles, Descriptions, and Other Metadata
One of the most common questions we get asked is how admins can customize title tags, descriptions, and other HTML metadata elements to affect how the page appears on a search results page. At the most basic level, Discourse auto-generates these elements based upon content on the page. The title tag is generated from the site or topic title, the description from the first post content, and so on.
It’s not possible to set custom values on each individual page’s metadata aside from editing the appropriate settings or content fields themselves. For example, if you’d like to edit the title of a topic as it will appear on Google, edit the topic title. The next time Google crawls the site, it will pick up the new title. Editing topic titles to accurately reflect the contents of the topic is a common practice all community managers should be doing. Topics generally get more (and better) responses if the title accurately reflects the content of the topic.
Descriptions are a little trickier for two reasons. First, the description is generated automatically from the contents of a topic’s first post. To adjust the description means you’ll need to edit the contents of that post. If you’ve set expectations in your community’s guidelines or terms, this shouldn’t be a problem; however, if you get too aggressive with editing posts solely for SEO purposes, you may upset your community members by altering their voice too much or removing information they found important to include.
The second challenge with descriptions is that Google and other search engines often create their own descriptions to show on the results page. The search engine will often pick the most relevant piece of content from the page that correlates with the search term. This is because search engines are primarily concerned with your content’s relevance to the user’s query, and they’re smart enough now to know what’s on the page.
This brings me to the most important thing to keep in mind about SEO: relevance is king. While honing your title, description, and other metadata may enhance your click-through rate from the search engine results page (SERP), the most important thing you can do is ensure your community is generating quality content that is relevant to what people are searching for.
Sitemaps are an XML file containing a list of the paths to every page on your website. In the past, our research showed sitemaps were helpful only for larger Discourse sites. However, in recent months, we've discovered that sitemaps can dramatically help Google to properly index every public URL on a site.
While search engines can index your site through crawling links, a sitemap offers a few benefits – namely that it feeds crawlers with all the info they need without manual crawling. We've seen manual crawling result in missed pages during the indexing process. It’s often stated that having a sitemap is crucial to your SEO efforts because they allow search engines to crawl your site more effectively. However, we’ve found this advice to be only part of the picture.
The growth of public Discourse sites rests heavily upon visitors finding the site via search. To help with the indexing process, Discourse supports sitemaps out of the box without the addition of a plugin as of version 2.9.0.beta4.
A brief note about what’s indexed on Discourse sites -- there are a few cases and places where your Discourse site will not be indexed. This includes if your site is set to login-only (no content will be indexed), or if your site is public, certain areas like user pages will not be indexed to avoid indexing duplicate content.
Let’s Talk About Subfolders
There’s mixed advice regarding one hotly debated SEO topic: hosting a site as a subfolder vs. on a subdomain. The main contention between the two is ranking -- does a site rank higher as a subfolder under the main domain, or is there no noticeable difference?
Both Matt Cutts and John Mueller of Google have said in the past there’s essentially no difference between a subfolder or subdomain in Google’s eyes.
But others, such as Rand Fishkin, founder and former CEO of Moz, have shared case studies of sites that migrated from subdomain to subfolder and saw a bump in traffic.
On the other hand, Ahrefs, another SEO tool, recently shared a few case studies showing the opposite effect.
GitHub changed their blog from a subfolder to a subdomain, and then again to a dedicated domain. Looking at the above graph, you can see a dip in traffic after the migrations which is expected, and is caused by reindexing. However, once search engines caught back up, not one method showed to be better than the other at driving traffic to the site.
It’s also important to remember that domain is not the only factor affecting your SEO performance. We’ll state this point again -- relevant content that other people link to (backlinks) is the primary influence on your site’s rankings. If your site isn’t all that helpful to those that find it, you won’t rank well, period.
Also, the Ahrefs article linked above references multiple other factors that can affect ranking, such as:
- Temporary signal changes
- Tracking or measuring issues
- Blocked pages
- Internal linking changes
- Removed/updated content
To summarize, Google says there’s no SEO benefit to hosting in a subfolder, and studies with conflicting evidence are unreliable because there are many variables affecting the SEO performance of a given site.
How does this impact your Discourse site? First, we’ve recommended Discourse sites use subdomains over subfolders for years now due to the fact that search engines aim to treat them in an equivalent manner.
Also, while hosting a Discourse site under a subfolder is possible, it adds quite a bit of technical complexity. Subfolder requires a special proxy, correct routing of traffic to the right places, and often introduces more technical problems and increased downtime in case of issues in our experience.
The beauty of open source is you have the choice to do what’s best for your site! However, we highly recommend Discourse sites, whether hosting with us or self-hosting, use a subdomain for its superior supportability and stability in light of the indeterminate impacts on the site’s search ranking.
Migrations and Redirects
We frequently migrate communities onto Discourse from other platforms, and a common question that we are asked is whether their Google rankings will be affected because the URL structure has changed. In short, the answer is no, but there are a few technical details to keep in mind.
In our team’s migration process, we work with our customers to determine if URL mapping is needed for the import. Discourse has a built-in redirect function solely for this purpose. If your old community has a URL at
example.com/community.php?tid=555, we’ll create a redirect during the import process so it properly maps to the new topic URL at
example.com/t/-/1234 The next time your site is crawled, the crawler will follow the redirect to the correct URL and update it in the search engine’s database.
Another detail to keep in mind is whether your community’s primary URL is changing in the process. If it’s staying the same, you have no work to do aside from pointing it to the new hosting location. However, if the URL is changing from something like
community.example.com, you will need to set up a server-side 301 redirect from the old domain to the new one (including passing along the full path and query parameters) so Discourse can properly parse the redirect. Once this is done, again, search engines will re-crawl the site, follow the redirects, and update the records in their data.
Focus on the Highest Value Items
If there’s anything you take away from this post, it’s to focus on the key components of SEO first, namely relevance to a user’s search, and handle the other technical elements of SEO as you are able. If your site has content that’s relevant to a searcher and other sites link to it, Google and other search engines will rank it. Other elements, such as metadata or sitemaps, while important, are not nearly as important as the content's relevance.
Did we miss anything? Have any additional questions? Let us know your thoughts in the discussion linked below.