NEW! Trying to keep your (suddenly) remote team productive? Try Discourse for Teams!

Blog

Updates & community insights

  • Discourse team grows to 50

    It’s been a long, long time since we wrote about growing our team to 20. The last few years have been good to us and as a result, we’ve grown steadily and have also continued giving back to open source wherever possible. You might have heard of the new arrivals if you frequent the Discourse Meta, but here’s a list anyway.

    Meet the rest of the team!

    • Daniela Bogazzi - Technical Advocate
    • Kyle Mitchell - Lawyer
    • Jeff Wong - Software Engineer
    • Johani Faris Saeed - Designer
    • Ginevra Brown - Community Accounts Specialist
    • David Taylor - Software Engineer
    • Rishabh Nambiar - Community Team Lead
    • Bianca Nenciu - Software Engineer
    • Penar Musaraj - Software Engineer
    • Saj Goonatilleke - Operations Engineer
    • Dan Ungureanu - Software Engineer
    • Taylor Henry - Technical Advocate
    • Roman Rizzi - Software Engineer
    • Justin DiRose - Technical Advocate
    • Daniel Waterworth - Software Engineer
    • Jarek Radosz - Software Engineer
    • Kris Kotlarek - Software Engineer
    • Mark VanLandingham - Software Engineer
    • Martin Brennan - Software Engineer
    • Osioke Itseuwa - Community Advocate
    • Will Chau - Customer Success Manager
    • Jordan Vidrine - Designer
    • Kane York - Software Engineer
    • Michelle Vendrame - Technical Advocate
    • Tobias Eigen - Teams Product Manager
    • Jamie Wilson - Software Engineer
    • Michael Fitz-Payne - Operations Engineer
    • Osama Sayegh - Software Engineer
    • Blake Sorrell - Customer Success Manager
    • Eleni Michalaki - Operations Engineer
    • Andrei Prigorshnev - Software Engineer
    • Alex Reed - Administrative Assistant

     

    While it’s not a prerequisite, it’s clear we love to hire from our community. To read more about each member (ft. glorious drawings) and working with us, check out discourse.org/team.

    We’re a fully remote company, working from 19 different countries and 15 different timezones, but does that make you wonder how we coordinate our work?

    That’s right, we use Discourse as our primary team coordination tool to build Discourse! As it excels at asynchronous, distributed teamwork, we can keep interruptions like instant messaging, calls, and meetings to a minimum. If that approach sounds interesting, don’t forget to try Discourse for Teams.

    Here’s to…the future of Discourse and to our community 🍻

    Comments
  • Standing on the Shoulders of a Giant Elephant: Upgrading Discourse to PostgreSQL 13

    elephant

    When the Discourse project started, way back in 2013, the team had to pick and choose between a handful tools into what would become our “stack”, the foundational software on which Discourse would be built upon. Some choices proved sub-optimal early, but we were able to quickly migrate over, like for example our migration from CoffeeScript to Javascript.

    On the other hand, most of our choices proved to be great, with picking PostgreSQL for the database being the finest. To illustrate how happy we are with it, let’s talk about our favorite feature of PostgreSQL latest version: B-Tree deduplication.

    A little back history into our hosting service

    While Discourse is, of course, 100% open source software first and foremost we are a hosting company. And since we started our commercial hosting services back in 2014, we grew our hosting into serving over 400 millions page views and storing over 4 million new posts each month.

    All this data is stored into PostgreSQL instances, so as you can imagine we were very interested when the PostgreSQL 13 release notes contained news about “significant improvements to its indexing and lookup system that benefit large databases, including space savings and performance gains for indexes”. It even made us consider breaking from our tradition of skipping the PostgreSQL odd versions and only upgrade every two years. And in order make an informed decision we had to benchmark.

    Activate the Shrink Ray

    hearthstone-shrink-ray-card

    In order to evaluate if the new B-Tree deduplication feature would benefit Discourse in any way, we decided to check if it would have effect in what is the largest table in most Discourse instances in our hosting, the posts_timings table. This tables stores read time of each user in each post and is defined as:

    discourse=# \d post_timings
                  Table "public.post_timings"
       Column    |  Type   | Collation | Nullable | Default 
    -------------+---------+-----------+----------+---------
     topic_id    | integer |           | not null | 
     post_number | integer |           | not null | 
     user_id     | integer |           | not null | 
     msecs       | integer |           | not null | 
    Indexes:
        "index_post_timings_on_user_id" btree (user_id)
        "post_timings_summary" btree (topic_id, post_number)
        "post_timings_unique" UNIQUE, btree (topic_id, post_number, user_id)
    

    We are also investigating if we can drop the post_timings_summary index, as it’s a subset of the left-most columns in the post_timings_unique one, which means it can potentially be re-used.

    In a particular instance we host, this table recently just went over a billion rows, so we used this number of rows for our test. Also, since in a live system this table receives a constant influx of updates, due to the MVCC we can end up with quite a bit of “bloat” that can skew our analysis. So in order to compare in a clean environment we used brand new installs of the last release of both 12 and 13 pg versions. After loading each version, the numbers are as follows:

    Total table Size

    PostgreSQL 12: 114 GB
    PostgreSQL 13:  85 GB
    

    A 25% reduction in the relation size? That’s awesome! 🥳

    Digging into specifics we have:

    PostgreSQL 12
      Table: 42 GB
      Index: 72 GB
    
    PostgreSQL 13
      Table: 42 GB
      Index: 43 GB
    

    As foretold in the release notes, the optimization only applies to the index, and we can reproduce it here. The table size is still the same, but the index size is almost half.

    If we enhance it further:

    PostgreSQL 12
    
                   relation               |    size    
    --------------------------------------+------------
     public.post_timings                  |   42 GB
     public.post_timings_unique           |   30 GB
     public.index_post_timings_on_user_id |   21 GB
     public.post_timings_summary          |   21 GB
    
    
    PostgreSQL 13
    
                   relation               |    size    
    --------------------------------------+------------
     public.post_timings                  |   42 GB
     public.post_timings_unique           |   30 GB
     public.post_timings_summary          | 6939 MB
     public.index_post_timings_on_user_id | 6766 MB
    

    Again, as expected, the UNIQUE index that by definition has 0 duplication saw no change in it’s size, but the indexes with repeating values got optimized into just a third of their original size.

    Not only index size changes, but also performance. According to the PostgreSQL documentation on the topic:

    This significantly reduces the storage size of indexes where each value (or each distinct combination of column values) appears several times on average. The latency of queries can be reduced significantly. Overall query throughput may increase significantly. The overhead of routine index vacuuming may also be reduced significantly.

    They also add a caveat that for write-heavy workloads with no duplication will incur a small fixed performance penalty. It’s not our case here, but if it was this would be alleviated by the fact that this is written in a completely async code path in our application: it’s a background request in our client and a non-blocking route in our Rails app that leverages Rack Hijack.

    presentation-slide-about-hack-hijack

    So the prophecy was true: PostgreSQL 13 brings significant improvement to Discourse!

    That’s a big deal, because here we saw the effect in one table in a single database, where our database schema has dozens of tables. And we host thousands of Discourse instances, with multiple PostgreSQL instances each for High Availability, so the gains are multiplied many times over.

    Discourse ❤️ PostgreSQL

    As we said in Discourse Gives Back 2017, Discourse has always been a 100% open source project that builds upon the decades of hard work of many other open source projects to survive. As we grow we’re happy to be able to also contribute directly to funding the projects we rely on the most. That is why last year we made another monetary donation to the PostgreSQL foundation and we aim to do the same every year.

    Comments
  • Discourse Essentials Starter Kit

    Want to use Discourse but unsure about where to start? This curated list of articles will help enhance your Discourse knowledge right away!

    Photo by Dave Catchpole / CC BY

    Discourse New User Guide

    Dive into your first Discourse site after learning how to browse through topics, read posts and participate in civilized discussion!

    Discourse Moderation Guide

    If you’re a Discourse moderator, this guide will run through most common scenarios in detail and show you how each can be handled with Discourse.

    Interface Nomenclature Guide

    Categories or channels? Topics or threads? Posts or messages? Read our nomenclature guide and know the correct term for every situation, every time.

    Beginner’s guide to using Discourse Themes

    Fascinated by a Discourse that looks nothing like it’s supposed to? Find out more by jumping into the world of themes, theme-components, color palettes and more. Discourse can be customized to nearly any extent, see our diverse list of customer sites if you don’t believe that yet.


    All of the advice above is valid for every Discourse instance; regardless of whether you self-host or use our fully managed hosting service. If you have more questions about Discourse, do a quick search or post them on the Discourse Meta, where our helpful community would be happy to assist.

    Comments