updates

Introducing Discourse AI

Falco

Apr 24, 2023 • 6 min read

We are happy to announce Discourse AI, the first step in integrating Artificial Intelligence with Discourse communities. Our new plugin is our one-stop solution offering both new features and enhancing existing ones.

With this first release, we are shipping 7 different Discourse AI modules. These modules are designed to help community managers, members, and moderators in tasks such as semantic-related suggestions for search and new topics to explore, sentiment analysis, toxicity detection, topic and chat summarization, NSFW image detection, improving posts with features like proofreading, suggested edits and translation.

Read along and find out more details about each of these features as well as what is coming up next on our roadmap!

Discourse AI Modules

For Discourse AI, we have opted to keep its features all in a single plugin with separate modules that you can enable independently depending on your community’s needs.

We've also made one of our priorities not to lock you to a single company API, so every community can pick the provider that makes sense for them, balancing data privacy, performance, feature sets and vendor lock-in.

So, without further ado, let's see what modules you can start using today:

When you get to the end of a topic, Discourse presents you with 5 suggestions of topics to read next. Nowadays, we pick 5 random topics for anonymous users and use the unread topics for logged-in users to populate that list, making it quick to generate but not very useful when you are researching a specific subject.

With the new Semantic Suggested feature, we use Semantic Textual Similarity between the current topic and all the other topics in your instance to suggest topics that are potentially more relevant to what a person is looking for.

A topic about a video game will suggest other similar games or sequels, a topic about technology will suggest other related technologies, and a topic about a new band album will suggest other albums by the same band, or similar ones.

We have been running this new suggested topics feature at Meta for the last few weeks, and we are very excited about the results so far.

A list of semantic related topic to this Discourse feature request

Community Sentiment

With the sentiment module, we automatically classify every post in your community across sentiment (positive or negative) and/or emotion (joy, surprise, anger, disgust, fear, sadness, or neutral). This allows your staff team to gain valuable insights into the health of the community and facilitates analysis of sentiment across various axes, including category, topic, and user levels.

User overall sentiment on the user profile page - Future UI mockup

Overall community sentiment over time - Future UI mockup

Composer AI Helper

Have you ever found yourself or someone you know in one of the following scenarios?

Writing a new topic in your favorite Discourse instance but having a hard time coming up with a good title. Naming things is hard!

Trying to reply to a topic that you know can be helpful, but while you can understand the language it was written well enough, it's not as easy to put your well thought-out reply into words in that same language.

You just finished writing a good-sized post, but while your arguments are solid, you're pretty sure they could be presented better if you took the time to revise the whole thing, but ain't nobody got the time for that!

We have found ourselves many times in situations just like that, and to ease the communication on Discourse, we created the Composer AI Helper module.

After composing your post, click on the ✨ icon and select any of the following options:

Suggest titles
Translate to English
Proofread
Create table

And after a couple of seconds, you will get some help from the AI.

Example of an AI proofread pass in a topic

5 proposed topic titles generated by the AI Helper module.

Toxicity Detection

The toxicity module scans new posts and chat messages and classifies them on a toxicity score across a variety of labels. Those toxicity scores are all available for reports, where the community moderators can identify content that may not be adequate for your instance.

And, if you want to go one step further, you can enable automatic flagging of content that crosses a customizable toxicity threshold, which will put the potential problematic content into the Discourse Review Queue, where it can be manually analyzed by your mod team.

Example of a post that was automatically flagged because it was deemed toxic

NSFW Image Detection

The NSFW module automatically scans every new upload in user posts and classifies each image found for what's usually considered inappropriate content. The content of the classification is available via reports to your moderator team and, optionally, you can enable automatic flagging of content that crosses a certain threshold.

Example of a post that was automatically flagged because it contained a NSFW image

Semantic Search

Using the same embeddings technique as Related Topics, we have also introduced a semantic search option to the full page search. This is particularly useful when searching using fully worded questions instead of just keywords.

Summarization

The Summarization module provides users with short summaries of Topics or Chat Channels, so you can get an overview of a long discussion in a few seconds.

Example of an AI generated summary of a Discourse UI change announcement topic

Modules Providers

As we said above, we are committed to offering new AI features without compromising your privacy. See below for the current providers and models for each module. CDCK handles hosting for open-source models in our infrastructure and API keys for SaaS providers like OpenAI.

Toxicity detection is powered by https://github.com/unitaryai/detoxify
Sentiment uses https://huggingface.co/j-hartmann/emotion-english-distilroberta-base and https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest
NSFW detection uses https://github.com/GantMan/nsfw_model and https://github.com/bhky/opennsfw2
Semantic Suggested Topics and Semantic Search uses either https://github.com/UKPLab/sentence-transformers or OpenAI to generate embeddings, and https://github.com/pgvector/pgvector for storage and search.
Composer AI Helper uses either OpenAI or Anthropic respective APIs.
Summarization uses https://huggingface.co/philschmid/bart-large-cnn-samsum, https://huggingface.co/philschmid/flan-t5-base-samsum, https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary, OpenAI or Anthropic.

Let us know if you want to see another provider for existing modules, as we are looking to increase our customers' options.

What’s next?

We are excited to continue improving the features we’re shipping today based on your feedback and we also have plenty of ideas for things to add next to enrich your community experience on Discourse. Some examples are:

Extraction of text from images
PII detection
Generate topic thumbnails using AI
Automatic image captions
Emoji contextual suggestions

As always, we are eager for your feedback, so share your thoughts in the comments below.

Disclaimer

We are being very mindful with our experimentation around AI. The algorithms we are leaning on are only as good as the data they were trained on. Bias, inaccuracies, and hallucinations are all possibilities we need to allow for. We regularly revisit, test, and refine our AI modules.

If you encounter any concerning issues with our AI modules please contact ai-safety@discourse.org

Installing Discourse AI on your community

We are making Discourse AI preview available for all our Enterprise customers today! Please contact our support team to get it installed and configured on your instance.

Based on feedback from our customer base and real world usage we will decide on availability on other hosted plans. Please contact team@discourse.org if you are interested!

If you self-host Discourse on your own server, you can use the modules that rely on OpenAI / Anthropic right away, and check out the self-hosting Discourse AI topic guide on Meta to also use the Open Source powered modules.