Storing Rich Text

i've been researching a bit about rich text editors and different approaches to storing rich text. It's a surprisingly complicated question so I thought i would post what i'm finding.

Discourse uses (and expected all users to use) Markdown, also giving a preview panel while you are writing posts. The post editor came with a toolbar, that (similar to what Discord does while I'm writing this post) provides shortcuts that help the user use Markdown syntax. The Discourse API gives you both the original Markdown as well as the 'cooked' HTML which I guess it serialises when the post is saved or published. Sometimes HTML tags were also included in the Markdown if you used a custom plugin.

As a user experience, i think even with toolbars that provide shortcuts etc, expecting users to learn and write (or even look at) Markdown is probably the biggest issue with Discourse, particularly in terms of writing long or particularly media-rich posts. Markdown is very simple to store though, and benefits from being more agnostic to presentation style

For editing long-form content like blog posts, I guess the most popular paradigm is WYSIWYG rich text editing like you get in Microsoft Word, Wordpress etc.

In total i've looked into the following open source options for rich text editors:

  • CKEditor
  • Quill
  • Editor.js
  • Draft
  • Slate
  • ProseMirror

basically my main findings are:

  • the main options for storing rich text are HTML, Markdown and JSON. HTML is generated by most older WYSIWYG editors such as CKEditor. Storing text as JSON is preferred by a lot of people for lots of reasons, but I guess the big picture reason is that it provides an abstraction for content that can be resilient and useful across a variety of contexts, whereas HTML could be limiting
  • storing rich text as JSON basically means that the text is stored as a tree of nested node objects. I first encountered the concept with Editor.js which is marketed as a block editor library and this appealed to me because I’ve really enjoyed using Notion’s editor (block based with drag and drop) and Squarespace. We wouldn’t need anything so advanced but having the freedom to define custom media blocks is a massive plus for making it easy to integrate lots of media types. I think it’s also going to be good for accessibility. I just realised last night that editor.js doesn’t store inline styles as JSON, just blocks. Anything inside the block is styled using HTML. This is compared to other libraries like Slate which store all content as JSON, blocks and inline nodes. An advantage of this is, I think, is that it makes it easier to implement Operational Transfer, which is a system for tracking and diffing simultaneous edits, which enables real time collaboration to happen