Slate editors can edit complex, nested data structures. And for the most part this is great. But in certain cases inconsistencies in the data structure can be introduced—most often when allowing a user to paste arbitrary richtext content.
"Normalizing" is how you can ensure that your editor's content is always of a certain shape. It's similar to "validating", except instead of just determining whether the content is valid or invalid, its job is to fix the content to make it valid again.
Slate editors come with a few built-in constraints out of the box. These constraints are there to make working with content much more predictable than standard contenteditable. All of the built-in logic in Slate depends on these constraints, so unfortunately you cannot omit them. They are...
These default constraints are all mandated because they make working with Slate documents much more predictable.
🤖 Although these constraints are the best we've come up with now, we're always looking for ways to have Slate's built-in constraints be less constraining if possible—as long as it keeps standard behaviors easy to reason about. If you come up with a way to reduce or remove a built-in constraint with a different approach, we're all ears!
The built-in constraints are fairly generic. But you can also add your own constraints on top of the built-in ones that are specific to your domain.
To do this, you extend the normalizeNode function on the editor. The normalizeNode function gets called every time an operation is applied that inserts or updates a node (or its descendants), giving you the opportunity to ensure that the changes didn't leave it in an invalid state, and correcting the node if so.
For example here's a plugin that ensures paragraph blocks only have text or inline elements as children:
This example is fairly simple. Whenever normalizeNode gets called on a paragraph element, it loops through each of its children ensuring that none of them are block elements. And if one is a block element, it gets unwrapped, so that the block is removed and its children take its place. The node is "fixed".
But what if the child has nested blocks?
One thing to understand about normalizeNode constraints is that they are multi-pass.
If you check the example above again, you'll notice the return statement:
You might at first think this is odd, because with the return there, the original normalizeNodes will never be called, and the built-in constraints won't get a chance to run their own normalizations.
But, there's a slight "trick" to normalizing.
When you do call Editor.unwrapNodes, you're actually changing the content of the node that is currently being normalized. So even though you're ending the current normalization pass, by making a change to the node you're kicking off a new normalization pass. This results in a sort of recursive normalizing.
This multi-pass characteristic makes it much easier to write normalizations, because you only ever have to worry about fixing a single issue at once, and not fixing every possible issue that could be putting a node in an invalid state.
To see how this works in practice, let's start with this invalid document:
The editor starts by running normalizeNode on <paragraph c>. And it is valid, because it contains only text nodes as children.
But then, it moves up the tree, and runs normalizeNode on <paragraph b>. This paragraph is invalid, since it contains a block element (<paragraph c>). So that child block gets unwrapped, resulting in a new document of:
And in performing that fix, the top-level <paragraph a> changed. It gets normalized, and it is invalid, so <paragraph b> gets unwrapped, resulting in:
And now when normalizeNode runs, no changes are made, so the document is valid!
🤖 For the most part you don't need to think about these internals. You can just know that anytime normalizeNode is called and you spot an invalid state, you can fix that single invalid state and trust that normalizeNode will be called again until the node becomes valid.
The one pitfall to avoid however is creating an infinite normalization loop. This can happen if you check for a specific invalid structure, but then don't actually fix that structure with the change you make to the node. This results in an infinite loop because the node continues to be flagged as invalid, but it is never fixed properly.
For example, consider a normalization that ensured link elements have a valid url property:
This fix is incorrectly written. It wants to ensure that all link elements have a url property string. But to fix invalid links it sets the url to null, which is still not a string!
In this case you'd either want to unwrap the link, removing it entirely. Or expand your validation to accept an "empty" url == null as well.