Originally created by IMB developers working on the DPIA product.
- Install package by following the steps at Installing the Package.
- Set up the package by following the steps at Basic Setup Guide.
- Output is HTML content in string form, not markdown.
- General Information
- Installing the Package - Start Here!
- Basic Setup Guide - Setting up after installing.
- Sanitization - How the output is sanitized.
- Applications using RichTextEditor - Find examples.
- For running on a NodeJS:20 React 18 app.
- Works with Vanilla JavaScript or Typescript 5.
- Output is HTML content in string form, not markdown.
- Add the following line to your
package.json
:
{
"dependencies": {
"@bcgov/citz-imb-richtexteditor": "https://github.com/bcgov/citz-imb-richtexteditor/releases/download/v<VERSION>/bcgov-citz-imb-richtexteditor-<VERSION>.tgz",
// The rest of your dependencies...
},
}
- Replace
<VERSION>
with the version you wish to use. Reference releases for version numbers.
- Run
npm install
to add the package.
- Add import
import { RichTextEditor } from '@bcgov/citz-imb-richtexteditor';
.
Example:
import React, { useState } from 'react';
import { RichTextEditor } from '@bcgov/citz-imb-richtexteditor';
const MyComponent = () => {
const [content, setContent] = useState('');
const [readOnly, setReadOnly] = useState(false);
return (
<>
<RichTextEditor content={content} setContent={setContent} readOnly={readOnly} />
</>
);
};
The RichTextEditor
provides a built in function sanitizeContent
which is designed to take a string containing HTML content and sanitize it by removing potentially malicious code. Here's a summary of what this sanitization handles and what it doesn't:
What it handles:
-
Removal of DOCTYPE and XML declarations: Removes
<!DOCTYPE ...>
and<?xml ...?>
declarations. -
Removal of
<link>
tags:<link>
elements are stripped out, which are often used to include external resources like CSS files that could affect the styling or behavior of the page. -
Stripping of inline styles: It removes any
style
attributes from HTML tags to prevent unwanted styles from being applied. -
Removal of potentially harmful tags: It explicitly removes
<script>
,<style>
,<iframe>
,<form>
,<object>
,<embed>
, and<applet>
tags. These tags can be used to execute JavaScript or embed external resources, which can be a security risk. -
Removal of event handlers: All attributes starting with
on
(likeonclick
,onerror
, etc.) are removed to prevent JavaScript code execution through event handlers. -
Blocking of
javascript:
anddata:
URLs: These are commonly used in XSS (Cross-Site Scripting) attacks to execute JavaScript or embed data directly into documents. -
Removal of XML and namespace attributes: Attributes like
xmlns
or those starting withxml
(which are used to define namespaces in XML) are stripped out. -
Filtering out non-allowlisted tags: The function also has a mechanism to loop through the HTML content, removing any tags that are not explicitly listed in the allowlist. Currently these are
"p", "br", "h", "ul", "ol", "li", "b", "i", "s", "u", "a", "span"
.
What it doesn't handle:
-
Limited tag allowlist: The allowlist is not exhaustive and could potentially filter out valid HTML5 elements which might be considered safe. Also, the code as written allows for any tag starting with "h", which includes all heading tags (
h1
throughh6
) but could inadvertently allow other non-safe tags that start with "h" if not handled correctly. -
Potential for obfuscation to bypass sanitization: The code may not handle all cases of obfuscated JavaScript or other types of obfuscation techniques that could be used to bypass sanitization.
-
Attribute sanitization: While it does remove event handlers and specific dangerous attribute values like
javascript:
anddata:
, it doesn't check other attributes for potentially malicious content. -
Comments and conditional comments: The code looks like it's attempting to handle comments, but complex conditional comments could potentially be missed.
-
Other potential threats: This sanitization function does not consider other forms of exploits that can be hidden in seemingly benign attributes or tags that are not included in the explicit block list.
In summary, this function takes a series of important steps to sanitize HTML content, particularly aimed at preventing XSS attacks, but it isn't foolproof.
The following applications are currently using this package:
DPIA - Digital Privacy Impact Assessment