This project, data-obfuscation-tool
, provides a solution for obfuscating sensitive data within databases. It is designed to ensure the privacy and security of data by applying customizable obfuscation techniques.
Before you begin, ensure you have the following:
- Node.js installed on your machine.
- Access to a RDBMS i.e MySQL database.
- The YAML configuration file set up for the database and tool settings.
To get started with the data-obfuscation-tool
, follow these steps:
git clone https://github.com/QbDVision-Inc/data-obfuscation-tool.git
cd data-obfuscation-tool
npm install
-
Create a Configuration File:
Copyconfig-template.yaml
toconfig.yaml
in the root directory.cp config-template.yaml config.yaml
Here's an example format:
database: host: localhost username: root password: yourpassword database: yourdatabase dialect: mysql logging: false inputDump: <Path to input dump>/database_dump.sql outputDump: <Path to output dump>/obfuscated_dump.sql
Update this file with your database details, the name of the dump file, and the path for the obfuscated output.
-
Define Obfuscation Rules:
Copy
obfuscationCfg-template.yaml
toobfuscationCfg.yaml
in the root directory.cp obfuscationCfg-template.yaml obfuscationCfg.yaml
The
obfuscationCfg.yaml
file is central to defining how data in your database should be obfuscated. This configuration file allows you to specify rules for general data types, specific columns, and even entire tables.The configuration file is divided into three main sections:
general
,columns
andtables
. Also, there is an inheritance order on the configuration sections:tables
inheritscolumns
columns
inheritsgeneral
This means you can set very broad settings in general, and then override them for specific columns and, if needed, override it for each table.
-
type
: Specifies the data type for which the rule applies (e.g.,string
). -
obfuscationRule
: The name of the obfuscation function to be used for this data type.- Currently supported obfuscation rules are
stringObfuscator
: Mask the value withX
except for the first character. Can't de-obfuscate to original value, but somebody who knows the original might be able to guess it.Biopharma pilot
becomesBXXXXXXXX p****
xorObfuscator
: Apply a bitwise XOR operation with a random key to each character. The same key must be used to de-obfuscate.Biopharma pilot
becomesId3ld4dasd aqwod
(relatively random text)
dictionaryObfuscator
: Replace the value with a random value from a dictionary. Can't de-obfuscate to original value.Biopharma pilot
becomesWhimsical across
(or some other random 9 character word)
noObfuscator
: Replace the value with itself.Biopharma pilot
becomesBiopharam pilot
(no change)
-
ignorePattern
: A regular expression pattern to match values that should not be obfuscated.Example:
general: - type: "string" obfuscationRule: "stringObfuscate" ignorePattern: > (^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$) |(^\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b$) |(^\[?\s*\{.*?\}\s*(,\s*\{.*?\}\s*)*\]?$)
-
name
: The name of the column. -
ignore
: Set totrue|false
to exclude|include this column from obfuscation, default is false.Example:
columns: - name: currentState ignore: true - name: objectModel ignore: true - name: fromLibraryModel ignore: true
-
name
: The name of the table. -
ignore
: Set to true to exclude the entire table from obfuscation. -
columns
: An array of column-specific rules within the table.Example:
tables: - name: SequelizeMeta ignore: true - name: Projects columns: - name: "name" obfuscationRule: "stringObfuscator" ignorePattern: null - name: UserActivities columns: - name: "requestContext" obfuscationRule: "requestContextObfuscator" ignorePattern: null
-
This configuration file allows for fine-grained control over how different data types, columns and tables are handled.
-
Note that each rule must have its implementation in the ObfuscatorStrategyMap, The function name must be matching the rule name.
-
Running the Tool:
node src/index.js
The script will run and obfuscate the data as per the configurations and rules you have defined.
If you are using MySQL, you may need to update your my.cnf
/my.ini
file to allow for larger packet sizes. This is necessary when importing large tables of data.
[mysqld]
max_allowed_packet=512M
Contributions to data-obfuscation-tool are welcome. Please feel free to fork the repository and submit pull requests.
This project is licensed under the MIT License.