data-sanitization

Pattern-based sanitization for sensitive data in objects and strings. Use it to mask or remove fields before logging, debugging, or sending data to systems that should not receive sensitive values such as secrets, PII, PHI, credentials, or other private data.

Works in Node.js and browsers with JavaScript and TypeScript. The package ships compiled JavaScript, TypeScript declarations, and source maps, with no runtime dependencies.

Installation

Install with the package manager used by your project.

npm

npm install data-sanitization

Yarn

yarn add data-sanitization

pnpm

pnpm add data-sanitization

Bun

bun add data-sanitization

Importing

The named export is recommended:

import { sanitizeData, DataSanitizationError } from 'data-sanitization';

The sanitizer is also available as the default export:

import sanitizeData from 'data-sanitization';

CommonJS consumers can require the compiled package:

const { sanitizeData } = require('data-sanitization');

Usage

Quick start

import { sanitizeData } from 'data-sanitization';

const input = {
  username: 'mark',
  password: 'super-secret',
  api_key: 'sk_live_abc123',
};

const result = sanitizeData(input);
// => { username: 'mark', password: '**********', api_key: '**********' }

Sanitize a string

String sanitization works with JSON-like strings, escaped JSON-like strings, and form-encoded strings:

sanitizeData('{"password":"secret","username":"mark"}');
// => '{"password":"**********","username":"mark"}'

sanitizeData('password=secret&username=mark');
// => 'password=**********&username=mark'

Remove fields instead of masking

sanitizeData(
  { password: 'secret', token: 'abc', username: 'mark' },
  { removeMatches: true },
);
// => { username: 'mark' }

Sanitize PII and PHI with custom patterns

Use customPatterns to mask fields that are sensitive for your domain, such as PII or PHI fields.

import { sanitizeData } from 'data-sanitization';

const sensitivePatterns = [
  'address',
  'date_of_birth',
  'email',
  'emergency_contact',
  'full_name',
  'health_card',
  'ip_address',
  'medications',
  'phone',
  'postal_code',
  'ssn',
];

const patient = {
  accountId: 'acct_123',
  full_name: 'Avery Example',
  email: 'avery@example.com',
  phone: '+1-555-0100',
  date_of_birth: '1989-04-12',
  health_card: 'HC-1234-5678',
  medications: ['example-medication'],
};

sanitizeData(patient, {
  customPatterns: sensitivePatterns,
  useDefaultPatterns: false,
});
// => {
//   accountId: 'acct_123',
//   full_name: '**********',
//   email: '**********',
//   phone: '**********',
//   date_of_birth: '**********',
//   health_card: '**********',
//   medications: '**********',
// }

Use removeMatches with the same patterns to remove those fields instead of masking them.

sanitizeData(patient, {
  customPatterns: sensitivePatterns,
  useDefaultPatterns: false,
  removeMatches: true,
});
// => { accountId: 'acct_123' }

Options

Option	Type	Default	Description
`patternMask`	`string`	`**********`	String used to replace matched string field values
`numericMask`	`number`	`9999999999`	Number used to replace matched number field values
`removeMatches`	`boolean`	`false`	Remove matched fields entirely instead of masking
`scanStringValues`	`boolean`	`true`	Scan string values on non-sensitive keys for embedded patterns (object input only)
`customPatterns`	`string[]`	`[]`	Additional field name patterns to match
`customMatchers`	`DataSanitizationMatcher[]`	`[]`	Additional regex matchers for custom string formats
`useDefaultPatterns`	`boolean`	`true`	Whether to include the built-in default patterns
`useDefaultMatchers`	`boolean`	`true`	Whether to include the built-in default matchers

Default patterns

The following field name patterns are matched by default using a case-insensitive substring match:

apikey
api_key
password
secret
token

A field named db_password or client_secret_key would also match because these patterns match as substrings.

Default matchers

Three matchers are included by default:

JSON matcher — matches "fieldName":"value" patterns in JSON and JSON-like strings
Escaped JSON matcher — matches \"fieldName\":\"value\" patterns in JSON embedded inside JSON string values
Form-encoded matcher — matches fieldName=value and fieldName:value patterns in URL-encoded and similarly delimited strings

Custom patterns and matchers

import { sanitizeData } from 'data-sanitization';

const data = {
  username: 'mark',
  ssn: '123-45-6789',
  credit_card: '4111111111111111',
};

sanitizeData(data, {
  customPatterns: ['ssn', 'credit_card'],
});

sanitizeData(data, {
  customPatterns: ['ssn'],
  useDefaultPatterns: false,
});

sanitizeData(data, {
  patternMask: '[REDACTED]',
});

Number-typed sensitive values are masked with numericMask to preserve the field's type:

sanitizeData({ password: 12345, username: 'mark' });
// => { password: 9999999999, username: 'mark' }

sanitizeData({ password: 12345, username: 'mark' }, { numericMask: 0 });
// => { password: 0, username: 'mark' }

For custom data formats, provide a DataSanitizationMatcher — a function that takes a pattern string and returns a global, case-insensitive RegExp. The regex must use capture groups $1 and $2 to preserve the field name and trailing delimiter while replacing the value.

const headerMatcher = (pattern: string) =>
  new RegExp(`(${pattern}:\\s*).+?(\\n|$)`, 'gi');

sanitizeData('authorization: Bearer abc123\nuser: mark', {
  customPatterns: ['authorization'],
  customMatchers: [headerMatcher],
  useDefaultMatchers: false,
});
// => 'authorization: **********\nuser: mark'

Error handling

sanitizeData throws a DataSanitizationError when:

The input is not a string, object, or null.
An unexpected error occurs during sanitization.

import { sanitizeData, DataSanitizationError } from 'data-sanitization';

try {
  sanitizeData(123 as any);
} catch (error) {
  if (error instanceof DataSanitizationError) {
    console.error(error.message); // 'Invalid data type'
    console.error(error.details); // { inputType: 'number' }
  }
}

Error details are limited to safe diagnostic metadata and do not include the original input payload.

How it works

String input is sanitized directly via regex replacement with the configured matchers.
Object input is sanitized recursively by key name without JSON serialization. Sensitive keys are masked or removed regardless of whether their values are strings, numbers, arrays, objects, or other primitives.
Plain nested objects and arrays are cloned as they are sanitized. Non-plain object instances are preserved without modification to avoid corrupting their prototypes.
Null input is accepted and returns null.
For object input, each configured pattern is matched case-insensitively against keys. String values on non-sensitive keys are also scanned for embedded patterns by default (scanStringValues: true), which catches credentials embedded in log messages or other free-text fields. For string input, each pattern is tested against each matcher to produce regex instances that find and replace sensitive values in the string directly.

Performance

sanitizeData is designed for in-process sanitization of log payloads, request/response objects, and similar data before they leave your application. It is not designed for streaming pipelines or bulk batch processing of large files.

String-value scanning (scanStringValues: true, the default) adds overhead on object workloads. The cost depends on how many non-sensitive string fields the input has and how long they are. Rough throughput on a modern laptop (Apple M-series, Node.js 22):

Workload	ops/s	ms/call	scan overhead
Shallow object (1 sensitive key)	~243,000	~0.004	~55%
Log object, stack trace with credentials	~79,000	~0.013	~80%
Log object, clean stack trace	~199,000	~0.005	~49%
Object with 10KB non-sensitive string	~143,000	~0.007	~76%
Large flat object (50 fields, 1 sensitive key)	~69,000	~0.015	~15%
Array (1,000 items, 1 sensitive key each)	~2,043	~0.49	~4%
Array (1,000,000 items, 1 sensitive key each)	~1.7	~574	~4%

Array workloads pay ~2–4% overhead regardless of size — the per-item pre-filter cost is negligible. The cost is most visible on individual objects with long non-sensitive string values such as stack traces or large text fields; a single 10KB non-sensitive string value incurs ~76% overhead.

Set scanStringValues: false to recover the pre-scanning performance when you control your data structure and know sensitive values only appear on sensitive-named keys.

For full benchmark tables, charts, and scaling analysis see docs/performance.md. To run the suite:

yarn bench

Contributing

For development setup, testing, and release process, see docs/development.md. For future direction, see docs/ROADMAP.md.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
.ai		.ai
.claude		.claude
.github		.github
.husky		.husky
.vscode		.vscode
.yarn/sdks		.yarn/sdks
bench		bench
docs		docs
scripts		scripts
src		src
test		test
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.oxfmtrc.json		.oxfmtrc.json
.oxlintrc.json		.oxlintrc.json
.yarnrc.yml		.yarnrc.yml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
commitlint.config.js		commitlint.config.js
lint-staged.config.mjs		lint-staged.config.mjs
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data-sanitization

Table of Contents

Installation

npm

Yarn

pnpm

Bun

Importing

Usage

Quick start

Sanitize a string

Remove fields instead of masking

Sanitize PII and PHI with custom patterns

Options

Default patterns

Default matchers

Custom patterns and matchers

Error handling

How it works

Performance

Contributing

License

About

Uh oh!

Releases 12

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

data-sanitization

Table of Contents

Installation

npm

Yarn

pnpm

Bun

Importing

Usage

Quick start

Sanitize a string

Remove fields instead of masking

Sanitize PII and PHI with custom patterns

Options

Default patterns

Default matchers

Custom patterns and matchers

Error handling

How it works

Performance

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages