@solvaratech/drawline-core

Mathematically Grounded, Engineering-Strong Database Seeding Engine

Drawline Core is a production-grade TypeScript library for intelligent, deterministic test data generation across multiple database systems. It provides a unified interface for schema inference, relationship resolution, and referentially intact data seeding with strong mathematical guarantees on data consistency.

Overview

Drawline addresses one of the most challenging problems in software engineering: generating realistic, referentially intact test data at scale across heterogeneous database systems. Traditional approaches rely on simple random generation or expensive database lookups to maintain foreign key integrity. Drawline uses a mathematically derived deterministic generation protocol that guarantees referential integrity without any database queries during generation.

Core Problem Statement

Given:

A database schema $S$ with collections $C = {c_1, c_2, ..., c_n}$
Relationships $R = {r_1, r_2, ..., r_m}$ defining foreign key dependencies
A generation seed $\sigma \in \mathbb{N}$

Generate documents $D_c = {d_1, d_2, ..., d_k}$ for each collection $c$ such that:

All foreign key references point to existing primary keys
The generation is fully deterministic: $G(\sigma, c, i) \rightarrow d_i$
No database queries are required during generation

Features Achieved

🎨 Drawline Semantic Engine (NEW v0.2.0)

Drawline now includes a world-class Drawline Semantic Engine powered by 60+ curated industry datasets. No more "Lorem Ipsum" or generic faker data; your test databases will now contain high-fidelity, domain-specific information.

60+ Industry Domains: Finance, Healthcare, Aviation, Logistics, Law, Science, Tech, and more.
Context-Aware Inference: The engine automatically detects field names like pan_card (Indian context), flight_number (Aviation), or diagnosis_code (Healthcare) and routes them to the correct semantic generator.
Zero-Dependency Core: High-performance generation without bloated external libraries.
Deterministic Randomness: Uses Xoshiro128 PRNG for repeatable, seed-based data generation across all 60+ datasets.

🏢 Industry Templates

Drawline provides ready-to-use schema templates for various sectors:

Ecommerce: Multi-table setup with users, products, orders, and logistics tracking.
OTT Streaming: Profiles, movie titles, genres, and watch history.
Fintech: Transactions, bank details, and financial tax types.
Logistics: Carriers, shipments, and global tracking states.
Healthcare: Appointments, vitals, and medical specialties.
...and 6 more industry presets.

🛡️ Unified Validation CLI

A single entry point for all project health checks:

Dataset Integrity: Automatically validates all 60+ JSON semantic collections.
Performance Benchmarking: Non-interactive suite that measures TPS (Transactions Per Second) and latency.
Unit Testing: Full integration with Vitest for 100% core logic verification.

Multi-Database Adapter Architecture

Drawline implements a unified adapter pattern supporting 11+ database systems:

Adapter	Status	Key Features
PostgreSQL	✅ Complete	Schema inference, FK constraints, serial types
MySQL	✅ Complete	AUTO_INCREMENT, foreign keys
SQLite	✅ Complete	Embedded testing, local file support
MongoDB	✅ Complete	ObjectId generation, document embedding
CSV Export	✅ Complete	Automated Export alongside test reports
...and more		DynamoDB, Firestore, Redis, SQL Server

Field Inference Engine

Smart field generation with score-based routing:

// Automatic Industry Routing
this.addRule('flight_status', ['flight', 'status'], 10, (r) => SemanticProvider.getFlightStatus(r));
this.addRule('diagnosis', ['diagnosis'], 10, (r) => SemanticProvider.getHealthcareDiagnosis(r));

CI/CD Integration

Automated Benchmarking: Measures TPS and Memory usage on every PR.
Artifact Upload: Generates and uploads PDF/Markdown reports and sample CSV data for every CI run.
Version Gating: Ensures npm publish only occurs if all 60+ datasets are valid and benchmarks are stable.

Technical Architecture

Core Data Flow

┌─────────────────────────────────────────────────────────────────────┐
│                    TestDataGeneratorService              │
├─────────────────────────────────────────────────────────────────────┤
│  1. initialize(config, collections, relationships)  │
│     ├── Preload metadata from target DB              │
│     ├── Build relationship map                      │
│     └── Initialize seeded RNG                      │
│                                                             │
│  2. buildDependencyOrder()                           │
│     ├── Build DAG from relationships                 │
│     ├── Detect and break cycles                       │
│     └── Return topological sort                    │
│                                                             │
│  3. generateAndPopulate()                           │
│     ├── For each collection in order:                │
│     │   ├── ensureCollection()                       │
│     │   ├── generateCollectionData()                 │
│     │   └── insertDocuments()                       │
│     └── Validate referential integrity               │
└─────────────────────────────────────────────────────────────────────┘

Adapter Interface

abstract class BaseAdapter {
  // Connection management
  abstract connect(): Promise<void>;
  abstract disconnect(): Promise<void>;
  
  // Schema operations
  abstract collectionExists(name: string): Promise<boolean>;
  abstract ensureCollection(name: string, fields: SchemaField[]): Promise<void>;
  abstract getCollectionDetails(name: string): Promise<CollectionDetails>;
  abstract getCollectionSchema(name: string): Promise<SchemaField[]>;
  
  // Data operations
  abstract insertDocuments(
    collectionName: string, 
    documents: GeneratedDocument[]
  ): Promise<(string | number)[]>;
  
  abstract clearCollection(name: string): Promise<void>;
  abstract getDocumentCount(name: string): Promise<number>;
  
  // Validation
  abstract validateReference(
    collectionName: string, 
    fieldName: string, 
    value: unknown
  ): Promise<boolean>;
}

Class Hierarchy

BaseAdapter
├── PostgresAdapter
├── MySQLAdapter
├── SQLiteAdapter
├── MongoDBAdapter
├── DynamoDBAdapter
├── FirestoreAdapter
├── RedisAdapter
├── SQLServerAdapter
├── InMemoryAdapter (for testing)
├── EphemeralAdapter (for demos)
├── NullAdapter (no-op)
└── CSVExportAdapter (export)

Mathematical Foundations

1. Topological Sort for Generation Ordering

Problem: Given a DAG $G = (V, E)$ where $V = C$ and edges represent dependencies, find a linear ordering $\tau: V \rightarrow [1, |V|]$ such that $\forall (u, v) \in E: \tau(u) < \tau(v)$.

Algorithm: Kahn's algorithm with in-degree counting:

TOPOLOGICAL-SORT(G):
  Compute in-degree(v) for all v ∈ V
  Queue ← { v | in-degree(v) = 0 }
  result ← []
  
  while Queue not empty:
    v ← Queue.pop()
    result.append(v)
    for each edge (v, w):
      in-degree(w) ← in-degree(w) - 1
      if in-degree(w) = 0:
        Queue.push(w)
  
  return result

Complexity: $O(|V| + |E|)$

2. Deterministic ID Generation

Theorem: For any collections $A$ and $B$ with relationship $R: A \rightarrow B$, let $id_A(i)$ generate the ID for the $i$-th document in $A$. Then $id_B(j)$ generated for the $j$-th document in $B$ satisfies:

$$\forall i \in [1, |A|]: FK(i) = id_A(i) = id_B(i \mod |B|)$$

Proof: Using the deterministic hash: $$id(c, i) = \text{hash}(\text{collection}c \oplus i \oplus \sigma){constrained}$$

The FK resolution computes: $$parentIndex = i \mod |parent|$$ $$FK(i) = id(parent, parentIndex)$$

By substitution: $$FK(i) = \text{hash}(parent \oplus (i \mod |parent|) \oplus \sigma)$$ $$= id(parent, i \mod |parent|)$$

$\square$

3. Cycle Detection and Breaking

Theorem: Any finite directed graph can be made acyclic by removing at least one edge.

Algorithm: Modified DFS with cycle breaking:

DETECT-CYCLE(G):
  visited ← ∅
  recursionStack ← ∅
  
  DFS(v):
    visited.add(v)
    recursionStack.add(v)
    
    for each neighbor u of v:
      if u ∉ visited:
        if DFS(u) return true
      if u ∈ recursionStack:
        return CYCLE-DETECTED(v, u)
    
    recursionStack.delete(v)
    return false
  
  for each vertex v:
    if v ∉ visited:
      if DFS(v) return true
  
  return false

Breaking Strategy: When cycles detected, prioritize removing weak dependencies (non-required FKs) to preserve data integrity.

4. Field Inference Scoring

Problem: Given a field name $f$, select the best generator from a rule set $R$.

Algorithm: Score-based matching:

$$\text{score}(r, f) = r_{score} + \text{match}(r, f) - \text{noise}(r, f)$$

Where:

$\text{match}(r, f) = 5$ if $|tokens(f)| = |tokens(r)|$ (perfect match)
$\text{noise}(r, f) = 0.5 \times (|tokens(f)| - |tokens(r)|)$

Select $r^* = \text{argmax}_r \text{score}(r, f)$

5. Composite FK Resolution

For composite FKs $(f_1, ..., f_k) \rightarrow (p_1, ..., p_k)$:

Select parent row index $r = i \mod |parent|$
Retrieve cached parent row $P[r]$
For each component $f_j$: $$value[f_j] = P[r][p_j]$$

This ensures all FK components reference the same parent row.

6. Cross-Column Constraint Satisfaction

For constraints like $A > B$ where $B$ is generated first:

$$value[A] = \max(generated, value[B] + \delta)$$

Where $\delta$ is a small deterministic offset to maintain both uniqueness and constraint satisfaction.

Usage

Installation

npm install @solvaratech/drawline-core

Basic Generation

import { TestDataGeneratorService } from "@solvaratech/drawline-core/server";
import { PostgresAdapter } from "@solvaratech/drawline-core/generator/adapters/PostgresAdapter";

// 1. Configure adapter
const adapter = new PostgresAdapter({
  connectionString: "postgres://user:pass@localhost:5432/mydb"
});
await adapter.connect();

// 2. Initialize service
const service = new TestDataGeneratorService(adapter);

// 3. Define schema
const collections = [
  {
    id: "users",
    name: "users",
    fields: [
      { id: "id", name: "id", type: "uuid", isPrimaryKey: true },
      { id: "email", name: "email", type: "string", required: true },
      { id: "name", name: "name", type: "string" }
    ]
  },
  {
    id: "posts",
    name: "posts",
    fields: [
      { id: "id", name: "id", type: "uuid", isPrimaryKey: true },
      { id: "user_id", name: "user_id", type: "uuid", isForeignKey: true, 
        referencedCollectionId: "users" },
      { id: "title", name: "title", type: "string" }
    ]
  }
];

const relationships = [
  {
    id: "posts->users",
    fromCollectionId: "posts",
    toCollectionId: "users",
    type: "many-to-one",
    fromField: "user_id",
    toField: "id"
  }
];

// 4. Generate configuration
const config = {
  collections: [
    { collectionName: "users", count: 100 },
    { collectionName: "posts", count: 1000 }
  ],
  seed: 12345
};

// 5. Execute generation
const result = await service.generateAndPopulate(
  collections, 
  relationships, 
  config
);

console.log(`Generated ${result.totalDocumentsGenerated} documents`);

Schema Diff and Migration

import { computeSchemaDiff, generateDDL } from "@solvaratech/drawline-core/schema";

// Compare current schema with database
const diff = computeSchemaDiff(databaseSnapshot, newSchema, "additive");

// Generate migration SQL
const statements = generateDDL(diff);

for (const stmt of statements) {
  console.log(stmt.sql);
}

ORM Code Generation

import { PrismaGenerator } from "@solvaratech/drawline-core/generators/orm";

const generator = new PrismaGenerator();
const output = generator.generate(collections, relationships);

console.log(output.content); // Prisma schema.prisma content

Roadmap

Short Term (v0.2.0 - v0.3.0)

Enhanced Validation: Post-generation integrity validation
Data masking: Sensitive data identification and redaction
Incremental generation: Delta seeding for existing databases
Distribution profiles: Normal, exponential, power-law distributions
Relationship visualization: Draw relationship graphs

Medium Term (v0.4.0 - v0.5.0)

Web UI Dashboard: Visual schema editor and generator interface
Data Templates: Reusable generation templates
Export formats: More export adapters (Excel, JSON Lines)
Audit logging: Generation audit trail
CI/CD integration: GitHub Actions, GitLab CI

Long Term (v1.0.0)

GraphQL API: REST/GraphQL API for remote generation
Multi-tenant:隔离的多租户支持
Enterprise features: SSO, RBAC, audit
Cloud dashboard: SaaS management console
Plug-in system: Third-party generator plugins

Development

Prerequisites

Node.js 18+
TypeScript 5.9+
pnpm or npm

Setup

npm install
npm run build

Testing

# Run all tests
npm test

# Watch mode
npm run test:watch

# UI
npm run test:ui

# CI (with coverage)
npm run test:ci

Type Checking

npm run type-check

CLI

npm run cli:build
npm link  # Link globally

drawline init
drawline gen --schema schema.json --config config.json

API Reference

Core Exports

// Main exports
export * from "./types/schemaDesign";      // Schema types
export * from "./types/schemaDiff";       // Diff types
export * from "./utils/schemaConverter";  // Converters
export * from "./utils/errorMessages"; // Errors
export * from "./schema";              // Schema engine
export * from "./generators/orm";      // ORM generators

// Server exports
export * from "./connections";         // Database connections
export * from "./generator";         // Generation engine

Key Interfaces

interface SchemaCollection {
  id: string;
  name: string;
  fields: SchemaField[];
  schema?: string;
  dbName?: string;
  position?: { x: number; y: number };
}

interface SchemaField {
  id: string;
  name: string;
  type: FieldType;
  required?: boolean;
  isPrimaryKey?: boolean;
  isForeignKey?: boolean;
  isSerial?: boolean;
  compositePrimaryKeyIndex?: number;
  compositeKeyGroup?: string;
  referencedCollectionId?: string;
  foreignKeyTarget?: string;
  rawType?: string;
  arrayItemType?: string;
  defaultValue?: any;
  constraints?: FieldConstraints;
}

interface SchemaRelationship {
  id: string;
  fromCollectionId: string;
  toCollectionId: string;
  type: "one-to-one" | "one-to-many" | "many-to-many";
  fromField?: string;
  toField?: string;
  fromFields?: string[];
  toFields?: string[];
}

interface TestDataConfig {
  collections: CollectionConfig[];
  seed?: number | string;
  batchSize?: number;
  onProgress?: (progress: ProgressUpdate) => Promise<void>;
}

License

MIT License. See LICENSE file for details.

Contributing

See CONTRIBUTING.md for development guidelines.

Support

GitHub Issues: https://github.com/solvaratech/drawline-core/issues
Documentation: https://drawline.app/docs

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
.releaserc.json		.releaserc.json
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docs.zip		docs.zip
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

@solvaratech/drawline-core

Table of Contents

Overview

Core Problem Statement

Features Achieved

🎨 Drawline Semantic Engine (NEW v0.2.0)

🏢 Industry Templates

🛡️ Unified Validation CLI

Multi-Database Adapter Architecture

Field Inference Engine

CI/CD Integration

Technical Architecture

Core Data Flow

Adapter Interface

Class Hierarchy

Mathematical Foundations

1. Topological Sort for Generation Ordering

2. Deterministic ID Generation

3. Cycle Detection and Breaking

4. Field Inference Scoring

5. Composite FK Resolution

6. Cross-Column Constraint Satisfaction

Usage

Installation

Basic Generation

Schema Diff and Migration

ORM Code Generation

Roadmap

Short Term (v0.2.0 - v0.3.0)

Medium Term (v0.4.0 - v0.5.0)

Long Term (v1.0.0)

Development

Prerequisites

Setup

Testing

Type Checking

CLI

API Reference

Core Exports

Key Interfaces

License

Contributing

Support

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages