Skip to content

feat: OracleDB support for CDC via LogMiner#4036

Open
josephwoodward wants to merge 27 commits intomainfrom
jw/oracledb_cdc
Open

feat: OracleDB support for CDC via LogMiner#4036
josephwoodward wants to merge 27 commits intomainfrom
jw/oracledb_cdc

Conversation

@josephwoodward
Copy link
Contributor

@josephwoodward josephwoodward commented Feb 27, 2026

This pull request adds support for the CDC pattern via LogMiner. LogMiner is traditionally a tool for providing a relational interface to Oracle's Redo Log Files for the purposes of tasks such as data recovery and auditing, it's also commonly also used for supporting the change data capture pattern. As such, there's some additional work for enabling CDC via LogMiner which I'll break down below.

How CDC Works with LogMiner

From a high level, LogMiner works by loading redo logs then allowing SQL based access to the change events. From an architectural perspective the following core components are:

LogMiner (logminer/logminer.go)

The continuous streaming engine. Each mining cycle:

  1. Queries OracleDB to determine which redo log files cover the SCN window since the last cycle (this is needed as logs can be cycled based on configurations such as size and duration)
  2. Deduplicates log files by preferring archived logs over online logs.
  3. Registers the log files with LogMiner and starts a bounded session (see Session Manager below)
  4. Reads the log contents, parsing DML entries through the SQL redo parser (see SQL Redo Parser below)
  5. Backs off when caught up with the current database SCN

Session Manager (logminer/session.go)

This is responsible for loading log files into LogMiner as they're cycled. It does this in a mining loop that continuously checks for the latest latest SCN.

Transaction Buffer (logminer/cache.go)

Event rows queried via LogMiner begin with a transaction start and eventually end with a commit/rollback. This means as we're reading events we have to buffer them until we reach the associative commit or rollback. Rollbacks result in the buffer being discarded and commits result in the buffer being processed by the SQL Redo Parser (discussed below) before published to Benthos pipeline. As such, it's possible that long-running transactions on a high-throughput system can lead to memory exhaustion, so it's important to have controls in place to limit this (max_transaction_events allows users to specify an upper limit).

As mentioned, the current implementation is in-memory but it'd be good to add support for other storage mechanisms (such as redis, or potentially even OracleDB - similar to what we do with the checkpoint cache)

SQL Redo Parser (logminer/sqlredo/)

Parses the reconstructed SQL redo statements produced by LogMiner back into structured DMLEvent values (table, operation type, column values).

The queries we get back from log miner literally look like this:

insert into "MYAPP"."CUSTOMERS" ("ID","NAME","EMAIL") values ('1','John Doe','john@example.com')`.

In some instances they'll have functions in them such as below:

insert into "MYAPP"."ORDERS" ("ID","ORDER_DATE") values ('100',TO_DATE('2020-01-15','YYYY-MM-DD'))

Based on the function name we have to perform conversions in Go (see logminer/sqlredo/valueconverter_test.go for tests).

Things to follow up in separate PR

  • Add more detailed documentation on how to configure Oracle to support CDC
  • Support for CLOB types

@claude
Copy link

claude bot commented Feb 27, 2026

Claude finished @josephwoodward's task in 4m 22s —— View job


  • Validate commit policy
  • Read project patterns (godev.md, tester.md)
  • Review code changes
  • Post inline comments
  • Post summary

Commits

  1. oracle_cdc: snapshotting across multiple schemas working — does not use imperative mood; reads as a status update. Should be e.g. oracle_cdc: support snapshotting across multiple schemas.
  2. oracle_cdc: test verifying various types passing — does not use imperative mood; vague. Should be e.g. oracle_cdc: add test for various column types.

Review

Well-structured CDC implementation with good architectural separation (LogMiner engine, session manager, transaction buffer, SQL redo parser). The integration tests are thorough and follow project patterns (testcontainers-go, integration.CheckSkip, assert.Eventually with mutex, license.InjectTestService). The component registration, public wrapper, bundle imports, info.csv entry, and license headers are all correct.

  1. Bug: ConvertValue receives column name instead of column type (parser.go:79) — The value converter's type-directed conversion paths are dead code because column names (e.g. "ORDER_DATE") are passed where Oracle type names (e.g. "DATE") are expected. Only the fallback pattern matching on the value string works. See inline comment for details.

@claude
Copy link

claude bot commented Feb 27, 2026

Claude Code is working…

  • Validate commit policy
  • Read project patterns (godev.md, tester.md)
  • Review code changes
  • Post inline comments
  • Post summary

View job run

@claude
Copy link

claude bot commented Feb 27, 2026

Claude finished @josephwoodward's task in 8m 22s —— View job


  • Validate commit policy
  • Read project patterns (godev.md, tester.md)
  • Review code changes
  • Post inline comments
  • Post summary

Commits

  1. 7be6aecde046b oracle_cdc: setting up local test infrastructure — "setting up" is not imperative mood. Should be e.g. oracle_cdc: set up local test infrastructure.
  2. fe1209b72c8c5 oracle_cdc: snapshotting across multiple schemas working — not imperative mood; reads as status update. Should be e.g. oracle_cdc: support snapshotting across multiple schemas.
  3. 878dff1e35684 oracle_cdc: test verifying various types passing — not imperative mood; reads as status update. Should be e.g. oracle_cdc: add test for various column types.
  4. f6277c3d67580 oracle_cdc: clean up and refactorings following code reviews — "refactorings" is a noun, not imperative. Should be e.g. oracle_cdc: clean up and refactor following code reviews.

Review

Well-structured CDC implementation with good architectural separation (LogMiner engine, session manager, transaction buffer, SQL redo parser). Component registration (init() + MustRegisterBatchInput), public wrapper (public/components/oracledb/package.go), bundle imports (all/package.go, cloud/package.go), info.csv entry, and RCL license headers are all correct. Integration tests follow project patterns (testcontainers-go, integration.CheckSkip, assert.Eventually with mutex, license.InjectTestService, context.Background() in cleanup, no require inside assert.Eventually). The previous review's ConvertValue column-type bug has been addressed — the converter now works solely on value string patterns.

  1. Context-unaware time.Sleep in mining loop (logminer.go:138) — time.Sleep blocks without checking context cancellation, delaying graceful shutdown by up to the user-configured backoff_interval. Should use select with ctx.Done() and time.After. See inline comment for details.

openai_transcription ,processor ,openai_transcription ,4.32.0 ,certified ,n ,y ,y
openai_translation ,processor ,openai_translation ,4.32.0 ,certified ,n ,y ,y
opensearch ,output ,OpenSearch ,0.0.0 ,certified ,n ,y ,y
oracledb_cdc ,input ,oracledb_cdc ,0.0.0 ,enterprise ,n ,y ,y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set a version.

lm.log.Infof("Caught up with redo logs, backing off")
}
time.Sleep(lm.cfg.MiningBackoffInterval)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: time.Sleep blocks without checking context cancellation, so a shutdown signal arriving during the backoff interval is not noticed until the sleep completes. Since MiningBackoffInterval is user-configurable (and could be set to minutes), this delays graceful shutdown by up to that duration.

Per godev patterns — check for cancellation in long-running loops using a select on ctx.Done(). Replace the time.Sleep with a context-aware backoff:

select {
case <-ctx.Done():
    return ctx.Err()
case <-time.After(lm.cfg.MiningBackoffInterval):
}

This also lets you remove the select/default wrapper above (lines 128-131) since cancellation is now checked at the sleep point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional as it's a tight loop so can result in a lot of unnecessary allocations, will need to benchmark to understand impact vs the additional wait time.


== Permissions

When using the default Oracle based cache, the Connect user requires permission to create tables and stored procedures, and the ` + "rpcn" + ` schema must already exist. Refer to ` + "`" + ociFieldCheckpointCacheTableName + "`" + ` for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have more actionable information here if possible

Comment on lines +246 to +248
} else if policy.IsNoop() {
policy.Count = 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is to say that Count must be > 0?

// Reset our stop signal
i.stopSig = shutdown.NewSignaller()

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting to function that can return error and do not do ad-hoc

				i.stopSig.TriggerHasStopped()
				return

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could actually use the wg, _ := errgroup.WithContext(softCtx) defined below

return
}

if err = i.cacheSCN(softCtx, maxSCN); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the error reporting would be easier in a method.

}

var currentSCN SCN
if err := s.db.QueryRowContext(ctx, "SELECT CURRENT_SCN FROM V$DATABASE").Scan(&currentSCN); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I saw sometimes we would use ` not " for SQL queries

}
var types []*sql.ColumnType
if types, err = batchRows.ColumnTypes(); err != nil {
batchRows.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite risky if we miss it, could we put that to defer?

}
// Add greater-than condition for current column
paramIdx++
condParts = append(condParts, fmt.Sprintf(`"%s" > :%d`, pk[i], paramIdx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe would be easier with StringBuilder?

// UPDATE: update "schema"."table" set "C1" = 'v1', "C2" = 'v2' where "C1" = 'old1' and "C2" = 'old2';
// DELETE: delete from "schema"."table" where "C1" = 'v1' and "C2" = 'v2';
type Parser struct {
valueConverter *OracleValueConverter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package seems to be quite pointer heavy, maybe we could be using less pointers?

return nil
}

dateStr := matches[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safer with named groups

Comment on lines +18 to +24
const (
OciFieldLogMiner = "logminer"

OciFieldSCNWindowSize = "scn_window_size"
OciFieldBackoffInterval = "backoff_interval"
OciFieldMiningStrategy = "strategy"
OciFieldMaxTransactionEvents = "max_transaction_events"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like service integration should be in the top level package.

}

// StartSession starts a LogMiner session with ONLINE_CATALOG strategy
func (sm *SessionManager) StartSession(ctx context.Context, startSCN, endSCN uint64, committedDataOnly bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add protection against calling it out of order?


switch dml.Operation {
case sqlredo.OpInsert:
m.Operation = replication.MessageOperationInsert.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have typed m.Operation?

if err := lm.sessionMgr.EndSession(ctx); err != nil {
lm.log.Errorf("ending LogMiner session on exit: %v", err)
}
lm.sessionActive = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move session state to SessionManager

@claude
Copy link

claude bot commented Mar 2, 2026

Claude finished @josephwoodward's task in 5m 23s —— View job


  • Validate commit policy
  • Read project patterns (godev.md, tester.md)
  • Review code changes
  • Post inline comments
  • Post summary

Commits

  1. 7be6aecde046b oracle_cdc: setting up local test infrastructure — "setting up" is not imperative mood. Should be e.g. oracle_cdc: set up local test infrastructure.
  2. fe1209b72c8c5 oracle_cdc: snapshotting across multiple schemas working — not imperative mood; reads as status update. Should be e.g. oracle_cdc: support snapshotting across multiple schemas.
  3. 878dff1e35684 oracle_cdc: test verifying various types passing — not imperative mood; reads as status update. Should be e.g. oracle_cdc: add test for various column types.
  4. f6277c3d67580 oracle_cdc: clean up and refactorings following code reviews — "refactorings" is a noun, not imperative. Should be e.g. oracle_cdc: clean up and refactor following code reviews.
  5. ae3db490a45c9 oracle_cdc: snapshot refactorings and tweaks based on feedback — "refactorings" is a noun, not imperative. Should be e.g. oracle_cdc: refactor snapshots and apply review feedback.
  6. 8a7713dee1ff5 oracle_cdc: export config params following feedbac — message is truncated ("feedbac" instead of "feedback").

Review

Well-structured CDC implementation with good architectural separation (LogMiner engine, session manager, transaction buffer, SQL redo parser). Component registration (init() + MustRegisterBatchInput), public wrapper (public/components/oracledb/package.go), bundle imports (all/package.go, cloud/package.go), info.csv entry, and RCL license headers are all correct. Integration tests follow project patterns (testcontainers-go, integration.CheckSkip, assert.Eventually with mutex, license.InjectTestService, context.Background() in cleanup, no require inside assert.Eventually). Previous review issues (ConvertValue column-type bug, time.Sleep backoff, dedicated sql.Conn for LogMiner) have been addressed or discussed.

LGTM


@claude
Copy link

claude bot commented Mar 2, 2026

Claude finished @josephwoodward's task in 4m 40s —— View job


  • Validate commit policy
  • Read project patterns (godev.md, tester.md)
  • Review code changes
  • Post inline comments
  • Post summary

Commits

  1. 2ae48ec oracle_cdc: setting up local test infrastructure — "setting up" is not imperative mood. Should be e.g. oracle_cdc: set up local test infrastructure.
  2. bdc99c3 oracle_cdc: snapshotting across multiple schemas working — not imperative mood; reads as status update. Should be e.g. oracle_cdc: support snapshotting across multiple schemas.
  3. 328ac83 oracle_cdc: test verifying various types passing — not imperative mood; reads as status update. Should be e.g. oracle_cdc: add test for various column types.
  4. 0a68f65 oracle_cdc: clean up and refactorings following code reviews — "refactorings" is a noun, not imperative. Should be e.g. oracle_cdc: clean up and refactor following code reviews.
  5. 1ffab81 oracle_cdc: snapshot refactorings and tweaks based on feedback — "refactorings" is a noun, not imperative. Should be e.g. oracle_cdc: refactor snapshots and apply review feedback.
  6. e5f5a5b oracle_cdc: export config params following feedbac — message is truncated ("feedbac" instead of "feedback").

Review

Well-structured CDC implementation with good architectural separation (LogMiner engine, session manager, transaction buffer, SQL redo parser). Component registration (init() + MustRegisterBatchInput), public wrapper (public/components/oracledb/package.go), bundle imports (all/package.go, cloud/package.go), info.csv entry, and RCL license headers are all correct. Integration tests follow project patterns (integration.CheckSkip, assert.Eventually with mutex, license.InjectTestService, context.Background() in cleanup, no require inside assert.Eventually). Previous review issues (ConvertValue column-type bug, time.Sleep backoff, dedicated sql.Conn for LogMiner) have been addressed or discussed.

LGTM


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants