Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 33 additions & 20 deletions handwritten/spanner/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ import * as v1 from './v1';
import {
ObservabilityOptions,
ensureInitialContextManagerSet,
isTracingEnabled,
} from './instrument';
import {
attributeXGoogSpannerRequestIdToActiveSpan,
Expand Down Expand Up @@ -496,12 +497,14 @@ class Spanner extends GrpcService {
this.directedReadOptions = directedReadOptions;
this.defaultTransactionOptions = defaultTransactionOptions;
this._observabilityOptions = options.observabilityOptions;
if (isTracingEnabled(this._observabilityOptions)) {
ensureInitialContextManagerSet();
}
this.sessionLabels = options.sessionLabels || null;
this.commonHeaders_ = getCommonHeaders(
this.projectFormattedName_,
this._observabilityOptions?.enableEndToEndTracing,
);
ensureInitialContextManagerSet();
this._nthClientId = nextSpannerClientId();
this._universeDomain = universeEndpoint;
this.projectId_ = options.projectId;
Expand Down Expand Up @@ -1677,7 +1680,7 @@ class Spanner extends GrpcService {
* @param {function} callback Callback function
*/
prepareGapicRequest_(config, callback) {
this.auth.getProjectId((err, projectId) => {
const proceed = (err?: Error | null, projectId?: string | null) => {
if (err) {
callback(err);
return;
Expand All @@ -1692,12 +1695,8 @@ class Spanner extends GrpcService {
}
const gaxClient = this.clients_.get(clientName)!;
let reqOpts = extend(true, {}, config.reqOpts);
reqOpts = replaceProjectIdToken(reqOpts, projectId!);
// It would have been preferable to replace the projectId already in the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really remove this comment? Isn't it still valid?

// constructor of Spanner, but that is not possible as auth.getProjectId
// is an async method. This is therefore the first place where we have
// access to the value that should be used instead of the placeholder.
if (!this.projectIdReplaced_) {
reqOpts = replaceProjectIdToken(reqOpts, projectId!);
this.projectId = replaceProjectIdToken(this.projectId, projectId!);
this.projectFormattedName_ = replaceProjectIdToken(
this.projectFormattedName_,
Expand All @@ -1715,20 +1714,22 @@ class Spanner extends GrpcService {
);
});
});
config.headers[CLOUD_RESOURCE_HEADER] = replaceProjectIdToken(
config.headers[CLOUD_RESOURCE_HEADER],
projectId!,
);
this.projectIdReplaced_ = true;
}
Comment on lines +1717 to +1722
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CLOUD_RESOURCE_HEADER replacement has been moved inside the if (!this.projectIdReplaced_) block, meaning it will be skipped for all requests after the first one. This header must be updated for every request to ensure the correct project ID is sent to the backend.

        this.projectIdReplaced_ = true;
      }
      config.headers[CLOUD_RESOURCE_HEADER] = replaceProjectIdToken(
        config.headers[CLOUD_RESOURCE_HEADER],
        projectId!,
      );

Copy link
Copy Markdown
Contributor Author

@surbhigarg92 surbhigarg92 May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the first request, the SDK permanently replaces the {{projectId}} token in-place inside all cached instance and database objects (instance.formattedName_, database.formattedName_). For all subsequent requests, the headers are constructed from these already-resolved properties, meaning config.headers[CLOUD_RESOURCE_HEADER] already contains the correct project ID. Running the replacement on every subsequent request is mathematically redundant and adds unnecessary regex overhead on Node's single-threaded event loop.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that this is an actual problem, and then specifically for non-cached resource names. See this test case:

it('real application flow: should replace project ID tokens on Backup objects after initial Spanner request', done => {
  const {Instance} = require('../src/instance');
  const {Backup} = require('../src/backup');
  // Stub setup...
  const appSpanner = new Spanner({projectId: '{{projectId}}'});
  asAny(appSpanner).auth.getProjectId = callback => {
    callback(null, PROJECT_ID);
  };
  const realInstance = new Instance(appSpanner, 'my-instance');
  appSpanner.instances_.set('my-instance', realInstance);
  // Backup created before project ID is replaced
  const realBackup = new Backup(realInstance, 'my-backup');
  // 1. Initial request triggers prepareGapicRequest_
  FAKE_GAPIC_CLIENT.getInstanceConfig = (reqOpts, gaxOpts, callback) => {
    callback(null, {});
  };
  appSpanner.getInstanceConfig('nam1', err => {
    if (err) return done(err);
    // At this point, appSpanner.projectIdReplaced_ is true.
    // 2. Application calls backup.getMetadata()
    FAKE_GAPIC_CLIENT.getBackup = reqOpts => {
      try {
        assert.strictEqual(
          reqOpts.name,
          `projects/${PROJECT_ID}/instances/my-instance/backups/my-backup`,
        );
        done();
      } catch (e) {
        done(e);
      }
      return Promise.resolve([{}]);
    };
    realBackup.getMetadata();
  });
});

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this case, this seems a problem only for classes like Backups.

Because any DataClient operations will be done via instance object which is created using Spanner instance. For these objects we are replacing the projectId.

this.instances_.forEach(instance => {
          instance.formattedName_ = replaceProjectIdToken(
            instance.formattedName_,
            projectId!,
          );
          instance.databases_.forEach(database => {
            database.formattedName_ = replaceProjectIdToken(
              database.formattedName_,
              projectId!,
            );
          });
        });

I think the same problem will come for instanceConfigs object also. Using Spanner client for AdminOperations are deprecated but it may still impact existing customers . Let me think of an alternate solution

config.headers[CLOUD_RESOURCE_HEADER] = replaceProjectIdToken(
config.headers[CLOUD_RESOURCE_HEADER],
projectId!,
);
// Do context propagation
propagation.inject(context.active(), config.headers, {
set: (carrier, key, value) => {
carrier[key] = value; // Set the span context (trace and span ID)
},
});
// Attach the x-goog-spanner-request-id to the currently active span.
attributeXGoogSpannerRequestIdToActiveSpan(config);
if (isTracingEnabled(this._observabilityOptions)) {
// Do context propagation
propagation.inject(context.active(), config.headers, {
set: (carrier, key, value) => {
carrier[key] = value; // Set the span context (trace and span ID)
},
});
// Attach the x-goog-spanner-request-id to the currently active span.
attributeXGoogSpannerRequestIdToActiveSpan(config);
}
const interceptors: any[] = [];
if (this._metricsEnabled) {
interceptors.push(MetricInterceptor);
Expand Down Expand Up @@ -1796,7 +1797,19 @@ class Spanner extends GrpcService {
};

callback(null, wrappedRequestFn);
});
};

if (
this.projectIdReplaced_ &&
this.projectId &&
this.projectId !== '{{projectId}}'
) {
process.nextTick(() => {
proceed(null, this.projectId);
});
} else {
this.auth.getProjectId(proceed);
}
}

/**
Expand Down
62 changes: 59 additions & 3 deletions handwritten/spanner/src/instrument.ts
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,56 @@ function ensureInitialContextManagerSet() {

export {ensureInitialContextManagerSet};

let globalTracingEnabled: boolean | undefined = undefined;

/**
* isGlobalTracingEnabled returns true if tracing is enabled globally,
* respecting cached status and active recording spans.
*
* @returns {boolean} True if global tracing is enabled.
*/
Comment thread
surbhigarg92 marked this conversation as resolved.
function isGlobalTracingEnabled(): boolean {
if (globalTracingEnabled !== undefined) {
return globalTracingEnabled;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caching means that the result that is returned the first time is valid for the lifetime of the application. This means that:

  1. If this function happens to be called before the application has configured OpenTelemetry, then the value that it calculates and caches can be wrong.
  2. It does not take into the (maybe theoretical) possibility that an application could change its configuration later.

Copy link
Copy Markdown
Contributor Author

@surbhigarg92 surbhigarg92 May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function happens to be called before the application has configured OpenTelemetry, then the value that it calculates and caches can be wrong.

Expectation is OpenTelemetry Global registration should be done before Spanner instance is created, even in Java we expect customers to do before SpannerInstance creation , if done later it will not be picked for adding traces.

It does not take into the (maybe theoretical) possibility that an application could change its configuration later.

If opentelemetry provider is passed while creating Spanner object that will be considered, but global configuration is not expected to be changed later

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to accommodate the requirement of letting customer register OpenTelemetry later, we will not be able to avoid enabling registering of AsyncHooksContextManager() . Registering this adds a good load on the application.

}

const globalProvider = trace.getTracerProvider();
if (globalProvider) {
const probeSpan = globalProvider
.getTracer(TRACER_NAME, TRACER_VERSION)
.startSpan('probe');
const isRecording = probeSpan.isRecording();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a safe assumption. It assumes that probeSpan.isRecording() will return true if tracing is enabled. But if tracing is enabled with a 5% sample rate, then there is no guarantee that this will return true. Or am I misunderstanding what is going on here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for highlighting it. Completely missed this . Another option which I was trying was below which also didn't look like a strong approach. I am checking this with OpenTelemetry team . Unlike Java, Node does not expose an option to check if tracer is enabled https://github.com/open-telemetry/opentelemetry-java/blob/31b3cd5f561a7cf6278a255fad33d40887c1a48b/api/all/src/main/java/io/opentelemetry/api/trace/Tracer.java#L72

const globalProvider = trace.getTracerProvider();
  if (globalProvider) {
    let delegate = globalProvider;
    if (typeof (globalProvider as any).getDelegate === 'function') {
      delegate = (globalProvider as any).getDelegate();
    }
    if (delegate) {
      const name = delegate.constructor.name;
      // Exclude the dummy NoopTracerProvider and uninitialized ProxyTracerProvider
      if (name !== 'NoopTracerProvider' && name !== 'ProxyTracerProvider') {
        globalTracingEnabled = true;
        return true;
      }
    }
  }

probeSpan.end();

if (isRecording) {
globalTracingEnabled = true;
return true;
}
}
globalTracingEnabled = false;
return false;
}

/**
* isTracingEnabled returns true if tracing is enabled for the given options
* or globally.
*
* @param {ObservabilityOptions} [opts] The observability options.
* @returns {boolean} True if tracing is enabled.
*/
export function isTracingEnabled(opts?: ObservabilityOptions): boolean {
if (opts?.tracerProvider) {
return true;
}

return isGlobalTracingEnabled();
}

/** Only exported for resetting state in unit tests. */
export function _resetTracingEnabledForTest(): void {
globalTracingEnabled = undefined;
}

/**
* startTrace begins an active span in the current active context
* and passes it back to the set callback function. Each span will
Expand All @@ -132,6 +182,10 @@ export function startTrace<T>(
config: traceConfig | undefined,
cb: (span: Span) => T,
): T {
if (!isTracingEnabled(config?.opts)) {
return cb(new noopSpan());
}

if (!config) {
config = {} as traceConfig;
}
Expand Down Expand Up @@ -245,9 +299,11 @@ export function setSpanErrorAndException(
* @returns {Span} the non-null span.
*/
export function getActiveOrNoopSpan(): Span {
const span = trace.getActiveSpan();
if (span) {
return span;
if (isTracingEnabled()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this could just as well call isGlobalTracingEnabled directly, as it does not supply any options. So the more specific check whether a tracer has been set on any options is always skipped. Is that intentional?

Also, calling trace.getActiveSpan() should be an extremely cheap method to call when tracing is disabled, so I am not sure this entire method really optimizes anything.

const span = trace.getActiveSpan();
if (span) {
return span;
}
}
return new noopSpan();
}
Expand Down
57 changes: 14 additions & 43 deletions handwritten/spanner/src/request_id_header.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ const randIdForProcess = randomBytes(8)
.readUint32LE(0)
.toString(16)
.padStart(8, '0');
const REQUEST_HEADER_VERSION = 1;
const PROCESS_PREFIX = `${REQUEST_HEADER_VERSION}.${randIdForProcess}.`;
const X_GOOG_SPANNER_REQUEST_ID_HEADER = 'x-goog-spanner-request-id';

class AtomicCounter {
Expand Down Expand Up @@ -57,15 +59,13 @@ class AtomicCounter {
}
}

const REQUEST_HEADER_VERSION = 1;

function craftRequestId(
nthClientId: number,
channelId: number,
nthRequest: number,
attempt: number,
) {
return `${REQUEST_HEADER_VERSION}.${randIdForProcess}.${nthClientId}.${channelId}.${nthRequest}.${attempt}`;
return `${PROCESS_PREFIX}${nthClientId}.${channelId}.${nthRequest}.${attempt}`;
}

const nthClientId = new AtomicCounter();
Expand Down Expand Up @@ -118,15 +118,6 @@ function injectRequestIDIntoError(config: any, err: Error) {
}
}

interface withNextNthRequest {
_nextNthRequest: Function;
}

interface withMetadataWithRequestId {
_nthClientId: number;
_channelId: number;
}

function injectRequestIDIntoHeaders(
headers: {[k: string]: string},
session: any,
Expand All @@ -136,52 +127,31 @@ function injectRequestIDIntoHeaders(
if (!session) {
return headers;
}

const database = session.parent;
if (!nthRequest) {
const database = session.parent as withNextNthRequest;
if (!(database && typeof database._nextNthRequest === 'function')) {
if (!database || typeof database._nextNthRequest !== 'function') {
return headers;
}
nthRequest = database._nextNthRequest();
}
const clientId = database ? database._nthClientId || 1 : 1;
const channelId = database ? database._channelId || 1 : 1;

attempt = attempt || 1;
return _metadataWithRequestId(session, nthRequest!, attempt, headers);
}

function _metadataWithRequestId(
session: any,
nthRequest: number,
attempt: number,
priorMetadata?: {[k: string]: string},
): {[k: string]: string} {
if (!priorMetadata) {
priorMetadata = {};
}
const withReqId = {
...priorMetadata,
};
const database = session.parent as withMetadataWithRequestId;
let clientId = 1;
let channelId = 1;
if (database) {
clientId = database._nthClientId || 1;
channelId = database._channelId || 1;
}
const withReqId = {...headers};
withReqId[X_GOOG_SPANNER_REQUEST_ID_HEADER] = craftRequestId(
clientId,
channelId,
nthRequest,
attempt,
nthRequest || 1,
attempt || 1,
);
return withReqId;
}

function nextNthRequest(database): number {
if (!(database && typeof database._nextNthRequest === 'function')) {
return 1;
if (database && typeof database._nextNthRequest === 'function') {
return database._nextNthRequest();
}
return database._nextNthRequest();
return 1;
}

export interface RequestIDError extends grpc.ServiceError {
Expand Down Expand Up @@ -214,6 +184,7 @@ export {
X_GOOG_SPANNER_REQUEST_ID_SPAN_ATTR,
attributeXGoogSpannerRequestIdToActiveSpan,
craftRequestId,
PROCESS_PREFIX,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really used elsewhere?

injectRequestIDIntoError,
injectRequestIDIntoHeaders,
nextNthRequest,
Expand Down
8 changes: 7 additions & 1 deletion handwritten/spanner/test/spanner.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,11 @@ const {
InMemorySpanExporter,
} = require('@opentelemetry/sdk-trace-node');
const {SimpleSpanProcessor} = require('@opentelemetry/sdk-trace-base');
const {startTrace, ObservabilityOptions} = require('../src/instrument');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not related to this line, but it feels like the most logical place to add this comment)

We are not adding any new tests for this. Should we add tests that verify that:

  1. It does not matter when the OpenTelemetry configuration is done (before or after creating a Spanner instance).
  2. It does not matter what the trace sampling is. When tracing is enabled, even with a 1% sampling rate, then the request ID should be added to the traces.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not matter when the OpenTelemetry configuration is done (before or after creating a Spanner instance).

As mentioned in previous comment reply, we need to discuss if we want to allow this usecase.

It does not matter what the trace sampling is. When tracing is enabled, even with a 1% sampling rate, then the request ID should be added to the traces.

Sure will add it

const {
startTrace,
ObservabilityOptions,
_resetTracingEnabledForTest,
} = require('../src/instrument');

function numberToEnglishWord(num: number): string {
switch (num) {
Expand Down Expand Up @@ -7112,6 +7116,7 @@ describe('Spanner with mock server', () => {
spanProcessors: [new SimpleSpanProcessor(exporter)],
});
provider.register();
_resetTracingEnabledForTest();

after(async () => {
await provider.shutdown();
Expand Down Expand Up @@ -7205,6 +7210,7 @@ describe('Spanner with mock server', () => {
provider.register();

beforeEach(async () => {
_resetTracingEnabledForTest();
await exporter.forceFlush();
await exporter.reset();
});
Expand Down
Loading