Skip to content

[GEN-2473] Refactor: migrate to synapseclient.models.Table API#214

Open
xindiguo wants to merge 1 commit intodevelopfrom
gen-2473-dec-update
Open

[GEN-2473] Refactor: migrate to synapseclient.models.Table API#214
xindiguo wants to merge 1 commit intodevelopfrom
gen-2473-dec-update

Conversation

@xindiguo
Copy link
Contributor

Problem:

The script was using the deprecated/legacy Synapse Table API methods:

  • syn.tableQuery("SELECT...").asDataFrame() for querying tables
  • syn.store(Table(schema, dataframe, etag=...)) for updating rows
  • Required manual etag management to handle concurrent updates
  • Legacy Table import from synapseclient conflicted with new models.Table

Solution:

  • Replace syn.tableQuery().asDataFrame() with Table(id=).query() method
  • Replace syn.store(Table(...)) with:
    • Table.upsert_rows(values=df, primary_keys=["variable"]) for updates
    • Table.store_rows(values=df) for new row insertion
  • Move 'from synapseclient.models import Table' after utilities import
    to properly override the legacy Table reference
  • Remove manual etag handling as new API manages this internally

This ensures compatibility with the latest synapseclient library and follows the recommended patterns for Synapse Table operations.

@xindiguo xindiguo requested a review from a team as a code owner January 27, 2026 18:47
@xindiguo xindiguo requested review from a team and Copilot and removed request for a team January 27, 2026 18:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors update_data_element_catalog.py to use the newer synapseclient.models.Table API for querying and mutating Synapse Tables, replacing deprecated legacy table query/store patterns and removing manual etag handling.

Changes:

  • Switch table reads from syn.tableQuery(...).asDataFrame() to Table(id=...).query(...).
  • Switch table writes from syn.store(Table(..., etag=...)) to Table(...).upsert_rows(...) and Table(...).store_rows(...).
  • Reorder imports so synapseclient.models.Table overrides the legacy Table imported via utilities.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 278 to +282
["synColSize", "numCols", "colLabels"]
]
vars_to_update_df = syn.store(
Table(table_schema, vars_to_update_df, etag=results.etag)
Table(id=catalog_id).upsert_rows(
values=vars_to_update_df,
primary_keys=["variable"]
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upsert_rows(..., primary_keys=["variable"]) requires the primary key column(s) to be present in values. Here vars_to_update_df is sliced down to only synColSize, numCols, and colLabels, so the variable column is missing and the upsert can’t match rows to update. Keep/include variable in the DataFrame passed to upsert_rows (and only drop it after the upsert, if needed).

Copilot uses AI. Check for mistakes.
# add the new cohort_dd column to the table schema
if "%s_dd" % cohort not in results.asDataFrame().columns:
# Check if column exists by querying
existing_df = Table(id=catalog_id).query(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how big the catalog_id tables are typically? I worry that the table may not be available to query so soon after it's been updated (above).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this script belongs in a module table_updates with other scripts and a Dockerfile/requirements.txt file and I don't see that the other scripts have been updated to use the new synapseclient models yet.

I would make a note of the specific synapseclient version to use to run this script specifically in the README and not to use the usual Dockerfile/requirements txt file setup as that won't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants