Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant flexibility for Data Connect users by allowing them to define and utilize a custom PostgreSQL schema name. This change enables better organization of database objects and improved integration with existing PostgreSQL environments where a non-public schema is preferred. The modifications span configuration, core logic for identifying database components, and command-line tools, ensuring that the specified schema is consistently applied across all relevant Data Connect operations. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces the ability to specify a custom PostgreSQL schema name in the Data Connect configuration, enhancing its flexibility. However, a critical security vulnerability has been identified: the implementation is susceptible to SQL injection. The schemaName is directly interpolated into several SQL queries without proper escaping or sanitization, posing a risk where arbitrary SQL could be executed with the privileges of the database user if the dataconnect.yaml configuration is influenced by external contributors. To address this, it is essential to properly escape all identifiers using double quotes (and doubling any existing double quotes) or to use parameterized queries where supported.
| const conn: pg.PoolClient = await pool.connect(); | ||
|
|
||
| // Set search_path to the configured PostgreSQL schema so unqualified table names resolve correctly. | ||
| await conn.query(`SET search_path TO "${schemaName}"`); |
There was a problem hiding this comment.
The schemaName variable is interpolated directly into a SQL query without proper escaping. Since schemaName is derived from the user-controlled dataconnect.yaml file, this allows for SQL injection. An attacker could provide a malicious schema name like public"; DROP TABLE users; -- to execute arbitrary SQL commands when a developer runs the SQL shell.
| await conn.query(`SET search_path TO "${schemaName}"`); | |
| await conn.query('SET search_path TO "' + schemaName.replace(/"/g, '""') + '"'); |
| databaseId, | ||
| [`SET ROLE "${firebaseowner(databaseId)}"`, ...commandsToExecuteByOwner.map((d) => d.sql)], | ||
| [ | ||
| `SET ROLE "${firebaseowner(databaseId, schemaName)}"`, |
There was a problem hiding this comment.
The schemaName variable is used to construct a SET ROLE SQL command. Neither the schemaName nor the resulting role name are escaped, leading to a SQL injection vulnerability. A malicious schema name could be used to break out of the double quotes and execute arbitrary SQL during the migration process.
| `SET ROLE "${firebaseowner(databaseId, schemaName)}"`, | |
| 'SET ROLE "' + firebaseowner(databaseId, schemaName).replace(/"/g, '""') + '"', |
| await cloudSqlAdminClient.createUser(projectId, instanceId, mode, user); | ||
|
|
||
| const fdcSqlRole = fdcSqlRoleMap[role as keyof typeof fdcSqlRoleMap](databaseId); | ||
| const fdcSqlRole = fdcSqlRoleMap[role as keyof typeof fdcSqlRoleMap](databaseId, schema); |
There was a problem hiding this comment.
fredzqm
left a comment
There was a problem hiding this comment.
Looks great! Great work!!
Can you add a CHANGELOG to announce this feature?
Have you verified that it works with a prod instance?
b47578b to
803dda2
Compare
- Added support for specifying a custom PostgreSQL schema name in the configuration. - Updated the `getIdentifiers` function to return the schema name, defaulting to 'public' if not specified. - Modified relevant functions to utilize the schema name for database operations. - Added tests to validate schema name handling in various scenarios.
- Added support for specifying a custom PostgreSQL schema name in the configuration. - Updated the `getIdentifiers` function to return the schema name, defaulting to 'public' if not specified. - Modified relevant functions to utilize the schema name for database operations. - Added tests to validate schema name handling in various scenarios.
getIdentifiersfunction to return the schema name, defaulting to 'public' if not specified.Resolves Issue Data Connect ignores datasource.postgresql.schema and always qualifies objects with public #9271
CLI TEST:
