Overview
We currently support two approaches for manually forcing shard routing:
- Comment:
/* pgdog_shard: 1234 */ SELECT * from x
- Session SET:
SET pgdog.shard = 5; + SELECT * from x (two round trips).
My understanding is that Session SET is effectively a hack for ORM users who don't have an easy way to inject comments. I say this because any application developer using raw SQL would obviously opt for the comment approach and save a round trip and connection management mental overhead.
I propose we add a new, better hack to circumvent the commenting limitations of ORMs.
The CTE hack is still a hack, but it's a better one than SETs IMO. To enable the CTE strategy, we plan to reserve the name pgdog_overrides (or an alternate) for this purpose.
It would look like this:
WITH pgdog_overrides AS (
SELECT $1::INTEGER AS shard
)
SELECT *
FROM comments
WHERE user_id = $2
LIMIT 10;
Clients bind one parameter ($1), the integer shard index.
Scenario
Imagine an internal feedback app where users comment on their company’s wall.
- The database shards by
organization_id
- The
comments table only records user_id and body (ie, no shard key).
- Every company’s comments live on and only on its organization’s shard.
- The
User struct contains organization_id, so the application can derive the correct shard for users.
Implementation Approaches
Session SET
Pros: Simple, uses standard SQL.
Cons: Requires two round trips per query (SET, then query). Not atomic with the query.
Single-statement CTE
Pros: One round trip. Shard context travels with the statement. Atomic.
Cons: Requires proxy support for parsing the reserved CTE.
Rust Example (sqlx)
async fn create_comment(
pool: &sqlx::PgPool,
user: &User,
text: &str,
) -> Result<Comment, sqlx::Error> {
sqlx::query_as!(Comment,
r#"
WITH pgdog_overrides AS (
SELECT $1::integer AS shard
)
INSERT INTO comments (user_id, body)
VALUES ($2, $3)
RETURNING *
"#,
user.organization_id, // binds to $1
user.id, // binds to $2
text // binds to $3
)
.fetch_one(pool)
.await
}
ActiveRecord Example
The whole purpose of the SET hack is to support active record. This is how you would do it with the override CTE.
Define a one-time helper in ApplicationRecord:
# app/models/application_record.rb
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
scope :with_shard, ->(shard) {
cte_sql = "SELECT #{connection.quote(shard.to_i)}::integer AS shard"
with(pgdog_overrides: Arel.sql(cte_sql))
}
end
Usage:
comments = Comment
.with_shard(user.organization_id) # 🎉 TADA!
.where(user_id: user.id)
.limit(10)
Emits:
WITH pgdog_overrides AS (
SELECT 12::integer AS shard
)
SELECT "comments".*
FROM "comments"
WHERE "comments"."user_id" = 12345
LIMIT 10;
Implementation Steps
- Reserve CTE name
pgdog_overrides in the proxy parser.
- In
QueryParser, detect the pgdog_overrides CTE at the start of the query.
- Extract the integer from the
shard column. Mark routed = true.
- Route the query to the specified shard index.
- Strip the CTE before forwarding the statement to the backend. (100% optional, it's more to have undecorated usage logs in our users' database dashboards)
- Deprecate SET?
Overview
We currently support two approaches for manually forcing shard routing:
/* pgdog_shard: 1234 */ SELECT * from xSET pgdog.shard = 5;+SELECT * from x(two round trips).My understanding is that Session SET is effectively a hack for ORM users who don't have an easy way to inject comments. I say this because any application developer using raw SQL would obviously opt for the comment approach and save a round trip and connection management mental overhead.
I propose we add a new, better hack to circumvent the commenting limitations of ORMs.
The CTE hack is still a hack, but it's a better one than SETs IMO. To enable the CTE strategy, we plan to reserve the name
pgdog_overrides(or an alternate) for this purpose.It would look like this:
Clients bind one parameter (
$1), the integer shard index.Scenario
Imagine an internal feedback app where users comment on their company’s wall.
organization_idcommentstable only recordsuser_idandbody(ie, no shard key).Userstruct containsorganization_id, so the application can derive the correct shard for users.Implementation Approaches
Session SET
Pros: Simple, uses standard SQL.
Cons: Requires two round trips per query (SET, then query). Not atomic with the query.
Single-statement CTE
Pros: One round trip. Shard context travels with the statement. Atomic.
Cons: Requires proxy support for parsing the reserved CTE.
Rust Example (sqlx)
ActiveRecord Example
The whole purpose of the SET hack is to support active record. This is how you would do it with the override CTE.
Define a one-time helper in
ApplicationRecord:Usage:
Emits:
Implementation Steps
pgdog_overridesin the proxy parser.QueryParser, detect thepgdog_overridesCTE at the start of the query.shardcolumn. Markrouted = true.