Skip to content

[SS-69] user-facing docs for Iceberg sink, now with GCP support#36773

Closed
ublubu wants to merge 27 commits into
MaterializeInc:kynan/iceberg-gcpfrom
ublubu:kynan/iceberg-gcp-doc
Closed

[SS-69] user-facing docs for Iceberg sink, now with GCP support#36773
ublubu wants to merge 27 commits into
MaterializeInc:kynan/iceberg-gcpfrom
ublubu:kynan/iceberg-gcp-doc

Conversation

@ublubu

@ublubu ublubu commented May 28, 2026

Copy link
Copy Markdown
Contributor

Adding docs for:

  • GCP connection
  • GCP Lakehouse/BigLake Iceberg Catalog Connection (BigLake is still the name of the API and everything in the GCP console, but it lives under a Lakehouse umbrella now.)

Small changes to docs for Iceberg sink.

stacked on #36695

@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch 3 times, most recently from 81cac44 to f582bfa Compare May 28, 2026 19:49
@ublubu ublubu marked this pull request as ready for review May 28, 2026 20:17
@ublubu ublubu requested a review from a team as a code owner May 28, 2026 20:17
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from 1f326ad to 07ae4e1 Compare May 28, 2026 20:58

The name of the user to connect as.

- name: "syntax-gcp"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: How does this affect https://preview.materialize.com/materialize/36773/sql/create-connection/#s3-compatible-object-storage where we mention Google Cloud Storage ... ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! That feature is not connected to this one, and they use different auth mechanisms. Maybe one day we'll make a GCS-native "copy from" operation, and that would use the new GCP connection primitive.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah .. thanks ... I think we'll want to then be a little more explicit in the connection page. Will add my comments to that page then.

Comment thread doc/user/data/examples/create_connection.yml Outdated
Comment thread doc/user/content/serve-results/sink/iceberg.md
Comment thread doc/user/content/sql/create-connection.md Outdated
Comment thread doc/user/content/sql/create-sink/iceberg.md
Comment thread doc/user/data/examples/create_connection.yml Outdated
Comment thread doc/user/content/serve-results/sink/iceberg.md Outdated
Comment thread doc/user/content/serve-results/sink/iceberg.md Outdated
Comment thread doc/user/data/examples/create_connection.yml
Comment thread doc/user/content/sql/create-sink/iceberg.md
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from 07ae4e1 to aa79174 Compare June 4, 2026 15:32
@ublubu ublubu requested review from a team as code owners June 4, 2026 15:32
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch 3 times, most recently from acc2db7 to 26b818e Compare June 4, 2026 18:28
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from aa79174 to 7912fa8 Compare June 4, 2026 18:29
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from 26b818e to a3408b2 Compare June 4, 2026 18:35
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from 7912fa8 to 0479930 Compare June 4, 2026 18:35
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from a3408b2 to 7c400e7 Compare June 4, 2026 21:14
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch 2 times, most recently from c8b6ed1 to 4f0aa58 Compare June 5, 2026 16:48
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from 3bcf327 to 8ec6f5b Compare June 8, 2026 16:30
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from a5b9e2b to 1ab3fd1 Compare June 8, 2026 16:30
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from 8ec6f5b to e89154b Compare June 8, 2026 16:45
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch 2 times, most recently from 41dfc25 to 7b1feb5 Compare June 8, 2026 18:48
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from e89154b to fdbb619 Compare June 8, 2026 18:48
@ublubu ublubu force-pushed the kynan/iceberg-gcp branch from fdbb619 to 96d20a7 Compare June 8, 2026 19:28
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from 7b1feb5 to 7fc632d Compare June 8, 2026 19:28
@ublubu ublubu force-pushed the kynan/iceberg-gcp-doc branch from 7fc632d to a9a6ed9 Compare June 9, 2026 15:27

The name of the user to connect as.

- name: "syntax-gcp"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah .. thanks ... I think we'll want to then be a little more explicit in the connection page. Will add my comments to that page then.

);
```

### GCP

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the above S3 compatible object storage section:

  • "You can use an AWS connection to perform bulk exports and bulk imports ..." -> "You can use an AWS connection to perform bulk exports (`COPY TO`) and bulk imports (`COPY FROM`) ..."

  • We could also add a disambiguation sentence that tells them if creating a GCP Iceberg sink to see this section.


A Google Cloud Platform (GCP) connection gives Materialize
a [service account](https://docs.cloud.google.com/iam/docs/service-account-overview)
in your GCP project. You can use a GCP connection to authenticate with

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would break this up and reorder so that the usage (i.e., creating an Iceberg catalog) is more prominent. This might make it easier to reword that service account thing as well.


### GCP

A Google Cloud Platform (GCP) connection gives Materialize

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"gives MZ a service account" makes it seem like creating a connection creates a service account. it just uses the service account when authenticating, yes?

- name: "`<connection_name>`"
description: |
A name for the connection.
- name: "`<service_account_key>`"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ... I should have been clearer in my comment.

Since it's a syntax for the create connection ... we don't need to explain the syntax of the service account key as part of this syntax. By incorporating ... I meant more SERVICE ACCOUNT KEY description referencing the secret that stores the service account key. and recommend the base64 encoding and using decode ... when creating the secret.

Also, did you mean to remove the example?

- name: "syntax-iceberg-catalog"
- name: "syntax-gcp"
code: |
CREATE SECRET <secret_name> AS decode('<service_account_key>', 'base64');

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already assumes that you have <service_account_key> is base64 encoded.
probably can add a comment that <service_account_key> is base64-encoded. If you want to have an example, we can remove this CREATE SECRET ... decode from here.


## Prerequisites

Google Cloud [documents the Lakehouse/BigLake setup process here](https://docs.cloud.google.com/lakehouse/docs/lakehouse-iceberg-rest-catalog). The parts you'll need:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start with the actual prerequisites, the bullet points.

- A Google Cloud project with the BigLake API enabled.
- A Google Cloud Storage bucket to serve as the Iceberg warehouse.
- A Lakehouse runtime catalog backed by your warehouse bucket.
- _NOTE: Materialize uses a service account key, not catalog-vended credentials, to write Iceberg data files._

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use {{< note >}} {{< /note >}} blocks for notes.

- `serviceusage.serviceUsageConsumer` (Service Usage Consumer)
3. Grant the service account this role on your **Iceberg warehouse bucket**:
- `storage.objectUser` (Storage Object User)
4. [Create a service account key.](https://docs.cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-gcloud)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The google docs tell people to use JSON and not p12 ... not sure if we want to just mention create a service account key (JSON format) or something.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to base64 encode it here? or is it already base64 encoded when you get it from google? That is, the example in step 2 has a comment showing how to base 64 encode the key.json ... but shouldn't we do it explicitly as a step here if people need to do it?


### Limitations

{{% include-headless "/headless/iceberg-sinks/limitations-list" %}}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This talks about aws

Image

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the compaction here

@ublubu ublubu force-pushed the kynan/iceberg-gcp branch 4 times, most recently from 7428408 to 19a77b5 Compare June 10, 2026 19:26
@ublubu ublubu deleted the branch MaterializeInc:kynan/iceberg-gcp June 10, 2026 19:48
@ublubu ublubu closed this Jun 10, 2026
@ublubu

ublubu commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Uh, this got automatically closed when I merged the previous PR. I will reopen after getting it on the correct base branch again.

@ublubu

ublubu commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Because I accidentally pushed kynan/iceberg-gcp-doc to my fork (instead of to MaterializeInc/materialize), this PR seems to have permanently closed

instead of updating to point at main when kynan/iceberg-gcp got merged and deleted.

@def-

def- commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

You can just open a new PR from the same branch.

ublubu added a commit that referenced this pull request Jun 18, 2026
reopening #36773 

----

Adding docs for:
- GCP connection
- GCP Lakehouse/BigLake Iceberg Catalog Connection _(BigLake is still
the name of the API and everything in the GCP console, but it lives
under a Lakehouse umbrella now.)_

Small changes to docs for Iceberg sink.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants