storage: create a separate service for each source#12770
Conversation
| ports: vec![ | ||
| ServicePort { | ||
| name: "controller".into(), | ||
| port_hint: 2100, |
There was a problem hiding this comment.
Should this be assigned.ports["controller"], and similarly for below?
There was a problem hiding this comment.
Ah, nope! assigned is only in scope on L420-429. The idea is that a service requests a particular port and you may or may not actually get that port. With the process orchestrator, for example, not every storaged process can get port 2100. But in Kubernetes since each pod is isolated you can hand out 2100 to every pod.
| memory_limit: None, | ||
| scale: NonZeroUsize::new(1).unwrap(), | ||
| labels: HashMap::new(), | ||
| availability_zone: None, |
There was a problem hiding this comment.
No need to block this for a prototype but something we should be careful about, if we care about co-locating storaged pods in an AZ for network traffic reasons (ingress / egress, latency)
There was a problem hiding this comment.
Yeah, we'll need to think through how we want to allow users to constrain their sources to AZs!
necaris
left a comment
There was a problem hiding this comment.
Looks great so far, per the existing orchestrator semantics in Kubernetes!
Since MaterializeInc#12216, tables are now entirely handled by the controller. There is no longer a need to send a `CreateSourceCommand` to the `storaged` process for table sources. We should eventually adjust the types here to enforce this statically, but this quick fix will unblock MaterializeInc#12770.
Resolve a TODO to read the Debezium transactional metadata source out of persist, rather than re-rendering the source. This PR will unblock creating a pod per source (MaterializeInc#12770), but it is blocked on reverting (MaterializeInc#12082), which is no longer necessary now that TCP boundary has been removed.
Resolve a TODO to read the Debezium transactional metadata source out of persist, rather than re-rendering the source. This PR will unblock creating a pod per source (MaterializeInc#12770), but it is blocked on reverting (MaterializeInc#12082), which is no longer necessary now that TCP boundary has been removed.
Resolve a TODO to read the Debezium transactional metadata source out of persist, rather than re-rendering the source. This PR will unblock creating a pod per source (MaterializeInc#12770), but it is blocked on reverting (MaterializeInc#12082), which is no longer necessary now that TCP boundary has been removed.
|
@petrosagg this is ready for review! I expect all tests to pass and will rebase after #12798 merges. |
|
Yikes, that recursive struct mapping introduced quite a bit of boilerplate. I forgot that we rely on the I explored an alternative path that makes storage ingestion descriptions work just like dataflow descriptions where they must carry with them a |
Looks great! |
Create a separate orchestrator service for each source. This means a
process per source when running locally and a pod per source when
running in Kubernetes. This is our storage scalability story for
Materialize Platform.
This will cause some fallout. As written, this creates a process for
every system table, which is hefty.
Fix MaterializeInc/cloud#2708.
Create a separate orchestrator service for each source. This means a
process per source when running locally and a pod per source when
running in Kubernetes. This is our storage scalability story for
Materialize Platform.
This will cause some fallout. As written, this creates a process for
every system table, which is hefty.
Fix MaterializeInc/cloud#2708.
Motivation
Tips for reviewer
Review just the last commit! The first two commits are from #12798 and will be rebased away after that PR merges.
Testing
Release notes
This PR includes the following user-facing behavior changes: