Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
31b7ad0
Add the auth.EnvVars function
shreyas-goenka Feb 28, 2025
bf234c4
Add the auth.ProcessEnv function
shreyas-goenka Feb 28, 2025
42f6ecf
-
shreyas-goenka Feb 28, 2025
4f43fb9
[WIP] Add bundle exec
shreyas-goenka Mar 2, 2025
90c6ad0
-
shreyas-goenka Mar 3, 2025
6002e0d
some more tests
shreyas-goenka Mar 3, 2025
1b5ff48
execute scripts from bundle root
shreyas-goenka Mar 3, 2025
c442378
clarify the cwd plan
shreyas-goenka Mar 3, 2025
152d982
add cases for the target flag
shreyas-goenka Mar 3, 2025
f0d1dc5
fix replacement
shreyas-goenka Mar 3, 2025
080d1bf
make the profile is passed test work
shreyas-goenka Mar 3, 2025
58d28a9
add more tests for profile and target
shreyas-goenka Mar 3, 2025
9939562
added flags are not parsed check
shreyas-goenka Mar 3, 2025
05cd18c
exit code done
shreyas-goenka Mar 3, 2025
a2748d5
cleanup
shreyas-goenka Mar 3, 2025
898b2c1
merge
shreyas-goenka Mar 3, 2025
1996d3f
do not run on cloud
shreyas-goenka Mar 3, 2025
88b6dc8
use $CLI instead of databricks
shreyas-goenka Mar 3, 2025
fd2600c
return stdout / stderr errors after
shreyas-goenka Mar 3, 2025
93c5e3f
simplify streaming
shreyas-goenka Mar 3, 2025
bbbbf30
-
shreyas-goenka Mar 3, 2025
eb58d11
-
shreyas-goenka Mar 3, 2025
007a714
-
shreyas-goenka Mar 3, 2025
0c9fbf7
-
shreyas-goenka Mar 3, 2025
edf7051
more cleanup
shreyas-goenka Mar 3, 2025
769acf7
-
shreyas-goenka Mar 3, 2025
4061036
-
shreyas-goenka Mar 3, 2025
c0d34f7
-
shreyas-goenka Mar 3, 2025
112a9de
-
shreyas-goenka Mar 3, 2025
647dab0
some cleanup
shreyas-goenka Mar 3, 2025
be3be9c
try streaming output
shreyas-goenka Mar 4, 2025
150a51c
merge
shreyas-goenka Mar 5, 2025
39d0b13
-
shreyas-goenka Mar 5, 2025
2212aa2
proper printing
shreyas-goenka Mar 5, 2025
7fb464b
lint
shreyas-goenka Mar 5, 2025
b5e2123
-
shreyas-goenka Mar 5, 2025
39ec48a
remove error struct
shreyas-goenka Mar 5, 2025
2c77368
split windows test
shreyas-goenka Mar 5, 2025
4ef33dc
Revert "split windows test"
shreyas-goenka Mar 5, 2025
996b58a
fix pwd
shreyas-goenka Mar 5, 2025
fcc2966
-
shreyas-goenka Mar 5, 2025
09c05db
-
shreyas-goenka Mar 5, 2025
acff901
fix windows
shreyas-goenka Mar 5, 2025
de5a348
Merge remote-tracking branch 'origin' into bundle-exec
shreyas-goenka Mar 5, 2025
7dc24b6
move tests
shreyas-goenka Mar 5, 2025
90a9584
-
shreyas-goenka Mar 5, 2025
8c094db
general stderr
shreyas-goenka Mar 5, 2025
5586121
read target from flag
shreyas-goenka Mar 11, 2025
26684b1
:-
shreyas-goenka Mar 11, 2025
8e19474
Merge remote-tracking branch 'origin' into bundle-exec
shreyas-goenka Mar 11, 2025
d9b5f5e
fix build
shreyas-goenka Mar 11, 2025
987220b
update docs
shreyas-goenka Mar 11, 2025
2ba81e3
Merge remote-tracking branch 'origin' into bundle-exec
shreyas-goenka Apr 9, 2025
98f33a8
-
shreyas-goenka Apr 9, 2025
9ea6144
use execve
shreyas-goenka Apr 9, 2025
3c4ff15
switch over to run
shreyas-goenka Apr 9, 2025
2866956
-gs
shreyas-goenka Apr 9, 2025
6239f4b
-
shreyas-goenka Apr 9, 2025
28ea87e
-
shreyas-goenka Apr 9, 2025
a6d161a
-
shreyas-goenka Apr 9, 2025
a7ffe5d
-
shreyas-goenka Apr 9, 2025
c56f6f5
-
shreyas-goenka Apr 9, 2025
a0fcc5d
-
shreyas-goenka Apr 9, 2025
f30dffa
-
shreyas-goenka Apr 9, 2025
7d46902
-
shreyas-goenka Apr 9, 2025
c0c1ac6
-
shreyas-goenka Apr 9, 2025
3b57925
-
shreyas-goenka Apr 9, 2025
e5e6ea4
cleanup
shreyas-goenka Apr 9, 2025
b0bb768
-
shreyas-goenka Apr 9, 2025
fa6151b
-
shreyas-goenka Apr 9, 2025
b515f2d
Merge remote-tracking branch 'origin' into bundle-exec
shreyas-goenka Apr 9, 2025
396fc21
Add acceptance test for bundle run
shreyas-goenka Apr 9, 2025
8add0f8
-
shreyas-goenka Apr 9, 2025
8056fd2
-
shreyas-goenka Apr 9, 2025
a6f0694
-
shreyas-goenka Apr 9, 2025
fd43bd7
checkout from main
shreyas-goenka May 9, 2025
fd2f755
merge
shreyas-goenka May 9, 2025
ff71a22
-
shreyas-goenka May 9, 2025
576a16e
-
shreyas-goenka May 9, 2025
1f36d74
undo fix
shreyas-goenka May 9, 2025
4c327f5
add stateful tracking for runs
shreyas-goenka May 9, 2025
2a06b49
Revert "add stateful tracking for runs"
shreyas-goenka May 9, 2025
72305f1
proper stateful tracking
shreyas-goenka May 9, 2025
92bf456
-
shreyas-goenka May 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .wsignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ acceptance/selftest/record_cloud/volume-io/hello.txt

# "bundle run" has trailing whitespace:
acceptance/bundle/integration_whl/*/output.txt
acceptance/bundle/run/basic/output.txt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it here? The last command is "echo" which includes a newline.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run command itself has a trailing space in it's output, not the echo command. Specifically:

[DATE] HH:MM:SS "run-name" TERMINATED

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this easily fixable? I would prefer to keep this ignore list to the absolute minimum.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to fix in #2864


# "bundle init" has trailing whitespace:
acceptance/bundle/templates-machinery/helpers-error/output.txt
Expand Down
18 changes: 18 additions & 0 deletions acceptance/bundle/run/basic/databricks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
bundle:
name: caterpillar


resources:
jobs:
foo:
name: foo
tasks:
- task_key: task
spark_python_task:
python_file: ./foo.py
environment_key: default

environments:
- environment_key: default
spec:
client: "2"
1 change: 1 addition & 0 deletions acceptance/bundle/run/basic/foo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
print(1)
34 changes: 34 additions & 0 deletions acceptance/bundle/run/basic/output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@

=== no run key specified
>>> [CLI] bundle run
Error: expected a KEY of the resource to run

Exit code: 1

=== deploy and run resource
>>> [CLI] bundle deploy
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/caterpillar/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!

>>> [CLI] bundle run foo
Run URL: [DATABRICKS_URL]/job/run/1

[DATE] HH:MM:SS "run-name" TERMINATED

=== no resource key with --
>>> [CLI] bundle run --
Error: expected a KEY of the resource to run

Exit code: 1

=== resource key with parameters
>>> [CLI] bundle run foo -- arg1 arg2
Run URL: [DATABRICKS_URL]/job/run/2

[DATE] HH:MM:SS "run-name" TERMINATED

=== inline script
>>> [CLI] bundle run -- echo hello
hello
16 changes: 16 additions & 0 deletions acceptance/bundle/run/basic/script
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
title "no run key specified"
errcode trace $CLI bundle run
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: personally prefer smaller tests, easier to review/debug and faster to run. e.g. you could split error-raising commands from the rest.

if you have one command then you also don't need to specify title (it's test name), errcode and trace.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but in this case it's only 5 commands so still small. The flip side is you end up with a bunch of really granular files / tests which does not seem optimal.


title "deploy and run resource"
errcode trace $CLI bundle deploy

errcode trace $CLI bundle run foo

title "no resource key with --"
errcode trace $CLI bundle run --

title "resource key with parameters"
errcode trace $CLI bundle run foo -- arg1 arg2

title "inline script"
errcode trace $CLI bundle run -- echo "hello"
7 changes: 7 additions & 0 deletions acceptance/bundle/run/basic/test.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[[Repls]]
Old = "(?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]"
New = "HH:MM:SS"

[[Repls]]
Old = '20\d\d-\d\d-\d\d'
New = '[DATE]'
26 changes: 26 additions & 0 deletions acceptance/internal/handlers.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"encoding/json"
"fmt"
"net/http"
"strconv"

"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/iam"
Expand Down Expand Up @@ -194,6 +195,31 @@ func addDefaultHandlers(server *testserver.Server) {
return req.Workspace.JobsList()
})

server.Handle("POST", "/api/2.2/jobs/run-now", func(req testserver.Request) any {
var request jobs.RunNow
if err := json.Unmarshal(req.Body, &request); err != nil {
return testserver.Response{
Body: fmt.Sprintf("internal error: %s", err),
StatusCode: 500,
}
}

return req.Workspace.JobsRunNow(request.JobId)
})

server.Handle("GET", "/api/2.2/jobs/runs/get", func(req testserver.Request) any {
runId := req.URL.Query().Get("run_id")
runIdInt, err := strconv.ParseInt(runId, 10, 64)
if err != nil {
return testserver.Response{
Body: fmt.Sprintf("internal error: %s", err),
StatusCode: 500,
}
}

return req.Workspace.JobsGetRun(runIdInt)
})

server.Handle("GET", "/oidc/.well-known/oauth-authorization-server", func(_ testserver.Request) any {
return map[string]string{
"authorization_endpoint": server.URL + "oidc/v1/authorize",
Expand Down
75 changes: 63 additions & 12 deletions libs/testserver/fake_workspace.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,16 @@ import (

// FakeWorkspace holds a state of a workspace for acceptance tests.
type FakeWorkspace struct {
mu sync.Mutex
mu sync.Mutex
url string

directories map[string]bool
files map[string][]byte
// normally, ids are not sequential, but we make them sequential for deterministic diff
nextJobId int64
jobs map[int64]jobs.Job
nextJobId int64
nextJobRunId int64
jobs map[int64]jobs.Job
jobRuns map[int64]jobs.Run

Pipelines map[string]pipelines.PipelineSpec
Monitors map[string]catalog.MonitorInfo
Expand Down Expand Up @@ -70,19 +73,21 @@ func MapDelete[T any](w *FakeWorkspace, collection map[string]T, key string) Res
return Response{}
}

func NewFakeWorkspace() *FakeWorkspace {
func NewFakeWorkspace(url string) *FakeWorkspace {
return &FakeWorkspace{
url: url,
directories: map[string]bool{
"/Workspace": true,
},
files: map[string][]byte{},
jobs: map[int64]jobs.Job{},
nextJobId: 1,

Pipelines: map[string]pipelines.PipelineSpec{},
Monitors: map[string]catalog.MonitorInfo{},
Apps: map[string]apps.App{},
Schemas: map[string]catalog.SchemaInfo{},
files: map[string][]byte{},
jobs: map[int64]jobs.Job{},
jobRuns: map[int64]jobs.Run{},
nextJobId: 1,
nextJobRunId: 1,
Pipelines: map[string]pipelines.PipelineSpec{},
Monitors: map[string]catalog.MonitorInfo{},
Apps: map[string]apps.App{},
Schemas: map[string]catalog.SchemaInfo{},
}
}

Expand Down Expand Up @@ -247,6 +252,52 @@ func (s *FakeWorkspace) JobsGet(jobId string) Response {
}
}

func (s *FakeWorkspace) JobsRunNow(jobId int64) Response {
defer s.LockUnlock()()

_, ok := s.jobs[jobId]
if !ok {
return Response{
StatusCode: 404,
}
}

runId := s.nextJobRunId
s.nextJobRunId++
s.jobRuns[runId] = jobs.Run{
RunId: runId,
State: &jobs.RunState{
LifeCycleState: jobs.RunLifeCycleStateRunning,
},
RunPageUrl: fmt.Sprintf("%s/job/run/%d", s.url, runId),
RunType: jobs.RunTypeJobRun,
RunName: "run-name",
}

return Response{
Body: jobs.RunNowResponse{
RunId: runId,
},
}
}

func (s *FakeWorkspace) JobsGetRun(runId int64) Response {
defer s.LockUnlock()()

run, ok := s.jobRuns[runId]
if !ok {
return Response{
StatusCode: 404,
}
}

// Mark the run as terminated.
run.State.LifeCycleState = jobs.RunLifeCycleStateTerminated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be very specific to a particular test, might not be reusable. Perhaps instead you can have explicit endpoint to update this field?

Copy link
Copy Markdown
Contributor Author

@shreyas-goenka shreyas-goenka May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundle run command blocks for the run to terminate. The run state is an internal detail and all runs automatically transition to terminated, so it's a fair way to model this in a fake workspace.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we can extend this later.

return Response{
Body: run,
}
}

func (s *FakeWorkspace) PipelinesGet(pipelineId string) Response {
defer s.LockUnlock()()

Expand Down
2 changes: 1 addition & 1 deletion libs/testserver/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ func (s *Server) getWorkspaceForToken(token string) *FakeWorkspace {
defer s.mu.Unlock()

if _, ok := s.fakeWorkspaces[token]; !ok {
s.fakeWorkspaces[token] = NewFakeWorkspace()
s.fakeWorkspaces[token] = NewFakeWorkspace(s.Server.URL)
}

return s.fakeWorkspaces[token]
Expand Down
Loading