Skip to content
This repository was archived by the owner on Feb 20, 2020. It is now read-only.

Bug 1516458 - take rootURL as config and set TASKCLUSTER_ROOT_URL and TASKCLUSTER_PROXY_URL#135

Merged
petemoore merged 9 commits intotaskcluster:masterfrom
djmitche:bug1516458
Jan 4, 2019
Merged

Bug 1516458 - take rootURL as config and set TASKCLUSTER_ROOT_URL and TASKCLUSTER_PROXY_URL#135
petemoore merged 9 commits intotaskcluster:masterfrom
djmitche:bug1516458

Conversation

@djmitche
Copy link
Copy Markdown
Contributor

For tests, we just run `go get github.com/taskcluster/taskcluster-proxy, so this uses the newest version. I'm not sure how to accomplish that in the prod deployment.

@djmitche djmitche self-assigned this Dec 28, 2018
@djmitche djmitche requested review from petemoore and walac December 28, 2018 22:01
@coveralls
Copy link
Copy Markdown

coveralls commented Dec 28, 2018

Coverage Status

Coverage decreased (-52.9%) to 17.7% when pulling b8cc67a on djmitche:bug1516458 into 7859cdf on taskcluster:master.

@djmitche djmitche force-pushed the bug1516458 branch 2 times, most recently from 557de62 to 851aded Compare December 28, 2018 22:38
@djmitche
Copy link
Copy Markdown
Contributor Author

hm, well, this isn't working yet..

@djmitche
Copy link
Copy Markdown
Contributor Author

(somehow the tests passed for me locally once, but not anymore)

@djmitche
Copy link
Copy Markdown
Contributor Author

OK, I think I figured it out -- generateCommand is called before the features are initialized, so the Env property of the task payload has already been read at feature initialization time. I switched to modifying each exec.Cmd's .Env property directly, as suggested by

    // generate commands, in case features want to modify them
    for i := range task.Payload.Command {
        err := task.generateCommand(i) // platform specific                                                                                                                                                                                                                                 
        if err != nil {
            panic(err)
        }    
    }    

This fixes the tests for me locally. The error in CI on Windows is

C:\Users\task_1546265954\gopath1.10.3\src\github.com\taskcluster\generic-worker\testdata\TestTaskclusterProxy\task_1546265954>go run "curlget.go" $TASKCLUSTER_PROXY_URL/queue/v1/task/KTBKfEgxR5GdfIIREQIvFQ/runs/0/artifacts/SampleArtifacts/_/X.txt 
2018/12/31 14:28:57 Get http://localhost:34569/queue/v1/task/KTBKfEgxR5GdfIIREQIvFQ/runs/0/artifacts/SampleArtifacts/_/X.txt: dial tcp: lookup localhost: getaddrinfow: A non-recoverable error occurred during a database lookup.
exit status 1

which confuses me -- the PR makes the following change to the URL:

-                               fmt.Sprintf("http://localhost:%v/queue/v1/task/KTBKfEgxR5GdfIIREQIvFQ/runs/0/artifacts/SampleArtifacts/_/X.txt", config.TaskclusterProxyPort),
+                               fmt.Sprintf("$TASKCLUSTER_PROXY_URL/queue/v1/task/KTBKfEgxR5GdfIIREQIvFQ/runs/0/artifacts/SampleArtifacts/_/X.txt"),

so both before and (from the logging) after, it's http://localhost:<port>/.... Why would it suddenly have trouble looking up localhost?

@djmitche
Copy link
Copy Markdown
Contributor Author

The Travis build is failing because env vars aren't available to PRs

@djmitche djmitche force-pushed the bug1516458 branch 2 times, most recently from 0e61869 to edf05e1 Compare December 31, 2018 16:34
@djmitche
Copy link
Copy Markdown
Contributor Author

The same TestTaskclusterProxy test that's failing here seems to be failing reliably on master, just differently:

https://taskcluster-artifacts.net/flKO8uO0T3yA5rfyfQe2fg/0/public/logs/live_backing.log
https://taskcluster-artifacts.net/FbZXyJP7TDSirPQLhqdaIQ/0/public/logs/live_backing.log
https://taskcluster-artifacts.net/EGzzMzPNR6-gF9QiW5p9Lg/0/public/logs/live_backing.log

--- FAIL: TestTaskclusterProxy (67.03s)
	helper_test.go:211: Scheduled task b9dUhBvITfWfhHwCVfMPCQ
	helper_test.go:224: Something went wrong executing worker - got exit code 69 but was expecting exit code 0

(note when looking at those logs, there are two different test cases named TestTaskclusterProxy)

..so perhaps this has never worked, and my patch is unrelated?

@petemoore
Copy link
Copy Markdown
Member

The same TestTaskclusterProxy test that's failing here seems to be failing reliably on master, just differently:

https://taskcluster-artifacts.net/flKO8uO0T3yA5rfyfQe2fg/0/public/logs/live_backing.log
https://taskcluster-artifacts.net/FbZXyJP7TDSirPQLhqdaIQ/0/public/logs/live_backing.log
https://taskcluster-artifacts.net/EGzzMzPNR6-gF9QiW5p9Lg/0/public/logs/live_backing.log

--- FAIL: TestTaskclusterProxy (67.03s)
	helper_test.go:211: Scheduled task b9dUhBvITfWfhHwCVfMPCQ
	helper_test.go:224: Something went wrong executing worker - got exit code 69 but was expecting exit code 0

(note when looking at those logs, there are two different test cases named TestTaskclusterProxy)

..so perhaps this has never worked, and my patch is unrelated?

It looks like it may be related to your patch. See

https://tools.taskcluster.net/groups/WtVE-HaNSjW2zOvFQ2veGQ/tasks/b9dUhBvITfWfhHwCVfMPCQ/runs/0/logs/public%2Flogs%2Flive.log#L27-59

@petemoore
Copy link
Copy Markdown
Member

I suspect changes to the taskcluster proxy test(s) will be needed in light of taskcluster/taskcluster-proxy#38

@petemoore
Copy link
Copy Markdown
Member

Although the comments in that PR say "This should be 100% backward compatible" so I'm not sure. Looking deeper now...

@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 2, 2019

This PR contains the fixes to make generic-worker work with the latest tc-proxy (after
taskcluster/taskcluster-proxy#38). But since generic-worker does not use a pinned version of the proxy, it makes sense that landing the tc-proxy PR broke CI for generic-worker until this is landed.

So we're back to localhost not resolving on windows. Googling suggests that's because %SYSROOT% isn't set, and from the look of the logs, it is not set:

https://taskcluster-artifacts.net/NIlDOfEvR5msuvQlCi03gw/0/public/logs/live_backing.log

Z:\task_1546277086\gopath1.10.3\src\github.com\taskcluster\generic-worker\testdata\TestTaskclusterProxy\task_1546277086>set
COMSPEC=C:\windows\system32\cmd.exe
GOPATH=Z:\task_1546277086\gopath1.10.3
GOROOT=Z:\task_1546277086\go1.10.3\go
PATH=Z:\task_1546277086\git\bin;Z:\task_1546277086\gopath1.10.3\bin;Z:\task_1546277086\go1.10.3\go\bin;C:\Program Files\Mercurial;C:\mozilla-build\7zip;C:\mozilla-build\info-zip;C:\mozilla-build\kdiff3;C:\mozilla-build\moztools-x64\bin;C:\mozilla-build\mozmake;C:\mozilla-build\msys\bin;C:\mozilla-build\msys\local\bin;C:\mozilla-build\nsis-3.0b3;C:\mozilla-build\nsis-2.46u;C:\mozilla-build\python;C:\mozilla-build\python\Scripts;C:\mozilla-build\python3;C:\mozilla-build\upx391w;C:\mozilla-build\wget;C:\mozilla-build\yasm;C:\windows\system32;C:\windows;C:\windows\System32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0\;C:\mozilla-build\python27;C:\mozilla-build\python27\Scripts;C:\mozilla-build\vim\vim72;C:\CoreUtils\bin;C:\mozilla-build\buildbotve\scripts;c:\Program Files\Microsoft Windows Performance Toolkit\;C:\Users\cltbld\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0;C:\ProgramData\chocolatey\bin;C:\Program Files\Puppet Labs\Puppet\bin;C:\mozilla-build\hg;C:\Program Files\GNU\GnuPG\pub;C:\Program Files\Mercurial\;
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.JS;.WS;.MSC
PROMPT=$P$G
TASKCLUSTER_PROXY_URL=http://localhost:34569
TASKCLUSTER_ROOT_URL=https://taskcluster.net
TASK_ID=NIlDOfEvR5msuvQlCi03gw
tcexitcode=0

This has a few advantages:
 * permits adding the queue:create-artifact:.. scope
 * avoides an unnecessary additional call to queue.task(..) by the proxy
 * duplicates docker-worker's approach to minimize chance of
   behavior change
@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 2, 2019

So, I think the options are:

  • Always have an accessToken (this is beyond my capabilities)
  • Change the order of feature initialization and put the value in Payload.Env
  • Special-case this feature and add a set command to the batch script

@petemoore which do you prefer?

@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 2, 2019

However, after this PR, env is []string{"TASKCLUSTER_PROXY_URL=.."}, and that is put into cmd.Env unchanged, so all of the other environment variables disappear.

This was slightly off-base: env is not []string{"TASKCLUSTER_PROXY_URL=.."}, it is still nil. But the taskcluster_proxy feature later appends "TASKCLUSTER_PROXY_URL=.." to cmd.Env, changing that field from nil to []string{"TASKCLUSTER_PROXY_URL=.."}.

I added some debugging to catch the case where the process environment is required and accessToken is nil, but that doesn't catch this particular error.

So, I think editing cmd.Env in a feature's setup function is not a good idea. The options, then, are

  • Change the order of feature initialization and put the value in Payload.Env
  • Special-case this feature and add a set command to the batch script

I'm leaning toward the second one.

@djmitche djmitche force-pushed the bug1516458 branch 3 times, most recently from fbf6cb7 to 30ab2b1 Compare January 2, 2019 23:19
@djmitche djmitche requested a review from petemoore January 2, 2019 23:56
@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 2, 2019

It's green!

@djmitche djmitche force-pushed the bug1516458 branch 3 times, most recently from d07d20b to 07c912c Compare January 3, 2019 17:50
@djmitche djmitche requested review from petemoore and removed request for walac January 3, 2019 18:56
@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 3, 2019

OK, that's all green except for coveralls. I don't know why it thinks coveraged dropped 52.9% - that's obviously wrong!

Copy link
Copy Markdown
Member

@petemoore petemoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last change request, and then we're done!

You can ignore the nit - the other change request is some straggling code that I think you accidentally left in when you added and then removed some debugging statements...


// Set an environment variable in each command. This can be called from a feature's
// NewTaskFeature method to set variables for the task.
func (task *TaskRun) setVariable(variable string, value string) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a nit, not a required change.

Nit: I would have probably called this setEnvironmentVariable for clarity, and made it a function of the environment rather than the task, but it doesn't really matter. This avoids the duplication of the loop over the task commands in both platform implementations. If you provide a method that just updates an environment block in a platform-independent way, you can have a single loop over the task commands, that is platform independent. It doesn't really matter though.

Note, originally the windows and all-unix-style process implementations were quite different, but with go 1.10 it became possible to use the standard library for running processes as a different user under windows, at which point it became possible to have very similar implementations for both. I intend to clean up some of the duplication in a future PR, and at that point will probably introduce a separate go type for an environment block (something like type Environment []string) that will have its own func (env *Environment) Set(envVar string, value string), func (env *Environment) Clear(envVar string) functions for modifying it. So it makes sense to rework this at the same time, so we can leave it as is for now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave that to you :)

* factor out and test non-platform-specific MergeEnvLists
* handle the task-user and current-user modes differently
With this change, the proxy knows the rootUrl, and the task has a
reliable indication of the proxy URL

There is no provision for a feature to set a task environment variable,
and in fact features are not even initialized yet when the task commands
are created, so this special-cases the TaskclusterProxyFeature for
inclusion in the task environment.
Copy link
Copy Markdown
Member

@petemoore petemoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks Dustin!

Dustin determined that the code coverage drop is simply due to the PR coming from a forked repo, and that we disabled access to production taskcluster for integration tests from forked repos. When this lands on master, we can check the code coverage report there.

@petemoore petemoore merged commit 01bde78 into taskcluster:master Jan 4, 2019
@djmitche
Copy link
Copy Markdown
Contributor Author

djmitche commented Jan 4, 2019

Coverage looks good on master. Note that Travis doesn't support using env vars on PRs!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants