VAULT-1564 report in-flight requests by hghaf099 · Pull Request #13024 · hashicorp/vault

hghaf099 · 2021-11-03T00:25:59Z

Adding a trace capability to the request handling or HTTP layer that shows, for each requests received but not yet answered:

time duration since the request was received
the request RemoteAddress
the operation and path in the request, but not payload

This would provide a point-in-time snapshot of what user requests Vault is handling, highlighting any deadlocks or abnormally long response times.

The API reporting on this information should be called as part of the debug command. It also support unauthenticated requests if a new profiling config option is set.

Adding a new metric for the total number of in-flight requests.
Adding documentation for the metric and the new endpoint.

Here is a sample result:
{ "7ecdd692-d934-e668-365d-b4a6664e1548": { "start_time": "2021-11-03T14:54:45.774893-07:00", "client_remote_address": "127.0.0.2:53771", "request_path": "/v1/secret/data/68", "duration": "230750 microseconds" } }

http/handler_test.go

command/debug.go

http/handler.go

http/sys_in_flight_reqeusts_test.go

http/sys_in_flight_requests.go

vault/core.go

vault/logical_system.go

vault/core_metrics.go

vault/logical_system.go

addressing comments

ncabatoff · 2021-11-05T12:12:16Z

command/debug.go

 		healthInfo, err := c.cachedClient.Sys().Health()
 		if err != nil {
 			c.captureError("server-status.health", err)
+			return


Although it's true that an error on the health endpoint will probably mean one on the seal-status endpoint too, I'd rather we didn't assume that. Can you revert this change please?

ncabatoff · 2021-11-05T12:12:33Z

command/debug.go

 		sealInfo, err := c.cachedClient.Sys().SealStatus()
 		if err != nil {
 			c.captureError("server-status.seal", err)
+			return


ncabatoff · 2021-11-05T12:14:31Z

command/debug.go

+			defer resp.Body.Close()
+			err = jsonutil.DecodeJSONFromReader(resp.Body, &data)
+			if err != nil {
+				c.captureError("inFlightReq-status", err)


It looks like the first arg to captureError is the target, which we're now calling requests right?

ncabatoff · 2021-11-05T12:15:29Z

command/debug.go

+				return
+			}
+
+			if data != nil && len(data) > 0 {


I don't think we need this conditional do we?

yeah, there should always be at least one entry which is the /v1/sys/in-flight-req related info

ncabatoff · 2021-11-05T12:16:50Z

http/handler.go

+				ClientRemoteAddr: r.RemoteAddr,
+				ReqPath: r.URL.Path,
+			})
+		if err != nil {


What error are we handling here?

Ah, thanks for catching this!

command/server/config.go

vault/core.go

ncabatoff · 2021-12-07T12:59:17Z

vault/core.go

 }

+type InFlightRequests struct {
+	l                sync.RWMutex


I'd like us to avoid having both a mutex and sync.Map; if we're going to use a mutex to guard the map, we may as well just use a regular map, since it's way cheaper than sync.Map if concurrent access is protected otherwise.

That said, I think you could get rid of the lock if you stored InFlightReqData instead of *InflightReqData. The race occurs when we try to update data referenced by a pointer while another goroutine is trying to load the data. If instead updating consists of copying the data out of the map, updating that copy, then storing a copy back into the map, I don't think it'll be racy.

going to remove the mutex.

ncabatoff · 2021-12-07T13:00:56Z

vault/testing.go


 func TestCoreWithCustomResponseHeaderAndUI(t testing.T, CustomResponseHeaders map[string]map[string]string, enableUI bool) (*Core, [][]byte, string) {
 	confRaw := &server.Config{
+		LogRequestsLevel: "basic",


This isn't a valid log level, right?

yeah, I wanted to make sure invalid values are not translated to valid ones

But we probably shouldn't do that kind of thing outside of a test specific to the feature. Unless it's somehow relevant to custom response header behaviour?

hmm, make sense.

ncabatoff · 2021-12-07T18:28:57Z

vault/core.go

+	currentInFlightReqMap := make(map[string]InFlightReqData)
+	c.inFlightReqData.InFlightReqMap.Range(func(key, value interface{}) bool {
+		// there is only one writer to this map, so skip checking for errors
+		v, _ := value.(InFlightReqData)


Minor nit, but: by using the two-argument form, you're deliberately ignoring errors. Either we don't believe errors are possible, in which case you can use the single-argument form which panics on error, or we do, in which case we should handle the error.

I think at some point, you mentioned that there is only one writer for this and there is no need to check if the assertion worked or not. The comment I have on line 3006 talks about that. But, happy to add the error check for this.

I'm ok with there being no error check, I'm just saying if we're not going to check for errors because we think they can't happen, why not use the single-argument form? If we're right that errors can't happen, then they're equivalent. If we're wrong, we'll learn of it because of panics.

Sounds good! I think it is highly unlikely that a panic happens. I am going to use the single-arguments form.

hghaf099 added 2 commits November 2, 2021 17:25

VAULT-1564 report in-flight requests

3463e90

adding a changelog

6420c0c

vercel bot temporarily deployed to Preview – vault November 3, 2021 00:38 Inactive

vercel bot temporarily deployed to Preview – vault-storybook November 3, 2021 00:38 Inactive

Changing some variable names and fixing comments

2c3fefe

vercel bot temporarily deployed to Preview – vault November 3, 2021 00:55 Inactive

vercel bot temporarily deployed to Preview – vault-storybook November 3, 2021 00:55 Inactive

minor style change

e2031da

vercel bot temporarily deployed to Preview – vault November 3, 2021 00:57 Inactive

vercel bot temporarily deployed to Preview – vault-storybook November 3, 2021 00:57 Inactive

adding unauthenticated support for in-flight-req

29c5f24

vercel bot temporarily deployed to Preview – vault November 3, 2021 20:38 Inactive

vercel bot temporarily deployed to Preview – vault-storybook November 3, 2021 20:38 Inactive

adding documentation for the listener.profiling stanza

17a4928

vercel bot temporarily deployed to Preview – vault-storybook November 3, 2021 22:01 Inactive

vercel bot deployed to Preview – vault November 3, 2021 22:01 View deployment

hghaf099 commented Nov 3, 2021

View reviewed changes

http/handler_test.go Outdated Show resolved Hide resolved

hghaf099 requested a review from ncabatoff November 3, 2021 22:38

ncabatoff reviewed Nov 4, 2021

View reviewed changes

adding an atomic counter for the inflight requests

18ab7da

addressing comments

vercel bot temporarily deployed to Preview – vault November 5, 2021 00:21 Inactive

vercel bot temporarily deployed to Preview – vault-storybook November 5, 2021 00:21 Inactive

hghaf099 marked this pull request as ready for review November 5, 2021 00:21

hghaf099 requested a review from taoism4504 as a code owner November 5, 2021 00:21

hghaf099 requested a review from ncabatoff November 5, 2021 01:01

ncabatoff reviewed Nov 5, 2021

View reviewed changes

vercel bot deployed to Preview – vault December 6, 2021 19:30 View deployment

fixing couple of tests

2a53824

vercel bot temporarily deployed to Preview – vault December 6, 2021 19:55 Inactive

vercel bot temporarily deployed to Preview – vault-storybook December 6, 2021 19:55 Inactive

ncabatoff reviewed Dec 6, 2021

View reviewed changes

command/server/config.go Outdated Show resolved Hide resolved

changing log_requests_info to log_requests_level

9893d2e

vercel bot temporarily deployed to Preview – vault-storybook December 6, 2021 21:04 Inactive

vercel bot temporarily deployed to Preview – vault December 6, 2021 21:04 Inactive

minor style change

d565b64

vercel bot temporarily deployed to Preview – vault December 6, 2021 21:06 Inactive

vercel bot temporarily deployed to Preview – vault-storybook December 6, 2021 21:06 Inactive

fixing a test

97c4cf3

vercel bot temporarily deployed to Preview – vault December 6, 2021 21:36 Inactive

vercel bot temporarily deployed to Preview – vault-storybook December 6, 2021 21:36 Inactive

ncabatoff reviewed Dec 7, 2021

View reviewed changes

vault/core.go Show resolved Hide resolved

ncabatoff reviewed Dec 7, 2021

View reviewed changes

removing the lock in InFlightRequests

5649a60

vercel bot temporarily deployed to Preview – vault-storybook December 7, 2021 17:00 Inactive

vercel bot temporarily deployed to Preview – vault December 7, 2021 17:00 Inactive

hghaf099 requested a review from ncabatoff December 7, 2021 18:13

ncabatoff reviewed Dec 7, 2021

View reviewed changes

use single-argument form for interface assertion

a0bec8d

vercel bot temporarily deployed to Preview – vault-storybook December 7, 2021 21:55 Inactive

vercel bot temporarily deployed to Preview – vault December 7, 2021 21:55 Inactive

ncabatoff approved these changes Dec 7, 2021

View reviewed changes

hghaf099 added 3 commits December 7, 2021 15:18

adding doc for the new configuration paramter

087fb00

adding the new doc to the nav data file

5ed569b

minor fix

8c8e5d8

ncabatoff mentioned this pull request Jan 17, 2022

Add the duration and start time to logged completed requests. #13682

Merged

Conversation

hghaf099 commented Nov 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hghaf099 commented Nov 3, 2021 •

edited

Loading