Skip to content

Redesign QueryContext class#13071

Merged
abhishekagarwal87 merged 10 commits into
apache:masterfrom
paul-rogers:220911-context
Oct 15, 2022
Merged

Redesign QueryContext class#13071
abhishekagarwal87 merged 10 commits into
apache:masterfrom
paul-rogers:220911-context

Conversation

@paul-rogers
Copy link
Copy Markdown
Contributor

@paul-rogers paul-rogers commented Sep 11, 2022

Query context revision which builds on the work started in PR #13022.

Release Notes

Most of the work in this PR is transparent to users. This PR includes Issue #13120 which introduces a user-visible change. We introduce two new configuration keys that refine the query context security model controlled by druid.auth.authorizeQueryContextParams. When that value is set to true then two other configuration options become available:

  • druid.auth.unsecuredContextKeys: The set of query context keys that do not require a security check. Use this for the "white-list" of key to allow. All other keys go through the existing context key security checks.
  • druid.auth.securedContextKeys: The set of query context keys that do require a security check. Use this when you want to allow all but a specific set of keys: only these keys go through the existing context key security checks.

Both are set using JSON list format:

druid.auth.securedContextKeys=["secretKey1", "secretKey2"]

You generally set one or the other values. If both are set, unsecuredContextKeys acts as exceptions to securedContextKeys.

In addition, Druid defines two query context keys which always bypass checks because Druid uses them internally:

  • sqlQueryId
  • sqlStringifyArrays

Backward Compatibility

When upgrading Druid, if query context security is disabled (druid.auth.authorizeQueryContextParams=false, the default) then you will see no change.

If query context security is enabled (druid.auth.authorizeQueryContextParams=true), then the two keys listed above will always be allowed, regardless of any security rules that may have been set. You can remove these two keys from your rules if you have added them.

The two new config values (unsecuredContextKeys, and securedContextKeys) will not be set in your configuration (or in the default configurations) and will thus behavior after an upgrade will not change. You can, however, shift context key security rules you have set for all users into the new unsecuredContextKeys config property.

Background

Recent work uncovered two issues with the way Druid handles query contexts:

During the investigation of the above it became clear that the semantics around the query context are a bit unclear and could be improved.

In particular, the current QueryContext class promises more than it can deliver. It groups context values into three groups: default, system values and user values. Splitting up the values was done to allow authorizing just the user-provided keys. However, this didn't quite work as intended for several reasons. First, most code does not recognize this format, and instead "merges" all three maps before making additional changes. Second, the only use of the user-provided values is to authorize them, something that can be done more simply via other means. Third, the concept of "user-set" keys is incorrect: keys set may come from the user, or may come from an application or the Router.

As a result, the QueryContext class creates confusion rather than tidying things up as was intended. The tidying up requires that we pull in Issue #13120 to resolve a context key security ambiguity.

Overview of Revisions

This PR achieves the desired goals in a simpler way, while unifying the many ways that the code currently works with the query context.

  • The original three-map QueryContext is replaced by a "facade" on top of a single map.
  • The original QueryContext was created to allow authorizing user context keys. This PR solves that issue by making a copy of the keys separate from the context, allowing the SQL engine to modify the context without creating conflicts with the security feature.
  • Uses of QueryContext to fetch values are replaced with a reference the new QueryContext class as a facade.
  • Modification to the context in the planning layer occurs directly on the context map; it is no longer done via the old QueryContext class since there is no real value in creating a layer on top of the map.
  • Avoid ClassCastException when getting values from QueryContext #13022 adds a number of type-safe method to the Query interface. However, doing so clutters up that interface, and we still need the QueryContexts methods to get at specific context values. See the discussion in Avoid ClassCastException when getting values from QueryContext #13022 for details. This PR, by contrast, puts all context access methods on the revised QueryContext facade, making for cleaner code.
  • All type-unsafe context value accesses were replaced in Avoid ClassCastException when getting values from QueryContext #13022. They are replaced again here by the new QueryContext type-safe methods.
  • The old type-unsafe methods are marked deprecated. No code in Druid-proper references them, though we must assume that extensions may continue to use them.
  • All specific methods in QueryContexts which take a Query or context map parameter are replaced by methods on the revised QueryContext class.
  • QueryContexts methods which rewrote a query moved to the Queries class.
  • Validation of a context value now reliably throws a BadQueryContextException with a clear explanation of the problem. Previously, we threw both BadQueryContextException and ISE for such errors, and the errors have variations in format and wording.
  • get methods are standardized. get<Type>() without a default returns a boxed value, or null if the value is not set. This allows code to implement logic of the form "if the value is set, do something with the value, else do something else."
  • get(defaultValue) methods return an unboxed value, since they will return the default if the value is unset.
  • DruidPlanner is refactored a bit to better handle context key security.
  • AuthConfig implements the enhanced context key security model.

Specifics

The original QueryContext class was first removed: all references to that class were modified to refer to the context map instead.

A new class of the same name was created. The only creator of the new class is Query.queryContext(). The new class holds onto the context map associated with the query, and provides the many type-safe and value-specific methods. Since QueryContext is simply a facade, it is cheap to create an instance as needed. This lets us convert code of the form:

long foo = QueryContexts.getAsLong(query, FOO_NAME, FOO_DEFAULT);
long bar = QueryContexts.getBar(query);

To

long foo = query.queryContext().getLong(FOO_NAME, FOO_DEFAULT);
long bar = query.queryContext().getBar();

When a bit of code makes multiple references to context values (as above), we can also do:

QueryContext queryContext = query.queryContext();
long foo = queryContext.getLong(FOO_NAME, FOO_DEFAULT);
long bar = queryContext.getBar();

The following methods in Query are marked deprecated:

interface Query... {
 @Deprecated
  <ContextType> ContextType getContextValue(String key);
  @Deprecated
  <ContextType> ContextType getContextValue(String key, ContextType defaultValue);
  @Deprecated
  default boolean getContextBoolean(String key, boolean defaultValue);

Although the above methods are deprecated, they are reimplemented as default methods. As a result, the methods of the same names are removed from subclasses since they are redundant.

The Query.getQueryContext() method is removed as a way of marking that the old QueryContext is no longer available. The old method didn't really even work: the different context types were lost when merging into a single map. A new queryContext() method returns the facade version.

Alternatives

The discussions in #13049 and #13022 suggested other alternatives:

  • Work only with the context map and the QueryContexts methods as shown above. This works fine, and is one of the steps taken in Fix QueryContext race condition #13049 that lead to this PR, but is a bit cumbersome.
  • Add methods to the Query method as in an early draft of Avoid ClassCastException when getting values from QueryContext #13022. Doing so clutters the Query interface, and does not help in the SQL layer where we work with contexts before the native Query is created. Unless we really want to clutter the Query interface, we'd still need the value-specific methods in QueryContexts.
  • Leave well enough alone: status quo. While the existing code works, as noted above, the semantics are a bit messy which creates confusion for readers. Tidying up the code has no benefit to the computer, but is helpful for us poor humans who have to understand the code.

Conclusion

The result is a solution that:


This PR has:

  • been self-reviewed.
  • has been validated using existing tests.
  • has modified unit tests as needed to reflect changed methods, exceptions and semantics.
  • been tested in a test Druid cluster.

@FrankChen021
Copy link
Copy Markdown
Member

I agree with your proposal on the new API form as

long foo = query.queryContext().getLong(FOO_NAME, FOO_DEFAULT);
long bar = query.queryContext().getBar();

It makes the APIs much simpler and cleaner.

@paul-rogers
Copy link
Copy Markdown
Contributor Author

@FrankChen021, thanks for the encouraging comment! Since this PR builds on the work you recently did, can you perhaps give it a review? This one touches many files and so is at risk of conflicts, so we might want to get it in if things look OK. Thanks!

@FrankChen021
Copy link
Copy Markdown
Member

@FrankChen021, thanks for the encouraging comment! Since this PR builds on the work you recently did, can you perhaps give it a review? This one touches many files and so is at risk of conflicts, so we might want to get it in if things look OK. Thanks!

Sure. I will review it on this weekends.

Comment thread processing/src/main/java/org/apache/druid/query/Query.java
Comment thread processing/src/main/java/org/apache/druid/query/Query.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/Query.java
Comment thread processing/src/main/java/org/apache/druid/query/QueryContext.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/QueryLifecycle.java Outdated
Copy link
Copy Markdown
Contributor Author

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhishekagarwal87, thanks for your review! Addressed your questions and comments. There is one backward-compatibility issue that one of your questions raised: please let me know what you think about the issue.

Comment thread server/src/main/java/org/apache/druid/server/QueryLifecycle.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/Query.java
Comment thread processing/src/main/java/org/apache/druid/query/Query.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/Query.java
* Removes the QueryContext class and its attendant race conditions.
* Replaces it with a light-weight facade to access the context.
* Uses a simpler, alternative way to authorize user context values.
* Moved query-related methods to Queries
* Moved more context value methods to QueryContext
* Revised the non-default "get" methods to return null if
  the value is not set.
* Various build fixes
Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @paul-rogers for the hard work.

Reviewed most of the files(excluding test cases). Here're some of suggestions besides above comments.

  1. A force-push is not recommended to an open PR since it breaks the review state kept by the GitHub. See https://github.com/apache/druid/blob/master/CONTRIBUTING.md

    Avoid rebasing and force pushes after submitting a pull request, since these make it difficult for reviewers to see what you've changed in response to their reviews. The Druid committer that merges your change will rebase and squash it into a single commit before committing it to master.

  2. Seems like your IDE adds extra line between import javax and import java directives, these changes are unneccessary

  3. Regarding the QueryContext, my opinion is that caller side should always see this type instead of seeing its underlying storage type which is Map<String, Object>. We can see we have getContext(), context(), and even queryContextMap(), these are a little confusion.

  4. Separation of the QueryContext and Map<String, Object> also bring another problem. There're a lot of call of context().getXXX, each call of context() generates a new fascade object QueryContext, I think it's a little bit waste of memory and unnessary.

Comment thread processing/src/main/java/org/apache/druid/query/scan/ScanQueryEngine.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/QueryLifecycle.java
Comment thread server/src/main/java/org/apache/druid/server/QueryResource.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/QueryResource.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/QueryContext.java Outdated
@paul-rogers
Copy link
Copy Markdown
Contributor Author

@FrankChen021, thanks much for your detailed review! Very helpful. To answer your specific questions:

  1. Sorry for the force-push. On many projects, the build is done on the PR itself. Any conflicts are resolved at the end of the review process. In Druid, each PR build is first rebased on master. Failure to rebase causes the build to fail and we get no feedback on the code. As a result, when there is a merge conflict, we are forced to rebase, which requires a force push. I wonder, is there some trick that others use to avoid this issue?
  2. Sorry for the extra spaces in imports. That's just how Eclipse works. I'll make a note to manually remove such newlines in the future.
  3. See below.
  4. See below.

Your point 3 suggests that callers always see QueryContext and point 4 worries about the cost of creating QueryContext per call. Looking more carefully at the prior implementation, there is a path to make this work. That fix appears in the latest commit. The result is we eliminate the copies that were your concern in point 4.

Point 3 also suggests that callers always see QueryContext instead of the map. I agree that is the goal. The one caveat is that, for backward compatibility, the original map-based getContext() must exist and be a JSON property so that the JSON form of a query is unchanged.

Thank you for pointing out the potential bug with updating a query context in place in a query. The general rule, after this change, should be:

  • The query context is created, as a map, before the query is constructed.
  • Once the query is constructed, the context is accessed via QueryContext and is immutable. (Again, the one exception is that the map is presented to Jackson for JSON serialization.)
  • If the context must change (such as while processing a native query), we must make a copy of the context then make a copy of the entire query with the new context. This ensures that the query is immutable and avoids the kinds of race conditions which first brought my attention to this area.

Decimal parsing for long parameters
Streamline context access in queries
@FrankChen021
Copy link
Copy Markdown
Member

Sorry for the force-push. On many projects, the build is done on the PR itself. Any conflicts are resolved at the end of the review process. In Druid, each PR build is first rebased on master. Failure to rebase causes the build to fail and we get no feedback on the code. As a result, when there is a merge conflict, we are forced to rebase, which requires a force push. I wonder, is there some trick that others use to avoid this issue?

To resolve merge conflict, another way is to merge the latest master back to our own development branch.

Comment thread processing/src/main/java/org/apache/druid/query/QueryContexts.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/QueryContexts.java Outdated
Comment thread processing/src/main/java/org/apache/druid/query/QueryContexts.java
Comment thread processing/src/main/java/org/apache/druid/query/Query.java
Comment thread processing/src/main/java/org/apache/druid/query/QueryContext.java Outdated
Comment thread sql/src/main/java/org/apache/druid/sql/AbstractStatement.java
Comment thread sql/src/main/java/org/apache/druid/sql/SqlQueryPlus.java
Comment thread sql/src/main/java/org/apache/druid/sql/avatica/DruidConnection.java Outdated
Comment thread sql/src/main/java/org/apache/druid/sql/avatica/DruidConnection.java Outdated
Copy link
Copy Markdown
Contributor Author

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrankChen021, thanks again for your very thorough review. I'm impressed at the detailed items you identified: you somehow find all the tender bits in the code, which is wonderful.

I did a re-inspection of each of the areas you identified. I'm hoping that, this time, the story is consistent in each area.

Comment thread processing/src/main/java/org/apache/druid/query/QueryContext.java Outdated
Comment thread sql/src/main/java/org/apache/druid/sql/avatica/DruidConnection.java Outdated
Comment thread sql/src/main/java/org/apache/druid/sql/SqlQueryPlus.java
Comment thread sql/src/main/java/org/apache/druid/sql/AbstractStatement.java
Required to get a security IT to pass
Revises context security in native, SQL query path
Moves two SQL context keys to QueryContexts for visibility to AuthConfig
@paul-rogers
Copy link
Copy Markdown
Contributor Author

paul-rogers commented Oct 12, 2022

@FrankChen021, thanks for the approval!

So, it turns out that I had to pull the work for Issue #13120 into this PR in order to get the security IT to pass. See below.

@paul-rogers
Copy link
Copy Markdown
Contributor Author

PR is updated to pull the work for Issue #13120. That change is needed to get the security IT to pass. Basically, that IT does context key security checks using JDBC. JDBC inserts a key, which fails the security check. This worked previously when we separated system and user keys. (In fact, this may be the reason that the prior PR did the split.)

This latest PR adds two config values, defined in the "release notes" section in the revised description. To make this work:

  • The AuthConfig is not the one-stop shop to prepare the list of context keys to use for authorization checks.
  • Three new set operations are added to CollectionUtils because Java, in its infinite wisdom, doesn't provide them.
  • Two context keys moved from the SQL package to QueryContexts so that they are visible to AuthConfig.
  • These two keys are "out-of-the-box freebies" for context security: they are always allowed because they are set by Druid itself in either the Router (query ID) or JDBC path (stringify arrays.)
  • The code in the planner to gather the resources is tidied up a bit.

We'll want to review this final change before we merge.

Copy link
Copy Markdown
Contributor

@abhishekagarwal87 abhishekagarwal87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Paul. I don't see any change in the docs. will you add that in another PR?

Comment thread server/src/main/java/org/apache/druid/server/security/AuthConfig.java Outdated
@abhishekagarwal87
Copy link
Copy Markdown
Contributor

It will also make life easier for cluster admin to have a default list of unsecured context variables.

@abhishekagarwal87
Copy link
Copy Markdown
Contributor

There are legit CI failures.

@abhishekagarwal87 abhishekagarwal87 merged commit f4dcc52 into apache:master Oct 15, 2022
@abhishekagarwal87
Copy link
Copy Markdown
Contributor

Thank you @paul-rogers. We can make the changes I suggested in a follow-up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants