Skip to content

Java API: Add mechanism for configuring an input/processor/output with a confg map #8240

Open
jordansissel wants to merge 17 commits intomainfrom
feature/java-plugin/configuration
Open

Java API: Add mechanism for configuring an input/processor/output with a confg map #8240
jordansissel wants to merge 17 commits intomainfrom
feature/java-plugin/configuration

Conversation

@jordansissel
Copy link
Copy Markdown
Contributor

@jordansissel jordansissel commented Sep 13, 2017

The idea behind ConstructingObjectParser was largely lifted from
Elasticsearch's class by the same name. The reasons I liked this approach was that it provided a type-safe way to define how user input (a Map, in Logstash's case) is parsed to create an object like an input plugin.

The ConstructingObjectParser roughly implements in Java what config/mixin.rb provides for Ruby plugins.

This class allows us to take a Map that comes from the config parser and
build a pipeline plugin (input, filter, output).

Differences from the ruby api:

  • Custom object validation is possible. Example, http_poller has a map of urls which are name => { configuration }, and we can achieve type-safe validation of this urls thing which benefits us as plugin authors and benefits users through better error reporting.
  • Explicit default value is not required because Java class fields already allow you to set default values :)

Request for comments:

  • Is builder api clear?

Note: Exposing the builders to Logstash will require #8227 to be completed and is outside the scope of this PR.

public class Foo {
  // a factory which calls a zero-argument constructor
  public static final ObjectFactory<Foo> FOO1 = new ObjectFactory(Foo::new);

  // one integer argument to the constructor, one optional string field:
  public static final ObjectFactory<Foo> FOO2 = new ObjectFactory(Foo::new, Field.declareInteger("port"))
    .define(Field.declareString("greeting"), Foo::setGreeting);

  private int port;
  private String greeting = "Hello"; // the default greeting

  public Foo() { }
  public Foo(int port) { this.port = port; }

  private void setGreeting(String greeting) {
    this.greeting = greeting;
  }

Example invocation:

Map<String, Object> config = ...; // this is the plugin definition that comes from the Logstash config parser
Foo foo = FOO1.apply(config);
  • Validation failure currently throws IllegalArgumentException[1]. Should we make our own exception class for this?
  • All builder arguments are required to be present (implied :required => true in Ruby plugins). If you do not provide all builder args, a NullPointerException is thrown (because the missing values are null?)
  • I am experimenting with JUnit features (Enclosed and Parameterized test runners). Do you like either of these?

[1] I chose IllegalArgumentException because it is an unchecked exception we can throw while still satisfying the constraints of the java.util.function interfaces (BiFunction, etc)

Todo:

  • Add support for other base types (boolean, float, double, etc)
  • Add support for typed lists like declareList("hosts", Elasticsearch::setHosts, transformString)
  • Add obsolete and deprecated
  • Add any necessary javadoc.
  • Add more examples (Added TranslateFilter example code)

@jordansissel
Copy link
Copy Markdown
Contributor Author

In implementing the other types (boolean, float, etc), I had forgotten that primitives are keywords in Java, so youc an't have a method called 'float'. So I will move everything to be declare[Type](...) like declareFloat

@jordansissel jordansissel force-pushed the feature/java-plugin/configuration branch 4 times, most recently from 08ec826 to 8ef455f Compare September 14, 2017 04:30
@jordansissel
Copy link
Copy Markdown
Contributor Author

I spiked out an implementation of the translate filter:

private final static ConstructingObjectParser<TranslateFilter> TRANSLATE = new ConstructingObjectParser<>(TranslateFilterPlugin::newFilter);
static {
TRANSLATE.declareString("field");
// Things get a bit weird here. I can explain. The translate filter has two modes of operation:
// One, where the user specifies a 'dictionary'. Two, where the user specifies 'dictionary_path'.
// These two modes are mutually exclusive, so it makes sense to have a custom builder method to
// determine which mode to use based on the given arguments. See TranslateFilter.newFilter.
TRANSLATE.declareConstructorArg("dictionary", (config) -> config);
TRANSLATE.declareConstructorArg("dictionary_path", (object) -> Paths.get(ObjectTransforms.transformString(object)));
TRANSLATE.declareString("destination", TranslateFilter::setDestination);
TRANSLATE.declareBoolean("exact", TranslateFilter::setExact);
TRANSLATE.declareBoolean("override", TranslateFilter::setOverride);
TRANSLATE.declareBoolean("regex", TranslateFilter::setRegex);
TRANSLATE.declareString("fallback", TranslateFilter::setFallback);
// Special handling of refresh_interval to reject when dictionary_path is not also set.
TRANSLATE.declareInteger("refresh_interval", TranslateFilterPlugin::setRefreshInterval);
}
private static TranslateFilter newFilter(Object[] args) {
String source = (String) args[0];
if (args[1] != null && args[2] != null) {
throw new IllegalArgumentException("You must specify either dictionary or dictionary_path, not both. Both are set.");
}
if (args[1] != null) {
// "dictionary" field was set, so args[0] is a map.
return new TranslateFilter(source, (Map<String, Object>) args[1]);
} else {
// dictionary_path set, so let's use a file-backed translate filter.
return new FileBackedTranslateFilter(source, (Path) args[2]);
}
}
private static void setRefreshInterval(TranslateFilter filter, int refresh) {
if (filter instanceof FileBackedTranslateFilter) {
((FileBackedTranslateFilter) filter).setRefreshInterval(refresh);
} else {
throw new IllegalArgumentException("refresh_interval is only valid when using dictionary_path.");
}
}

The Ruby API config :dictionary, :validate => :hash etc are now using ConstructingObjectParser features (defined in the top of the TranslateFilterPlugin class)

@jordansissel jordansissel force-pushed the feature/java-plugin/configuration branch from 0c04c0b to e2be201 Compare September 15, 2017 22:47
@jakelandis
Copy link
Copy Markdown
Contributor

jakelandis commented Sep 18, 2017

@jordansissel and I spoke on this subject on Slack, and came to some conclusions around the requirements:

  • Must be easily consumed (read and write) by Plugin authors.
  • Should be strongly typed. e.g. don't push everything as an Object or String.
  • Must support a large number of configuration options in a manner that does not adversely impact read/write-ability for plugin authors. For example a constructor that takes in 20+ arguments is a no-go.
  • Should support custom transformations to a different (stronger) type. Since the configuration language is not strongly typed, but the configuration should be, we should support the ability for plugin developers to transform types. For example, "http://localhost:9200" should be URL type, but when read from the configuration it is a String. We should allow the plugin developer to express this configuration in his/her best suited typed via a custom transformation of type.
  • Should support custom validation, with knowledge of entire configuration set. For example in the Translate filter it is invalid to have both a dictionary and a dictionary_path defined. We should provide hooks such that a plugin developer may error early and gracefully in that case.
  • Must support the ability for the plugin developer to express configuration as obsolete and deprecated withOUT adversely affecting the read/write-ability.
  • Should support some form of core/re-usable validation, such as non-null, or isAURL. Common types of validation that assist the plugin developer to ensure sane input.
  • Should support some form of core/re-usable transformations. For example, converting a String to URL is a common pattern and we shouldn't ask plugin developers to re-code it every time (no matter how simple).
  • Should support by some means a way to express Logstash core configuration to plugins. For example the number of workers may influence the way plugin should work or is configured without having to duplicate the configuration specifically for the plugin.
  • Should support by some means a way to configure common categories of things, such http clients or SSLContext's. Sane defaults will go along way to a healthier, more secure system overall. (note - this is a different type of configuration and may result in a different strategy)
  • It would be nice if the implementation for the configuration can be re-used for the settings. Configuration and settings are very similar, and less code the better.
  • Should be able to split one configuration into two or more. For example a complex Hash defined in the configuration should be able to be simplified and split into it's appropriate parts when presented to the plugin .

@jordansissel
Copy link
Copy Markdown
Contributor Author

jordansissel commented Sep 22, 2017

I think this is ready for review. It may be best to pair up on this one in order to navigate the functionality.

I have written a "sample filter" which demonstrates the use of the ObjectFactory and in a way that I think we will use when writing Logstash plugins (See TranslateFilterPlugin.java). The Translate filter is a bit of a complex plugin because it has model operations (depends on whether 'dictionary' or 'dictionary_path' are set).

  • Authors will be interacting primarily with ObjectFactory which takes a config map and builds an object.
  • Strong typing is applied where possible, for example, if you have a setter take a string, it is invalid to try and apply an integer field to it. (It wouldnt' compile, which is nice)
  • Tests are included for a variety of use cases.

Still todo is documentation on how to use it, but I will write that in a future PR with the Plugin API PR.

…ion map.

The idea behind ConstructingObjectParser was largely lifted from Elasticsearch's class by the same name. The reasons I liked this approach was that it provided a type-safe way to define how user input (a Map, in Logstash's case) is parsed to create an object like an input plugin.

The ConstructingObjectParser roughly implements in Java what `config/mixin.rb` provides for Ruby plugins.

This class allows us to take a Map that comes from the config parser and
build a pipeline plugin (input, filter, output).

Differences from the ruby api:

* Custom object validation is possible. Example, `http_poller` has a map of `urls` which are `name => { configuration }`, and we can achieve type-safe validation of this `urls` thing which benefits us as plugin authors and benefits users through better error reporting.
* Explicit `default` value is not required because Java class fields already allow you to set default values :)

Some rationale on IllegalArgumentException:

* Object construction will throw IllegalArgumentException upon validation failure. This was chosen because IllegalArgumentException is an unchecked exception we can throw while still satisfying the constraints of the java.util.function interfaces (BiFunction, etc)
Also renamed noun-verb things (stringTransform) to verb-noun (transformString)
* note IllegalArgumentException throws
* method order slightly rearranged.
* javadocs
…ete.

This mirrors the current `:deprecated => "details"` and `:obsolete => "details"` in the Ruby side of Logstash.

* When a field is obsolete, using it will raise an exception.
* When a field is deprecated, using it will log a warning.

In tests, I used a trick I learned from Go to make compile-time assertions about return types. This is the reason for `Field x` in various places used in assignment but otherwise not used.
Previously this was in org.logstash.plugin, and it felt not the right package.

Big code reorg also:
* refactored all the shorthand helpers like declareInteger into default methods of a new ObjectParser interface.
* moved object transforms like transformInteger to a new ObjectTransforms class
This exists primarily to practice and demonstrate the capabilities of ConstructingObjectParser
This exists primarily to practice and demonstrate the capabilities of ConstructingObjectParser
Renamed ConstructingObjectParser -> ObjectFactory (shorter name, felt more appropriate)

Fields (setters, etc) are now declared using a fluent interface on ObjectFactory and don't need to be done in a `static { ... }` block anymore

Also added back deprecated and obsolete fields.
@jordansissel jordansissel force-pushed the feature/java-plugin/configuration branch from 4a95760 to a67644f Compare September 27, 2017 06:12
@jordansissel
Copy link
Copy Markdown
Contributor Author

Very rough sketch to see how another complicated filter (mutate) would exist in this new world ---

https://github.com/elastic/logstash/pull/8240/files/a67644f75c918e34f396211c23b546f687f897bd..590584f8ef3d33201c295eb24f2f8546d001c84c

Major concept here is that mutate can compile to a chain of Consumers when the plugin is loaded, so we can try doing that.

@jordansissel
Copy link
Copy Markdown
Contributor Author

@jakelandis and I spend lots of time talking about this project.

We compare annotations-vs-factories pretty frequently. Annotations some really nice readability properties that factories simply don't have.

I'm leaning Jake's suggestion of declaring settings and capabilities with annotations. In fact, the last sketch we drew on this topic looks identical to how Graylog's plugins are configured -- these use a mix of https://github.com/joschi/JadConfig and Java EE Inject.

I'll have a go sketching a few more plugins using annotations to see how it feels.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 5, 2025

This pull request does not have a backport label. Could you fix it @jordansissel? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • backport-8.x is the label to automatically backport to the 8.x branch.

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 5, 2025

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Mar 5, 2025
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 5, 2025

This pull request is now in conflicts. Could you fix it @jordansissel? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature/java-plugin/configuration upstream/feature/java-plugin/configuration
git merge upstream/main
git push upstream feature/java-plugin/configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.x Automated backport to the 8.x branch with mergify status:needs-triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants