Skip to content

String truncation#135

Merged
gregkalapos merged 12 commits intoelastic:masterfrom
gregkalapos:StringTruncation
Mar 7, 2019
Merged

String truncation#135
gregkalapos merged 12 commits intoelastic:masterfrom
gregkalapos:StringTruncation

Conversation

@gregkalapos
Copy link
Copy Markdown
Contributor

Solves #129.

This is very Json.NET specific. Maybe we move to another json serializer later, but still I think it's nice to have this now, so I'd say: let's solve it for the current serializer 🚀

After this change:

  • by default we trim every string property
  • by default we also trim the values inside Dictionary<string, string>
  • in case we don't want to trim something we can mark it with [NoTruncationInJsonNet]
  • added tests.

get => _name;
set
{
if (value.Length > Consts.PropertyMaxLength)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't truncate here anymore. This happens before serialization.

agent.Tracer.CaptureTransaction("TestTransaction", "Test", (t) => { t.CaptureSpan(spanName.ToString(), "test", () => { }); });

Assert.NotNull(payloadSender.FirstSpan);
Assert.Equal(1024, payloadSender.FirstSpan.Name.Length);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't valid anymore. We can think about testing more in PayloadSender, but atm if we go though MockPayloadSender the trimming logic is not triggered. With Intake V2 there'll be a new PayloadSender, so I'd rather not focus on this now... we have good coverage by the other new tests I added.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ MockPayloadSender is only used to assert messages get send not necessarily how.

More isolated serialization tests would be helpful

@skynet2
Copy link
Copy Markdown
Contributor

skynet2 commented Feb 28, 2019

HI @gregkalapos ,
Correct me if I'm wrong but your code will truncate all string and all dictionaries.
According to json schema only some fields should be truncated, maybe it`s better to create attribute Truncate and truncate only that strings which are marked with that attribute?

@gregkalapos
Copy link
Copy Markdown
Contributor Author

HI @gregkalapos ,
Correct me if I'm wrong but your code will truncate all string and all dictionaries.
According to json schema only some fields should be truncated, maybe it`s better to create attribute Truncate and truncate only that strings which are marked with that attribute?

By default we truncate everything. I basically have an inverse logic here: by default we truncate, and if we don't want to truncate something we add [NoTruncationInJsonNet] to it. I do this, because at first glance I think we have more fields with a maxLength limit than fields without limit and in case we don't truncate the server would reject the whole payload, so it's better to be on the safe side.

I added the attribute to (I think) all fields where it's needed. It'd be nice to see #134 merged (since that moves some types that are effected here), and then I'd make Span.Context things also public and then I'll go through all fields again to make sure we don't miss anything we should not truncate.

@gregkalapos gregkalapos self-assigned this Mar 1, 2019
Copy link
Copy Markdown
Contributor

@SergeyKleyman SergeyKleyman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

{
internal class Stacktrace
{
[NoTruncationInJsonNet]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be cleaner to declare the reverse? [Truncate(int chars)] instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spotted @skynet2 comment. I tend to agree, a default truncate feels implicit rather than explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, since my logic seems to be strange for basically everyone, we go with the reverse.

Just for reference: the main reason for trimming by default was this:
If we trim by default and we forget to add [DontTruncate] -> We trim the string, but the first 1024 chars and everything else is visible.
If we don't trim and forget to add [Truncate] -> The server rejects the request (the whole transaction or span), nothing is visible.

Of course we should not forget to add attributes in the first place :)

Anyway, I accept that trimming by default would be strange. Reversed the logic.

/// Automatically applies <see cref="StringValueProvider"/> to <see cref="String"/> properties, and <see cref="StringDictionaryValueProvider"/>
/// to <see cref="Dictionary{String, String}"/> unless they are marked with <see cref="NoTruncationInJsonNetAttribute"/>.
/// </summary>
internal class StringTruncationValueResolver : CamelCasePropertyNamesContractResolver
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially just be two JsonConverter e.g TrimmedStringJsonConverter and TagsJsonConverter (for the Dictionary<string, string> case handled here as well.

This would eschew the custom contract resolver and value providers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Agreed, it makes things easier. Added the two converters, removed the contract resolver and the value provider.

Removed NoTruncationInJsonNetAttribute, by default we do not trancate strings.
@gregkalapos
Copy link
Copy Markdown
Contributor Author

@Mpdreamz thanks for the feedback!

I pushed another version.

I created the 2 converters, and I simply attach those to the properties that we want to trim. I think this is the easiest and simplest solution.

Another option would be to activate those globally through JsonSerializerSettings, but in that case we need additional attributes to mark things for trimming. I think the current solution is better, because this way we don't have to check for attributes within the converter. Once the converter fires we can trim, since we only add those to properties that we wanna trim.

....will rebase this PR soon....

Copy link
Copy Markdown
Member

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a super small suggestion, nothing worth blocking merging over 😄

Thanks for the rewrite to converters, this will also help when we later on move to a different serializer!

Mpdreamz and others added 4 commits March 7, 2019 11:25
Co-Authored-By: gregkalapos <gergo@kalapos.net>
Removed SpanNameLengthTest. since it's coverd in SerializationTests
@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 7, 2019

Codecov Report

Merging #135 into master will increase coverage by 3.43%.
The diff coverage is 89.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #135      +/-   ##
==========================================
+ Coverage   80.03%   83.46%   +3.43%     
==========================================
  Files          40       40              
  Lines        1292     1173     -119     
  Branches      207      187      -20     
==========================================
- Hits         1034      979      -55     
+ Misses        188      122      -66     
- Partials       70       72       +2
Impacted Files Coverage Δ
src/Elastic.Apm/Model/Payload/Stacktrace.cs 100% <ø> (ø) ⬆️
src/Elastic.Apm/Model/Payload/Error.cs 80% <ø> (-3.34%) ⬇️
src/Elastic.Apm/Api/Context.cs 100% <ø> (ø) ⬆️
src/Elastic.Apm/Model/Payload/Transaction.cs 97.38% <ø> (+11.66%) ⬆️
src/Elastic.Apm/Consts.cs 100% <ø> (ø) ⬆️
src/Elastic.Apm/Report/PayloadSender.cs 65.15% <ø> (+6.26%) ⬆️
src/Elastic.Apm/Helpers/StringExtensions.cs 100% <100%> (+100%) ⬆️
src/Elastic.Apm/Model/Payload/Span.cs 89.47% <100%> (+1.41%) ⬆️
...Report/Serialization/TrimmedStringJsonConverter.cs 50% <50%> (+50%) ⬆️
...stic.Apm/Report/Serialization/TagsJsonConverter.cs 90% <90%> (+90%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d40b0a4...749efff. Read the comment docs.

@gregkalapos gregkalapos merged commit 58e1302 into elastic:master Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants