Skip to content

Correct behavior for XML canonicalization with namespaces and nested elements#242

Merged
LoneRifle merged 4 commits intonode-saml:masterfrom
mthadley:canonicalization-with-ancestor-namespaces
Jul 8, 2022
Merged

Correct behavior for XML canonicalization with namespaces and nested elements#242
LoneRifle merged 4 commits intonode-saml:masterfrom
mthadley:canonicalization-with-ancestor-namespaces

Conversation

@mthadley
Copy link
Contributor

@mthadley mthadley commented Mar 17, 2022

Hey there! I wanted to start by saying thank you for the hard work maintaining this package.

I've been running into problems related to mismatched signature digests when validating a signed XML document, and I've narrowed the mismatch down to differences in how this library and another canonicalize the document. While the real world example involves SAML response XML, I've tried to instead demonstrate the behavior with a minimal failing test.

For the http://www.w3.org/TR/2001/REC-xml-c14n-20010315 method, I noticed that xml-crypto seems to be repeating namespaces that are already defined on an ancestor node.

If we start with this document...

<root xmlns:aaa='bbb'>
  <child1>
    <child2>
      <child3 aaa:foo='bar'>
      </child3>
    </child2>
  </child1>
</root>

...and like in the test, canonicalize the subset of the document with child1, we get...

<child1 xmlns:aaa="bbb">
  <child2>
    <child3 xmlns:aaa="bbb" aaa:foo="bar">
    </child3>
  </child2>
</child1>

Notice that xmlns:aaa gets pushed down from the root to child1, which I expect, but then the namespace is also repeated on child3.

Is this expected behavior? It seems incorrect, due to the duplication of xmlns:aaa.

@mthadley
Copy link
Contributor Author

In addition to the failing test, I've pushed up what I think may be a fix, in 27ca996.

@cjbarth
Copy link
Contributor

cjbarth commented Jul 7, 2022

I seems to me that duplicating the namespace doesn't result in non-conforming XML, just hard-to-read XML. If you apply a linter, the XML could clean right up. Having said that, I'm all for cleanly generated XML.

For example, you mention that this library results in:

<child1 xmlns:aaa="bbb">
  <child2>
    <child3 xmlns:aaa="bbb" aaa:foo="bar">
    </child3>
  </child2>
</child1>

After running xmllint --nsclean, the result is:

<child1 xmlns:aaa="bbb">
  <child2>
    <child3 aaa:foo="bar">
    </child3>
  </child2>
</child1>

That says to me that this library could do a better job and generating XML and the PR helps with that.

See https://stackoverflow.com/questions/27821554/how-to-eliminate-duplicate-xml-namespace-definitions

Copy link
Collaborator

@LoneRifle LoneRifle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, taking reference from @cjbarth and node-saml/node-saml#36

@LoneRifle LoneRifle merged commit 49f93d8 into node-saml:master Jul 8, 2022
@mthadley mthadley deleted the canonicalization-with-ancestor-namespaces branch July 9, 2022 00:07
@cjbarth cjbarth added the bug label May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants