Skip to content

Ensure that dict keys are strings when json encoding#40

Open
mattboehm wants to merge 2 commits into
smiley-debugger:masterfrom
mattboehm:json-dict-key
Open

Ensure that dict keys are strings when json encoding#40
mattboehm wants to merge 2 commits into
smiley-debugger:masterfrom
mattboehm:json-dict-key

Conversation

@mattboehm
Copy link
Copy Markdown
Contributor

If dict keys are not strings, a TypeError is thrown while trying to encode that dict to JSON.
While ints as dict keys seem to work fine, a tuple of ints does not (i.e. d = {(1,2): 3}

This issue was referenced by @mNantern in #22 but I've also independently run into this.

@mattboehm
Copy link
Copy Markdown
Contributor Author

A part of me thinks that smiley should be very liberal about catching exceptions when encoding this json, logging the error and replacing the string with "" or "<SMILEY_ERROR>".

I'm concerned that edge cases will continue to pop up, and each one will cause a recording to catastrophically abort.

@dhellmann
Copy link
Copy Markdown
Collaborator

I like the idea of protecting against this, but I would also prefer to understand the nature of the problem. The dictionary in question holds the local variables, and all of those should have string names. I haven't been able to reproduce the bug, but if you can it would be good to get more detail about which value is actually causing the error so we can add a unit test.

@mattboehm
Copy link
Copy Markdown
Contributor Author

I believe the problem is not with the root dict, but when a local var's value contains a dict with tuple keys. While testing this issue, I ran against a test program that contained the line:

d = {1:2,(3,0):4}

This caused local_vars to be {"d": {1:2,(3,0):4}}, which triggered the exception.

I can add unit tests to this PR (either for jsonutils, db.DB.trace(), or both.)

@dhellmann
Copy link
Copy Markdown
Collaborator

Ah, that makes much more sense, sure.

@dhellmann
Copy link
Copy Markdown
Collaborator

This looks good for protecting us against JSON encoding errors. I wonder if we need to deal with the representation differently in the UI, though. Would it be confusing to have the key shown as a string when it was really a tuple?

I had though of looking at msgpack as a performance improvement. I wonder if it would solve this problem more cleanly?

@mattboehm
Copy link
Copy Markdown
Contributor Author

Another format could possibly help.

msgpack-python does not support this out of the box; tuples are converted to lists when packing, so when trying to unpack, it tries to set the dict key to a list, creating a python error:

>>> packed = msgpack.packb({(1, 2): 3})
>>> msgpack.unpackb(packed)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "msgpack/_unpacker.pyx", line 139, in msgpack._unpacker.unpackb (msgpack/_unpacker.cpp:139)
TypeError: unhashable type: 'list'
>>> 

However, msgpack has the concept of extended types, so it's possible that implementing this on the msgpack side would be a bit cleaner than trying to shoehorn type annotations into the json encoding.

The only thing that'd work out of the box for this particular case is pickle, but that comes with its own set of headaches.

I think for the short term, it makes sense to make the JSON encoding resilient to catastrophic errors, but not spending too much time on the more minor issues until other formats can be experimented with more.

FWIW, the main advantage of JSON for me is that it is easy to read and closely matches the Python syntax. I also get the sense that the JSON encoding is not the biggest performance issue, but it's hard to know until something else is tested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants