Skip to content

Class-level attributes dropped during serialization across processes #398

Description

@jrbourbeau

I came across a use case where class-level attributes don't seem to survive a roundtrip serialization / deserialization trip where serialization and deserialization happen in different processes.

For example, if I define a Foo class in a test_module.py Python module that keeps track of state through a class-level bar attribute:

# test_module.py

class Foo:
    bar = []

    def append(self, value):
        self.bar.append(value)

and then, in a separate script, create a Foo instance (named foo), add some data to the class-level bar list, and then send the foo instance to a separate subprocess, the state stored in foo.bar isn't included in the serialized form of foo:

import os
import subprocess
import sys

import cloudpickle
from test_module import Foo

# Create Foo instance, add a value to the Foo.bar class-level
# attribute, and then display the value
foo = Foo()
foo.append(123)
print(f"(main process) {foo.bar = }")

# Serialize foo and display foo.bar in a subprocess
s = cloudpickle.dumps(foo)
command = ("import cloudpickle; "
            f"result = cloudpickle.loads({s}); "
            "print(f'(subprocess) foo.bar = {result.bar}')")
subprocess.call([sys.executable, '-c', command])

outputs

(main process) foo.bar = [123]
(subprocess) foo.bar = []

Though I would have expected the output to be:

(main process) foo.bar = [123]
(subprocess) foo.bar = [123]

For reference, this issue originated over in distributed (xref dask/distributed#4233). Also the above was run using cloudpickle 1.6.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions