Skip to content

Fix #94: allow pickling Logger objects#96

Merged
pitrou merged 4 commits into
cloudpipe:masterfrom
pitrou:pickle_loggers
May 31, 2017
Merged

Fix #94: allow pickling Logger objects#96
pitrou merged 4 commits into
cloudpipe:masterfrom
pitrou:pickle_loggers

Conversation

@pitrou

@pitrou pitrou commented May 31, 2017

Copy link
Copy Markdown
Member

No description provided.

@pitrou pitrou mentioned this pull request May 31, 2017
@codecov-io

codecov-io commented May 31, 2017

Copy link
Copy Markdown

Codecov Report

Merging #96 into master will increase coverage by 0.3%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #96     +/-   ##
=========================================
+ Coverage   79.88%   80.18%   +0.3%     
=========================================
  Files           2        2             
  Lines         522      530      +8     
  Branches      109      110      +1     
=========================================
+ Hits          417      425      +8     
  Misses         75       75             
  Partials       30       30
Impacted Files Coverage Δ
cloudpickle/cloudpickle.py 80.07% <100%> (+0.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91eae7b...26f91d7. Read the comment docs.

Comment thread tests/cloudpickle_test.py

dumped = cloudpickle.dumps(logger)

code = """if 1:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use textwrap.dedent here to fix the indentation instead of the if.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

textwrap.dedent is a bit more annoying to use (you have to make sure the first line has the same indent as the other ones, which this idiom doesn't require).

@pitrou

pitrou commented May 31, 2017

Copy link
Copy Markdown
Member Author

Interesting: the dispatch dict doesn't support old-style classes, which is why the test fails on 2.6.

@mrocklin

Copy link
Copy Markdown
Contributor

So, my experience when I run into the issue of cloudpickle not being able to serialize a logger is that cloudpickle is actually trying to serialize everything. The solution in these cases is almost always to find out why. It's probably still correct for cloudpickle to learn how to serialize logger objects, but we may want to ask ourselves why this is occurring.

@pitrou

pitrou commented May 31, 2017

Copy link
Copy Markdown
Member Author

It's probably still correct for cloudpickle to learn how to serialize logger objects, but we may want to ask ourselves why this is occurring.

It's occurring simply because Logger objects don't have a custom __reduce__ method, so pickle tries to reconstruct them structurally (which entails also reconstructing the handlers attached to them, which is a slippery slope as often those have I/O objects attached to them).

@pitrou

pitrou commented May 31, 2017

Copy link
Copy Markdown
Member Author

Note reconstructing by calling getLogger(name) is really the right thing to do, as loggers really are lazily-instantiated named singletons. Their configuration needn't be serialized and may well be different in different processes or runs.

@pitrou

pitrou commented May 31, 2017

Copy link
Copy Markdown
Member Author

Are there any other comments before this get merged?

@pitrou pitrou merged commit 5eefe28 into cloudpipe:master May 31, 2017
@pitrou pitrou deleted the pickle_loggers branch May 31, 2017 20:08
ghost pushed a commit to dbtsai/spark that referenced this pull request Aug 22, 2017
## What changes were proposed in this pull request?

Based on apache#18282 by rgbkrk this PR attempts to update to the current released cloudpickle and minimize the difference between Spark cloudpickle and "stock" cloud pickle with the goal of eventually using the stock cloud pickle.

Some notable changes:
* Import submodules accessed by pickled functions (cloudpipe/cloudpickle#80)
* Support recursive functions inside closures (cloudpipe/cloudpickle#89, cloudpipe/cloudpickle#90)
* Fix ResourceWarnings and DeprecationWarnings (cloudpipe/cloudpickle#88)
* Assume modules with __file__ attribute are not dynamic (cloudpipe/cloudpickle#85)
* Make cloudpickle Python 3.6 compatible (cloudpipe/cloudpickle#72)
* Allow pickling of builtin methods (cloudpipe/cloudpickle#57)
* Add ability to pickle dynamically created modules (cloudpipe/cloudpickle#52)
* Support method descriptor (cloudpipe/cloudpickle#46)
* No more pickling of closed files, was broken on Python 3 (cloudpipe/cloudpickle#32)
* ** Remove non-standard __transient__check (cloudpipe/cloudpickle#110)** -- while we don't use this internally, and have no tests or documentation for its use, downstream code may use __transient__, although it has never been part of the API, if we merge this we should include a note about this in the release notes.
* Support for pickling loggers (yay!) (cloudpipe/cloudpickle#96)
* BUG: Fix crash when pickling dynamic class cycles. (cloudpipe/cloudpickle#102)

## How was this patch tested?

Existing PySpark unit tests + the unit tests from the cloudpickle project on their own.

Author: Holden Karau <holden@us.ibm.com>
Author: Kyle Kelley <rgbkrk@gmail.com>

Closes apache#18734 from holdenk/holden-rgbkrk-cloudpickle-upgrades.
andrewdhicks added a commit to opendatacube/datacube-core that referenced this pull request Sep 5, 2017
jeremyh pushed a commit to opendatacube/datacube-core that referenced this pull request Sep 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants