-
-
Notifications
You must be signed in to change notification settings - Fork 747
Remove deallocate and reallocate from GCAllocator #8554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your pull request and interest in making D better, @pbackus! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + phobos#8554" |
|
What about a use case where you want to use the GC as a standard allocator, but get the benefits of having the memory scanned because it's in the GC? I'm wondering if there might be template config flags we should use on the GCAllocator. I'd also like to see a way to generate blocks with NO_SCAN bit set. |
We can add a new allocator for this if it's actually something people really want (perhaps Worth noting that
Out of scope for this PR, but please file an enhancement request on bugzilla so this doesn't get lost. |
What about making I would interpret that if an allocator does not have |
|
I subjectively disagree with this paragraph from the issue description rationalizing this change:
I don't think so. The definition is an implementation of a |
A no-op In support of my interpretation: there is precedent for having a no-op
I cannot find any precedent for a no-op
The only thing you can conclude from the absence of a
Of course. But that's a pretty niche requirement—most data structures are perfectly capable of working with garbage collection. |
The DDoc comment describes it as "D's built-in garbage-collected allocator." That seems pretty unambiguous to me. |
Yeah, well, not really? How do you define when the method returns I agree that it's not great because it can still cause regressions. Memory that was previously freed will now not be freed, potentially for a long time if some garbage somewhere (dangling pointer that is never going to be dereferenced) is or is interpreted as a reference to it, and will potentially result in an infinite memory leak if the application chains these events forming an accidental ever-growing linked list.
Right, that's not what I meant and not related to what I was trying to say.
Yes, I think garbage-collector backed allocators are a bit of a special case here.
Um, I think looking at the allocator's interface is literally the core of Design by Introspection, which is what std.allocator is bulit on. But it's an interesting idea, is there precedent for this?
Sorry, what does that have to do with garbage collection? We may be talking past each other. |
I don't understand how you're reaching that conclusion just from that description. By itself it seems at odds with the general description of std.allocator allocators, which aim to facilitate manual memory management, right? |
My understanding is that |
|
That makes sense. So it comes back to, if the question is "can this allocator deallocate?", and the answer is "yes, but it's not going to happen right away", how to represent this in the allocator interface in a meaningful way. Generalizing that to "not immediately, but it's not something you should worry about" as you mentioned does makes sense to me... Though, we already can represent the "don't worry about it" part in the current allocator interface: just return As such, a more correct approach to me here seems to be to add a layer which adds a no-op Does that make sense? Also, perhaps it would be useful regardless to have an allocator backed by the GC heap that does call |
|
Another reason I'm uneasy about this change: We have the following:
We want to add the following:
However, I think that might be incompatible with things like round-robin allocators, which could try to deallocate a piece of memory from all child allocators in turn until one returns |
|
|
I think the relevant question to ask is: "should the client of this allocator call Since we're using DbI, the presence of For the GC, we know the answer should be "no", because the entire point of GC is that you don't free memory manually. So in order to communicate that answer to clients of Technically, it would not break anything to implement a no-op
Yes, perhaps. @schveiguy suggested the same thing in an earlier comment. I'm not opposed to it at all, as long as the name |
I'm not sure this is useful. It's also not how DbI is used for other methods. For shim allocators, introspection is mainly done so that the primitive's presence is propagated up the chain. For other cases, the presence is used like, "Does the underlying object support operation X? If yes, great, I'll use it! If not, oh well, I'll do something less efficient or signal an error". E.g. for I understand that you're trying to say that not calling This sort of thing ought to be explicit. (Edit: deleted a message I need to think more about) |
Granted, but I'm not talking about other methods, I'm talking about
When the user passes |
You've completely lost me here. As far as I can tell, transmission of information is entirely one-way, even with no shim layer:
|
Okay. But I think it would also be a bug to ignore
No, that's not enough. The user should be able to treat the container as a black box. The container describes its allocator requirements by mandating that the given allocator implements the needed primitives. Different containers may have different needs:
Currently we're not looking at the distinction between the last two points. For the first two points, this proposal enables accidentally passing an allocator which actually doesn't (and will never) deallocate to the second type of container and the program will happily compile. I think this is not OK, and we need the design to address it. The current design does. Absence of You have not addressed this BTW: #8554 (comment) In short, this is a bad idea and makes the std.allocator interface less useful and more complicated for the sake of one corner case.
Yes sorry I already retracted that post. |
|
Here is a proposal for a way forward:
|
Sure, sounds good. I'll submit a new PR for the unsafe one.
I'm not opposed on principle, but I don't see any reason to do this now, since it doesn't block anything. Either way, it's a separate PR. |
Agreed, just making sure we have a path forward for this use case. Thank you !!!
I guess we should merge that first, so that e.g. |
It has been proposed to change the behavior of GCAllocator to more closely match that of the built-in GC (in particular, to make it @safe). In preparation for that change, this change makes the original behavior of GCAllocator available under a new name, GCHeapMallocator. When GCAllocator is changed, users may choose to either adopt the new behavior or migrate to GCHeapMallocator and retain the original behavior. For the rationale behind these changes, see PR dlang#8554 and issue 23318.
It has been proposed to change the behavior of GCAllocator to more closely match that of the built-in GC (in particular, to make it @safe). In preparation for that change, this change makes the original behavior of GCAllocator available under a new name, GCHeapMallocator. When GCAllocator is changed, users may choose to either adopt the new behavior or migrate to GCHeapMallocator and retain the original behavior. For the rationale behind these changes, see PR dlang#8554 and issue 23318.
In preparation for the removal of GCAllocator.deallocate and GCAllocator.reallocate, make that functionality available via a new allocator with a different name. Users of GCAllocator that wish to continue using manual @System memory management instead of automatic @safe memory management can switch to GCHeapMallocator. For the rationale behind these changes, see PR dlang#8554 and issue 23318.
In preparation for the removal of GCAllocator.deallocate and GCAllocator.reallocate, make that functionality available via a new allocator with a different name. Users of GCAllocator that wish to continue using manual @System memory management instead of automatic @safe memory management can switch to GCHeapMallocator. For the rationale behind these changes, see PR dlang#8554 and issue 23318.
In preparation for the removal of GCAllocator.deallocate and GCAllocator.reallocate, make that functionality available via a new allocator with a different name. Users of GCAllocator that wish to continue using manual @System memory management instead of automatic @safe memory management can switch to GCHeapMallocator. For the rationale behind these changes, see PR dlang#8554 and issue 23318.
I'm still trying to figure out what reason one would have to want to do that.
That's the problem - if client code forgets to deallocate, it will be wrong for every other allocator. The documentation does say to not support Since client code has to know when to possibly deallocate, why would one ever want to use GCAllocator in a context that isn't benchmarking it against other allocators? There is however something to be said about using an allocator that can't possibly deallocate, since then we'll never have aliasing issues when freeing - it can't happen. But reallocation still remains a problem in that case.
Those containers aren't allocator-aware, any that are need to deallocate. |
Same reasons you'd want to use GC over manual memory management in any other context. The most obvious one is memory safety. If this PR is merged, then containers instantiated with the new version of
That burden exists regardless of whether this PR is merged. (If desired, I can split the addition of these
Reallocation will be handled by
The point is that if this PR is merged, an allocator-aware container instantiated with the new version of |
This is necessary to make clients of GCAllocator (e.g., generic containers) usable in @safe code. Memory allocated by GC allocator will still be freed by the GC when it is no longer reachable. Rejected alternatives: - Making GCAllocator.deallocate a no-op method that always returns false. The documentation in std.experimental.allocator.building_blocks specifically says not to do that. - Special-casing client code for GCAllocator to avoid calling its 'deallocate' method, while still calling 'deallocate' for other allocators. 'deallocate' has been documented as an optional method since std.experimental.allocator was introduced in 2015 (commit 8b249a6), so this change will not break code that correctly adheres to the documented allocator interface (i.e., checks for the presence of 'deallocate' before calling it). Fixes issue 23318.
Same rationale as for GCAllocator.deallocate. See previous commit for details.
|
@atilaneves Ping, anything I can do to help move this and #8555 forward? Would be happy to discuss options at this weekend's online beerconf, if that would help. |
|
@atilaneves Ping. |
|
I missed the first ping somehow, sorry about that. We should probably chat about this, feel free to email me. |
|
@pbackus @atilaneves have you reached a consensus on this? |
|
Not yet, since we haven't talked, and I really think we should. |
|
Due to the issues outlined in this thread, I am no longer prioritizing work on |
|
I think @CyberShadow 's concerns about requiring deallocation can be addressed by checking if the allocator being used is the GC. |
How does one do that in case it is under some layer? AIUI, this information needs to be propagatable somehow, either explicitly or implicitly (which is what this discussion was trying to find a way of doing). |
One probably can't, and aside from allocators like affix could be treated as the GC when backed by it, most allocators can't anyway. But it's an interesting idea to propagate the fact that all memory would be GC allocated even if the allocator isn't exactly the GC. |
|
In phobos we have things like What about changing allocations that don't need explicit freeing into a different function name? Like |
|
The more I think about it, the more having |
We should ultimately provide both options, but I expect it to explicitly not deallocate too |
|
People will be very surprised when their blocks are not deallocated when you call |
Even when they find out it's because it was the GC all along? A And now that I think of it, |
|
I'm wondering if there is perhaps some clever DbI-aligned way to detect if |
According to the official docs, Of course, the fact that everyone assumes
There isn't. Closest you can get is checking whether |
That's good enough, right? Assuming we aren't going to deprecate calling static methods via an instance. |
|
It works, but it's very easy for an allocator (or range) author to forget to mark the relevant method as |
We have people asking why the GC doesn't clean up their no-longer referenced data all the time. You think people will not be questioning why deallocate does nothing? I also very much dislike the idea of deallocate doing nothing. It reminds me of how
If deallocate is a static method, that doesn't indicate it doesn't do anything. Perhaps if it's static and pure? In any case, the GC doesn't fit in with allocators at all IMO, because the whole dynamic of how to allocate memory is different. You allocate and then you are done. Concepts that work one way with explicit memory management work differently with GC. |
The thing is, it's the official docs for something that is in
Bump allocators can implement |
That would lead me to believe that not deallocating straight away is ok since we already kind of do it right now (the GC).
That's how I was tending to think until @pbackus made the point that if |
Yes, as we discussed over email, the allocator API will require significant changes in order to allow non-GC allocators to work with Even if we did want to develop the new API in Phobos, I would recommend doing it in a separate package, like
Actually, you can define |
This is necessary to make clients of GCAllocator (e.g., generic
containers) usable in @safe code.
Memory allocated by GC allocator will still be freed by the GC when it
is no longer reachable.
Rejected alternatives:
Making GCAllocator.deallocate a no-op method that always returns
false. The documentation in std.experimental.allocator.building_blocks
specifically says not to do that.
Special-casing client code for GCAllocator to avoid calling its
'deallocate' method, while still calling 'deallocate' for other
allocators.
'deallocate' has been documented as an optional method since
std.experimental.allocator was introduced in 2015 (commit 8b249a6),
so this change will not break code that correctly adheres to the
documented allocator interface (i.e., checks for the presence of
'deallocate' before calling it).
Fixes issue 23318.
Submitted as a draft to see what breaks on Buildkite.
This is going to be pretty disruptive, but better to rip the band-aid off now while allocators are still in
std.experimentalthan to be stuck with a design mistake forever.TODO
Fix atilaneves/automem (green but will break)Depends on fixes tomake/disposein this PR