Skip to content

json+struct codec should take pointers to pointers to **non-const** buffers #3425

@bhaller

Description

@bhaller

Making a new issue for this at the request of @jeromekelleher since #3306 was merged and closed. Tagging @petrelharp since he'll be a client of this API too, in pyslim.

Copy-paste from #3306:

In the declaration:

tsk_json_struct_metadata_get_blob(const char *metadata, tsk_size_t metadata_length, const char **json, tsk_size_t *json_length, const uint8_t **blob, tsk_size_t *blob_length)
can const uint8_t **blob be uint8_t **blob without the const? SLiM sometimes actually munges the incoming mutation metadata (remapping population indices if the user has asked for that, prior to the main metadata-parsing code working with the data). I could of course make a copy of the binary blob, but since it is expected to be quite large I'd prefer not to, so as to not push the memory high-water mark unnecessarily. It doesn't seem like tskit has any special interest in keeping that metadata const, right? I'd have to cast away the const to modify it, though, which according to the C++ standard is "undefined behavior" although I imagine it would probably work in practice.

For consistency the JSON data should perhaps also not be const, but I don't care about that one way or the other; SLiM doesn't need to modify that, and if it did, I could just copy the data since that is small.

(If you want the mainstream API to be const for safety, perhaps you could provide a non-const version of it for those who need it?)

In followup discussion, @jeromekelleher wrote:

These have been changes to const char **blob, for consistency. The const is necessary here for us to compile and signals that the function doesn't alter the buffers.

I replied (edited/expanded a bit):

It's a bit annoying to require a caller to pass a pointer to const just because the function itself doesn't alter the buffers. If it was just a question of a const char * parameter and the user passing in a char * to the function, that would be fine; the compiler would automatically accept the char * for the parameter, rather than requiring the caller to provide a const char *. But because of the extra level of indirection, a const char ** parameter does not automatically accept a char ** parameter, requiring the caller to do some faffing about with casts and such. I would argue that it is not standard practice to put the const there – when passing a pointer to a pointer to data, it is already assumed that the function is not supposed to muck around inside the data that is ultimately pointed to, and that guarantee is not typically made with const in APIs (and in many cases cannot be made, in C/C++ syntax, such as guaranteeing that a function won't modify stuff that is pointed to by pointers contained within a const structure that it is passed in; C and C++ don't have any concept of "deep" constness). So I don't think you ought to have that const there. But it's not the end of the world; I thought initially that dealing with this API issue would require SLiM to cast away const, producing undefined behavior, but that is not the case. It's just annoying and non-standard (IMHO).

(I'd question the decision to use char instead of uint8_t, though; they are not the same thing, and I think the semantics of uint8_t are more appropriate – for the binary data, at least.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions