Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Speed up :func:`csv.reader` by appending parsed fields to the output row
without an extra reference-count round-trip (using the internal
reference-stealing list append helper). Patch by Omkar Kabde.
5 changes: 2 additions & 3 deletions Modules/_csv.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ module instead.
#endif

#include "Python.h"
#include "pycore_list.h" // _PyList_AppendTakeRef()
#include "pycore_pyatomic_ft_wrappers.h"

#include <stddef.h> // offsetof()
Expand Down Expand Up @@ -685,11 +686,9 @@ parse_save_field(ReaderObj *self)
}
self->field_len = 0;
}
if (PyList_Append(self->fields, field) < 0) {
Py_DECREF(field);
if (_PyList_AppendTakeRef((PyListObject *)self->fields, field) < 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list used here is self->fields. Can we guarantee that it is not mutated by concurrent threads?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when parse_save_field is running, self->fields is only reachable through self, and self is locked by `Reader_iternext's critical section. i dont think any other thread can mutate the list.

static PyObject *
Reader_iternext(PyObject *op)
{
    PyObject *result;
    Py_BEGIN_CRITICAL_SECTION(op);
    result = Reader_iternext_lock_held(op);
    Py_END_CRITICAL_SECTION();
    return result;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @eendebakpt , just a nudge

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object is locked and self->fields is not accessible via other paths so this is indeed safe.

If redesigning I would make probably make fields (and maybe some other fields) a local variable inside Reader_iternext_lock_held (only minor thing to take care of: the fields currently acts as a guard for re-entrant calls). This is out of scope for the PR though.

I would drop the new entry (or at least omit the implementation details).

@omkar-334 This needs to be reviewed by a core dev, might it might take some time.

return -1;
}
Py_DECREF(field);
return 0;
}

Expand Down
Loading