Skip to content

Two tests fail on macOS suddenly due to pandas + pyarrow #3732

@seisman

Description

@seisman

We have two sudden new failures on macOS in the latest scheduled CI runs (see https://github.com/GenericMappingTools/pygmt/actions/runs/12540348940), while the CI runs yesterday worked https://github.com/GenericMappingTools/pygmt/actions/runs/12532978920.

Details
=================================== FAILURES ===================================
___________________ test_vectors_to_arrays_pyarrow_datetime ____________________

    @pytest.mark.skipif(not _HAS_PYARROW, reason="pyarrow is not installed.")
    def test_vectors_to_arrays_pyarrow_datetime():
        """
        Test the vectors_to_arrays function with pyarrow arrays containing date32/date64
        types.
        """
        vectors = [
>           pd.Series(
                data=[datetime.date(2020, 1, 1), datetime.date(2021, 12, 31)],
                dtype="date32[day][pyarrow]",
            ),
            pd.Series(
                data=[datetime.date(2022, 1, 1), datetime.date(2023, 12, 31)],
                dtype="date64[ms][pyarrow]",
            ),
        ]

../pygmt/tests/test_clib_vectors_to_arrays.py:79: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/series.py:428: in __init__
    dtype = self._validate_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/generic.py:458: in _validate_dtype
    dtype = pandas_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/common.py:1679: in pandas_dtype
    result = registry.find(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/base.py:521: in find
    return dtype_type.construct_from_string(dtype)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'pandas.core.arrays.arrow.dtype.ArrowDtype'>
string = 'date32[day][pyarrow]'

    @classmethod
    def construct_from_string(cls, string: str) -> ArrowDtype:
        """
        Construct this type from a string.
    
        Parameters
        ----------
        string : str
            string should follow the format f"{pyarrow_type}[pyarrow]"
            e.g. int64[pyarrow]
        """
        if not isinstance(string, str):
            raise TypeError(
                f"'construct_from_string' expects a string, got {type(string)}"
            )
        if not string.endswith("[pyarrow]"):
            raise TypeError(f"'{string}' must end with '[pyarrow]'")
        if string == "string[pyarrow]":
            # Ensure Registry.find skips ArrowDtype to use StringDtype instead
            raise TypeError("string[pyarrow] should be constructed by StringDtype")
    
        base_type = string[:-9]  # get rid of "[pyarrow]"
        try:
>           pa_dtype = pa.type_for_alias(base_type)
E           NameError: name 'pa' is not defined

../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/arrays/arrow/dtype.py:218: NameError
_____________________ test_virtualfile_from_vectors_pandas _____________________

dtypes_pandas = (<class 'numpy.int8'>, <class 'numpy.int16'>, <class 'numpy.int32'>, <class 'numpy.int64'>, <class 'numpy.longlong'>, <class 'numpy.uint8'>, ...)

    def test_virtualfile_from_vectors_pandas(dtypes_pandas):
        """
        Pass vectors to a dataset using pandas.Series, checking both numpy and pyarrow
        dtypes.
        """
        size = 13
    
        for dtype in dtypes_pandas:
>           data = pd.DataFrame(
                data={
                    "x": np.arange(size),
                    "y": np.arange(size, size * 2, 1),
                    "z": np.arange(size * 2, size * 3, 1),
                },
                dtype=dtype,
            )

../pygmt/tests/test_clib_virtualfile_from_vectors.py:157: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/frame.py:650: in __init__
    dtype = self._validate_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/generic.py:458: in _validate_dtype
    dtype = pandas_dtype(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/common.py:1679: in pandas_dtype
    result = registry.find(dtype)
../../../../micromamba/envs/pygmt/lib/python3.11/site-packages/pandas/core/dtypes/base.py:521: in find
    return dtype_type.construct_from_string(dtype)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'pandas.core.arrays.arrow.dtype.ArrowDtype'>
string = 'int8[pyarrow]'

    @classmethod
    def construct_from_string(cls, string: str) -> ArrowDtype:
        """
        Construct this type from a string.
    
        Parameters
        ----------
        string : str
            string should follow the format f"{pyarrow_type}[pyarrow]"
            e.g. int64[pyarrow]
        """
        if not isinstance(string, str):
            raise TypeError(
                f"'construct_from_string' expects a string, got {type(string)}"
            )
        if not string.endswith("[pyarrow]"):
            raise TypeError(f"'{string}' must end with '[pyarrow]'")
        if string == "string[pyarrow]":
            # Ensure Registry.find skips ArrowDtype to use StringDtype instead
            raise TypeError("string[pyarrow] should be constructed by StringDtype")
    
        base_type = string[:-9]  # get rid of "[pyarrow]"
        try:
>           pa_dtype = pa.type_for_alias(base_type)
E           NameError: name 'pa' is not defined

There are no changes in the PyGMT source codes, and the environment difference seems irrelevant:

diff old.txt new.txt
47c47
<     coverage                          7.6.9         py311h4921393_0          conda-forge
---
>     coverage                          7.6.10        py311h4921393_0          conda-forge
59c59
<     fonttools                         4.55.3        py311h4921393_0          conda-forge
---
>     fonttools                         4.55.3        py311h4921393_1          conda-forge
62c62
<     gdal                              3.10.0        py311h6d86783_10         conda-forge
---
>     gdal                              3.10.0        py311h32e851c_13         conda-forge
95c95
<     libabseil                         20240722.0    cxx17_hf9b8971_1         conda-forge
---
>     libabseil                         20240722.0    cxx17_h07bc746_2         conda-forge
114,115c114,115
<     libgdal-core                      3.10.0        hcf82b6a_10              conda-forge
<     libgdal-jp2openjpeg               3.10.0        h4ea06f0_10              conda-forge
---
>     libgdal-core                      3.10.0        h9ef0d2d_13              conda-forge
>     libgdal-jp2openjpeg               3.10.0        h5de94d9_13              conda-forge
122c122
<     libheif                           1.18.2        gpl_he913df3_100         conda-forge
---
>     libheif                           1.19.5        gpl_h297b2c4_100         conda-forge
136c136
<     libspatialindex                   2.0.0         h00cdb27_0               conda-forge
---
>     libspatialindex                   2.1.0         h57eeb1c_0               conda-forge
205c205
<     rtree                             1.3.0         py311hc46b6d3_2          conda-forge
---
>     rtree                             1.3.0         py311heb40887_3          conda-forge
215c215
<     sphinx-gallery                    0.18.0        pyhd8ed1ab_0             conda-forge
---
>     sphinx-gallery                    0.18.0        pyhd8ed1ab_1             conda-forge
246c246
<     zstd                              1.5.6         hb46c0d2_0               conda-forge
---
>     zstd                              1.5.6         hb46c0d2_0               conda-forge

A similar issue was reported to the pandas repository pandas-dev/pandas#60573 and that issue also happens on macOS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    upstreamBug or missing feature of upstream core GMT

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions