Skip to content

xarray>=2025.1.2 has inconsistent treatment of np.datetime64 and datetime #10220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DahnJ opened this issue Apr 11, 2025 · 1 comment
Open

Comments

@DahnJ
Copy link

DahnJ commented Apr 11, 2025

What is your issue?

In xarray prior to 2025.1.2 usage of np.datetime64 and datetime resulted in identical result

from datetime import datetime
import numpy as np
import xarray as xr

a = xr.DataArray(coords={'time': [np.datetime64('2020')]}, dims=['time'])
b = xr.DataArray(coords={'time': [datetime(2020,1,1)]}, dims=['time'])

print(a['time'].dtype)
print(b['time'].dtype)

would yield

datetime64[ns]
datetime64[ns]

From version 2025.1.2, this is no longer true, and we get

datetime64[s]
datetime64[ns]

Together with the equality check failing (pandas-dev/pandas#55694), this means comparing DataArrays and Datasets is now subtly dependent on whether datetime or np.datetime64 was used to construct them.

Is this the intended behaviour? Is there a recommended way to deal with this difference beyond requiring the users to always specify np.datetime(..., 'ns')?

@DahnJ DahnJ added the needs triage Issue that has not been reviewed by xarray team member label Apr 11, 2025
@spencerkclark
Copy link
Member

In some sense, yes, in alignment with pandas we did intentionally relax the previous automatic casting of all datetime-like objects passed to xarray to nanosecond resolution. For certain use cases it is beneficial to be able to use coarser resolution times. However we agree the dtype equality issue for DatetimeIndex-backed variables is a particularly awkward unintended consequence (see also #10045). Ideally this could be addressed upstream.

In terms of resolutions, specifically "s", "ms", "us", and "ns" are now permitted. Following pandas, coarser resolution np.datetime64 values are cast to "s" resolution, and finer resolution values are cast to "ns" resolution, but anything in between passes through unmodified. Xarray similarly defers to pandas on how to translate datetime.datetime objects into np.datetime64 values. For better or worse pandas still always chooses "ns" resolution in that context.

To the extent that we support datetime values with different resolutions, it seems somewhat inevitable that without special care users will end up with DataArrays of different dtypes, but it would be nice if equality was not sensitive to dtype.

@spencerkclark spencerkclark removed the needs triage Issue that has not been reviewed by xarray team member label Apr 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants