My script loads some 1GB CSV files using dask and write the data to parquet. However, sometimes the dask job failes with aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socketcaused by fsspec.exceptions.FSTimeoutError
STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False}
ddf_heat_randers = ddf.read_csv(
input_path,
storage_options=STORAGE_OPTIONS,
sep=';')
ddf_data.to_parquet(output_path, storage_options=STORAGE_OPTIONS)
Is there a way to configure the timeout?
I have tried with STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False, 'timeout':1} without any changes.
Perhaps my question relates to the PR: #364