From 8159a8edec30736a40ac4116310ce7ede901b342 Mon Sep 17 00:00:00 2001 From: thodson-usgs Date: Sun, 21 Jun 2026 16:25:01 -0500 Subject: [PATCH 1/5] docs+types: add NGWMN README section + samples reference page, fix conf root_doc, drop dead nblink media, unify last_modified type Four small, independent fixups surfaced in a library review: - docs: add a `reference/samples.rst` page and wire it into the API reference toctree. `dataretrieval.samples` is public (in `__all__`) but had no reference page. - docs(conf): `main_doc` is not a recognized Sphinx setting (the correct name is `root_doc`); it was silently ignored. Rename so the intent is real rather than relying on the `index` default. - docs(examples): drop the `extra-media: ../../../demos/datasets` entry from `peak_streamflow_trends.nblink`. That directory does not exist and the notebook reads no local files, so nbsphinx_link warned/failed copying missing media on every docs build. - types(waterdata): widen `get_daily`/`get_continuous` `last_modified` from `str | None` to `str | Iterable[str] | None`, matching the other eight OGC getters. `last_modified` is routed through `_format_api_dates`, which accepts a single interval string or a two-element [start, end] range list, so the narrow annotation rejected a valid input shape and was inconsistent with the parallel getters. - README: add an NGWMN usage example (state -> sites -> water levels) and an NGWMN entry under Available Data Services; reformat that section as a function index -- each service led by its function name (e.g. `get_dv`) with a brief description, and each subsection tagged with its module. - README: document `get_ratings` (usage example + index entry) and note its `dict`-of-rating-tables return shape; pass a Series to a getter directly in the NGWMN example (drop the redundant `.tolist()`). mypy --strict passes on the changed module. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_01Sjb14HkwuCydKSKMsaXsgd --- README.md | 105 +++++++++++++----- dataretrieval/waterdata/api.py | 4 +- docs/source/conf.py | 4 +- .../examples/peak_streamflow_trends.nblink | 5 +- docs/source/reference/index.rst | 1 + docs/source/reference/samples.rst | 8 ++ 6 files changed, 90 insertions(+), 37 deletions(-) create mode 100644 docs/source/reference/samples.rst diff --git a/README.md b/README.md index d651be6c..add08b2d 100644 --- a/README.md +++ b/README.md @@ -105,6 +105,18 @@ df, metadata = waterdata.get_continuous( print(f"Retrieved {len(df)} continuous gage height measurements") ``` +Rating curves come back as a `dict` of rating tables (one per rating file), +keyed by rating id — unlike the other getters, which return a +`(DataFrame, metadata)` pair: + +```python +# Get stage-discharge rating curves for a streamgage +ratings = waterdata.get_ratings(monitoring_location_id='USGS-01646500') + +for rating_id, table in ratings.items(): + print(f"{rating_id}: {len(table)} rating points") +``` + Visit the [API Reference](https://doi-usgs.github.io/dataretrieval-python/reference/waterdata.html) for more information and examples on available services and input parameters. @@ -118,6 +130,33 @@ import logging logging.basicConfig(level=logging.DEBUG) ``` +### National Ground-Water Monitoring Network (NGWMN) + +Access groundwater data aggregated from many state, federal, and local +agencies. NGWMN is a sibling of the Water Data API built on the same engine, +so chunking, pagination, and result shaping behave the same way: + +```python +from dataretrieval import ngwmn + +# Find the groundwater monitoring sites in a state +# (state accepts a full name, a postal code like 'WI', or a FIPS code like '55') +sites, metadata = ngwmn.get_sites(state='Wisconsin') + +print(f"Found {len(sites)} NGWMN sites in Wisconsin") + +# Pull water levels for those sites over a time window. The [:20] keeps this +# example small; drop it to pull the whole state — a multi-site request is +# split into URL-safe chunks automatically. +site_ids = sites['monitoring_location_id'][:20] +water_levels, metadata = ngwmn.get_water_level( + monitoring_location_id=site_ids, + datetime=['2022-01-01', '2024-01-01'] +) + +print(f"Retrieved {len(water_levels)} water-level observations") +``` + ### Water Quality Portal (WQP) Access water quality data from multiple agencies: @@ -170,35 +209,43 @@ print(f"Found {len(flowlines)} upstream tributaries within 50km") ## Available Data Services -### Modern USGS Water Data APIs (Recommended) -- **Daily values**: Daily statistical summaries (mean, min, max) -- **Instantaneous values**: High-frequency continuous data -- **Field measurements**: Discrete measurements from field visits -- **Monitoring locations**: Site information and metadata -- **Time series metadata**: Information about available data parameters -- **Latest daily values**: Most recent daily statistical summary data -- **Latest instantaneous values**: Most recent high-frequency continuous data -- **Daily, monthly, and annual statistics**: Median, maximum, minimum, arithmetic mean, and percentile statistics -- **Samples data**: Discrete USGS water quality data - -### Legacy NWIS Services (Deprecated) -- **Daily values (dv)**: Legacy daily statistical data -- **Instantaneous values (iv)**: Legacy continuous data -- **Site info (site)**: Basic site information -- **Statistics (stat)**: Statistical summaries -- **Discharge peaks (peaks)**: Annual peak discharge events - -### Water Quality Portal -- **Results**: Water quality analytical results from USGS, EPA, and other agencies -- **Sites**: Monitoring location information -- **Organizations**: Data provider information -- **Projects**: Sampling project details - -### Network Linked Data Index (NLDI) -- **Basin delineation**: Watershed boundaries for any point -- **Flow navigation**: Upstream/downstream network traversal -- **Feature discovery**: Find monitoring sites, dams, and other features -- **Hydrologic connectivity**: Link data across the stream network +### Modern USGS Water Data APIs (Recommended) — `dataretrieval.waterdata` +- `get_daily`: Daily statistical summaries (mean, min, max) +- `get_continuous`: High-frequency continuous (instantaneous) values +- `get_field_measurements`: Discrete measurements from field visits +- `get_monitoring_locations`: Site information and metadata +- `get_time_series_metadata`: A location's available data parameters +- `get_latest_daily`: Most recent daily statistical summary +- `get_latest_continuous`: Most recent high-frequency value +- `get_stats_por` / `get_stats_date_range`: Daily, monthly, and annual statistics +- `get_samples`: Discrete USGS water-quality samples +- `get_ratings`: Stage-discharge rating curves (returns a `dict` of rating tables, not `(df, metadata)`) + +### National Ground-Water Monitoring Network (NGWMN) — `dataretrieval.ngwmn` +- `get_sites`: Groundwater monitoring-location metadata across many agencies +- `get_water_level`: Depth-to-water and water-level observations +- `get_lithology`: Geologic-material logs by depth interval +- `get_well_construction`: Casing, screen, and build-out records +- `get_providers`: Contributing data-provider organizations + +### Legacy NWIS Services (Deprecated) — `dataretrieval.nwis` +- `get_dv`: Legacy daily statistical data +- `get_iv`: Legacy continuous (instantaneous) data +- `get_info`: Basic site information +- `get_stats`: Statistical summaries +- `get_discharge_peaks`: Annual peak discharge events + +### Water Quality Portal — `dataretrieval.wqp` +- `get_results`: Water-quality analytical results from USGS, EPA, and other agencies +- `what_sites`: Monitoring-location information +- `what_organizations`: Data-provider information +- `what_projects`: Sampling-project details + +### Network Linked Data Index (NLDI) — `dataretrieval.nldi` +- `get_basin`: Watershed boundary for a point or feature +- `get_flowlines`: Upstream/downstream flowline navigation +- `get_features`: Find monitoring sites, dams, and other features along the network +- `get_features_by_data_source`: Features from a specific data source ## More Examples diff --git a/dataretrieval/waterdata/api.py b/dataretrieval/waterdata/api.py index b2092d62..8d686f74 100644 --- a/dataretrieval/waterdata/api.py +++ b/dataretrieval/waterdata/api.py @@ -65,7 +65,7 @@ def get_daily( unit_of_measure: str | Iterable[str] | None = None, qualifier: str | Iterable[str] | None = None, value: str | Iterable[str] | None = None, - last_modified: str | None = None, + last_modified: str | Iterable[str] | None = None, skip_geometry: bool | None = None, time: str | Iterable[str] | None = None, bbox: list[float] | None = None, @@ -288,7 +288,7 @@ def get_continuous( unit_of_measure: str | Iterable[str] | None = None, qualifier: str | Iterable[str] | None = None, value: str | Iterable[str] | None = None, - last_modified: str | None = None, + last_modified: str | Iterable[str] | None = None, time: str | Iterable[str] | None = None, limit: int | None = None, filter: str | None = None, diff --git a/docs/source/conf.py b/docs/source/conf.py index 9d478f98..b129d35b 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -39,8 +39,8 @@ # suffix of source documents source_suffix = ".rst" -# The main toctree document. -main_doc = "index" +# The root toctree document. +root_doc = "index" # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the diff --git a/docs/source/examples/peak_streamflow_trends.nblink b/docs/source/examples/peak_streamflow_trends.nblink index 1bf99495..b707fea1 100644 --- a/docs/source/examples/peak_streamflow_trends.nblink +++ b/docs/source/examples/peak_streamflow_trends.nblink @@ -1,6 +1,3 @@ { - "path": "../../../demos/peak_streamflow_trends.ipynb", - "extra-media": [ - "../../../demos/datasets" - ] + "path": "../../../demos/peak_streamflow_trends.ipynb" } diff --git a/docs/source/reference/index.rst b/docs/source/reference/index.rst index 48947ff8..23608ce1 100644 --- a/docs/source/reference/index.rst +++ b/docs/source/reference/index.rst @@ -12,6 +12,7 @@ API reference ngwmn nldi nwis + samples streamstats utils waterdata diff --git a/docs/source/reference/samples.rst b/docs/source/reference/samples.rst new file mode 100644 index 00000000..902dd297 --- /dev/null +++ b/docs/source/reference/samples.rst @@ -0,0 +1,8 @@ +.. _samples: + +dataretrieval.samples +--------------------- + +.. automodule:: dataretrieval.samples + :members: + :special-members: From a35f2d254a2ca03a787134175a055f4669553cc6 Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Mon, 22 Jun 2026 11:47:58 -0500 Subject: [PATCH 2/5] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index add08b2d..1b3f994f 100644 --- a/README.md +++ b/README.md @@ -133,8 +133,8 @@ logging.basicConfig(level=logging.DEBUG) ### National Ground-Water Monitoring Network (NGWMN) Access groundwater data aggregated from many state, federal, and local -agencies. NGWMN is a sibling of the Water Data API built on the same engine, -so chunking, pagination, and result shaping behave the same way: +agencies. NGWMN uses the same OGC engine as the Water Data API, +so chunking and pagination behave the same way: ```python from dataretrieval import ngwmn From fa4402cf86ef4f7a46caac4564c55156809ee05b Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Mon, 22 Jun 2026 11:48:05 -0500 Subject: [PATCH 3/5] Update README.md --- README.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 1b3f994f..692991b4 100644 --- a/README.md +++ b/README.md @@ -145,12 +145,9 @@ sites, metadata = ngwmn.get_sites(state='Wisconsin') print(f"Found {len(sites)} NGWMN sites in Wisconsin") -# Pull water levels for those sites over a time window. The [:20] keeps this -# example small; drop it to pull the whole state — a multi-site request is -# split into URL-safe chunks automatically. -site_ids = sites['monitoring_location_id'][:20] +# Pull water levels from the first twenty sites over a time window. water_levels, metadata = ngwmn.get_water_level( - monitoring_location_id=site_ids, + monitoring_location_id=sites['monitoring_location_id'][:20], datetime=['2022-01-01', '2024-01-01'] ) From d7d838b754cd1d8da7c83c09a41577a7d1292e71 Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Mon, 22 Jun 2026 11:48:13 -0500 Subject: [PATCH 4/5] Update README.md --- README.md | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/README.md b/README.md index 692991b4..4937829d 100644 --- a/README.md +++ b/README.md @@ -105,18 +105,6 @@ df, metadata = waterdata.get_continuous( print(f"Retrieved {len(df)} continuous gage height measurements") ``` -Rating curves come back as a `dict` of rating tables (one per rating file), -keyed by rating id — unlike the other getters, which return a -`(DataFrame, metadata)` pair: - -```python -# Get stage-discharge rating curves for a streamgage -ratings = waterdata.get_ratings(monitoring_location_id='USGS-01646500') - -for rating_id, table in ratings.items(): - print(f"{rating_id}: {len(table)} rating points") -``` - Visit the [API Reference](https://doi-usgs.github.io/dataretrieval-python/reference/waterdata.html) for more information and examples on available services and input parameters. From f0edb49e1cb7a5b2402bd22e09e9d103694c08ee Mon Sep 17 00:00:00 2001 From: Timothy Hodson <34148978+thodson-usgs@users.noreply.github.com> Date: Mon, 22 Jun 2026 11:48:18 -0500 Subject: [PATCH 5/5] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 4937829d..bf3a319d 100644 --- a/README.md +++ b/README.md @@ -204,7 +204,7 @@ print(f"Found {len(flowlines)} upstream tributaries within 50km") - `get_latest_continuous`: Most recent high-frequency value - `get_stats_por` / `get_stats_date_range`: Daily, monthly, and annual statistics - `get_samples`: Discrete USGS water-quality samples -- `get_ratings`: Stage-discharge rating curves (returns a `dict` of rating tables, not `(df, metadata)`) +- `get_ratings`: Stage-discharge rating curves ### National Ground-Water Monitoring Network (NGWMN) — `dataretrieval.ngwmn` - `get_sites`: Groundwater monitoring-location metadata across many agencies