Open-source data pipelines that publish 193 space, astronomy, and physics datasets to Hugging Face in Parquet format. Covers satellites, orbital mechanics, asteroids, space weather, solar activity, exoplanets, gravitational waves, pulsars, radio surveys, X-ray catalogs, space probes, particle physics, and more — sourced from NASA, NOAA, ESA, SpaceX, Wikidata, and other public APIs. Updated daily via GitHub Actions.
All datasets are loadable in one line (load_dataset("juliensimon/...")), require no API keys, and work with pandas, polars, or any Parquet-compatible tool.
23,417 downloads (+550) · 7 likes · 199 datasets · updated 2026-04-24
| # | Dataset | Downloads |
|---|---|---|
| 1 | space-track-tle-history | 2,158 |
| 2 | esa-exomars-tgo-observations | 1,229 (+3) |
| 3 | wmo-oscar-satellites | 907 (+214) |
| 4 | constellation-tle-latest | 627 (+24) |
| 5 | spacex-launches | 517 (+11) |
| 6 | gaia-dr3-eclipsing-binaries | 376 (+1) |
| 7 | starlink-fleet-data | 363 (+13) |
| 8 | gaia-dr3-young-stellar-objects | 361 (+2) |
| 9 | gaia-dr3-white-dwarfs | 360 (+3) |
| 10 | open-agent-traces | 352 (+28) |
Track every object orbiting Earth and beyond. This collection covers the complete NORAD satellite catalog, daily Starlink constellation health, two-line element sets dating back to 1959, launch records, and near-Earth asteroid monitoring from NASA JPL. Whether you're propagating orbits with SGP4, analyzing space debris trends, or studying asteroid close approaches, these datasets provide the foundation for orbital mechanics research and space situational awareness.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| ast-spacemobile-fleet-data | Daily AST SpaceMobile BlueBird + BlueWalker direct-to-cell constellation health | Daily | <1 MB | |
| asterank-asteroid-mining | Mining economics for 400K+ asteroids: estimated value, profit, delta-v, and spectral types from Asterank | — | Static | ~20 MB |
| asteroid-lightcurves-lcdb | Rotation periods, lightcurve amplitudes, diameters, and taxonomies for 20K+ asteroids from LCDB | — | Static | ~1 MB |
| blue-origin-launches | Complete Blue Origin launch manifest — New Shepard + New Glenn flights (past and upcoming) | Daily | <1 MB | |
| bus-demeo-asteroid-taxonomy | Reference Bus-DeMeo spectroscopic taxonomy for 371 asteroids (24 classes, 0.45-2.45 um) | — | Static | <1 MB |
| comet-catalog | 1,278 comets with orbital elements, discoverers, and discovery dates from Wikidata | Quarterly | <1 MB | |
| constellation-census | 19 satellite constellations (Starlink, OneWeb, Kuiper, GPS, etc.) — 11K+ satellites | Daily | 0.4 MB | |
| constellation-tle-latest | Daily TLE snapshots for 18 constellations: GNSS, OneWeb, Iridium, Planet, SES, Intelsat, and more | Daily | <5 MB | |
| fcc-ngso-filings | FCC IBFS filings for major NGSO constellations — requested satellite counts, shells, status | Weekly | <1 MB | |
| fireball-bolide-events | Fireball and bolide atmospheric impact events detected by US government sensors | Weekly | <1 MB | |
| gcat-launch-vehicles | 4,875 launch vehicles, engines, and stages from GCAT | — | Static | <1 MB |
| gcat-satellite-catalog | 68K+ satellites, rocket bodies, and debris from GCAT (Jonathan McDowell) | Weekly | 2.6 MB | |
| globalstar-fleet-data | Daily Globalstar constellation health — per-generation satellite counts and status (Amazon-owned as of 2026) | Daily | <1 MB | |
| iau-meteor-showers | 2,163 meteor shower records from the IAU Meteor Data Center | — | Static | <1 MB |
| jpl-small-body-database | 1.4M+ asteroids and comets with orbital elements and physical parameters | Daily | 200 MB | |
| kuiper-fleet-data | Daily Amazon Project Kuiper constellation health — per-shell satellite counts and status | Daily | <1 MB | |
| launch-cost-to-leo | Historical and current launch vehicle costs per kilogram to low Earth orbit (LEO) | — | Static | <1 MB |
| launch-vehicles | 230+ orbital launch vehicles with specs and payload capacity from Wikidata | Quarterly | <1 MB | |
| mpc-comet-elements | Orbital elements for all known comets from the Minor Planet Center | — | Static | <1 MB |
| neo-close-approaches | 35K+ near-Earth asteroid and comet close approaches from NASA JPL | Daily | 3.2 MB | |
| neowise-asteroid-properties | Diameters, albedos, and beaming parameters for 100K+ asteroids from WISE/NEOWISE | — | Static | ~10 MB |
| nesvorny-asteroid-families | 150K+ asteroids grouped into dynamical families by hierarchical clustering (Nesvorny et al.) | — | Static | ~10 MB |
| nhats-accessible-asteroids | 4,800+ asteroids accessible for human space missions with delta-v requirements | Daily | <1 MB | |
| oneweb-fleet-data | Daily OneWeb (Eutelsat) constellation health — per-plane satellite counts at ~1,200 km | Daily | <1 MB | |
| orbital-fragmentation-events | Catalog of orbital fragmentation events (breakups, explosions, collisions) from NORAD SATCAT | — | Static | <1 MB |
| reentry-events | 35K satellite and debris reentry events with decay dates and locations | Daily | <1 MB | |
| satnogs-transmitters | 10K+ satellite radio transmitters and frequencies from SatNOGS | Weekly | 5 MB | |
| sdss-asteroid-taxonomy | Compositional taxonomy for 50K+ SDSS observations of asteroids with ugriz reflectances | — | Static | ~5 MB |
| sentry-impact-risk | Near-Earth objects with non-zero Earth impact probability | Daily | <1 MB | |
| space-agency-database | Space agencies and governmental space organizations worldwide | Quarterly | <1 MB | |
| space-launch-log | Every orbital and suborbital launch since 1957 with sites and outcomes | Weekly | 2.4 MB | |
| space-missions | 24K+ crewed and uncrewed space missions from Wikidata | Quarterly | <1 MB | |
| space-track-satcat | Complete NORAD satellite catalog — 68K satellites, rocket bodies, and debris | Daily | 1.6 MB | |
| space-track-tle-history | 238 million orbital element sets for every cataloged object since 1959 | Daily | 10.9 GB | |
| spacecraft-database | 8K+ spacecraft with operators, manufacturers, and orbits from Wikidata | Quarterly | <1 MB | |
| spacex-launches | 659 SpaceX missions with timelines, descriptions, and carousel photos from spacex.com | Daily | ~80 MB | |
| ssodnet-asteroid-properties | Physical properties for 500K+ asteroids (diameters, albedos, taxonomy, masses) from IMCCE SsODNet | — | Static | ~50 MB |
| starlink-fleet-data | Daily Starlink constellation health — per-shell satellite counts and status | Daily | 618 MB | |
| starlink-ground-stations | Starlink gateway and point-of-presence locations worldwide | Daily | 7 KB | |
| starlink-tle-latest | Latest Starlink + GPS TLEs in raw and Parquet format | Daily | 1.5 MB | |
| tno-centaur-properties | 652 TNO/Centaur physical properties (diameter, albedo, density) from PDS | — | Static | <1 MB |
| ucs-satellite-database | 7,500+ active satellites with purpose, operator, and orbit metadata | Quarterly | 5 MB | |
| ula-launches | Complete United Launch Alliance manifest — Atlas V, Delta, Vulcan Centaur flights | Daily | <1 MB | |
| wmo-oscar-satellites | 1,025 Earth-observing satellites and 1,230 instruments from WMO OSCAR | Quarterly | <1 MB |
Data returned by humanity's most distant spacecraft and surface explorers. Includes 50+ years of interplanetary measurements from Voyager and Pioneer, Cassini's Saturn observations, Mars surface weather and 2M+ images from Perseverance and Curiosity, MAVEN atmospheric key parameters, rock compositions from Curiosity's laser spectrometer, marsquake detections from InSight, and million-record observation logs from ESA's Mars Express, ExoMars TGO, Rosetta, BepiColombo, JUICE, and Huygens missions. Ideal for planetary science, mission planning studies, and multi-instrument data fusion.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| artemis-ii | Artemis II crewed lunar flyby: 1,285 trajectory vectors, crew manifest, mission timeline, payloads | Daily | <1 MB | |
| cassini-saturn-observations | 63K Saturn observation records from the Cassini mission (2004-2017) | — | Static | 1.6 MB |
| deep-space-probes | 1.2M hourly readings from Voyager 1+2 and Pioneer 10+11 (1972-2025) | Monthly | 32 MB | |
| esa-bepicolombo-observations | 176K+ observation records from ESA/JAXA BepiColombo Mercury mission (11 instruments, cruise + flybys) | Weekly | ~20 MB | |
| esa-exomars-tgo-observations | 27M+ observation records from ESA ExoMars TGO (4 instruments, since 2018) | Weekly | ~2 GB | |
| esa-huygens-titan-descent | 14K+ observation metadata from ESA Huygens Titan descent (8 instruments, 2005) | — | Static | <1 MB |
| esa-juice-observations | 6K+ observation records from ESA JUICE Jupiter mission (cruise phase, growing) | Weekly | <1 MB | |
| esa-mars-express-observations | 1.66M observation metadata from ESA Mars Express (8 instruments, since 2003) | Weekly | 200 MB | |
| esa-rosetta-observations | 14M+ observation records from ESA Rosetta at comet 67P (15 instruments incl. Philae) | — | Static | ~2 GB |
| esa-venus-express-observations | 525K observation metadata from ESA Venus Express (5 instruments, 2006-2014) | Weekly | 21 MB | |
| galileo-jupiter-atmosphere | Jupiter atmospheric profile from Galileo Probe descent (1995) — temperature, pressure, density to 24 bar | — | Static | <1 MB |
| gcat-deep-space | 1,206 deep space objects and 469 planetary landings from GCAT | Weekly | <1 MB | |
| huygens-titan-atmosphere | Titan atmospheric profile from Huygens Probe descent (2005) — 1,400 km to surface | — | Static | <1 MB |
| insight-marsquake-catalog | 2,715 marsquakes detected by InSight SEIS seismometer (2019-2022, final catalog) | — | Static | <1 MB |
| isro-missions | ISRO spacecraft, launchers, customer satellites, and research centres | Quarterly | <1 MB | |
| mars-chemcam-compositions | 30K+ Mars rock/soil oxide compositions from Curiosity ChemCam LIBS | — | Static | 1 MB |
| mars-perseverance-weather | Mars surface weather from Perseverance MEDA (temperature, pressure, wind, UV) | Monthly | ~100 MB | |
| nasa-eva-chronology | 375 spacewalks (EVAs) — complete history from Gemini to ISS | — | Static | <1 MB |
| nasa-mars-rover-images | 400K+ image metadata from Perseverance and Curiosity rovers (sol, camera, position, URLs) | Weekly | ~50 MB | |
| nasa-maven-kp-insitu | MAVEN Mars atmosphere key parameters: solar wind, magnetic field, ion composition at 4-8s cadence | Quarterly | ~500 MB | |
| pds-planetary-missions | NASA PDS mission catalog — 98 missions, 115 spacecraft, 748 instruments with targets and cross-references | — | Static | <5 MB |
| pluto-atmosphere | Pluto atmospheric profiles (temperature, pressure, composition, haze) from New Horizons | — | Static | <1 MB |
Explore the surfaces of other worlds through impact crater databases and geochemistry. Features the most comprehensive crater catalogs available — 1.3 million lunar craters, 384K Mars craters, and 44K Ceres craters mapped by the Dawn mission — alongside IAU-approved planetary nomenclature and the Meteoritical Society's record of every known meteorite fall on Earth.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| ceres-craters-dawn | 44,594 impact craters on Ceres (>=1 km) from the Dawn Framing Camera | — | Static | 9 MB |
| impact-craters | 4K+ impact craters across the solar system (Earth, Moon, Mars, etc.) from Wikidata | Quarterly | <1 MB | |
| lunar-craters-robbins | 1.3M+ lunar impact craters from the Robbins 2019 database | — | Static | 200 MB |
| lunar-sample-geochemistry | 58K geochemical analyses of Apollo/Luna/Chang'e 5 lunar samples (Astromat) | — | Static | 1.4 MB |
| mars-craters-robbins | 384K+ Mars impact craters from the Robbins & Hynek 2012 database | — | Static | 50 MB |
| mercury-crater-degradation | 3,253 Mercury craters with degradation classification (Kinczyk et al. 2020) | — | Static | <1 MB |
| mercury-craters-herrick | 16,876 Mercury impact craters from MESSENGER imagery (Herrick et al. 2011) | — | Static | <1 MB |
| meteorite-database | 1,200+ named meteorites with classification, mass, and fall location from Wikidata | Quarterly | <1 MB | |
| meteorite-landings | 45K+ known meteorite landings with classification and mass | — | Static | 5 MB |
| planetary-nomenclature | 15K+ IAU-approved named features on Moon, Mars, Venus, and Mercury | — | Static | 5 MB |
| solar-system-moons | All 200+ known natural satellites of planets and dwarf planets with orbital and physical parameters | — | Static | <1 MB |
Monitor the Sun-Earth connection in near real-time. These datasets track solar flares, coronal mass ejections, geomagnetic storms, and the solar wind — the key drivers of space weather that affect satellite operations, GPS accuracy, power grids, and astronaut safety. Includes essential indices for orbit propagation (Kp, Ap, F10.7), 70+ years of sunspot records, and official NOAA alerts. Updated daily from NOAA SWPC, NASA DONKI, WDC Kyoto, and other authoritative sources.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| auroral-electrojet-index | Hourly AE/AU/AL/AO auroral electrojet indices from Kyoto WDC | Daily | 2 MB | |
| celestrak-space-weather | Consolidated space weather data for orbit propagation (Kp, Ap, F10.7) | Daily | 5 MB | |
| donki-space-weather-events | 12K+ coronal mass ejections, geomagnetic storms, and solar particle events (2010+) | Daily | 1.0 MB | |
| dst-index | 600K+ hourly geomagnetic storm intensity readings since 1957 (Dst index) | Daily | 1.7 MB | |
| f107-solar-flux | Daily F10.7 cm solar radio flux since 1947 — primary proxy for atmospheric drag | Daily | 2 MB | |
| forbush-decreases | 7,097 Forbush decrease events (1957-2016) with solar wind, IMF, and CME parameters from IZMIRAN | — | Static | <1 MB |
| geomagnetic-kp-index | 3-hourly geomagnetic disturbance index (Kp 0-9) with NOAA storm scale | Daily | 4 KB | |
| iers-earth-orientation | Daily Earth orientation parameters (polar motion, UT1-UTC, LOD) since 1973 | Daily | 5 MB | |
| neutron-monitor-cosmic-rays | Hourly cosmic ray intensity from the global neutron monitor network | Daily | <1 MB | |
| omni-solar-wind-parameters | 561K+ hourly solar wind parameters (velocity, density, IMF) from NASA OMNI | Daily | 20 MB | |
| silso-sunspot-number | 120K+ daily sunspot numbers since 1818 from SILSO/Royal Observatory of Belgium | Monthly | 3 MB | |
| solar-flare-events | 16K+ individual solar flare detections from GOES X-ray sensors (2017+) | Daily | 0.5 MB | |
| solar-proton-events | Solar proton events (SPEs) affecting the Earth environment from 1976 to present | — | Static | <1 MB |
| solar-radio-bursts | Solar radio burst events (Type II/III/IV/V) from HEASARC | Weekly | 5 MB | |
| solar-wind | Real-time solar wind speed, density, temperature, and magnetic field from L1 | Daily | 0.2 MB | |
| space-weather-indices | Daily Kp, Ap, F10.7 solar and geomagnetic indices since 1957 | Daily | 0.8 MB | |
| substorm-onsets | 253K+ magnetospheric substorm onsets from 5 detection algorithms (SuperMAG) | Quarterly | 3 MB | |
| swpc-alerts | Official NOAA space weather alerts, watches, and warnings | Daily | 2 MB |
A broad survey of the observable universe — from exoplanets in our galactic neighborhood to quasars at the edge of the cosmos. Covers confirmed exoplanets from NASA, gravitational wave detections from LIGO/Virgo/KAGRA, gamma-ray bursts, fast radio bursts from CHIME, pulsars, variable stars, galaxy clusters, and million-source radio and X-ray sky surveys. These datasets support multi-messenger astronomy, cross-matching across wavelengths, and large-scale statistical studies of astrophysical populations.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| 4xmm-dr14-xray-sources | 630K+ unique X-ray sources from ESA XMM-Newton serendipitous survey (4XMM) | — | Static | ~80 MB |
| aavso-vsx-variable-stars | 1.5M+ variable stars from the AAVSO Variable Star Index (VSX) with types, periods, and magnitudes | — | Static | ~100 MB |
| apogee-dr17 | APOGEE DR17 stellar parameters and abundances from high-resolution infrared spectroscopy | — | Static | ~50 MB |
| astronaut-database | Every person who has been to space — 560+ astronauts/cosmonauts | Monthly | <1 MB | |
| astronomer-database | 11K+ astronomers with affiliations, awards, and fields of work from Wikidata | Quarterly | <1 MB | |
| black-hole-catalog | Known black hole systems and X-ray binaries from SIMBAD | Weekly | 90 KB | |
| bright-star-catalog | 9,110 naked-eye stars from the Bright Star Catalogue (BSC5, 5th Revised Edition) | — | Static | ~1 MB |
| brown-dwarf-catalog | 14K ultracool and brown dwarfs within 40 pc | — | Static | 10 MB |
| carbon-stars | 6,000+ Galactic carbon stars from the General Catalogue of Cool Carbon Stars (GCCS) | — | Static | <1 MB |
| cataclysmic-variable-catalog | 2,000+ cataclysmic variables — dwarf novae, polars, and classical novae | Quarterly | <1 MB | |
| chandra-x-ray-sources | 28K X-ray sources from the Chandra Source Catalog (CSC 2.1) | — | Static | 1.8 MB |
| chime-frb-catalog | 4,500+ fast radio bursts from the CHIME/FRB telescope | Semi-annual | 5 MB | |
| cns5-nearby-stars | Catalogue of Nearby Stars within 25 parsecs (CNS5) with astrometry and photometry | — | Static | <1 MB |
| constellation-catalog | 94 IAU constellations with abbreviations, areas, and brightest stars from Wikidata | Quarterly | <1 MB | |
| cosmic-void-catalog | 1,000+ cosmic voids from SDSS DR7 (Pan et al. 2012) | Semi-annual | <1 MB | |
| cosmicflows-galaxy-distances | 56K galaxy distances from Cosmicflows-4 (8 distance methods) | — | Static | 3.7 MB |
| desi-dr1-redshifts | 1M+ spectroscopic redshifts from the DESI Data Release 1 Bright Galaxy Survey | — | Static | ~100 MB |
| erosita-erass1-xray | 900K X-ray sources from the first eROSITA All-Sky Survey (eRASS1) | Per release | 500 MB | |
| euve-observations | 1,367 EUVE extreme-UV observations (1992–2001) — the only EUV space mission ever flown (70–760 Å) | Quarterly | <1 MB | |
| fermi-4fgl-dr4 | 7K gamma-ray sources from Fermi LAT 14-year all-sky survey | Annual | 50 MB | |
| first-radio-catalog | 946K radio sources from the VLA FIRST Survey at 1.4 GHz (5" resolution) | — | Static | 113 MB |
| fuse-observations | 5,729 FUSE far-UV spectra (1999–2007) — highest-resolution 905–1187 Å spectrograph ever flown | Quarterly | <1 MB | |
| gaia-dr3-cepheids | Gaia DR3 Cepheid variable stars with pulsation periods, multi-band photometry, and parallaxes | — | Static | ~10 MB |
| gaia-dr3-eclipsing-binaries | Gaia DR3 eclipsing binary candidates with orbital periods and light-curve parameters | — | Static | ~20 MB |
| gaia-dr3-rrlyrae | 272K RR Lyrae pulsating stars from Gaia DR3 — distance ladder | — | Static | 50 MB |
| gaia-dr3-spectroscopic-binaries | 180K+ spectroscopic binary star orbital solutions (SB1+SB2) from Gaia DR3 | — | Static | ~20 MB |
| gaia-dr3-white-dwarfs | 359K white dwarf candidates with atmospheric parameters and masses from Gaia DR3 | — | Static | ~50 MB |
| gaia-dr3-young-stellar-objects | 79K+ young stellar object (YSO) candidates with classification scores and variability from Gaia DR3 | — | Static | ~10 MB |
| galah-dr4-stellar-abundances | GALAH DR4 radial velocities, stellar parameters, and elemental abundances for 917K stars | — | Static | ~80 MB |
| galex-observations | 275K GALEX UV survey observations (2003–2013) tagged with survey type (AIS, MIS, DIS, NGS, etc.) | Quarterly | ~6 MB | |
| galaxy-clusters | 1,650+ galaxy clusters detected by Planck via the Sunyaev-Zeldovich effect | Quarterly | 50 KB | |
| galaxy-zoo-2-morphology | 243K citizen-science galaxy morphology classifications with vote fractions and debiased probabilities | — | Static | ~20 MB |
| gamma-ray-bursts | 4,200+ gamma-ray bursts from Fermi GBM with duration, flux, and spectral data | Weekly | 0.3 MB | |
| gcvs-variable-stars | 58K variable stars from the General Catalogue of Variable Stars | Quarterly | 15 MB | |
| geneva-copenhagen-stellar-survey | 16,682 F and G dwarf stars in the solar neighbourhood with ages, metallicities, and kinematics | — | Static | ~5 MB |
| globular-star-clusters | 167 Milky Way globular clusters with masses, structural parameters, and metallicities | — | Static | <1 MB |
| gravitational-lenses | 33K strong gravitational lenses from the lenscat community catalog | — | Static | 0.9 MB |
| gravitational-wave-events | 260+ black hole and neutron star mergers detected by LIGO/Virgo/KAGRA | Weekly | 30 KB | |
| grbweb-unified-grb-catalog | Unified GRB catalog from GRBweb combining Fermi, Swift, BATSE, BeppoSAX, and IPN detectors | — | Static | <1 MB |
| gswlc-galaxy-properties | 659K galaxies with stellar masses, star formation rates, and dust attenuation from GALEX-SDSS-WISE | — | Static | ~50 MB |
| hecate-nearby-galaxies | HECATE catalog of nearby galaxies within 200 Mpc with stellar masses, SFR, and morphology | — | Static | ~10 MB |
| hipparcos-catalog | 118K brightest stars with precise positions and parallaxes from ESA Hipparcos | — | Static | 30 MB |
| hst-observations | 2.6M+ Hubble Space Telescope observations (1990–present) — target, proposal, instrument, detector metadata from MAST | Weekly | ~80 MB | |
| icecube-neutrino-catalog | IceCube neutrino point sources from HEASARC | — | Static | <1 MB |
| icrf3-reference-frame | 3,417 ICRF3 extragalactic radio sources — THE celestial reference frame | — | Static | 2 MB |
| iue-observations | 102K IUE UV spectra (1978–1996) — the longest-running UV space observatory, from SWP, LWP, LWR cameras | Quarterly | ~2 MB | |
| jwst-observations | 960K+ JWST observations from MAST — proposal, target, instrument, timing, and wavelength metadata | Weekly | ~150 MB | |
| k2-observations | 765K K2 extended-mission observations (2014–2018, campaigns C0–C19) with parsed EPIC ID, campaign, and cadence | Quarterly | ~20 MB | |
| kepler-eclipsing-binaries | 2,177 Kepler eclipsing binary stars | — | Static | 1 MB |
| kepler-observations | 213K Kepler prime-mission observations (2009–2013) with KIC ID, cadence, and per-quarter observation mask | Quarterly | ~5 MB | |
| kepler-transit-timing | 295K transit times for 2,599 KOIs with O-C residuals, durations, and depths (Holczer+ 2016) | — | Static | ~5 MB |
| mcgill-magnetar-catalog | All known magnetars with spin parameters, magnetic field strengths, and X-ray properties | — | Static | <1 MB |
| messier-catalog | The classic Messier catalog — 110 galaxies, nebulae, and star clusters | Quarterly | 10 KB | |
| milliquas | Milliquas v8 — the Million Quasars Catalog with positions, redshifts, and radio/X-ray associations | — | Static | ~100 MB |
| nasa-exoplanets | 6,150 confirmed exoplanets with orbital, stellar, and discovery parameters | Weekly | 0.5 MB | |
| nebula-catalog | 60K+ nebulae (emission, reflection, dark, planetary) with coordinates and distances from Wikidata | Quarterly | 1.7 MB | |
| ngc-ic-catalog | 14K deep-sky objects — galaxies, nebulae, and star clusters (NGC + IC) | Monthly | 0.5 MB | |
| nvss-radio-catalog | 1.77M radio sources from the NRAO VLA Sky Survey at 1.4 GHz | — | Static | 150 MB |
| observatory-database | 640+ ground and space observatories with locations, apertures, and wavelengths from Wikidata | Quarterly | <1 MB | |
| open-star-clusters | 7,167 Gaia-era open star clusters with distances and ages | — | Static | 5 MB |
| open-supernova-catalog | 72K supernovae with light curves, spectra references, and host galaxies | Weekly | 10 MB | |
| otter-tde-catalog | Tidal disruption events (TDEs) from the Open TDE Catalog — stars torn apart by black holes | — | Static | <1 MB |
| pantheon-plus-sne-ia | 1,550 Type Ia supernovae — gold standard cosmological distance dataset | — | Static | 10 MB |
| planck-cold-clumps | 13K+ Galactic cold clumps — pre-stellar cores and star-forming regions from Planck | — | Static | <1 MB |
| planck-sz2-clusters | 1,650+ galaxy clusters from Planck SZ2 catalog with mass and redshift | Semi-annual | <1 MB | |
| planetary-nebulae | 1,715 planetary nebulae from MUSE survey | — | Static | <1 MB |
| pulsar-catalog | 4,300+ pulsars with spin period, dispersion measure, and magnetic field | Monthly | 0.2 MB | |
| pulsar-glitch-catalog | 700+ pulsar glitch events from the Jodrell Bank Glitch Catalogue | Quarterly | <1 MB | |
| quasar-catalog | 50K quasars, Seyfert galaxies, blazars, and active galactic nuclei | Weekly | 1.3 MB | |
| rave-dr6 | RAVE DR6 stellar radial velocities, parameters, and elemental abundances for 518K spectra | — | Static | ~30 MB |
| rc3-galaxy-morphology | 23K bright galaxies with Hubble morphological types from RC3 | — | Static | 10 MB |
| roma-bzcat-blazars | 3,561 confirmed blazars (BL Lac + FSRQ) from Roma-BZCAT 5th edition | — | Static | <1 MB |
| solar-eclipse-catalog | 12,000+ solar eclipses spanning 5 millennia (-1999 to +3000) from NASA | — | Static | <1 MB |
| stackexchange-space-qa | 33K Q&A pairs from Astronomy + Space Exploration Stack Exchange sites — top-scored/accepted answers, CC-BY-SA 4.0 | Annually | ~50 MB | |
| sumss-radio-catalog | 211K southern radio sources at 843 MHz from SUMSS | — | Static | 30 MB |
| supernova-remnants | 310 Galactic supernova remnants with radio flux and spectral index | Quarterly | 10 KB | |
| tess-toi-candidates | 7K+ TESS Objects of Interest — active exoplanet candidates | Weekly | 5 MB | |
| tgss-radio-catalog | 624K radio sources at 150 MHz from GMRT TGSS ADR1 | — | Static | 80 MB |
| unified-radio-catalog | SPECFIND v3 unified radio source catalog cross-matching 50+ radio surveys | — | Static | ~100 MB |
| vlass-radio-sources | 3.4M radio sources from VLA Sky Survey Epoch 1 (VLASS) at 3 GHz | — | Static | 681 MB |
| wds-double-stars | 157K visual double star systems from the Washington Double Star Catalog | Weekly | 50 MB | |
| wise-hii-regions | 8,000+ Galactic HII regions from WISE mid-infrared survey (Anderson+ 2014) | Quarterly | <1 MB | |
| wolf-rayet-stars | 383 Galactic Wolf-Rayet stars with Gaia DR2 distances and spectral types | — | Static | <1 MB |
| xray-binary-catalog | 500+ high-mass and low-mass X-ray binaries (Liu et al. 2006/2007) | Quarterly | <1 MB |
Fundamental particle properties and high-energy astrophysics catalogs. Includes the Particle Data Group's authoritative summary of every known particle, cosmic ray energy spectra from 131 experiments, ultra-high-energy events from the Pierre Auger Observatory, and gamma-ray source catalogs spanning MeV to PeV energies from Fermi, Swift, INTEGRAL, HAWC, and LHAASO. Essential for particle physics, astroparticle research, and multi-wavelength source identification.
| Dataset | Description | Last Updated | Schedule | Size |
|---|---|---|---|---|
| auger-cosmic-rays | Ultra-high-energy cosmic ray events from Pierre Auger Observatory | — | Static | 100 MB |
| crdb-cosmic-ray-spectra | 316K cosmic ray measurements from 131 experiments | Quarterly | 50 MB | |
| fermi-3fhl-hard-gamma-ray | 1,558 hard gamma-ray sources (>10 GeV) from Fermi LAT 3FHL | — | Static | 0.6 MB |
| fermi-3pc-gamma-ray-pulsars | 7K+ gamma-ray pulsars from Fermi LAT Third Pulsar Catalog (3PC) | — | Static | 2.2 MB |
| fermi-4lac-agn-catalog | 3,409 gamma-ray AGN from Fermi LAT Fourth AGN Catalog (4LAC) | — | Static | 0.7 MB |
| fermi-gbm-triggers | 12.5K+ Fermi GBM triggers — all triggers, not just confirmed GRBs | Daily | 1.8 MB | |
| hawc-tev-gamma-ray | 65 TeV gamma-ray sources from the 3HWC HAWC catalog | — | Static | <1 MB |
| icecat-neutrino-alerts | High-energy neutrino alert events from the IceCube Neutrino Observatory (ICECAT-1) | — | Static | <1 MB |
| integral-ibis-hard-xray | 929 hard X-ray sources from INTEGRAL IBIS 17-year survey (17-290 keV) | — | Static | 0.3 MB |
| lhaaso-gamma-ray-sources | 180 ultra-high-energy gamma-ray sources from 1LHAASO (2024) | — | Static | <1 MB |
| pdg-particle-properties | Every known particle from the Particle Data Group | Annual | 50 MB | |
| physics-nobel-laureates | 229 Physics Nobel Prize laureates with institutions and cited work from Wikidata | Quarterly | <1 MB | |
| swift-bat-hard-xray-survey | 1,893 hard X-ray sources (14-195 keV) from Swift-BAT 157-month survey | — | Static | 0.3 MB |
| tevcat-tev-gamma-ray | 322 TeV gamma-ray sources — THE ground-based VHE reference catalog | — | Static | <1 MB |
- Orbital Mechanics — satellites, TLEs, launches, NEOs, asteroids, impact risk
- Space Probes & Missions — Voyager, Pioneer, Cassini, Mars Express, Rosetta, Curiosity, Perseverance, EVAs
- Planetary Science — lunar craters, Mars craters, Mars geochemistry, meteorites
- Space Weather — solar flares, CMEs, geomagnetic storms, solar wind, Kp/Ap/F10.7 indices
- Astronomy — exoplanets, pulsars, radio surveys, X-ray catalogs, variable stars, gravitational waves, galaxy morphology
- Physics — particle properties, cosmic ray spectra, hard X-ray surveys, gamma-ray catalogs (TeV/UHE)
- Solar System — planetary missions, craters (Moon/Mars/Ceres/Mercury), atmospheric profiles (Jupiter/Titan), named features
- Space Essentials — astronauts, space missions, meteorites, constellations, Nobel laureates. No jargon, just names, dates, and places
Each dataset has a Python script in scripts/ and a GitHub Actions workflow in .github/workflows/. The scripts fetch data from public sources, convert to Parquet, and upload to Hugging Face.
Pipelines use two update strategies:
- Full rebuild — re-fetches the entire dataset from source. Used when the source is a single file with no delta endpoint (SATCAT, Space Weather) or the dataset is small enough that incremental updates aren't worth the complexity.
- Incremental — downloads the existing Parquet from HF, fetches only new/recent data, merges and deduplicates, then uploads. Falls back to full rebuild automatically when no existing data is found. Used by Starlink, Constellation Census, DONKI (14-day window), Dst Index (current month only), Solar Flares (SWPC daily append), Solar Wind (7-day rolling window), and Kp Index.
The only secret needed is HF_TOKEN — a Hugging Face write token, set in the repo's GitHub Actions secrets.
pip install pandas pyarrow requests huggingface_hub[hf_xet]
# Orbital Mechanics
python scripts/update-asterank.py
python scripts/update-bus-demeo.py
python scripts/update-comets.py
python scripts/update-constellation-census.py
python scripts/update-constellation-tles.py
python scripts/update-fireballs.py
python scripts/update-fragmentation-events.py
python scripts/update-gcat.py
python scripts/update-gcat-satcat.py
python scripts/update-ground-stations.py
python scripts/update-launch-cost.py
python scripts/update-launch-log.py
python scripts/update-launch-vehicles.py
python scripts/update-lcdb.py
python scripts/update-meteor-showers.py
python scripts/update-meteorite-landings.py
python scripts/update-mpc-comets.py
python scripts/update-neo.py
python scripts/update-neowise.py
python scripts/update-nesvorny-families.py
python scripts/update-nhats.py
python scripts/update-reentry-events.py
python scripts/update-satcat.py
python scripts/update-satnogs.py
python scripts/update-sbdb.py
python scripts/update-sdss-taxonomy.py
python scripts/update-sentry.py
python scripts/update-space-agencies.py
python scripts/update-space-missions.py
python scripts/update-spacecraft.py
python scripts/update-spacex-launches.py
python scripts/update-ssodnet.py
python scripts/update-starlink.py
SPACETRACK_USER=xxx SPACETRACK_PASS=xxx python scripts/update-tle-history.py
python scripts/update-tle-latest.py
python scripts/update-tno-centaur.py
python scripts/update-ucs.py # requires: pip install openpyxl
python scripts/update-wmo-oscar.py
# Planetary Science
python scripts/update-ceres-craters.py
python scripts/update-impact-craters.py
python scripts/update-lunar-craters.py
python scripts/update-lunar-geochemistry.py
python scripts/update-mars-craters.py
python scripts/update-mercury-craters.py
python scripts/update-mercury-degradation.py
python scripts/update-meteorites.py
python scripts/update-planetary-nomenclature.py
python scripts/update-pluto-atmosphere.py
pip install beautifulsoup4 lxml && python scripts/update-solar-system-moons.py
# Space Weather
python scripts/update-ae-index.py
python scripts/update-celestrak-sw.py
python scripts/update-donki.py
python scripts/update-dst-index.py
python scripts/update-f107.py
python scripts/update-forbush-decreases.py
python scripts/update-iers-eop.py
python scripts/update-kp-index.py
python scripts/update-neutron-monitor.py
python scripts/update-omni.py
python scripts/update-solar-eclipses.py
pip install netCDF4 && python scripts/update-solar-flares.py
pip install beautifulsoup4 lxml && python scripts/update-solar-proton-events.py
python scripts/update-solar-radio.py
python scripts/update-solar-wind.py
python scripts/update-space-weather.py
python scripts/update-substorm-onsets.py
python scripts/update-sunspot.py
python scripts/update-swpc-alerts.py
# Space Probes & Missions
python scripts/update-artemis-ii.py
python scripts/update-astronauts.py
python scripts/update-bepicolombo.py
python scripts/update-cassini.py
python scripts/update-chemcam.py
python scripts/update-deep-space-probes.py
python scripts/update-eva.py
python scripts/update-exomars-tgo.py
python scripts/update-galileo-atmosphere.py
python scripts/update-gcat-deep-space.py
python scripts/update-huygens.py
python scripts/update-huygens-atmosphere.py
python scripts/update-insight-marsquakes.py
python scripts/update-isro.py
python scripts/update-juice.py
python scripts/update-mars-express.py
python scripts/update-mars-rovers.py
python scripts/update-maven.py
python scripts/update-meda-weather.py
python scripts/update-pds-missions.py
python scripts/update-rosetta.py
python scripts/update-venus-express.py
# Astronomy
python scripts/update-4xmm-dr14.py
python scripts/update-aavso-vsx.py
python scripts/update-apogee-dr17.py
python scripts/update-astronomers.py
python scripts/update-black-holes.py
python scripts/update-bright-stars.py
python scripts/update-brown-dwarfs.py
python scripts/update-carbon-stars.py
python scripts/update-cataclysmic-variables.py
python scripts/update-chandra.py
python scripts/update-chime-frb.py
python scripts/update-cns5.py
python scripts/update-constellations.py
python scripts/update-cosmic-voids.py
python scripts/update-cosmicflows.py
python scripts/update-desi.py
python scripts/update-erosita.py
python scripts/update-euve.py
python scripts/update-exoplanets.py
pip install astropy && python scripts/update-fermi-4fgl.py
python scripts/update-first.py
python scripts/update-fuse.py
python scripts/update-gaia-cepheids.py
python scripts/update-gaia-eb.py
python scripts/update-gaia-rrlyrae.py
python scripts/update-gaia-sb.py
python scripts/update-gaia-wd.py
python scripts/update-gaia-yso.py
pip install astropy && python scripts/update-galah.py
python scripts/update-galex.py
python scripts/update-galaxy-clusters.py
python scripts/update-galaxy-zoo.py
python scripts/update-gcvs.py
python scripts/update-geneva-copenhagen.py
python scripts/update-globular-clusters.py
python scripts/update-gravitational-lenses.py
python scripts/update-gravitational-waves.py
python scripts/update-grb.py
python scripts/update-grbweb.py
python scripts/update-gswlc.py
python scripts/update-hecate.py
python scripts/update-hst.py
python scripts/update-hii-regions.py
python scripts/update-hipparcos.py
python scripts/update-icecube.py
python scripts/update-icrf3.py
python scripts/update-iue.py
python scripts/update-jwst.py
python scripts/update-k2-obs.py
python scripts/update-kepler-eb.py
python scripts/update-kepler-obs.py
python scripts/update-kepler-ttv.py
python scripts/update-magnetars.py
python scripts/update-messier.py
python scripts/update-milliquas.py
python scripts/update-nebulae.py
python scripts/update-ngc-ic.py
python scripts/update-nvss.py
python scripts/update-observatories.py
python scripts/update-open-clusters.py
python scripts/update-otter-tde.py
python scripts/update-pantheon.py
python scripts/update-planck-pgcc.py
python scripts/update-planck-sz2.py
python scripts/update-planetary-nebulae.py
pip install beautifulsoup4 lxml && python scripts/update-pulsar-glitches.py
python scripts/update-pulsars.py
python scripts/update-quasars.py
python scripts/update-rave-dr6.py
python scripts/update-rc3.py
python scripts/update-roma-bzcat.py
python scripts/update-snr.py
python scripts/update-stackexchange-space.py
python scripts/update-sumss.py
python scripts/update-supernovae.py
python scripts/update-tess-toi.py
python scripts/update-tgss.py
python scripts/update-unified-radio.py
python scripts/update-vlass.py
python scripts/update-wds.py
python scripts/update-wolf-rayet.py
python scripts/update-xray-binaries.py
# Physics
python scripts/update-auger.py
pip install crdb && python scripts/update-crdb.py
python scripts/update-fermi-3fhl.py
python scripts/update-fermi-3pc.py
python scripts/update-fermi-4lac.py
python scripts/update-fermi-gbm-triggers.py
python scripts/update-hawc.py
python scripts/update-icecat.py
python scripts/update-integral-ibis.py
python scripts/update-lhaaso.py
pip install particle && python scripts/update-pdg.py
python scripts/update-physics-nobel.py
python scripts/update-swift-bat.py
python scripts/update-tevcat.py
build-tle-archive.py builds historical TLE data from Space-Track yearly bulk zip exports (1959–2025). Daily updates for the current year are automated via update-tle-history.py (fetches yesterday's GP history, appends to tle_{year}.parquet). Requires SPACETRACK_USER and SPACETRACK_PASS secrets.
backfill-tle-history.py and backfill-starlink-snapshots.py are one-time scripts for filling gaps from Space-Track GP history. Not needed for ongoing operation.
All datasets are on Hugging Face. Load any dataset in one line:
from datasets import load_dataset
ds = load_dataset("juliensimon/<name>")No API keys needed. Works with pandas, polars, DuckDB, and any Parquet-compatible tool.
Apache Parquet with zstd compression. Files range from a few KB to several GB.
~50 datasets update daily, ~20 weekly, the rest are static snapshots. Each dataset page shows its schedule.
Yes. Code is MIT-licensed. Datasets are CC-BY-4.0 (with rare exceptions noted per dataset).
Each dataset page has a BibTeX citation block. See the Citation section below for citing the collection.
If you use these datasets, please cite:
@dataset{space_datasets,
author = {Simon, Julien},
title = {Space Datasets: Automated Space Data Pipelines for Hugging Face},
year = {2026},
publisher = {Hugging Face},
url = {https://github.com/juliensimon/space-datasets}
}These datasets are built from the following public sources — please cite them as appropriate:
| Domain | Original source |
|---|---|
| Orbital Mechanics | CelesTrak (Dr. T.S. Kelso), Space-Track.org, GCAT (Jonathan McDowell), Starlink Insider, NASA/JPL CNEOS, NASA/JPL SSD, NASA NHATS, SatNOGS (Libre Space Foundation), UCS, IAU MDC, WMO OSCAR |
| Space Probes | NASA SPDF COHOWeb (Voyager, Pioneer), PDS Atmospheres (Cassini, MEDA), PDS Geosciences (ChemCam), ESA PSA (Mars Express, Rosetta, Venus Express), ISRO |
| Planetary Science | USGS Astrogeology (Robbins crater databases), Meteoritical Society (via NASA data.gov) |
| Space Weather | NOAA SWPC, WDC Kyoto, NASA CCMC DONKI, NOAA NCEI GOES-16 XRS, SILSO (Royal Observatory of Belgium), LASP LISIRD (F10.7), IERS |
| Astronomy | NASA Exoplanet Archive, NASA HEASARC (Fermi, Chandra, Swift), GWOSC (LIGO/Virgo/KAGRA), ATNF Pulsar Catalogue, OpenNGC, Green's SNR Catalog, SIMBAD (CDS Strasbourg), VizieR (CDS Strasbourg — VLASS, Cosmicflows-4, INTEGRAL, LHAASO, HAWC), Fermi LAT, CHIME/FRB, eROSITA, Pantheon+, lenscat |
| Physics | Particle Data Group (PDG), CRDB (Cosmic Ray DataBase), Pierre Auger Observatory (via Zenodo), IceCube (via HEASARC), Swift/BAT (NASA), INTEGRAL IBIS (ESA), TeVCat (via HEASARC), LHAASO (via VizieR), HAWC (via VizieR) |