Research Grade Datasets.
Analysis Ready. Curated over 4 years.
Now available at your fingertips.
Assembled and Standardised at ARTPARK.
Epi data, weather records,
administrative censuses and surveys, shapefiles, and more — all following
common standards.
For Researchers & Scientists
Find the data you need,
without the hassle
Browse through datasets curated over four years at ARTPARK. Each dataset comes with comprehensive data dictionaries, standardised formats, and clear documentation.
-
Data Dictionaries
Every dataset includes detailed field descriptions, units, and methodology documentation
-
Multiple Data Types
Tabular datasets, weather station records, and geospatial shapefiles in standard formats
-
Consistent Structure
All datasets follow the same schema conventions — easy to read and compare across sources
-
Versioned Datasets
Track changes and access historical versions for reproducible research
| Dataset | Coverage | Access |
|---|---|---|
| IMD Weather Stations | 2020-2024 | Open |
| Karnataka Districts | 2023 | Open |
| Crop Yield Records | 2015-2023 | Request |
Quick Start
# Install with uv (recommended)
$ uv init && uv add dataio-artpark
# Register and Obtain your API Keys, then run:
$ uv run dataio init
# List datasets by collection
$ uv run dataio list-datasets --collection "Census Data"
# Download a dataset with metadata
$ uv run dataio download-dataset TS0001DS0042 For Developers
Powered by a first class API
We built dataio to power our own research workflows — and now it powers this platform too! It's open-source on GitHub and available on PyPI, with a CLI for quick access, a Python SDK for scripting, and a REST API for full programmatic control.
-
CLI Tool & Python SDK
Quick dataset discovery and download from your terminal or Python scripts
-
REST API
Full programmatic access to datasets, tables, weather data, and shapefiles at state, district, or city level
-
Reusable Code
Consistent dataset structure means code written for one dataset works for others
-
Pipeline Ready
Integrate with CI/CD workflows and automation — we follow semver for reproducible systems
More features on the way
We're actively building new capabilities to make the platform even more powerful.
Advanced Analytics
Automated insights for every dataset you access.
- Statistical summaries & distributions
- Temporal trend visualisation
- Data quality scoring & anomaly detection
- Spatial coverage heatmaps
Data Playground
Interactive environment to explore and experiment with datasets.
- Jupyter-style notebook interface
- LLM integrations for natural language queries
- Auto-generated visualisations
- Share and collaborate on analyses
Auto Validation
Automated data quality checks during upload.
- Schema validation against templates
- Geospatial integrity checks
- Duplicate detection & deduplication
- Customisable validation rules
Open Standards, Transparent Design
Available for self-hosting, with full control over your data.
Open Source
Licensed under AGPL-3.0. Inspect the code, contribute improvements, or fork for your needs.
Self-Hostable
Deploy the entire platform on your own infrastructure. Full control over your data.
Open Standards
Standard data formats and APIs. No vendor lock-in, easy integration with existing tools.
Open Methodology
We're open-sourcing our data standardisation methodology so you can apply it to your own datasets.
Ready to explore?
Sign in to access the full catalog, download datasets, and generate API keys. No account yet? Sign in with your email to get started.
PyPI