CoRE Stack Data Structure¶
Read this page before using CoRE Stack data deeply or building new pipelines.
It explains the main idea that ties the whole stack together:
- landscapes are represented through standardized spatial units
- many outputs are indexed to those units
- different pipeline families become joinable because they share those units
Note
This is a conceptual and documentation-facing model. It should be treated as the working contract for understanding the platform, not as a claim that one exact Django ORM schema already exposes all of it directly.
Why This Chapter Comes First¶
CoRE Stack does not start from dashboards or APIs. It starts from a way of representing landscapes.
The key move is the micro-watershed registry:
- hydrological units are better than purely administrative units for reasoning about water flow
- the units are stable enough to act like a registry
- once data is indexed to that registry, multiple layers can be joined and compared
Our Three Organizing Principles¶
1. Nested hydrological units¶
CoRE Stack treats river basins, sub-basins, watersheds, and micro-watersheds as a nested structure.
The micro-watershed is the most important practical planning unit in the current stack because many derived metrics eventually resolve there.
2. Crosswalks to administrative units¶
People often discover and use data through state, district, tehsil, village, and panchayat names.
That means CoRE Stack must constantly connect:
- hydrological correctness
- administrative usability
3. Connected rather than isolated entities¶
The stack is not only hierarchical. It is also networked.
Examples:
- upstream micro-watersheds drain into downstream ones
- waterbodies belong to or influence surrounding watershed units
- villages and assets intersect multiple hydrological contexts
Entity Relationship Diagram¶
classDiagram
class RiverBasin {
ID: string
name: string
geometry: Polygon
}
class SubBasin {
ID: string
name: string
geometry: Polygon
}
class Watershed {
ID: string
name: string
geometry: Polygon
}
class MicroWatershed {
ID: string
area: float (hectares)
uid: string
cropping_intensities: list<TimeSeries>
hydro_timeseries: list<TimeSeries>
terrain_breakdown: dict
aquifer_breakdown: dict
geometry: Polygon
}
class Waterbody {
ID: string
area: float (hectares)
surface_water_area: list<TimeSeries>
zone_of_influence: Polygon
zoi_cropping_intensities: list<TimeSeries>
zoi_fortnightly_ndvi: list<TimeSeries>
geometry: Polygon
}
class State {
ID: string
name: string
geometry: Polygon
}
class District {
ID: string
name: string
geometry: Polygon
}
class Tehsil {
ID: string
name: string
geometry: Polygon
}
class Village {
ID: string
name: string
geometry: Polygon
}
class NREGAAsset {
ID: string
name: string
work_type: string
geometry: Point
}
RiverBasin "1" *-- "many" SubBasin : contains
SubBasin "1" *-- "many" Watershed : contains
Watershed "1" *-- "many" MicroWatershed : contains
MicroWatershed "1" *-- "many" Waterbody : contains
State "1" *-- "many" District : contains
District "1" *-- "many" Tehsil : contains
Tehsil "1" *-- "many" Village : contains
Tehsil "1" *-- "many" MicroWatershed : contains
MicroWatershed "many" --> "many" Village : intersects
MicroWatershed "1" *-- "many" NREGAAsset : contains
Village "1" *-- "many" NREGAAsset : contains
MicroWatershed "many" --> "1" MicroWatershed : drains into
What Each Layer Of The Model Is For¶
| Entity type | Why it exists in the stack | How users usually meet it |
|---|---|---|
| river basin, sub-basin, watershed | hydrological nesting and context | GEE assets, conceptual model, some datasets |
| micro-watershed | the main standardized planning and join unit | public APIs, STAC vector layers, dashboards, starter-kit outputs |
| waterbody | feature-level hydrological object nested inside larger units | waterbody APIs, vector layers, dashboards |
| state, district, tehsil | discovery and query entry points | GeoAdmin routes, public API parameters, STAC browsing |
| village and other enrichment layers | administrative and social context | geometry routes, enrichment outputs, planning analysis |
| assets such as NREGA works | intervention and planning context | enrichment layers and downstream use cases |
Why Micro-Watersheds And Not Only Villages¶
The rationale from the CoRE Stack registry writeup is important:
- water planning is fundamentally hydrological
- village and panchayat boundaries are useful, but they are not water-flow units
- hydrological units are more stable over time than many administrative boundaries
- a shared registry makes it easier for many actors to index and exchange data
CoRE Stack’s current registry is a pan-India micro-watershed delineation of roughly 1000 ha units, nested under larger hydrological boundaries.
That gives CoRE Stack something like a local-scale Earth-system grid:
- small enough for place-based planning
- standardized enough for large-scale data collation
How Data Gets Attached To The Registry¶
The core operational idea is:
- start from rasters and source layers
- compute derived signals such as terrain, land use, drought, or water balance
- vectorize or aggregate those signals onto standard spatial units
- store and publish them in a way that preserves stable identifiers
That is why CoRE Stack repeatedly moves between:
- rasters
- vector layers
- watershed registries
- tehsil summaries
- API payloads
- STAC items and style files
The starter-kit mirrors this logic in Python by building structures like:
- tehsil
- micro-watersheds inside that tehsil
- waterbodies inside those micro-watersheds
The Most Important Practical Relationships¶
-
tehsil -> micro-watershedsPeople often query by administrative name, but the computation is frequently organized on hydrological units. -
micro-watershed -> uid -> other datasetsThe watersheduidis one of the most important stable keys in the current public surface. -
micro-watershed -> upstream/downstream connectivityRiver rejuvenation, runoff reasoning, and catchment logic all depend on the fact that watersheds are connected, not isolated. -
micro-watershed -> villages -> assetsMany planning questions require hydrological and administrative layers together.
What Users Should Remember¶
- If you understand the micro-watershed registry, the rest of the stack becomes much easier to navigate.
- If you only look at administrative units, you will miss how water actually moves.
- If you only look at hydrological units, you may miss how implementation and funding happen on administrative units.
- CoRE Stack tries to keep both views connected.
Limitations And What Can Improve¶
The current theory pages on the CoRE Stack site are very clear that this structure is useful, but not final.
Important limitations to keep in mind:
- surface-water logic is modeled much better than groundwater flow
- current boundaries were produced from
SRTM-based delineation; newer DEMs such asFABDEMcould improve future versions - village and panchayat-level implementation still requires intersections and social coordination, not only hydrological correctness
- not every useful landscape entity is fully represented yet