Pipeline Integration Pattern¶

This page explains how existing CoRE Stack pipelines become fully integrated into the stack.

Read it as a pattern library, not as a single algorithm description.

Primary code surfaces:

What "Fully Integrated" Means Here¶

An integrated pipeline usually does most of the following:

accepts a stable ROI such as state, district, block, or project ROI
exposes an entry point through API, Django shell, or Celery
uses shared naming and asset-path helpers
runs the actual processing step
saves resulting layer metadata to the database
publishes vector or raster outputs to GeoServer when needed
optionally creates STAC or related catalog metadata
gets documented in this docs site

flowchart LR
    A[Route or shell trigger] --> B[Task or direct function]
    B --> C[Processing logic]
    C --> D[Asset creation or local output]
    D --> E[Metadata in DB]
    D --> F[GeoServer]
    D --> G[STAC or related specs]

Reusable Building Blocks¶

Concern	Typical helpers	Where to look
GEE setup	`ee_initialize()`	utilities/gee_utils.py
naming and path normalization	`valid_gee_text()`, `get_gee_asset_path()`, `get_gee_dir_path()`	utilities/gee_utils.py
vector export to GEE	`export_vector_asset_to_gee()`, `upload_shp_to_gee()`	utilities/gee_utils.py
raster export and publication	`sync_raster_to_gcs()`, `sync_raster_gcs_to_geoserver()`	utilities/gee_utils.py
vector publication	`sync_fc_to_geoserver()`, `sync_layer_to_geoserver()`, `push_shape_to_geoserver()`	computing/utils.py
layer bookkeeping	`save_layer_info_to_db()` and sync status helpers	computing/utils.py
route exposure	DRF views and task triggers	computing/api.py, computing/urls.py

Note

If your pipeline uses ee_initialize() or accepts gee_account_id, document the setup prerequisite explicitly and point operators to Google Earth Engine. In the current stack, that setup is required for most real compute runs.

Recurring Pipeline Shapes¶

1. Vector Clip and Publish¶

Used by pages such as:

Typical pattern:

build or load ROI
filter an external vector dataset
export to GEE or write local shapes
save layer metadata
publish to GeoServer

2. Raster Clip and Publish¶

Used by pages such as:

Typical pattern:

build ROI
load a source raster
clip or derive the raster
export raster to GEE or GCS
publish through GeoServer

3. Vector Enrichment or Spatial Join¶

Used by pages such as:

Typical pattern:

load ROI or administrative geometry
join with an external table or vector layer
compute attributes or dominant classes
export and publish the resulting feature collection

4. Mixed Raster and Vector Outputs¶

Used by pages such as:

These pages are useful when you want to study both raster publication and vector summary logic in one place.

5. Time-Series Helpers¶

Used by pages such as:

These are useful when your new pipeline depends on temporal compositing, per-class aggregation, or derived vegetation signals.

Builder Workflow¶

Step 1: Decide the smallest useful output¶

Choose early whether the first useful version of your pipeline is:

vector only
raster only
tabular plus geometry
mixed raster and vector

That choice determines which shared helpers you will reuse.

Step 2: Start with the processing function¶

Write the core logic first. Make it clear what inputs the function needs and what output it returns before you add publication concerns.

Step 3: Wrap it in a task or callable entry point¶

Depending on the use case, follow one of these patterns:

direct function for shell use
Celery task for async execution
DRF view in computing/api.py for HTTP callers

Step 4: Reuse shared integration helpers¶

Do not re-invent:

naming conventions
asset path generation
GeoServer upload paths
layer metadata persistence

Those are already shared across the stack.

Step 5: Document the pattern, not just the result¶

A good pipeline page should tell future contributors:

where the code lives
what entry point calls it
whether it is vector, raster, or mixed
which shared helpers it relies on
what cloud dependencies are required for full integration
how an operator obtains the gee_account_id if the pipeline needs one

Good Pages to Borrow From¶

If you are building a new pipeline, compare these examples:

If you need...	Start from
a vector clipping workflow	Drainage Lines
a raster clipping workflow	Catchment Area
a table-plus-geometry enrichment workflow	Facilities Proximity
mixed raster and vector publication	Stream Order
time-series processing ideas	NDVI Time Series

Incremental Adoption¶

A new contributor does not need to build the whole cloud publication chain on day one.

A practical way to start is:

understand an existing pipeline page in this section
prototype the data logic locally
expose a direct function or shell workflow
add API or Celery integration
add GEE, GeoServer, or STAC publication only when the science is stable

That staged approach is the same one reflected across the developer pages in this site: start with direct execution, then expose stable entry points, then add publication and metadata integration.