Local Pipeline First¶
This is the recommended first pipeline-development step for new or modified computations.
Before adding routes, Celery tasks, auth decorators, or dataset registration, make the actual analytical workflow work in a local, inspectable way.
Why Start Here¶
This sequence reduces confusion:
- if the science is wrong, you catch it before integration work
- if the data assumptions are wrong, you see them before HTTP and queue layers hide the problem
- if the output shape is unstable, you avoid publishing bad interfaces
The Local-First Sequence¶
flowchart LR
A[Choose one ROI and one output] --> B[Run the core function locally]
B --> C[Validate files, geometry, and fields]
C --> D[Make it repeatable from shell or script]
D --> E[Only then add Django Computing API]
What Counts As Local¶
Any of these are valid starting points:
- local sample files on your machine
- your own GEE assets
- public CoRE Stack layer metadata discovered from Public API References and STAC Specs
- existing CoRE Stack public layer inventories exposing
gee_asset_path
Minimum Shape Of A Good First Prototype¶
- Keep the local entry point simple and explicit.
- Validate the result before you start thinking about APIs or publication.
- Return a path and a summary so the run is easy to inspect and compare.
What To Validate Before Moving On¶
- does the region of interest match what you intended?
- are the output fields stable and understandable?
- if the output is spatial, does it open cleanly in QGIS or Python?
- if the output is temporal, are the time windows and units clear?
- can you run it twice without changing behavior unexpectedly?
Practical Exit Points¶
Start from a plain callable function, a small script, or manage.py shell.
Open the first vector or raster result there before you wire publication.
If your workflow relies on GEE, confirm that the asset path or export target is correct before exposing them as apis.
Only After This Page¶
Once the local output is stable, the next developer job is to decide:
- does it need an HTTP route?
- does it need async execution?
- does it need publication or dataset registration?
Those decisions belong to later pages.