LULC Generation Pipeline
Generate high-resolution Land Use Land Cover (LULC) maps using satellite imagery and machine learning.
Overview
graph TB
subgraph "Input"
A[Sentinel-2 Imagery]
B[Training Samples]
C[Auxiliary Data]
end
subgraph "Processing"
D[Preprocessing]
E[Feature Extraction]
F[Classification]
G[Post-processing]
end
subgraph "Output"
H[LULC Raster]
I[Accuracy Report]
J[Change Detection]
end
A --> D
B --> D
C --> D
D --> E
E --> F
F --> G
G --> H
G --> I
G --> J
The LULC pipeline classifies land into categories such as:
- Agriculture
- Forest
- Water Bodies
- Built-up Areas
- Barren Land
- Vegetation
Classification System
pie title LULC Classes Distribution
"Agriculture" : 45
"Forest" : 18
"Water" : 8
"Built-up" : 12
"Barren" : 10
"Vegetation" : 7
Class Definitions
| Class |
Code |
Description |
Color |
| Agriculture |
1 |
Croplands, plantations |
#ffff00 |
| Forest |
2 |
Dense tree cover |
#006400 |
| Water |
3 |
Rivers, lakes, reservoirs |
#0000ff |
| Built-up |
4 |
Urban areas, roads |
#ff0000 |
| Barren |
5 |
Bare soil, rock |
#a52a2a |
| Vegetation |
6 |
Grasslands, shrubs |
#7cfc00 |
Algorithm Flow
sequenceDiagram
participant U as User
participant A as API
participant G as GEE
participant C as Classifier
participant S as Storage
U->>A: Request LULC Generation
A->>G: Query Sentinel-2
G->>A: Return ImageCollection
A->>A: Cloud Masking
A->>A: Compute Indices<br/>NDVI, NDWI, NDBI
A->>C: Train Random Forest
C->>C: Classify Pixels
A->>A: Majority Filter
A->>A: Sieve Small Polygons
A->>S: Export Results
S->>U: Download Link
Required Parameters
{
"state": "karnataka",
"district": "raichur",
"block": "devadurga",
"year": 2023,
"resolution": "10m"
}
graph LR
A[LULC Pipeline] --> B[Sentinel-2 L2A]
A --> C[Ground Truth Data]
A --> D[Existing LULC]
B --> E[10m Resolution<br/>Multi-spectral]
C --> F[Training Samples<br/>Reference Points]
D --> G[Validation Data<br/>Comparison]
| Data Source |
Purpose |
Resolution |
| Sentinel-2 L2A |
Primary imagery |
10m |
| Training samples |
Supervised classification |
Point data |
| Existing LULC |
Validation |
30-100m |
Processing Steps
1. Data Acquisition
# Query Sentinel-2 for date range
start_date = f"{year}-01-01"
end_date = f"{year}-12-31"
# Filter by region and cloud cover
image_collection = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
.filterDate(start_date, end_date)
.filterBounds(geometry)
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20)))
2. Preprocessing
graph LR
A[Raw Imagery] --> B[Cloud Masking]
B --> C[Atmospheric Correction]
C --> D[Median Composite]
D --> E[Clip to Region]
Calculated spectral indices:
| Index |
Formula |
Purpose |
| NDVI |
(NIR-Red)/(NIR+Red) |
Vegetation health |
| NDWI |
(NIR-SWIR)/(NIR+SWIR) |
Water detection |
| NDBI |
(SWIR-NIR)/(SWIR+NIR) |
Built-up areas |
| EVI |
2.5*(NIR-Red)/(NIR+6*Red-7.5*Blue+1) |
Enhanced vegetation |
4. Classification
graph TB
A[Input Features] --> B[Random Forest]
A --> C[Training Samples]
B --> D{Bootstrap<br/>Aggregation}
C --> D
D --> E[Decision Tree 1]
D --> F[Decision Tree 2]
D --> G[Decision Tree N]
E --> H[Majority Vote]
F --> H
G --> H
H --> I[Classified Map]
5. Post-processing
- Majority filter: Remove salt-and-pepper noise
- Sieve filter: Remove small polygons (< 0.5 ha)
- Smoothing: Gaussian kernel smoothing
Output Products
Raster Output
graph LR
A[LULC Raster] --> B[GeoTIFF Format]
A --> C[Color-mapped PNG]
A --> D[Class Statistics]
| Property |
Value |
| Format |
GeoTIFF |
| Resolution |
10m |
| CRS |
EPSG:4326 |
| Bands |
1 (class values) |
| Compression |
LZW |
Statistics Report
{
"classification_stats": {
"total_pixels": 1250000,
"resolution_m": 10,
"area_sqkm": 125.0,
"class_distribution": {
"agriculture": {"pixels": 562500, "area_sqkm": 56.25, "percentage": 45.0},
"forest": {"pixels": 225000, "area_sqkm": 22.5, "percentage": 18.0},
"water": {"pixels": 100000, "area_sqkm": 10.0, "percentage": 8.0},
"built_up": {"pixels": 150000, "area_sqkm": 15.0, "percentage": 12.0},
"barren": {"pixels": 125000, "area_sqkm": 12.5, "percentage": 10.0},
"vegetation": {"pixels": 87500, "area_sqkm": 8.75, "percentage": 7.0}
}
},
"accuracy": {
"overall": 88.5,
"kappa": 0.85,
"producer_accuracy": {...},
"user_accuracy": {...}
}
}
API Usage
Request
curl -X POST "https://geoserver.core-stack.org/api/v1/lulc_for_tehsil/" \
-H "Content-Type: application/json" \
-d '{
"state": "karnataka",
"district": "raichur",
"block": "devadurga",
"start_year": 2022,
"end_year": 2023,
"version": "v3",
"gee_account_id": 1
}'
Response
{
"Success": "generate_lulc_v3_tehsil task initiated"
}
Local Mode
Local-first LULC documentation is still being aligned with the backend. For the current route surface, use the task-submission handlers in computing/api.py and treat the Local-First section in this docs repo as roadmap material until that alignment is complete.
graph LR
A[Local Request] --> B[STAC Catalog]
B --> C[Download Pre-computed]
C --> D[Return Result]
Change Detection
Compare LULC between two time periods:
graph TB
A[LULC 2019] --> C[Change Detection]
B[LULC 2023] --> C
C --> D[Change Matrix]
C --> E[Hotspot Analysis]
C --> F[Transition Map]
Change Matrix Example
| From/To |
Ag |
Forest |
Water |
Built-up |
| Ag |
85% |
5% |
2% |
8% |
| Forest |
10% |
88% |
1% |
1% |
| Water |
5% |
2% |
92% |
1% |
| Barren |
45% |
10% |
3% |
42% |
Accuracy Assessment
Confusion Matrix
pie title Agriculture Predictions
"Correct (85)" : 85
"Forest (5)" : 5
"Water (2)" : 2
"Built-up (8)" : 8
Accuracy Metrics
| Metric |
Value |
Description |
| Overall Accuracy |
89.3% |
Correctly classified / Total |
| Kappa Coefficient |
0.87 |
Agreement beyond chance |
| Producer's Accuracy |
85-92% |
Correct / Actual class |
| User's Accuracy |
84-93% |
Correct / Predicted class |
Troubleshooting
Low Accuracy
| Issue |
Cause |
Solution |
| Cloud contamination |
Poor cloud masking |
Increase cloud threshold |
| Class confusion |
Similar signatures |
Add more training data |
| Mixed pixels |
10m resolution |
Use sub-pixel analysis |
Processing Errors
| Error |
Solution |
| "No clear images" |
Expand date range |
| "Memory exceeded" |
Reduce region size |
| "Export failed" |
Check storage quota |
Best Practices
- Use dry season imagery - Better visibility of land features
- Multi-year composites - Reduce year-to-year variation
- Ground truth validation - Essential for accuracy
- Consistent classification - Use same scheme for change detection