Calibrated from GEE execution on 2026-03-29 (actual: 0.732). Will be validated against QGIS and folia outputs. Cloud masking method and temporal compositing approach will cause cross-platform variation. Value is median of cloud-free pixel temporal composite.
| Workflow | Model | Backend | Status | Answer | Error | Cost | Latency |
|---|---|---|---|---|---|---|---|
| exec | gold | gee | PASS | 0.7322962578128589 | 0.3% | $0.000034 | 611ms |
| exec | gold | folia-rust | PASS | 0.735694408416748 | 0.8% | --- | 7ms |
| exec | gold | qgis | PASS | 0.7356943609190687 | 0.8% | --- | 150ms |
Known-correct folia spec for this problem. This is the reference implementation used for backend quality testing.
# Platform Comparison: Sentinel-2 NDVI — Gold Spec
#
# Compute NDVI from a cloud-masked Sentinel-2 summer 2023 composite
# over Iowa farmland. Ground truth: mean NDVI ~0.73.
#
# This spec is designed to run through both the Python backend and
# browser-wasm executor (folia bench exec -b browser-wasm).
name: sentinel2-ndvi
version: "1.0"
description: >
Compute mean NDVI from Sentinel-2 L2A Surface Reflectance over
central Iowa farmland during summer 2023 (Jun-Aug). Cloud-mask
using the SCL band, create a temporal median composite, then
compute NDVI = (B08 - B04) / (B08 + B04).
Ground truth: ~0.73 (calibrated from GEE 2026-03-29).
settings:
default_bbox: [-93.5, 41.5, -93.2, 41.7]
default_crs: EPSG:4326
layers:
# ============================================================
# SOURCE LAYERS
# ============================================================
source/s2-collection:
uri: "@sentinel-2-l2a"
type: raster
description: >
Sentinel-2 L2A Surface Reflectance (Harmonized).
Summer 2023 over central Iowa.
params:
bbox: [-93.5, 41.5, -93.2, 41.7]
datetime: "2023-06-01/2023-09-01"
assets: [B04, B08, SCL]
query: { "eo:cloud_cover": { "lt": 20 } }
# ============================================================
# COMPUTE: CLOUD MASK + COMPOSITE + NDVI
# ============================================================
compute/cloud-masked:
type: raster
description: >
Cloud-masked Sentinel-2 using SCL band.
Keep vegetation (4), bare soil (5), water (6).
compute:
op: cloud_mask_sentinel2
inputs:
data: { layer: source/s2-collection }
params:
scl_classes: [4, 5, 6]
compute/composite:
type: raster
description: >
Temporal median composite of cloud-masked imagery.
compute:
op: temporal_reduce
inputs:
data: { layer: compute/cloud-masked }
params:
method: median
compute/ndvi:
type: raster
description: >
NDVI = (B08 - B04) / (B08 + B04).
compute:
op: raster_ndvi
inputs:
data: { layer: compute/composite }
params:
nir_band: B08
red_band: B04
# ============================================================
# RESULT: MEAN NDVI
# ============================================================
result/mean-ndvi:
type: table
description: >
Mean NDVI across the Iowa farmland AOI.
Ground truth: ~0.73.
compute:
op: analysis_zonal_stats
params:
stats: [mean]
inputs:
raster: { layer: compute/ndvi }
The prompt given to LLMs in single-shot workflow benchmarks.
Problem: Compute the mean NDVI from Sentinel-2 imagery over Iowa
farmland during summer 2023 (June-August). Apply cloud masking
and create a temporal median composite, then report mean NDVI.
Study area: -93.5, 41.5, -93.2, 41.7 (central Iowa farmland).
Data: Sentinel-2 L2A Surface Reflectance (Harmonized).
Expected answer: approximately 0.73 mean NDVI.