window.huggingface={variables:{"SPACE_CREATOR_USER_ID":"6732cbf6d40c698f6dd0bfd8"}};d> Editing as Reasoning — Amaze-Bench Leaderboard

Editing as Reasoning — Amaze-Bench Leaderboard

Editing-as-Reasoning (EAR) turns visual planning from step-by-step generation into a single-step image transformation.

Violation & Coverage

Violation (↓): Percentage of predicted path cells that fall in non-GT cells (%).

Coverage (↑): Percentage of predicted path cells that fall in GT cells (%).

MSE In & MSE Out

MSE In (↓): MSE in gt path region.

MSE Out (↓): MSE in non-gt path region.

Pass@1 & Pass@5

Pass@1 (↑): Percentage of samples that generated a valid path in one generation.

Pass@5 (↑): Percentage of samples that generated a valid path at least five times.

BenchmarkAmaze
Tip: Switch between Maze / Queen tasks, then click table headers to sort within each group.