Performative scenario optimization solutions converge to classical stochastic programming solutions as the strength of the decision-feedback effect decreases, with convergence rate bounded by the Lipschitz modulus of the distribution map.
Adversarial Debate Score
70% survival rate under critique
Model Critiques
Supporting Research Papers
- Performative Scenario Optimization
This paper introduces a performative scenario optimization framework for decision-dependent chance-constrained problems. Unlike classical stochastic optimization, we account for the feedback loop wher...
- ParetoEnsembles.jl: A Julia Package for Multiobjective Parameter Estimation Using Pareto Optimal Ensemble Techniques
Mathematical models of natural and man-made systems often have many adjustable parameters that must be estimated from multiple, potentially conflicting datasets. Rather than reporting a single best-fi...
- On Lipschitzian properties of multifunctions defined implicitly by"split"feasibility problems
In the present paper, a systematic study is made of quantitative semicontinuity (a.k.a. Lipschitzian) properties of certain multifunctions, which are defined as a solution map associated to a family o...
- Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction
We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynam...
Formal Verification
Z3 checks whether the hypothesis is internally consistent, not whether it is empirically true.
This discovery has a Claude-generated validation package with a full experimental design.
Precise Hypothesis
For a performative scenario optimization problem parameterized by a decision-feedback strength ε ≥ 0, let x*(ε) denote the performatively stable solution and x*(0) the classical stochastic programming (SP) solution. The hypothesis asserts: ||x*(ε) - x*(0)|| ≤ L·ε for all ε in [0, ε_max], where L is the Lipschitz modulus of the distribution map D: X → P(Z) (mapping decisions to probability distributions over outcomes Z), and convergence is measured in a norm consistent with the decision space X. Formally, lim_{ε→0} x*(ε) = x*(0) with rate O(ε·L).
- HARD DISPROOF: Empirical measurement of ||x*(ε) - x*(0)|| > C·ε for some constant C > L + δ (δ = 0.01) across ≥ 3 independent problem instances, with statistical significance p < 0.01.
- HARD DISPROOF: Existence of a single problem instance where ||x*(ε) - x*(0)||/ε diverges as ε → 0 (super-linear divergence confirmed via log-log regression slope > 1.1 with R² > 0.95).
- HARD DISPROOF: Convergence rate empirically measured as O(ε^α) with α < 0.9 (sub-linear in ε) across multiple problem classes.
- SOFT DISPROOF: The Lipschitz bound L is not computable or is infinite for standard problem classes (e.g., Gaussian mixture distribution maps), making the bound vacuous.
- SOFT DISPROOF: Convergence holds only for a measure-zero set of distribution maps, indicating the result is not generic.
- PARTIAL DISPROOF: Convergence rate is correct (O(ε)) but the constant L is not tight — empirical constant exceeds L by factor > 10, making the bound practically useless.
Experimental Protocol
PHASE 1 — Synthetic Validation (Days 1–15): Construct 5 canonical problem classes with analytically known L and x*(0): (A) Linear-quadratic: f(x,z) = (x-z)², D(x;ε) = N(εx, 1) (B) Portfolio optimization: f = -μᵀx + λxᵀΣx, D shifts mean by εAx (C) Newsvendor: f = c·max(d-x,0) + h·max(x-d,0), D(x;ε) shifts demand mean by εx (D) Logistic regression loss with performative labels (E) Convex quadratic with affine distribution shift
For each class, compute x*(ε) numerically for ε ∈ {0, 0.01, 0.05, 0.1, 0.2, 0.5} using repeated stochastic gradient descent to performative stability. Measure ||x*(ε) - x*(0)|| and fit log-log regression to estimate convergence rate α and constant C.
PHASE 2 — Lipschitz Bound Tightness (Days 16–25): Compute L analytically for each problem class. Compare empirical constant C to theoretical L. Assess tightness ratio C/L. Test whether C ≤ L holds universally.
PHASE 3 — Stress Testing (Days 26–35): Test boundary conditions: non-convex f, non-Lipschitz D, near-degenerate ε_max. Document failure modes.
- SYNTHETIC — Linear-quadratic performative optimization instances: Generated programmatically; d ∈ {10, 100, 1000} dimensions; 500 instances per dimension; no external source required.
- SYNTHETIC — Portfolio optimization: 50-asset covariance matrices from historical S&P 500 data (2010–2023); source: Yahoo Finance API or CRSP; N=252 scenarios per year.
- SYNTHETIC — Newsvendor demand distributions: Exponential, Gamma, and log-normal with ε-shifted means; 1000 Monte Carlo samples per ε value.
- BENCHMARK — Performative prediction datasets from Perdomo et al. (2020) replication package (credit scoring with strategic agents); publicly available at github.com/google-research/performative-prediction.
- VALIDATION — Stochastic programming benchmark library (SIPLIB): 20 standard SP instances adapted with artificial distribution maps; available at www.isye.gatech.edu/~sahmed/siplib/.
- COMPUTATIONAL — Scenario trees with N ∈ {10, 50, 100, 500, 1000} scenarios for convergence-in-N analysis.
- PRIMARY: Log-log regression slope α ∈ [0.9, 1.1] for ≥ 4 of 5 problem classes (p < 0.05 for α=1 not rejected).
- PRIMARY: Empirical constant C ≤ 1.5·L for ≥ 3 of 5 problem classes (bound is tight within factor 1.5).
- PRIMARY: R² ≥ 0.95 for log-log regression in ≥ 4 of 5 problem classes (confirming power-law relationship).
- SECONDARY: Convergence holds for all ε ∈ [0, 0.8·ε_max] without numerical instability.
- SECONDARY: Variance across 20 seeds is < 10% of mean ||x*(ε)-x*(0)|| for all ε ≥ 0.01.
- SECONDARY: Results replicate for d ∈ {10, 100, 1000} with dimension-independent convergence rate α.
- TERTIARY: Bound remains valid (C ≤ 10·L) even for non-convex f in ≥ 2 of 3 stress-test instances.
- HARD FAILURE: α < 0.8 or α > 1.3 for ≥ 3 of 5 problem classes (convergence rate is not O(ε)).
- HARD FAILURE: C > 10·L for any problem class where L is finite and computable (bound is vacuously loose).
- HARD FAILURE: R² < 0.80 for ≥ 3 of 5 problem classes (no consistent power-law relationship).
- HARD FAILURE: Performative stable point iteration fails to converge for ε < ε_max in ≥ 2 problem classes (existence assumption violated).
- SOFT FAILURE: α ∈ [0.8, 0.9] for ≥ 3 classes (sub-linear but not catastrophic — hypothesis partially supported).
- SOFT FAILURE: L is not computable (infinite or undefined) for ≥ 2 problem classes, making the bound non-operational.
- SOFT FAILURE: Convergence rate α varies by > 0.3 across problem classes, suggesting result is problem-class-specific rather than general.
GPU_HOURS: 12
CPU_HOURS: 480
MEMORY_GB: 32
COST_USD_MIN: 150
COST_USD_FULL: 1200
100
GPU hours
30d
Time to result
$1,000
Min cost
$10,000
Full cost
ROI Projection
Implementation Sketch
# Performative Scenario Optimization Convergence Validator # Architecture: 3-module pipeline # MODULE 1: Problem Class Factory class PerformativeProblem: def __init__(self, problem_type, dim, epsilon, L_true=None): self.type = problem_type # 'LQ', 'portfolio', 'newsvendor', 'logistic', 'quadratic' self.dim = dim self.epsilon = epsilon self.L_true = L_true # analytical Lipschitz modulus if known def distribution_map(self, x, epsilon): # Returns distribution D(x; epsilon) as (mean, cov) or scenario weights if self.type == 'LQ': # D(x; eps) = N(eps * x, I) mean = epsilon * x cov = np.eye(self.dim) return mean, cov elif self.type == 'portfolio': # D shifts return mean by epsilon * A @ x mean = self.mu_base + epsilon * self.A @ x return mean, self.Sigma # ... other types def objective(self, x, z): # f(x, z) — must be convex in x if self.type == 'LQ': return np.sum((x - z)**2) # ... other types # MODULE 2: Stable Point Solver def find_stable_point(problem, epsilon, n_scenarios=500, max_iter=10000, tol=1e-6, eta=0.01): """Repeated gradient descent to performative stability.""" x = np.zeros(problem.dim) # initialization for outer_iter in range(max_iter): x_prev = x.copy() # Sample scenarios from D(x_prev; epsilon) mean, cov = problem.distribution_map(x_prev, epsilon) scenarios = np.random.multivariate_normal(mean, cov, n_scenarios) # Inner loop: minimize E_{z~D(x_prev)}[f(x, z)] x = projected_gradient_descent( grad_fn=lambda x: np.mean([grad_f(x, z) for z in scenarios], axis=0), x_init=x_prev, projection=lambda x: project_to_feasible(x, problem), eta=eta, tol=tol/10 ) # Check stable point convergence if np.linalg.norm(x - x_prev) < tol: return x, outer_iter raise ConvergenceError(f"No stable point found after {max_iter} iterations") def find_classical_sp_solution(problem, n_scenarios=500, tol=1e-6): """Classical SP: epsilon=0, fixed distribution.""" return find_stable_point(problem, epsilon=0.0, n_scenarios=n_scenarios, tol=tol) # MODULE 3: Convergence Rate Estimator def estimate_convergence_rate(problem_class, dim, epsilon_grid, n_seeds=20, n_scenarios=500): """ Returns: alpha (convergence rate), C (empirical constant), L_empirical, R_squared """ results = {eps: [] for eps in epsilon_grid} # Get classical SP solution (ground truth) x_star_0, _ = find_classical_sp_solution( PerformativeProblem(problem_class, dim, 0.0), n_scenarios ) for eps in epsilon_grid: for seed in range(n_seeds): np.random.seed(seed) prob = PerformativeProblem(problem_class, dim, eps) x_star_eps, _ = find_stable_point(prob, eps, n_scenarios) dist = np.linalg.norm(x_star_eps - x_star_0) results[eps].append(dist) # Aggregate means = [np.mean(results[eps]) for eps in epsilon_grid] # Log-log regression (exclude eps=0) log_eps = np.log(epsilon_grid[1:]) log_dist = np.log(means[1:]) alpha, log_C = np.polyfit(log_eps, log_dist, 1) C = np.exp(log_C) # R-squared log_dist_pred = alpha * log_eps + log_C ss_res = np.sum((log_dist - log_dist_pred)**2) ss_tot = np.sum((log_dist - np.mean(log_dist))**2) R2 = 1 - ss_res / ss_tot # Empirical Lipschitz modulus L_empirical = estimate_lipschitz_modulus( PerformativeProblem(problem_class, dim, 1.0), n_pairs=1000 ) return { 'alpha': alpha, 'C': C, 'L_empirical': L_empirical, 'R2': R2, 'tightness_ratio': C / L_empirical, 'means': means, 'epsilon_grid': epsilon_grid } def estimate_lipschitz_modulus(problem, n_pairs=1000): """Estimate L = sup W1(D(x;1), D(x';1)) / ||x-x'||""" ratios = [] for _ in range(n_pairs): x = np.random.randn(problem.dim) x_prime = np.random.randn(problem.dim) w1 = wasserstein_1_distance( problem.distribution_map(x, 1.0), problem.distribution_map(x_prime, 1.0) ) ratios.append(w1 / (np.linalg.norm(x - x_prime) + 1e-10)) return np.max(ratios) # MAIN EXPERIMENT RUNNER epsilon_grid = [0.0, 0.01, 0.05, 0.1, 0.2, 0.5] problem_classes = ['LQ', 'portfolio', 'newsvendor', 'logistic', 'quadratic'] dimensions = [10, 100, 1000] all_results = {} for pc in problem_classes: for dim in dimensions: key = f"{pc}_d{dim}" all_results[key] = estimate_convergence_rate( pc, dim, epsilon_grid, n_seeds=20, n_scenarios=500 ) # Early abort check if all_results[key]['R2'] < 0.70: log_warning(f"Low R2={all_results[key]['R2']} for {key} — potential failure") if all_results[key]['alpha'] < 0.7 or all_results[key]['alpha'] > 1.5: log_critical(f"Alpha={all_results[key]['alpha']} out of range for {key}") # Generate summary table and convergence plots generate_report(all_results, output_dir='./results/')
- DAY 3 CHECKPOINT: If performative stable point iteration fails to converge for ε = 0.1 in the LQ problem class (simplest case), abort and debug solver before proceeding. Criterion: convergence tolerance not reached within 10,000 iterations for ≥ 50% of seeds.
- DAY 7 CHECKPOINT: If R² < 0.70 for log-log regression on LQ problem class (d=10), abort Phase 1 and investigate whether power-law relationship exists at all. This is the easiest case; failure here indicates fundamental issue.
- DAY 12 CHECKPOINT: If empirical Lipschitz modulus L cannot be estimated (variance > 100% of mean across 1000 random pairs) for ≥ 2 problem classes, abort bound-tightness analysis — the bound is not computable.
- DAY 18 CHECKPOINT: If α < 0.7 or α > 1.5 for ≥ 3 of 5 problem classes in Phase 1, abort Phase 2 and Phase 3 — core hypothesis is likely false; redirect to characterizing actual convergence rate.
- DAY 25 CHECKPOINT: If tightness ratio C/L > 100 for all problem classes, abort commercial value assessment — bound is theoretically correct but practically useless; redirect to tighter bound derivation.
- DAY 30 CHECKPOINT: If stress testing reveals that convergence fails (α < 0.5) for all non-convex f instances, scope the hypothesis to convex f only and update hypothesis restatement before finalizing report.