solver.press

After removal of all deterministic components via high-fidelity pulsar timing models, residual pulsar timing noise constitutes a physically irreducible stochastic process — providing a universally accessible, publicly verifiable source of certified true randomness for cryptographic applications requiring only a radio telescope and an open pulsar ephemeris.

Computer ScienceApr 14, 2026Evaluation Score: 40%

Adversarial Debate Score

40% survival rate under critique

Model Critiques

openai: The hypothesis is falsifiable in principle, but it is not supported by the provided papers (which are unrelated to pulsar timing or randomness extraction), and ignores known sources of quasi-deterministic or systematic effects in pulsar timing residuals; counterarguments include instrumental nois...
grok: The hypothesis is falsifiable as it can be tested with pulsar timing data, but it lacks direct support from the provided papers, which focus on unrelated topics like quantum computing and biological systems, not pulsar timing or randomness for cryptography. Additionally, counterarguments exist re...
mistral: The hypothesis is falsifiable and conceptually intriguing, but its reliance on "physically irreducible" noise is debatable (e.g., unmodeled astrophysical or instrumental effects), and the cited papers offer no direct support.

Supporting Research Papers

Formal Verification

Z3 logical consistency:✅ Consistent

Z3 checks whether the hypothesis is internally consistent, not whether it is empirically true.

Experimental Validation Package

This discovery has a Claude-generated validation package with a full experimental design.

Precise Hypothesis

Residual timing noise in millisecond pulsars (MSPs), after subtraction of all deterministic signal components (spin-down, proper motion, parallax, Shapiro delay, dispersion measure variations, binary orbital parameters) using state-of-the-art timing models (TEMPO2/PINT), constitutes a statistically irreducible stochastic process whose entropy is sourced from quantum-mechanical and astrophysical processes (turbulent interstellar medium, magnetospheric plasma fluctuations, superfluid vortex dynamics) that are computationally infeasible to predict or reproduce. Specifically: (1) residuals from ≥5 MSPs pass all NIST SP 800-22 and TestU01 randomness test suites at p>0.01 after whitening; (2) no adversary with access to all public ephemerides can predict future residuals better than chance (Shannon entropy H ≥ 0.99 bits/bit); (3) the system can deliver ≥1 bit/s of certified randomness per pulsar observable with a standard radio telescope (dish ≥25m diameter, receiver sensitivity SEFD ≤2000 Jy).

Disproof criteria:
  1. PREDICTABILITY FAILURE: Any algorithm achieves >55% accuracy predicting the sign of the next timing residual using only public ephemeris data and prior residuals (p < 0.001, binomial test, N ≥ 1000 predictions). This would indicate residual deterministic structure.
  2. STATISTICAL FAILURE: Residuals from ≥3 of 5 target MSPs fail ≥5 of 15 NIST SP 800-22 tests at α = 0.01 after standard whitening procedures.
  3. ENTROPY COLLAPSE: Measured min-entropy H_min < 0.90 bits/bit for any 1000-bit block extracted from whitened residuals.
  4. REPRODUCIBILITY FAILURE: Two independent observatories observing the same pulsar simultaneously produce residual sequences with cross-correlation |r| > 0.3 after noise subtraction (indicating shared deterministic artifact rather than independent stochastic source).
  5. MODEL ABSORPTION: A neural network trained on 5 years of residuals achieves out-of-sample prediction R² > 0.15 on held-out 6-month window, indicating learnable structure remains.
  6. PHYSICAL MECHANISM IDENTIFIED: A peer-reviewed model successfully predicts >60% of residual variance from first principles (e.g., complete ISM turbulence model), eliminating the irreducibility claim.
  7. TIMING NOISE CORRELATION: Residuals from geographically separated pulsars (angular separation >10°) show statistically significant cross-correlation at zero lag (|r| > 0.2, p < 0.001), suggesting a common systematic rather than independent stochastic sources.

Experimental Protocol

Phase 1 — Data Acquisition (Days 1–90): Obtain archival ToA datasets from NANOGrav 15-year data release, EPTA DR2, and PPTA DR3 for 10 MSPs. Select 5 MSPs with best timing precision (RMS < 500 ns): PSR J0437−4715, PSR J1909−3744, PSR J1713+0747, PSR J0030+0451, PSR J1744−1134.

Phase 2 — Deterministic Subtraction (Days 30–120): Apply TEMPO2 and PINT timing models with full parameter sets. Perform F-test to confirm no additional deterministic parameters improve fit at p < 0.01. Compute whitened residuals using Cholesky decomposition of the noise covariance matrix.

Phase 3 — Randomness Characterization (Days 90–180): Convert whitened residuals to bit strings via von Neumann extractor and hash-based extractor (SHA-3). Apply NIST SP 800-22 (15 tests), TestU01 BigCrush (106 tests), and Dieharder (18 tests) suites. Compute min-entropy via compression-based estimator (NIST SP 800-90B).

Phase 4 — Adversarial Prediction Challenge (Days 120–210): Train LSTM, Transformer, and Gaussian Process models on 80% of residual data; test on held-out 20%. Compute prediction accuracy, R², and compare to null model (mean prediction).

Phase 5 — Multi-Observatory Cross-Validation (Days 180–270): Compare simultaneous residuals from NANOGrav and EPTA for overlapping pulsars. Compute cross-correlations and test for shared structure.

Phase 6 — Throughput Benchmarking (Days 240–300): Measure bits/second output rate for a 25m dish observing PSR J0437−4715 at 1.4 GHz. Estimate practical randomness generation rate.

Required datasets:
  1. NANOGrav 15-year Data Release (NG15): Public ToA files for 68 MSPs, noise model parameters, timing ephemerides. URL: data.nanograv.org. Size: ~2 GB.
  2. EPTA Data Release 2 (EPTA DR2): ToA files for 25 MSPs from 5 European telescopes. URL: epta.eu.org/dr2. Size: ~800 MB.
  3. PPTA Data Release 3 (PPTA DR3): ToA files for 30 MSPs from Parkes telescope. URL: doi.org/10.4225/08/534CC21379C12. Size: ~600 MB.
  4. IPTA Data Release 2: Combined international dataset for 65 MSPs. Size: ~3 GB.
  5. DE440 Solar System Ephemeris (JPL): Required for barycentric correction. Available via TEMPO2 package.
  6. TEMPO2 software package (v2.0+): Timing model fitting and residual computation. GitHub: github.com/vallis/TEMPO2.
  7. PINT software package (v0.9+): Independent timing model implementation for cross-validation. GitHub: github.com/nanograv/PINT.
  8. enterprise/enterprise_extensions: Bayesian noise modeling for MSPs. GitHub: github.com/nanograv/enterprise.
  9. NIST SP 800-22 test suite: Statistical randomness testing. URL: csrc.nist.gov/projects/random-bit-generation.
  10. TestU01 library (v1.2.3): BigCrush and SmallCrush test batteries. URL: simul.iro.umontreal.ca/testu01.
  11. Dieharder v3.31.1: Additional randomness tests. Available via package managers.
  12. IERS Earth Orientation Parameters: Required for timing model accuracy. URL: iers.org.
Success:
  1. RANDOMNESS QUALITY: ≥4 of 5 target MSPs pass all 15 NIST SP 800-22 tests at α = 0.01 (after Bonferroni correction) with ≥100 independent 10⁶-bit sequences each.
  2. ENTROPY BOUND: Min-entropy H_min ≥ 0.95 bits/bit for all 5 target MSPs (NIST SP 800-90B methodology).
  3. UNPREDICTABILITY: Best adversarial model achieves sign-prediction accuracy ≤ 52% (not significantly different from 50% at p > 0.05, binomial test, N ≥ 10,000 predictions) and R² ≤ 0.05 on held-out data.
  4. INDEPENDENCE: Cross-correlation between simultaneous residuals from two independent observatories |r| < 0.1 (p > 0.05) for all tested pulsar pairs.
  5. THROUGHPUT: Demonstrated randomness generation rate ≥ 0.1 bits/second per pulsar with a 25m dish, scalable to ≥1 bit/s with a 64m dish.
  6. STATISTICAL TESTS: ≥95% pass rate on TestU01 BigCrush (≥100 of 106 tests) for ≥3 of 5 MSPs.
  7. REPRODUCIBILITY: Independent reanalysis by a second team using the same public data produces results within 5% of reported entropy estimates.
Failure:
  1. HARD FAILURE — PREDICTABILITY: Any adversarial model achieves sign-prediction accuracy > 55% (p < 0.001) on held-out data for any target MSP. Experiment terminates; hypothesis is falsified for that pulsar.
  2. HARD FAILURE — ENTROPY: H_min < 0.85 bits/bit for ≥3 of 5 MSPs. Indicates insufficient irreducibility for cryptographic use.
  3. HARD FAILURE — NIST FAILURE: ≥3 of 5 MSPs fail ≥5 NIST tests at α = 0.01. Indicates systematic non-randomness.
  4. SOFT FAILURE — THROUGHPUT: Demonstrated rate < 0.01 bits/second per pulsar with a 25m dish. Hypothesis may be physically valid but practically useless for cryptography.
  5. SOFT FAILURE — CORRELATION: Cross-correlation |r| > 0.2 (p < 0.01) between simultaneous residuals from two observatories. Suggests shared systematic artifact; requires investigation before cryptographic use.
  6. SOFT FAILURE — MODEL DEPENDENCE: Whitened residuals from TEMPO2 and PINT differ by > 20% in RMS, indicating model-dependent artifacts contaminate the stochastic signal.
  7. CONDITIONAL FAILURE — ADVERSARIAL LEARNING: R² > 0.10 on held-out data for ≥2 MSPs. Hypothesis requires revision to exclude those pulsars; remaining pulsars may still qualify.

480

GPU hours

300d

Time to result

$4,200

Min cost

$31,000

Full cost

ROI Projection

Commercial:
  1. PRODUCT OPPORTUNITY — Pulsar Randomness API: A cloud service delivering certified pulsar-derived random bits at $0.001–$0.01 per 1000 bits. At 10⁹ bits/day throughput (multi-telescope array), revenue potential: $365K–$3.65M/year at modest adoption.
  2. PRODUCT OPPORTUNITY — Compliance Certification: Certifying pulsar TRNG under NIST SP 800-90B and FIPS 140-3 would enable sale to government and financial sector. Certification cost: ~$500K; addressable market: $50M/year in high-assurance randomness services.
  3. TELESCOPE NETWORK MONETIZATION: Existing radio telescope arrays (MeerKAT, FAST, SKA) could offer randomness-as-a-service as a secondary revenue stream with near-zero marginal cost (pulsars are already observed for science). FAST alone observes 200+ MSPs; potential secondary revenue: $1M–$10M/year.
  4. OPEN-SOURCE ECOSYSTEM: A reference implementation (pulsar-rng Python package) would attract academic and startup adoption, creating an ecosystem around astronomical randomness. Comparable projects (e.g., random.org) generate $500K+/year in API revenue.
  5. INSURANCE/AUDIT VALUE: Third-party auditable randomness for financial derivatives, lottery systems, and legal proceedings. Current market for certified randomness in gaming/lottery: $200M+/year globally.
  6. RESEARCH GRANTS: NSF, DARPA, and ESA funding for astronomical TRNG research estimated at $5M–$20M over 5 years if proof-of-concept succeeds.

🔓 If proven, this unlocks

Proving this hypothesis is a prerequisite for the following downstream discoveries and applications:

  • 1pulsar-TRNG-hardware-implementation-v1
  • 2distributed-pulsar-randomness-beacon-v1
  • 3quantum-gravity-noise-floor-v1
  • 4ISM-turbulence-entropy-quantification-v1
  • 5astronomical-randomness-beacon-protocol-v1
  • 6post-quantum-cryptography-seeding-v1

Prerequisites

These must be validated before this hypothesis can be confirmed:

  • PTA-noise-characterization-v1
  • ISM-turbulence-stochasticity-v2
  • NIST-SP800-90B-compliance-framework-v1
  • MSP-timing-model-completeness-v3

Implementation Sketch

# Pulsar Timing Residual TRNG — Implementation Architecture
# ============================================================

# LAYER 1: DATA INGESTION
class PulsarDataIngester:
    def __init__(self, pulsar_id: str, pta_source: str = "NANOGrav15"):
        self.pulsar_id = pulsar_id  # e.g., "J1713+0747"
        self.pta_source = pta_source
        
    def load_toa_file(self, path: str) -> np.ndarray:
        """Load Time-of-Arrival data from .tim file format"""
        # Returns: array of (MJD, ToA_uncertainty_us, telescope_code)
        return tempo2_reader.parse_tim(path)
    
    def load_ephemeris(self, path: str) -> dict:
        """Load .par timing model file"""
        # Returns: dict of {parameter_name: (value, uncertainty)}
        return tempo2_reader.parse_par(path)

# LAYER 2: DETERMINISTIC SUBTRACTION
class TimingModelSubtractor:
    def __init__(self, ephemeris: dict, solar_system_ephem: str = "DE440"):
        self.model = PINT_TimingModel(ephemeris, solar_system_ephem)
        
    def compute_residuals(self, toas: np.ndarray) -> np.ndarray:
        """
        Subtract full deterministic model from ToAs.
        Returns raw residuals in microseconds.
        """
        # Apply: barycentric correction, Shapiro delay, DM correction,
        # spin-down model, binary orbital model (if applicable)
        return self.model.residuals(toas)
    
    def f_test_model_completeness(self, residuals: np.ndarray, 
                                   candidate_params: list) -> dict:
        """
        F-test to verify no additional deterministic parameters
        improve fit at p < 0.01.
        Returns: {param: p_value} for all candidates
        """
        results = {}
        baseline_chi2 = np.sum(residuals**2 / residuals_uncertainty**2)
        for param in candidate_params:
            extended_model = self.model.add_parameter(param)
            new_residuals = extended_model.residuals(toas)
            new_chi2 = np.sum(new_residuals**2 / residuals_uncertainty**2)
            delta_chi2 = baseline_chi2 - new_chi2
            p_value = chi2.sf(delta_chi2, df=1)
            results[param] = p_value
        return results

# LAYER 3: NOISE CHARACTERIZATION (Bayesian)
class BayesianNoiseCharacterizer:
    def __init__(self, pulsar_id: str):
        self.pta = enterprise.PTA([pulsar_id])
        
    def fit_noise_model(self, residuals: np.ndarray, 
                        toas: np.ndarray) -> dict:
        """
        Fit white noise (EFAC, EQUAD) and red noise (power law)
        using MultiNest sampler.
        Returns: posterior samples for all noise hyperparameters
        """
        sampler = PTMCMCSampler(self.pta, Niter=500000, 
                                 resume=False)
        chain = sampler.sample()
        return {
            'EFAC': chain['efac_posterior'],
            'EQUAD': chain['equad_posterior'],
            'red_noise_amplitude': chain['rn_amp_posterior'],
            'red_noise_spectral_index': chain['rn_gamma_posterior'],
            'DM_noise_amplitude': chain['dm_amp_posterior']
        }
    
    def compute_noise_covariance(self, posteriors: dict, 
                                  toas: np.ndarray) -> np.ndarray:
        """
        Construct full noise covariance matrix C from posterior medians.
        Shape: (N_toa, N_toa)
        """
        C_white = np.diag((posteriors['EFAC'] * toa_errors)**2 + 
                           posteriors['EQUAD']**2)
        C_red = red_noise_covariance(toas, 
                                      posteriors['red_noise_amplitude'],
                                      posteriors['red_noise_spectral_index'])
        C_dm = dm_noise_covariance(toas, posteriors['DM_noise_amplitude'])
        return C_white + C_red + C_dm

# LAYER 4: WHITENING
class ResidualWhitener:
    def whiten(self, residuals: np.ndarray, 
               C: np.ndarray) -> np.ndarray:
        """
        Cholesky whitening: r_w = L^{-1} r where C = L L^T
        Returns: whitened residuals (should be iid N(0,1))
        """
        L = np.linalg.cholesky(C)
        r_whitened = np.linalg.solve(L, residuals)
        return r_whitened
    
    def verify_whitening(self, r_w: np.ndarray) -> dict:
        """
        Verify whitened residuals have flat PSD and unit variance.
        """
        psd_freqs, psd_power = lombscargle(toas, r_w)
        max_peak_sigma = (max(psd_power) - mean(psd_power)) / std(psd_power)
        return {
            'variance': np.var(r_w),  # Should be ~1.0
            'max_psd_peak_sigma': max_peak_sigma,  # Should be < 3.0
            'normality_p': scipy.stats.normaltest(r_w).pvalue  # Should be > 0.05
        }

# LAYER 5: RANDOMNESS EXTRACTION
class RandomnessExtractor:
    def von_neumann_extract(self, bits: np.ndarray) -> np.ndarray:
        """
        Von Neumann extractor on consecutive bit pairs.
        (0,1) -> 0; (1,0) -> 1; (0,0),(1,1) -> discard
        Efficiency: ~50% for fair coins
        """
        output = []
        for i in range(0, len(bits)-1, 2):
            if bits[i] == 0 and bits[i+1] == 1:
                output.append(0)
            elif bits[i] == 1 and bits[i+1] == 0:
                output.append(1)
        return np.array(output)
    
    def hash_extract(self, residuals: np.ndarray, 
                     block_size: int = 256) -> bytes:
        """
        SHA3-256 hash extractor for stronger entropy concentration.
        Input: 256 quantized residual values (8-bit each = 2048 bits)
        Output: 256 bits of extracted randomness
        """
        output_bits = []
        for i in range(0, len(residuals) - block_size, block_size):
            block = residuals[i:i+block_size]
            quantized = np.int8(block * 127).tobytes()
            hash_output = hashlib.sha3_256(quantized).digest()
            output_bits.extend(hash_output)
        return bytes(output_bits)

# LAYER 6: STATISTICAL TESTING
class RandomnessTestSuite:
    def run_nist_sp800_22(self, bit_sequence: bytes) -> dict:
        """Run all 15 NIST SP 800-22 tests. Returns p-values."""
        return nist_tests.run_all(bit_sequence, n_bits=10**6)
    
    def run_testu01_bigcrush(self, bit_sequence: bytes) -> dict:
        """Run TestU01 BigCrush (106 tests). Returns pass/fail."""
        return testu01.bigcrush(bit_sequence)
    
    def compute_min_entropy(self, bit_sequence: bytes) -> float:
        """
        NIST SP 800-90B min-entropy estimation.
        Returns: H_min in bits/bit (range 0-1)
        """
        estimators = [
            nist_90b.compression_estimator(bit_sequence),
            nist_90b.collision_estimator(bit_sequence),
            nist_90b.markov_estimator(bit_sequence),
            nist_90b.most_common_value_estimator(bit_sequence)
        ]
        return min(estimators)  # Conservative: take minimum

# LAYER 7: ADVERSARIAL PREDICTION CHALLENGE
class AdversarialPredictor:
    def train_lstm(self, residuals: np.ndarray, 
                   train_frac: float = 0.8) -> dict:
        """3-layer LSTM, 256 hidden units, trained on 80% of data."""
        model = LSTM(layers=3, hidden=256, dropout=0.2)
        train_size = int(len(residuals) * train_frac)
        model.fit(residuals[:train_size], epochs=100, 
                  batch_size=32, patience=10)
        predictions = model.predict(residuals[train_size:])
        return self._evaluate(residuals[train_size:], predictions)
    
    def _evaluate(self, true: np.ndarray, 
                  pred: np.ndarray) -> dict:
        sign_accuracy = np.mean(np.sign(true) == np.sign(pred))
        r2 = 1 - np.sum((true-pred)**2) / np.sum((true-np.mean(true))**2)
        return {'sign_accuracy': sign_accuracy, 'r2': r2,
                'mse': np.mean((true-pred)**2)}

# LAYER 8: MAIN PIPELINE
def run_pulsar_trng_validation(pulsar_ids: list) -> dict:
    results = {}
    for pid in pulsar_ids:
        ingester = PulsarDataIngester(pid)
        toas = ingester.load_toa_file(f"data/{pid}.tim")
        ephem = ingester.load_ephemeris(f"data/{pid}.par")
        
        subtractor = TimingModelSubtractor(ephem)
        residuals = subtractor.compute_residuals(

Source

AegisMind Research
Need AI to work rigorously on your problems? AegisMind uses the same multi-model engine for personal and professional use. Get started