arXiv is now an independent nonprofit! Learn more
License: CC BY 4.0
arXiv:2606.31991v1 [cs.LG] 30 Jun 2026

Amplifying Membership Signal
Through Chained Regeneration

Wojciech Łapacz   
Warsaw University of Technology
&Stanisław Pawlak11footnotemark: 1
Warsaw University of Technology
Equal contribution.Contact: wojciech.lapacz02@gmail.com
Abstract

The tendency of large generative models to memorize training data makes sample verification critical for privacy auditing and copyright enforcement. Current membership (MIA) and dataset inference (DI) attacks often rely on one-shot generations, which yield weak signals and limited sensitivity across modalities. Inspired by Model Autophagy Disorder (MAD), we introduce MADreMIA, a model-agnostic framework that enhances white-, gray-, and black-box MIA and DI. Rather than relying on shadow model training – often infeasible for large generative models – our framework facilitates scalable inference by leveraging inherent signals through iterative trajectories. This process utilizes chained generations across diverse modalities, where each output serves as the subsequent input, to improve membership evidence at low FPR. We demonstrate that memorized training samples exhibit significantly higher coherence and slower degradation during iterative regeneration than non-member generations. Our results show that MADreMIA provides richer signals across diverse model families and modalities; we present comprehensive evaluations for IARs, diffusion, and language models, alongside preliminary results demonstrating its potential for audio models.

1 Introduction

The rapid development of generative AI triggered a pressing demand for training data, frequently leading to the unauthorized ingestion of private, sensitive, or copyrighted content. Consequently, with the scaling of generative models the importance of Membership inference attacks (MIAs) [28] and dataset inference (DI) [24] has become critical. Practical auditing – ranging from protecting medical privacy [49] to identifying licensed content [12] or detecting benchmark contamination [22, 30, 44] – requires determining whether specific samples or datasets were used to shape a model’s parameters. The definitive test is whether a model retains a structural ”echo” of its training data, manifesting itself as a high-fidelity memorization signal that can be surfaced through targeted inference. Existing auditing methods, however, face a significant bottleneck. Most extract evidence from a single query [46, 39] or a set of loosely coupled samples [8]. These one-shot signals are often fragile; recent evaluations on unbiased benchmarks show that many MIAs degrade significantly under distributional shifts, often performing only slightly better than random guessing [22, 12]. Furthermore, high-performance ”shadow model” attacks [41, 5] – which require training multiple auxiliary models to simulate the target – are computationally expensive and impractical for real-world large-scale generative architectures.

Refer to caption
Figure 1: Comparison between conventional one-shot membership inference attack and our chained-generation approach. The former use a single query, which yields a weak signal that often fails to separate members from non-members. In the latter, each generation informs the next query, progressively amplifying membership evidence and improving separability: re-members \checkmark are more coherent and degrade slower than re-non-members ×\times.

To address these limitations, we shift the perspective from a single static query to a dynamic trajectory. This concept is best illustrated through a forensic parallel: in a criminal interrogation, a suspect may maintain a lie for a single response, but that lie often collapses under the pressure of repeated, recursive follow-up questions. A truthful narrative, by contrast, remains coherent because it is grounded in a fixed reality. We argue that generative models exhibit a similar phenomenon – their ”truth” is the training set. While a model can produce a plausible-looking output for a non-member sample once, it may struggle to sustain that plausibility over a recursive chain of self-generated inputs.

Our framework, MADreMIA, is inspired by the mechanics of Model Autophagy Disorder (MAD) [1, 29]. Traditionally, MAD describes a failure mode where models trained on their own synthetic outputs progressively lose variance and collapse into a state of degenerated ”madness”. We pivot this phenomenon into a diagnostic, interference-time tool: if a sample was present during training, it acts as a stable “attractor” in the model’s latent space. By repeatedly feeding a model’s outputs back into itself – creating an iterative regeneration chain – we can amplify the signal of memorization.

Within this framework, we distinguish between two types of trajectories:

  • Re-members: These are member samples (training data) that are iteratively re-generated. Because the model has ”memorized” these points, they exhibit high stability and slow semantic degradation over time.

  • Re-non-members: These are unseen samples that are iteratively re-generated. Lacking a structural anchor in the model’s weights, these samples drift rapidly toward the model’s average biases or dissolve into noise (see Figure 1).

MADreMIA functions as a modular, inference-time add-on that is intentionally method-, model-, and modality-agnostic. By measuring consistency across recursive loops, we provide richer signals across diverse architectures, including image autoregressive models (IARs), diffusion models (DMs), large language models (LLMs), and audio voice conversion models. We demonstrate that while a single output is often too noisy to be decisive, the trajectory of a “re-member” is different than “re-non-member” and thus acts as a powerful signal amplifier, surfacing traces of training data that are otherwise invisible.

MADreMIA iterative procedure moves beyond one-shot plausibility by probing whether the model preserves semantic and structural consistency under repeated self-interaction. Consequently, this work investigates a central research question: Can the dynamics of recursive self-generation serve as a signal amplifier to expose training data membership?

In summary, the main contributions of our paper are:

  • We introduce an iterative regeneration setup to uncover data memorization invisible during single-pass inference.

  • We show theoretically and empirically that trajectory features (generation dynamics over time) yield a significantly more statistically robust membership signal. By functioning as a variance reduction mechanism, these features isolate the underlying membership information much more effectively than standard one-shot baselines.

  • We propose an inference-time, cross-modal framework that improves Membership and Dataset Inference efficiency across Vision and Language models without the need for expensive shadow model training.

2 Related Works

Memorization.

Memorization in generative models — the tendency to reproduce training examples rather than generate novel samples — has been studied across multiple model families. Early work formalized the distinction between memorization, mode collapse, and overfitting [35], while subsequent studies characterized the generalization-to-memorization transition in diffusion models [15], localized it through attention patterns [26], and showed that standard evaluation metrics fail to surface it [2]. Mitigation strategies have been proposed for both LLMs [16] and text-to-image models [7].

Membership and Dataset Inference.

Individual Membership Inference Attacks (MIAs) can be confounded by distribution shifts [22], prompting a shift toward Dataset Inference, which aggregates evidence across many samples [23, 12, 19]. Shadow-model approaches [41, 5] are now computationally infeasible for large architectures, so modern attacks extract signals from limited black-box outputs [46, 6, 32]. Most relevant to our work, Li et al. [20] performs MIAs on diffusion models by repeatedly perturbing a target image and comparing averaged outputs to the original — but since queries are independent and do not evolve with model responses, deeper structural memorization remains unexploited.

Model Collapse.

Recursive self-training in generative models leads to progressive quality and diversity degradation when insufficient real data is injected — a phenomenon termed Model Autophagy Disorder [1]. Training on model-generated data further causes tails of the original distribution to disappear [29]. Together, these works suggest that iterative generation is structurally revealing: memorized regions may persist differently from non-member examples under repeated reuse. Our method turns this insight into a privacy-auditing mechanism, exploiting chained regeneration at inference time to amplify membership-relevant differences rather than treating collapse as a training-time pathology. The extended related works section can be found in Appendix E.

3 Theory of Trajectory-Based Signal Amplification

For each sample, we define an iterative trajectory Z0,Z1,,ZTZ_{0},Z_{1},\dots,Z_{T}, where Z0Z_{0} is the observed sample and Zt+1Z_{t+1} is produced by one regeneration step. Let M{0,1}M\in\{0,1\} denote membership. Define a per-step score ϕt:=ϕ(Zt,Zt+1)\phi_{t}:=\phi(Z_{t},Z_{t+1}) and the average ST:=1Tt=0T1ϕtS_{T}:=\frac{1}{T}\sum_{t=0}^{T-1}\phi_{t}. The attack predicts MM from STS_{T}. We use aTbTa_{T}\gtrsim b_{T} when aTcbTa_{T}\geq c\,b_{T} for a constant c>0c>0 independent of TT, and aTbTa_{T}\asymp b_{T} for two-sided bounds.

Assumption 3.1 (Signal and Noise).

(A1) There exists a sequence (Δt0)(\Delta_{t}\geq 0) such that 𝔼[ϕtM=1]𝔼[ϕtM=0]Δt\mathbb{E}[\phi_{t}\mid M=1]-\mathbb{E}[\phi_{t}\mid M=0]\geq\Delta_{t}. (A2) maxmsuptVar(ϕtM=m)σ2<\max_{m}\sup_{t}\mathrm{Var}(\phi_{t}\mid M=m)\leq\sigma^{2}<\infty. (A3) The centered process ϕ~t:=ϕt𝔼[ϕtM]\tilde{\phi}_{t}:=\phi_{t}-\mathbb{E}[\phi_{t}\mid M] is geometrically mixing with effective autocorrelation time τeff\tau_{\mathrm{eff}}, implying Var(STM)Cσ2τeffT\mathrm{Var}(S_{T}\mid M)\leq C\frac{\sigma^{2}\tau_{\mathrm{eff}}}{T}.

Theorem 3.2 (Trajectory Averaging).

Under A1–A3, the signal ΓT:=|𝔼[STM=1]𝔼[STM=0]|\Gamma_{T}:=|\mathbb{E}[S_{T}\mid M=1]-\mathbb{E}[S_{T}\mid M=0]| and SNR satisfy:

ΓT1Tt=0T1Δt,SNR2(ST):=ΓT2maxmVar(STM=m)(1TΔt)2Cσ2τeff/T.\Gamma_{T}\geq\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t},\quad\mathrm{SNR}^{2}(S_{T}):=\frac{\Gamma_{T}^{2}}{\max_{m}\mathrm{Var}(S_{T}\mid M=m)}\geq\frac{(\frac{1}{T}\sum\Delta_{t})^{2}}{C\sigma^{2}\tau_{\mathrm{eff}}/T}.

Interpretation. Multi-step attacks improve when mean signal decays slowly relative to variance reduction.

Corollary 3.3 (Exponential Leakage).

If Δt=Δ0et/τg\Delta_{t}=\Delta_{0}e^{-t/\tau_{g}}, then ΓTΔ01eT/τgT/τg\Gamma_{T}\geq\Delta_{0}\frac{1-e^{-T/\tau_{g}}}{T/\tau_{g}}. If ΓTΔ01eT/τgT/τg\Gamma_{T}\asymp\Delta_{0}\frac{1-e^{-T/\tau_{g}}}{T/\tau_{g}}, then SNR2(ST)g(T/τg)\mathrm{SNR}^{2}(S_{T})\gtrsim g(T/\tau_{g}) where g(x):=(1ex)2xg(x):=\frac{(1-e^{-x})^{2}}{x}. The maximizer x1.2564x^{\star}\approx 1.2564 yields an optimal T1.2564τgT^{\star}\approx 1.2564\,\tau_{g}.

Corollary 3.4 (Amplification Gain).

Let κ:=τg/τeff\kappa:=\tau_{g}/\tau_{\mathrm{eff}}. At T=TT=T^{\star}, the gain over the single-step baseline S1S_{1} is SNR(ST)SNR(S1)cκ\frac{\mathrm{SNR}(S_{T^{\star}})}{\mathrm{SNR}(S_{1})}\gtrsim c\sqrt{\kappa}, with c0.638c\approx 0.638.

It is worth noting that we do not claim that trajectory iteration increase the Bayes information ceiling I(M;Z0)I(M;Z_{0}); No, instead it improves practical fixed-form statistics via temporal variance reduction.

This theory applies to any iterative protocol satisfying A1–A3. Theorem 3.2 provides a conditional amplification guarantee. We present proofs in the Appendix L.

4 Method

MADreMIA is a trajectory-augmentation framework for privacy inference on generative models. It is designed as an any-box extension of standard one-shot attacks (MIA/DI): black-box by default, gray-box when richer outputs are available, and white-box when needed. The central design principle is to keep the downstream scorer unchanged and improve only its input representation through additional trajectory-derived evidence.

Unified setup.

Following Sec. 3, for each queried sample we construct

Z0,Z1,,ZT,Zt+1=(f,Zt),t=0,,T1,Z_{0},Z_{1},\dots,Z_{T},\qquad Z_{t+1}=\mathcal{R}(f,Z_{t}),\;t=0,\dots,T-1,

where Z0=xZ_{0}=x is the queried sample, ff is the audited generator, and \mathcal{R} is a modality-specific regeneration operator executed under a fixed protocol. For MIA, the label is M{0,1}M\in\{0,1\} (member/non-member). For DI, we use an analogous binary label D{0,1}D\in\{0,1\} (in-target-dataset/out-of-target-dataset).

Threat model.

MADreMIA supports: black-box (query access to ff outputs only), gray-box (query access plus output-level statistics such as loss/log-probability signals), and white-box (optional access to internals/gradients when available). In all cases, the adversary/auditor has no access to training data identities (labels), performs at most TT regeneration steps per sample, and outputs a binary prediction via hh: MM for MIA or DD for DI.

Base one-shot signal.

The theory defines ϕt:=ϕ(Zt,Zt+1),ST:=1Tt=0T1ϕt.\phi_{t}:=\phi(Z_{t},Z_{t+1}),\qquad S_{T}:=\frac{1}{T}\sum_{t=0}^{T-1}\phi_{t}.

A trajectory one-shot comparator corresponds to the T=1T=1 case (using ϕ0=ϕ(Z0,Z1)\phi_{0}=\phi(Z_{0},Z_{1})). When available, we additionally report classical one-shot baselines zbase=ϕbase(Z0)z_{\mathrm{base}}=\phi_{\mathrm{base}}(Z_{0}). Importantly, for each modality/model, the orientation (sign) of ϕt\phi_{t} is fixed on train data only (equivalently ϕt\phi_{t} or ϕt-\phi_{t}) and then frozen for test-time evaluation.

Signals and Fusion.

MADreMIA augments one-shot evidence with trajectory summaries computed from (Z0,,ZT)(Z_{0},\dots,Z_{T}). We define

zbase=ϕbase(Z0)d,ztraj=ψ(Z0,,ZT)k,z_{\mathrm{base}}=\phi_{\mathrm{base}}(Z_{0})\in\mathbb{R}^{d},\qquad z_{\mathrm{traj}}=\psi(Z_{0},\dots,Z_{T})\in\mathbb{R}^{k},

Here ψ\psi aggregates temporal statistics aligned with the ϕt\phi_{t} process (e.g., drift, consistency, quality evolution, diversity, score decay, and summaries derived from {ϕt}t=0T1\{\phi_{t}\}_{t=0}^{T-1} and STS_{T}). The fused representation is z~=[zbaseztraj]d+k,\tilde{z}=[z_{\mathrm{base}}\|z_{\mathrm{traj}}]\in\mathbb{R}^{d+k}, and the final attack score is s(Z0)=h(z~),s(Z_{0})=h(\tilde{z}), with hh a calibrated scorer. By default, following Kowalczuk et al. [19], hh is an L1-regularized logistic regression fit as a plug-in estimator of P(M=1z~)P(M=1\mid\tilde{z}).

Mechanism.

MADreMIA leverages the fact that members often exhibit slower average drift than non-members. Memorized samples typically lie in deeper local probability wells, causing iterative regenerations to remain closer to Z0Z_{0}. Gains represent fixed-statistic SNR improvements consistent with the DPI: I(M;z~)I(M;Z0)I(M;\tilde{z})\leq I(M;Z_{0}).

4.1 Modality-specific instantiations

Image autoregressive models (IARs) and diffusion models.

\mathcal{R} is image-to-image regeneration under fixed controls (autoregressive decoding for IARs; controlled re-noise/re-denoise for diffusion, i.e., partial forward noising to a fixed noise level followed by reverse denoising under fixed scheduler/settings). Trajectory features are defined relative to Z0Z_{0}, in particular MSE(Z0,Zt)\mathrm{MSE}(Z_{0},Z_{t}), LPIPS(Z0,Zt)\mathrm{LPIPS}(Z_{0},Z_{t}) [47], and SSIM(Z0,Zt)\mathrm{SSIM}(Z_{0},Z_{t}) [37].

Large language models (LLMs).

\mathcal{R} is an autophagous text loop where each generation is fed back as the next prompt/input under a fixed template, fixed context-window policy (with left-sided truncation to keep only the newest text), and fixed decoding configuration. We use multiple features to measure the quality and diversity of generations, specifically: Kullback-Leibler Divergence, Jensen-Shannon Divergence, Jaccard Index, Predictive Entropy, and Logit Margin:

KLD(Z0,Zt),JSD(Z0,Zt),Jaccard(Z0,Zt),Entropy(Zt),LogitMargin(Zt),\mathrm{KLD}(Z_{0},Z_{t}),\quad\mathrm{JSD}(Z_{0},Z_{t}),\quad\mathrm{Jaccard}(Z_{0},Z_{t}),\quad\mathrm{Entropy}(Z_{t}),\quad\mathrm{LogitMargin}(Z_{t}),

for t{1,,T}t\in\{1,\dots,T\}. These are summarized along the trajectory and fused with zbasez_{\mathrm{base}}. For clarity, KLD/JSD are computed on aligned token-distribution vectors: in gray/white-box settings from next-token logits, and in black-box settings from smoothed empirical token-frequency distributions under a fixed tokenizer/vocabulary. In fact, metrics in our experiments follow the gray-box setting, but our framework itself is open to the black-box setting as well. A black-box setting requires repeated queries per step to estimate distributions. Jaccard is computed on token sets after the same fixed preprocessing. More information about features for vision and language models are provided in Appendix H.

Audio generative models.

In the audio domain, \mathcal{R} employs iterative reconstruction loops. Notably, we do not conduct a full Membership or Dataset Inference evaluation for audio models, as the literature currently lacks proper audio benchmarks and specialized attacks tailored to the voice conversion setting. Nevertheless, to demonstrate the cross-modal generality of our framework, our first experiment explores this potential using an objective audio fidelity metric.

Across all modalities, MADreMIA follows the pipeline: Z0(Z0:T)(ϕ0:T1,ST,ztraj)z~s(Z0)Z_{0}\rightarrow(Z_{0:T})\rightarrow(\phi_{0:T-1},S_{T},z_{\mathrm{traj}})\rightarrow\tilde{z}\rightarrow s(Z_{0}).

Images (black-box)
Refer to caption
(a) RAR-XXL
Refer to caption
(b) VAR-d30
Refer to caption
(c) DiT-MoE-G
Refer to caption
(d) UViT-T2I-Deep
Text (grey-box)
Refer to caption
(e) OLMo-7B
Refer to caption
(f) Pythia-6.9B
Refer to caption
(g) OPT-6.7B
Refer to caption
(h) Llama-13B
Audio (black-box)
Refer to caption
(i) FreeVC
Refer to caption
(j) AutoVC
Figure 2: Divergence trajectories across chained regeneration steps. Rows represent image models (FID), audio models (FAD), and language models (KLD). Across modalities and access settings, member examples retain lower divergence and degrade more slowly than non-member examples, providing a robust signal for both membership and dataset inference. Evaluations were conducted using the following sample sizes: 10,000 for IAR and Diffusion models, 2,000 for Audio, 1,000 for OLMo, 512 for Pythia, and 250 for Llama and OPT.

5 Experiments

5.1 Experimental Setup

To ensure a scientifically sound evaluation across our MIA tasks, we restrict our setup to models trained on public datasets with well-defined training and test splits. We evaluate our method across three diverse modalities to demonstrate its broad applicability. For image generation, we analyze SOTA autoregressive models (VAR-d{20, 24, 30} [33], RAR-{L, XL, XXL} [43]) and diffusion models (DiT-RF-{XL, G} [13], UViT-T2I-Deep [3]), trained primarily on the ImageNet [10] or COCO [36] datasets for class-conditioned and text-to-image generation. We extend this evaluation to the audio domain using modern Voice Conversion models (AutoVC [25], FreeVC [21]), and to the language domain utilizing prominent LLMs (LLaMA-13B [34], Pythia-6.9B [4], OLMo-7B [14], and OPT-6.7B [48]). Comprehensive details regarding all specific models and datasets used in experiments are provided in the Appendix F and G. All experiments were conducted on a machine equipped with 3 NVIDIA RTX PRO 5000 Blackwell GPUs (48 GB VRAM each) and an Intel Xeon Gold 6526Y CPU.

5.2 Metrics

To measure similarity between feature representations and their fidelity, we utilize the Fréchet Inception Distance (FID) [17], and Fréchet Audio Distance (FAD) [18] for vision and audio models, respectively. For LLMs, we measure Token Diversity as the Kullback–Leibler Divergence (KLD) between the normalized average token probability distribution at the current iteration and that of the first evaluation iteration: Token Diversity at iteration tt (for t>1t>1) is defined as the Kullback-Leibler divergence from iteration 11:

TokenDiversity(t)=DKL(ptp1)=iVpt(i)logpt(i)p1(i).\mathrm{TokenDiversity}(t)=D_{\mathrm{KL}}\!\left(p_{t}\,\|\,p_{1}\right)=\sum_{i\in V}p_{t}(i)\,\log\frac{p_{t}(i)}{p_{1}(i)}.

where ptp_{t} and p1p_{1} are the normalized average token probability distributions for step tt and step 11 respectively.

5.3 MIA and DI procedures

MIA pipeline.

For each labeled member/non-member sample, we generate Z0,,ZTZ_{0},\dots,Z_{T}, compute ϕt\phi_{t}, STS_{T}, and modality-specific trajectory features, form z~\tilde{z}, and fit hmiah_{\mathrm{mia}}. We evaluate univariate trajectory statistics by direct thresholding and multivariate features by logistic-regression fusion on strictly stratified 80/20 train-test splits. We report AUC, TPR at 1% FPR, and accuracy. Splitting is performed at sample/source level before trajectory generation: all descendants of the same Z0Z_{0} (all ZtZ_{t}, all derived features) remain in the same partition. Thresholds, feature normalization, and LR calibration are fit on train only and applied unchanged to test. Primary endpoint is the multivariate fusion score; univariate STS_{T}^{\star} results are reported as theory-aligned diagnostics. If TT is tuned, it is selected on train (or a train-only validation split) and never on test. We use established metrics: TPR@FPR=1%, AUC, and Accuracy.

DI pipeline.

The DI pipeline is identical, replacing the target label with dataset-origin variable DD. The same ZtZ_{t}, ϕt\phi_{t}, and trajectory-fusion machinery is used; only label semantics and calibration change. For DI, splitting/evaluation are performed at dataset or source-group level, and per-sample logits are aggregated by a fixed mean rule into a dataset-level score. Dataset-level decisions are evaluated against a permutation-based null over dataset labels within the evaluation fold.

Both MIA and DI setups inherit standard generative privacy-audit conventions, including the IAR setting introduced in [19].

5.4 Research questions

We evaluate whether chained regeneration can be a signal amplifier for one-shot auditing across modalities, model families, and access regimes. Our analysis focuses on the following questions: (Q1) What distinguishes member/non-member chained generation trajectories? (Q2) Can one-shot membership signal be amplified for single ϕ(t)\phi(t) features? What are the gains for trajectory-based STS_{T} over ϕ(t)\phi(t) across modalities? (Q3) Does MADreMIA increase member/non-member separability compared to one-shot MIA? (Q4) Does increasing generative model stochasticity during regeneration loop affect the trajectories separation between members and non-members? (Q5) How does model size affect member/nonmember trajectory signals? Finally, we also provide a short analysis of the Getty Images case [9] in Appendix K.

5.5 Members and Nonmembers differ in generative trajectories: qaulitative results.

Across all modalities, members and non-members exhibit distinct regeneration dynamics. Members preserve structure longer and drift more slowly, while non-members degrade faster and diverge toward the model’s generic prior. This pattern is visible both in per-step qualitative examples (Figures˜3(a) and 3(b)) and in aggregate divergence trajectories (Figure˜2) comparing the quality of regenerations to base samples (FID for images, FAD for audio) and the drift of output token distribution in text model. The results presented support the core hypothesis that auto-regeneration trajectory contains multiple membership cues. The key trajectory asymmetry findings are:

  1. 1.

    Fidelity and degradation: Re-members maintain high structural quality throughout the trajectory, whereas re-non-members exhibit rapid perceptual and semantic degradation.

  2. 2.

    Persistence and divergence: Re-members demonstrate significant structural persistence and coherence across iterations. Conversely, re-non-members diverge more quickly, drifting toward the model’s general distribution and losing the specific characteristics of the original input.

Refer to caption
(a) Members.
Refer to caption
(b) Non-members.
Figure 3: Qualitative comparison of members and non-members across iterative regeneration (VAR-d30). Non-member images quality degrades faster than members, whose semantic coherence is largely preserved across regenerations.
The asymmetry is present across diverse models and modalities.

We test broad architectural diversity: image autoregressive and diffusion models, audio voice conversion/generation models, and text generative models. Figure˜2 summarizes trajectory behavior using modality-appropriate divergence metrics 5.2. This design directly tests whether our proposed signal amplification is model- and modality-agnostic.

5.6 ϕ(t)\phi(t) statistics may increase membership signal over one-shot ϕ(0)\phi(0).

We evaluate the validity of our theoretical assumptions using empirical generative trajectories, fixing TT to the first 15 iterations. As summarized in Table 2, while Assumption A2 is fully supported, A1 and A3 receive only partial empirical backing. Specifically, for certain values of ϕt\phi_{t}, the absence of clear exponential decay within the first 15 iterations is acceptable for our main claim, since it indicates slower or plateau-like leakage. It suggests that non-exponential leakage forms may also govern real trajectories.

To assess the efficacy of modality-specific trajectory statistics, we evaluate whether aggregated trajectory evidence remains competitive with - or outperforms - the one-shot evidence. We define gain:=maxTSNR2(ST)maxtSNR2(ϕt),\mathrm{gain}:=\frac{\max_{T}\mathrm{SNR}^{2}(S_{T})}{\max_{t}\mathrm{SNR}^{2}(\phi_{t})}, and show results in Table 2. Trajectory diagnostics are strong: P(gain1)=811=0.73,P(gain0.9)=1011=0.91,P(\mathrm{gain}\geq 1)=\frac{8}{11}=0.73,\qquad P(\mathrm{gain}\geq 0.9)=\frac{10}{11}=0.91, with median gain =1.00=1.00. Given the small number of tested features, we interpret these numbers as supportive preliminary evidence.

Table 1: STS_{T} gains over ϕt\phi_{t} across modalities. P(gain1)P(\text{gain}\geq 1) indicates the fraction of models where scoring matches or exceeds the baseline.
Family nn P(1)P(\geq 1) P(0.9)P(\geq 0.9) Median
VAR 3 0.67 1.00 1.00
Diffusion 3 0.67 1.00 1.00
LLM 5 0.80 0.80 1.04
Table 2: Assumption support across model families. Fractions indicate the number of models satisfying each assumption.
Family A1 A2 A3
VAR 3/33/3 3/33/3 3/33/3
Diffusion 3/33/3 3/33/3 2/32/3
LLM 3/53/5 5/55/5 2/52/5

5.7 MADreMIA amplifies baseline MIA

Tables˜3 and 4 compare MADreMIA-augmented attacks against their unaided baselines across LLMs and IARs. Across all base attacks and model families, incorporating reconstruction Diversity (MSEsumMSE_{\text{sum}}, LPIPSsumLPIPS_{\text{sum}}), Quality (SSIMsumSSIM_{\text{sum}}, SSIMstdSSIM_{\text{std}}), or both (Combined) consistently raises attack performance. Gains are most pronounced on OLMo-7B, where, for example, the Zlib baseline collapses to AUC 0.179 yet recovers to 0.868 with Combined signals, and CAMIA reaches AUC 0.969 — the strongest result across all settings. On the remaining LLMs the improvements are more modest but consistent. For IARs, MADreMIA yields clear gains in classification accuracy: VAR-d30 improves from 0.607 to 0.696 (+8.9 p.p.) and RAR-XXL from 0.562 to 0.713 (+15.1 p.p.), although TPR@1%FPR gains are smaller and less stable. Together, these results confirm that iterative reconstruction signals provide complementary, architecture-agnostic information that reliably strengthens membership inference across both LLMs and IARs.

Table 3: MIA results on established LLM benchmarks (described in detail in Appendix G), where MADreMIA trajectory features are aggregated across 15 iterations. Augmenting any base attack with diversity, quality, or combined signals consistently improves all the metrics over the unaided baselines.
Pythia-6.9B OLMo-7B OPT-6.7B Llama-13B
Attack TPR@1%FPR AUC TPR@1%FPR AUC TPR@1%FPR AUC TPR@1%FPR AUC
Loss [42] 0.004 ±\pm0.00 0.349 ±\pm0.02 0.008 ±\pm0.01 0.523 ±\pm0.02 0.013 ±\pm0.01 0.390 ±\pm0.04 0.009 ±\pm0.01 0.368 ±\pm0.04
+ Diversity 0.093 ±\pm0.06 0.647 ±\pm0.05 0.303 ±\pm0.09 0.735 ±\pm0.04 0.092 ±\pm0.12 0.613 ±\pm0.09 0.173 ±\pm0.14 0.690 ±\pm0.08
+ Quality 0.096 ±\pm0.07 0.686 ±\pm0.05 0.032 ±\pm0.04 0.702 ±\pm0.04 0.084 ±\pm0.09 0.652 ±\pm0.07 0.198 ±\pm0.13 0.679 ±\pm0.09
+ Combined 0.100 ±\pm0.08 0.673 ±\pm0.06 0.263 ±\pm0.14 0.804 ±\pm0.03 0.112 ±\pm0.12 0.672 ±\pm0.09 0.188 ±\pm0.15 0.702 ±\pm0.07
Zlib [5]) 0.000 ±\pm0.00 0.338 ±\pm0.02 0.022 ±\pm0.01 0.179 ±\pm0.01 0.012 ±\pm0.02 0.369 ±\pm0.03 0.009 ±\pm0.01 0.337 ±\pm0.03
+ Diversity 0.129 ±\pm0.08 0.677 ±\pm0.05 0.318 ±\pm0.11 0.842 ±\pm0.03 0.099 ±\pm0.11 0.628 ±\pm0.08 0.176 ±\pm0.14 0.689 ±\pm0.07
+ Quality 0.124 ±\pm0.08 0.673 ±\pm0.06 0.208 ±\pm0.10 0.833 ±\pm0.03 0.092 ±\pm0.10 0.667 ±\pm0.08 0.210 ±\pm0.14 0.688 ±\pm0.08
+ Combined 0.128 ±\pm0.08 0.690 ±\pm0.06 0.295 ±\pm0.14 0.868 ±\pm0.02 0.121 ±\pm0.12 0.672 ±\pm0.08 0.194 ±\pm0.15 0.693 ±\pm0.08
Min-K% [27] 0.124 ±\pm0.08 0.680 ±\pm0.05 0.067 ±\pm0.07 0.703 ±\pm0.04 0.086 ±\pm0.11 0.650 ±\pm0.08 0.127 ±\pm0.11 0.648 ±\pm0.09
+ Diversity 0.120 ±\pm0.07 0.677 ±\pm0.05 0.219 ±\pm0.08 0.775 ±\pm0.03 0.064 ±\pm0.09 0.640 ±\pm0.08 0.144 ±\pm0.13 0.685 ±\pm0.08
+ Quality 0.124 ±\pm0.07 0.695 ±\pm0.05 0.095 ±\pm0.09 0.772 ±\pm0.03 0.094 ±\pm0.11 0.674 ±\pm0.09 0.178 ±\pm0.14 0.686 ±\pm0.08
+ Combined 0.113 ±\pm0.07 0.694 ±\pm0.05 0.240 ±\pm0.15 0.837 ±\pm0.03 0.092 ±\pm0.10 0.694 ±\pm0.08 0.182 ±\pm0.14 0.700 ±\pm0.07
CAMIA [6] 0.111 ±\pm0.09 0.683 ±\pm0.05 0.428 ±\pm0.25 0.958 ±\pm0.01 0.128 ±\pm0.12 0.664 ±\pm0.08 0.166 ±\pm0.13 0.686 ±\pm0.09
+ Diversity 0.118 ±\pm0.08 0.690 ±\pm0.05 0.517 ±\pm0.25 0.966 ±\pm0.01 0.104 ±\pm0.11 0.668 ±\pm0.08 0.146 ±\pm0.12 0.692 ±\pm0.08
+ Quality 0.131 ±\pm0.08 0.708 ±\pm0.05 0.501 ±\pm0.26 0.964 ±\pm0.01 0.115 ±\pm0.13 0.682 ±\pm0.08 0.192 ±\pm0.14 0.712 ±\pm0.08
+ Combined 0.109 ±\pm0.08 0.696 ±\pm0.05 0.553 ±\pm0.27 0.969 ±\pm0.01 0.109 ±\pm0.12 0.689 ±\pm0.08 0.176 ±\pm0.13 0.716 ±\pm0.08
Table 4: MIA results on IARs, where MADreMIA trajectory features are aggregated across 10 iterations (benchmark details in Appendix G). While AUC remains stable across augmentation variants, TPR@1%FPR and Accuracy improve substantially.
VAR-d30 RAR-XXL
Attack TPR@1%FPR AUC ACC TPR@1%FPR AUC ACC
Baseline [19] 0.040 ±\pm0.02 0.750 ±\pm0.02 0.607 ±\pm0.07 0.044 ±\pm0.02 0.754 ±\pm0.01 0.562 ±\pm0.02
+ Diversity 0.090 ±\pm0.09 0.755 ±\pm0.03 0.691 ±\pm0.03 0.084 ±\pm0.06 0.771 ±\pm0.03 0.700 ±\pm0.03
+ Quality 0.076 ±\pm0.08 0.757 ±\pm0.03 0.703 ±\pm0.03 0.079 ±\pm0.07 0.754 ±\pm0.04 0.703 ±\pm0.03
+ Combined 0.088 ±\pm0.06 0.750 ±\pm0.04 0.696 ±\pm0.03 0.069 ±\pm0.05 0.775 ±\pm0.03 0.713 ±\pm0.03

5.8 MADreMIA amplifies baseline DI

The p-value histograms in Figure˜4 demonstrate that MADreMIA trajectory features consistently strengthen the statistical evidence for dataset-level inference across all evaluated architectures. On Pythia-6.9B, augmented variants reach the 95% confidence threshold at around 100 samples versus roughly 150 for the baseline. Furthermore, augmented variants shift the distribution of log10(p)-\log_{10}(p) values noticeably rightward relative to the baseline, with this pattern holding across all three signal types. The effect is more pronounced on RAR-XXL, where the Combined variant produces a substantially larger rightward shift, indicating that individual trials yield stronger and more reliable evidence for membership inference.

Refer to caption
(a) Pythia-6.9B
Refer to caption
(b) RAR-XXL
Figure 4: Dataset Inference performance on selected models.

5.9 Sensitivity analysis of generation strength

Figure 5 shows PR curves for VAR-d30 across regeneration strengths s{2,4,6,8}s\in\{2,4,6,8\}, where ss controls how many final scales are regenerated. Members consistently achieve higher precision and recall than non-members across all values of ss, confirming that the MIA signal is robust to the choice of regeneration strength. As ss increases, however, the two groups converge in PR space (see Appendix J).

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5: Precision-Recall curves for VAR-d30 across regeneration strengths s{2,4,6,8}s\in\{2,4,6,8\}. Members (green) and Non-Members (red) are traced over 15 iterations, with color intensity indicating iteration progress. Larger ss corresponds to more aggressive regeneration.

5.10 Trajectory asymmetry scaling across model families

As illustrated in Figure˜6, the membership signal – quantified by ΔFID=FIDnonmemFIDmem\Delta\text{FID}=\text{FID}_{\text{nonmem}}-\text{FID}_{\text{mem}} persists across all model scales, suggesting that the observed asymmetry is a fundamental property rather than an artifact of specific parameter regimes. While the magnitude of this separation varies across architectures, its relationship with model scale is not uniform. The separation grows stronger with model size in VAR and DiT-MoE, but remains largely unaffected by scaling in RARs. Ultimately, the underlying trend is robust: iterative trajectory chaining consistently exposes a larger membership gap compared to standard one-shot generations.

Refer to caption
(a) RAR
Refer to caption
(b) VAR
Refer to caption
(c) DiT-MoE
Figure 6: Ablation: Trajectory asymmetry scaling across model families. Membership separation (Δ\Delta FID) persists across model scales, confirming that iterative trajectory chaining consistently amplifies membership signals compared to one-shot baselines.

6 Conclusions

We introduced MADreMIA, a model-agnostic membership inference signal amplifier for large generative models. By chaining repeated regenerations rather than relying on a single query, MADreMIA exploits a consistent asymmetry: member samples retain coherence across iterations while non-members drift and deteriorate. This signal generalizes across image, text, and audio generators, spanning IAR, diffusion, and LLM families. Our experimental results show that fusing trajectory-derived features with baseline MIA/DI scores further improves member/non-member separability, suggesting that iterative regeneration is a broadly applicable lens for privacy auditing and copyright attribution.

Acknowledgments

We gratefully acknowledge Polish high-performance computing infrastructure PLGrid for providing computer facilities and support within computational grant no. PLG/2025/018391. This research was partially funded by National Science Centre, Poland, grant no: 2023/51/I/ST6/02854.

References

  • [1] S. Alemohammad, J. Casco-Rodriguez, L. Luzi, A. I. Humayun, H. Babaei, D. LeJeune, A. Siahkoohi, and R. Baraniuk (2023) Self-consuming generative models go mad. In The Twelfth International Conference on Learning Representations, Cited by: §E.3, §1, §2.
  • [2] C. Bai, H. Lin, C. Raffel, and W. C. Kan (2021) On training sample memorization: lessons from benchmarking generative modeling with a large-scale competition. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, New York, NY, USA, pp. 2534–2542. External Links: ISBN 9781450383325, Link, Document Cited by: §E.1, §2.
  • [3] F. Bao, S. Nie, K. Xue, Y. Cao, C. Li, H. Su, and J. Zhu (2023) All are worth words: a vit backbone for diffusion models. In CVPR, Cited by: Appendix F, §5.1.
  • [4] S. Biderman, H. Schoelkopf, Q. Anthony, H. Bradley, K. O’Brien, E. Hallahan, M. A. Khan, S. Purohit, U. S. Prashanth, E. Raff, A. Skowron, L. Sutawika, and O. Van Der Wal (2023) Pythia: a suite for analyzing large language models across training and scaling. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. Cited by: Appendix F, §5.1.
  • [5] N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson, et al. (2021) Extracting training data from large language models. In 30th USENIX security symposium (USENIX Security 21), pp. 2633–2650. Cited by: §E.2, §1, §2, Table 3.
  • [6] H. Chang, A. S. Shamsabadi, K. Katevas, H. Haddadi, and R. Shokri (2025) Context-aware membership inference attacks against pre-trained large language models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pp. 7299–7321. Cited by: §E.2, §2, Table 3.
  • [7] C. Chen, D. Liu, M. Shah, and C. Xu (2025) Enhancing privacy-utility trade-offs to mitigate memorization in diffusion models. 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8182–8191. External Links: Link Cited by: §E.1, §2.
  • [8] C. A. Choquette-Choo, F. Tramer, N. Carlini, and N. Papernot (2021) Label-only membership inference attacks. In International conference on machine learning, pp. 1964–1974. Cited by: §1.
  • [9] M. Coulter (2024) Aiming for fairness: an exploration into getty images v. stability ai and its importance in the landscape of modern copyright law. DePaul J. Art Tech. & Intell. Prop. L 34, pp. 124. Cited by: Appendix K, §5.4.
  • [10] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei (2009) Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Cited by: Table 8, §5.1.
  • [11] M. Duan, A. Suri, N. Mireshghallah, S. Min, W. Shi, L. Zettlemoyer, Y. Tsvetkov, Y. Choi, D. Evans, and H. Hajishirzi (2024) Do membership inference attacks work on large language models?. In Conference on Language Modeling (COLM), Cited by: Table 8, Appendix G.
  • [12] J. Dubiński, A. Kowalczuk, F. Boenisch, and A. Dziedzic (2025) Cdi: copyrighted data identification in diffusion models. In Proceedings of the Computer Vision and Pattern Recognition Conference, pp. 18674–18684. Cited by: §E.2, §1, §2.
  • [13] Z. Fei, M. Fan, C. Yu, D. Li, and J. Huang (2024) Scaling diffusion transformers to 16 billion parameters. External Links: 2407.11633, Link Cited by: Appendix F, §5.1.
  • [14] D. Groeneveld, I. Beltagy, E. Walsh, A. Bhagia, R. Kinney, O. Tafjord, A. Jha, H. Ivison, I. Magnusson, Y. Wang, et al. (2024) OLMo: accelerating the science of language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 15789–15809. Cited by: Appendix F, §5.1.
  • [15] X. Gu, C. Du, T. Pang, C. Li, M. Lin, and Y. Wang (2023) On memorization in diffusion models. arXiv preprint arXiv:2310.02664. Cited by: §E.1, §2.
  • [16] A. Hans, Y. Wen, N. Jain, J. Kirchenbauer, H. Kazemi, P. Singhania, S. Singh, G. Somepalli, J. Geiping, A. Bhatele, et al. (2024) Be like a goldfish, don’t memorize! mitigating memorization in generative llms. Advances in Neural Information Processing Systems 37, pp. 24022–24045. Cited by: §E.1, §2.
  • [17] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30. Cited by: §5.2.
  • [18] K. Kilgour, M. Zuluaga, D. Roblek, and M. Sharifi (2018) Fr\\backslash’echet audio distance: a metric for evaluating music enhancement algorithms. arXiv preprint arXiv:1812.08466. Cited by: §5.2.
  • [19] A. Kowalczuk, J. Dubiński, F. Boenisch, and A. Dziedzic (2025) Privacy attacks on image autoregressive models. In Forty-second International Conference on Machine Learning, External Links: Link Cited by: §E.2, §2, §4, §5.3, Table 4.
  • [20] J. Li, J. Dong, T. He, and J. Zhang (2024) Towards black-box membership inference attack for diffusion models. CoRR abs/2405.20771. External Links: Document, 2405.20771, Link Cited by: §E.2, §2.
  • [21] J. Li, W. Tu, and L. Xiao (2023) Freevc: towards high-quality text-free one-shot voice conversion. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. Cited by: Appendix F, §5.1.
  • [22] P. Maini, H. Jia, N. Papernot, and A. Dziedzic (2024) LLM dataset inference: did you train on my dataset?. CoRR abs/2406.06443. External Links: Document, 2406.06443, Link Cited by: §E.2, §1, §2.
  • [23] P. Maini and A. Suri Reassessing emnlp 2024’s best paper: does divergence-based calibration for mias hold up?. In The Fourth Blogpost Track at ICLR 2025, Cited by: §E.2, §2.
  • [24] P. Maini, M. Yaghini, and N. Papernot (2021) Dataset inference: ownership resolution in machine learning. arXiv preprint arXiv:2104.10706. Cited by: §1.
  • [25] K. Qian, Y. Zhang, S. Chang, X. Yang, and M. Hasegawa-Johnson (2019) Autovc: zero-shot voice style transfer with only autoencoder loss. In International Conference on Machine Learning, pp. 5210–5219. Cited by: Appendix F, §5.1.
  • [26] M. Sakarvadia, A. Ajith, A. M. Khan, N. C. Hudson, C. Geniesse, K. Chard, Y. Yang, I. Foster, and M. W. Mahoney (2024) Mitigating memorization in language models. In The Thirteenth International Conference on Learning Representations, Cited by: §E.1, §2.
  • [27] W. Shi, A. Ajith, M. Xia, Y. Huang, D. Liu, T. Blevins, D. Chen, and L. Zettlemoyer (2023) Detecting pretraining data from large language models. arXiv preprint arXiv:2310.16789. Cited by: Table 3.
  • [28] R. Shokri, M. Stronati, C. Song, and V. Shmatikov (2017) Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pp. 3–18. Cited by: §1.
  • [29] I. Shumailov, Z. Shumaylov, Y. Zhao, N. Papernot, R. Anderson, and Y. Gal (2024-07) AI models collapse when trained on recursively generated data. Nature 631, pp. 755–759. External Links: Document Cited by: §E.3, §1, §2.
  • [30] A. K. Singh, M. Y. Kocyigit, A. Poulton, D. Esiobu, M. Lomeli, G. Szilvasy, and D. Hupkes (2024) Evaluation data contamination in llms: how do we measure it and (when) does it matter?. arXiv preprint arXiv:2411.03923. Cited by: §1.
  • [31] L. Soldaini, R. Kinney, A. Bhagia, D. Schwenk, D. Atkinson, R. Authur, B. Bogin, K. Chandu, J. Dumas, Y. Elazar, et al. (2024) Dolma: an open corpus of three trillion tokens for language model pretraining research. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 15725–15788. Cited by: Table 8.
  • [32] J. Tao and R. Shokri (2025) (Token-level) InfoRMIA: stronger membership inference and memorization assessment for LLMs. CoRR abs/2510.05582. External Links: Document, 2510.05582, Link Cited by: §E.2, §2.
  • [33] K. Tian, Y. Jiang, Z. Yuan, B. Peng, and L. Wang (2024) Visual autoregressive modeling: scalable image generation via next-scale prediction. External Links: 2404.02905, Link Cited by: Appendix F, §5.1.
  • [34] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al. (2023) Llama: open and efficient foundation language models. arXiv preprint arXiv:2302.13971. Cited by: Appendix F, §5.1.
  • [35] G. van den Burg and C. Williams (2021) On memorization in probabilistic deep generative models. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. W. Vaughan (Eds.), Vol. 34, pp. 27916–27928. External Links: Link Cited by: §E.1, §2.
  • [36] A. Veit, T. Matera, L. Neumann, J. Matas, and S. Belongie (2016) Coco-text: dataset and benchmark for text detection and recognition in natural images. arXiv preprint arXiv:1601.07140. Cited by: Table 8, §5.1.
  • [37] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli (2004) Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13 (4), pp. 600–612. Cited by: item Structural Similarity Index Measure (SSIM) [37]:, item Structural Similarity Index Measure (SSIM) [37]:, §4.1.
  • [38] Y. Wen, Y. Liu, C. Chen, and L. Lyu (2024) Detecting, explaining, and mitigating memorization in diffusion models. In The Twelfth International Conference on Learning Representations, External Links: Link Cited by: §E.1.
  • [39] Y. WU, H. Qiu, S. Guo, J. Li, and T. Zhang (2024) You only query once: an efficient label-only membership inference attack. In The Twelfth International Conference on Learning Representations, External Links: Link Cited by: §1.
  • [40] J. Yamagishi, C. Veaux, and K. MacDonald (2019) CSTR vctk corpus: english multi-speaker corpus for cstr voice cloning toolkit (version 0.92). The Rainbow Passage which the speakers read out can be found in the International Dialects of English Archive:(http://web. ku. edu/˜ idea/readings/rainbow. htm).. Cited by: Table 8.
  • [41] J. Ye, A. Maddi, S. K. Murakonda, V. Bindschaedler, and R. Shokri (2022) Enhanced membership inference attacks against machine learning models. In Proceedings of the 2022 ACM SIGSAC conference on computer and communications security, pp. 3093–3106. Cited by: §E.2, §1, §2.
  • [42] S. Yeom, I. Giacomelli, M. Fredrikson, and S. Jha (2018) Privacy risk in machine learning: analyzing the connection to overfitting. In 2018 IEEE 31st computer security foundations symposium (CSF), pp. 268–282. Cited by: Table 3.
  • [43] Q. Yu, J. He, X. Deng, X. Shen, and L. Chen (2024) Randomized autoregressive visual generation. External Links: 2411.00776, Link Cited by: Appendix F, §5.1.
  • [44] M. Zawalski, M. Boubdir, K. Bałazy, B. Nushi, and P. Ribalta (2026) Detecting data contamination in LLMs via in-context learning. In The Fourteenth International Conference on Learning Representations, External Links: Link Cited by: Appendix G, §1.
  • [45] H. Zen, V. Dang, R. A. J. Clark, Y. Zhang, R. J. Weiss, Y. Jia, Z. Chen, and Y. Wu (2019) LibriTTS: a corpus derived from librispeech for text-to-speech. In Interspeech, External Links: Link Cited by: Table 8.
  • [46] J. Zhang, J. Sun, E. C. Yeats, Y. Ouyang, M. Kuo, J. Zhang, H. Yang, and H. H. Li (2024) Min-kk%++: improved baseline for detecting pre-training data from large language models. CoRR abs/2404.02936. External Links: Document, 2404.02936, Link Cited by: §E.2, §1, §2.
  • [47] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang (2018) The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595. Cited by: item Learned Perceptual Image Patch Similarity (LPIPS) [47]:, item Learned Perceptual Image Patch Similarity (LPIPS) [47]:, §4.1.
  • [48] S. Zhang, S. Roller, N. Goyal, M. Artetxe, M. Chen, S. Chen, C. Dewan, M. Diab, X. Li, X. V. Lin, et al. (2022) Opt: open pre-trained transformer language models. arXiv preprint arXiv:2205.01068. Cited by: Appendix F, §5.1.
  • [49] Z. Zhang, C. Yan, and B. A. Malin (2022) Membership inference attacks against synthetic health data. Journal of biomedical informatics 125, pp. 103977. Cited by: §1.

Appendix A Impact Statement

This work advances methods for auditing generative models by improving membership and dataset inference through chained regeneration. The primary positive impact is stronger accountability: MADreMIA can help detect memorization of sensitive, proprietary, or benchmark data, supporting privacy audits, copyright verification, and unlearning validation across model families and modalities.

While enhanced inference capabilities can assist in model auditing and transparency, they also require responsible application to avoid potential misuse. We frame MADreMIA as a tool for research evaluation, compliance monitoring, and internal red-teaming. It is important to note that our method provides statistical evidence rather than a definitive proof of data inclusion; therefore, results should be interpreted alongside additional forensic and procedural evidence within a broader data governance framework.

Appendix B Limitations

While our proposed framework is designed to be cross-modal and model-agnostic, our experimental scope is naturally constrained by several practical and theoretical factors. Most notably, we do not conduct full Membership Inference Attack (MIA) evaluations on audio generation models. Although our initial signal-degradation experiments indicate that iterative trajectory features exist in the audio domain, the literature currently lacks established single-step baselines tailored for these architectures, leaving MIA for audio models untested. Furthermore, while our framework is conceptually compatible with restricted setups, our current empirical evaluations rely on gray-box access to exact next-token logits, meaning that strictly black-box MIA remains untested in our work. Operationally, the primary limitation of our method is its scalability; the iterative regeneration loop inherently introduces a linear computational overhead by requiring multiple forward passes per sample. From a theoretical perspective, our core assumptions A1 and A3 are only partially satisfied in practice, as demonstrated by the empirical measurements in Table 2. Finally, our evaluations may be susceptible to distribution-shift confounds—where trajectory differences might stem from inherent dataset mismatches rather than pure memorization—and the exploratory findings presented in Section 5.6 are based on preliminary small-nn evidence that will require larger-scale validation in future work.

Appendix C LLM Usage

Large language models were used to improve the readability and clarity of portions of the manuscript, as well as to provide feedback during the writing and revision process. The authors verified all technical statements, citations, and claims and take full responsibility for the final content.

Appendix D Method Overview

Refer to caption
Figure 7: MADreMIA amplifies membership signal via trajectory features. Left: both methods share the same suspect sample Z0Z_{0} and base features zbase=ϕ(Z0)z_{\mathrm{base}}=\phi(Z_{0}). Top: baseline one-shot MIA uses no trajectory add-on and yields weak member/non-member separability. Bottom: MADreMIA adds trajectory features ztraj=ψ(Z0,,ZT)z_{\mathrm{traj}}=\psi(Z_{0},\ldots,Z_{T}) from chained regeneration. Members exhibit consistent trajectories (slow drift from Z0Z_{0}), whereas non-members drift inconsistently. The fused representation z~=[zbaseztraj]\tilde{z}=[z_{\mathrm{base}}\|z_{\mathrm{traj}}] is scored by the same plug-in estimator hh (e.g., L1-regularized logistic regression), amplifying the attack signal without changing the scorer.

Appendix E Extended Related Works

Our work builds upon three intersecting lines of prior research: the characterization of data memorization in generative models, the evolution of membership inference, and the dynamics of model collapse during recursive generation.

E.1 Memorization

Memorization — the tendency of generative models to reproduce training examples rather than generate novel samples — has been studied across multiple model families and from both measurement and mitigation perspectives. van den Burg and Williams [35] formalized the problem for probabilistic generative models such as VAEs, showing that memorization differs fundamentally from mode collapse and overfitting and is not captured by commonly-used nearest-neighbor tests. For diffusion models, Gu et al. [15] show that the denoising score matching objective has a closed-form optimum that can only replicate training samples, and introduces the EMM metric to quantify how dataset size and model configuration govern the generalization-to-memorization transition. Sakarvadia et al. [26] localize this phenomenon through bright-ending cross-attention patterns, while the sharpness-based framework of [38] justifies score-difference memorization metrics and proposes mitigation via sharpness-aware regularization of the initial noise. The benchmarking study [2] demonstrated that standard evaluation metrics fail to surface memorization even in competitive settings. Mitigation has been tackled both for LLMs, where Hans et al. [16] propose the goldfish loss that excludes randomly sampled token subsets from the training objective to prevent verbatim reproduction, and for text-to-image diffusion models, where Chen et al. [7] address the privacy–utility tension by combining prompt re-anchoring with semantic prompt search to improve both dimensions simultaneously.

E.2 Membership/Dataset Inference

A second line of work investigates whether specific examples or datasets can be identified from model behavior. Because individual Membership Inference Attacks (MIAs) can be confounded by distribution shifts [22], recent literature often favors Dataset Inference (DI), which aggregates feature evidence across many samples to statistically detect training data usage [23, 12, 19]. Concurrently, individual MIA methods must adapt to increasingly restrictive black-box deployments. Furthermore, approaches based on training multiple shadow models to learn membership distributions [41, 5] are now computationally infeasible for massive modern architectures. Consequently, modern attacks must extract signals using only limited outputs rather than internal weights or gradients [46, 6, 32].

In these restricted settings, recent black-box attacks heavily rely on output variations. For example, Li et al. [20] perform MIAs on diffusion models by repeatedly perturbing a target image via an API, averaging the results, and comparing them to the original sample. However, in an interrogation analogy, this approach merely asks multiple paraphrased versions of the exact same question. Because the target sample is perturbed independently each time, the query does not dynamically evolve in response to the model’s previous answers, leaving deeper structural memorization unexploited.

E.3 Model Collapse

The last, but very important point is the literature on recursive self-training in generative models. Alemohammad et al. [1] showed that self-consuming generative loops lead to progressive degradation in quality or diversity when insufficient fresh real data is injected at each generation, a phenomenon they term Model Autophagy Disorder. Their analysis is especially important for our setting because it frames repeated regeneration not as a neutral operation, but as a process that can magnify latent properties of the learned distribution. Closely related, Shumailov et al. [29] showed that recursively training on model-generated data causes model collapse, where tails of the original distribution disappear and learned behaviour drifts toward degenerate approximations. Taken together, these works suggest that iterative generation is structurally revealing: under repeated reuse, memorized or high-density regions may persist differently from non-member examples, while generic outputs may drift or collapse. Our method turns this insight into a privacy-auditing mechanism: rather than studying recursive generation as a training-time pathology, we exploit chained regeneration at inference time to amplify membership-relevant differences.

Appendix F Model Details

In our experiments, we consider two vision model families: image autoregressive models (IARs) and diffusion models. The IAR category includes VAR [33] and RAR [43] variants, while the diffusion category includes DiT-MoE [13] and UViT-T2I [3]. Furthermore, as others modalities, we evaluate large language models (LLMs) and voice conversion (VC) models. The LLMs include Pythia [4], OLMo [14], OPT [48], and Llama [34], while the VC models consist of AutoVC [25] and FreeVC [21]. Across all settings, we focus on representative, high-performing model variants.

Table 5: Vision model details.
IAR Models Diffusion Models
VAR-d30 VAR-d24 VAR-d20 RAR-XXL RAR-XL RAR-L DiT-MoE-G DiT-MoE-XL UViT-T2I-Deep
Model parameters 2.1B 1.0B 600M 1.5B 955M 462M 16.5B 4.1B 141M
Training epochs 350 300 250 400 400 400
FID 1.92 2.33 2.95 1.48 1.50 1.70 1.72 2.10 5.48
Table 6: Language model details.
OLMo Llama Pythia OPT
Model parameters 7B 13B 6.9B 6.7B
Training tokens 2.46T 1T 300B 180B
Table 7: Audio model details.
AutoVC FreeVC
Model parameters 28M 39M
Training data (hours) 44 40
SMOS (seen-to-seen) 3.5 4.1

Appendix G Dataset Details

For vision and audio models that have publicly known and available train/test splits we use these datasets. For most LLMs we use established MIA benchmarks (e.g. WikiMIA), but for OLMo, we use their corresponding training sets and the Global News as non-member set, as suggested in [44].

Table 8: Datasets used to construct member and non-member sets for each model family in our experiments, spanning vision, language, and speech domains.
Model Members Non-members
VAR ImageNet [10] ImageNet
RAR ImageNet ImageNet
DiT-MoE ImageNet ImageNet
UViT-T2I COCO [36] COCO
Pythia Mimir [11] Mimir
OLMo Dolma [31] Global News
Llama WikiMIA WikiMIA
OPT WikiMIA WikiMIA
AutoVC VCTK [40] LibriTTS [45]
FreeVC VCTK LibriTTS

Importantly, for the Pythia-6.9B we use the Mimir dataset [11] which consists of 6 subsets: arxiv, dm_mathematics, github, hackernews, pubmed_central, and wikipedia_(en). We concatenate all these subsets and randomly select samples from the pool. We use the ngram_7_0.2 data split. For the rest of the models, we employ their corresponding datasets’ train split as members and val/test split as nonmembers.

Appendix H Metrics Details

The following metrics are computed over the sequence of model outputs collected across MADreMIA iterations, capturing how the model’s generative behavior evolves under repeated generation.

H.1 Features for Language Models

Jaccard Similarity:

Measures the lexical overlap between the model’s output at a given iteration and its initial response, computed over trigrams. A high Jaccard similarity indicates that the model rigidly reproduces the same surface forms across iterations, which is characteristic of memorized content.

J(A,B)=|AB||AB|J(A,B)=\frac{|A\cap B|}{|A\cup B|}
Token Diversity:

Quantifies the divergence between the token probability distribution at the current iteration PP and the initial distribution QQ. Large values indicate that the model’s vocabulary preferences shift substantially during reconstruction, reflecting instability in its output distribution.

DKL(PQ)=x𝒳P(x)log(P(x)Q(x))D_{KL}(P\parallel Q)=\sum_{x\in\mathcal{X}}P(x)\log\left(\frac{P(x)}{Q(x)}\right)
Token Distribution Shift:

We define it as a Jensen-Shannon Divergence, which is a symmetric and bounded variant of KLD that measures the distributional distance between PP and QQ via their mixture MM. Compared to KLD, JSD is well-defined even when the supports of PP and QQ do not fully overlap, making it a more numerically stable measure of distributional drift across iterations.

JSD(PQ)\displaystyle\mathrm{JSD}(P\parallel Q) =12DKL(PM)+12DKL(QM)\displaystyle=\frac{1}{2}D_{KL}(P\parallel M)+\frac{1}{2}D_{KL}(Q\parallel M)
where M\displaystyle\text{where }M =12(P+Q)\displaystyle=\frac{1}{2}(P+Q)
Predictive Entropy:

Measures the uncertainty of the model’s next-token distribution over the full vocabulary 𝒱\mathcal{V}. Low entropy indicates that the model assigns high probability mass to a single token — consistent with confident, memorized reproduction — whereas high entropy reflects diffuse, uncertain predictions.

H(Y𝐱)=c𝒱P(y=c𝐱)logP(y=c𝐱)H(Y\mid\mathbf{x})=-\sum_{c\in\mathcal{V}}P(y=c\mid\mathbf{x})\log P(y=c\mid\mathbf{x})
Margin:

Captures the decisiveness of the model’s token predictions by computing the difference in probability between the top-ranked and second-ranked tokens. A large margin indicates high confidence in a specific token, which may signal memorized recall, while a small margin reflects genuine uncertainty between competing continuations.

M=P(y^1𝐱)P(y^2𝐱)M=P(\hat{y}_{1}\mid\mathbf{x})-P(\hat{y}_{2}\mid\mathbf{x})

H.2 Features for Vision Models

Mean Squared Error (MSE):

Measures the average pixel-level reconstruction error between the generated image at a given iteration and the original input. Lower MSE indicates that the model consistently reproduces fine-grained pixel details across iterations, which is a strong signal of memorization.

MSE(x,x^)=1Ni=1N(xix^i)2\mathrm{MSE}(x,\hat{x})=\frac{1}{N}\sum_{i=1}^{N}\left(x_{i}-\hat{x}_{i}\right)^{2}
Structural Similarity Index Measure (SSIM) [37]:

Evaluates perceptual similarity between the reconstructed image x^\hat{x} and the original xx by jointly comparing luminance, contrast, and structural information across local image patches. Unlike MSE, SSIM is sensitive to perceptual distortions that are meaningful to human observers, and its stability across iterations serves as a complementary signal to pixel-level metrics.

SSIM(x,x^)=(2μxμx^+c1)(2σxx^+c2)(μx2+μx^2+c1)(σx2+σx^2+c2)\mathrm{SSIM}(x,\hat{x})=\frac{(2\mu_{x}\mu_{\hat{x}}+c_{1})(2\sigma_{x\hat{x}}+c_{2})}{(\mu_{x}^{2}+\mu_{\hat{x}}^{2}+c_{1})(\sigma_{x}^{2}+\sigma_{\hat{x}}^{2}+c_{2})}

where μx\mu_{x}, μx^\mu_{\hat{x}} are local means, σx2\sigma_{x}^{2}, σx^2\sigma_{\hat{x}}^{2} are local variances, σxx^\sigma_{x\hat{x}} is the cross-covariance, and c1c_{1}, c2c_{2} are stabilization constants.

Learned Perceptual Image Patch Similarity (LPIPS) [47]:

Quantifies perceptual dissimilarity between xx and x^\hat{x} using deep feature representations extracted from a pretrained network ϕ\phi. By operating in a learned feature space rather than pixel space, LPIPS captures high-level semantic and textural differences that are invisible to MSE or SSIM, making it particularly sensitive to cases where a model reproduces semantic content while varying low-level details.

LPIPS(x,x^)=l1HlWlh,wwl(ϕl(x)hwϕl(x^)hw)22\mathrm{LPIPS}(x,\hat{x})=\sum_{l}\frac{1}{H_{l}W_{l}}\sum_{h,w}\left\|w_{l}\odot\left(\phi_{l}(x)_{hw}-\phi_{l}(\hat{x})_{hw}\right)\right\|_{2}^{2}

where ϕl\phi_{l} denotes the feature map at layer ll of the pretrained network and wlw_{l} are learned channel-wise weights.

Appendix I Additional Dataset Inference Results

Figure˜8 extends our dataset inference evaluation to Llama-13B and VAR-d30. On Llama-13B, augmented variants reach the 95% confidence threshold faster than the baseline, with the Combined and Quality signals leading, though convergence is noisier at low sample counts. On VAR-d30, the benefit is more pronounced: augmented variants cross the threshold at roughly 100 samples compared to over 200 for the baseline, with all three signal types outperforming it consistently. The significance histograms corroborate these findings — the Combined variant shifts the log10(p)-\log_{10}(p) distribution rightward on both models, confirming that trajectory features yield stronger per-trial evidence.

Refer to caption
(a) Llama-13B
Refer to caption
(b) VAR-d30
Refer to caption
(c) Llama-13B
Refer to caption
(d) VAR-d30
Figure 8: DI performance on additional models.

Appendix J Precision and Recall for Generative Models

Figure˜9 shows Precision and Recall across iterations for VAR-d30 and DiT-MoE-XL. In both models and both metrics, members consistently score higher than non-members throughout all iterations, confirming that the membership signal is stable and model-agnostic. Notably, the gap between members and non-members widens as iterations progress, indicating that chained regeneration amplifies the underlying asymmetry rather than merely preserving it.

Refer to caption
(a) VAR-d30 (Rec.)
Refer to caption
(b) VAR-d30 (Prec.)
Refer to caption
(c) DiT-MoE-XL (Rec.)
Refer to caption
(d) DiT-MoE-XL (Prec.)
Figure 9: Precision and Recall across models.

Appendix K Getty Images Case

As a practical case study, we consider the Getty Images v. Stability AI dispute [9] and evaluate whether chained regeneration can distinguish images that are plausibly associated with the Stable Diffusion training distribution from images that are very unlikely to have been included. We use Stable Diffusion 1.5 as the target model. For the positive pool, we extract 2,000 images from LAION-2B whose metadata contains the string gettyimages and treat them as members. For the negative pool, we collect 2,000 images from the Getty Images website whose upload date is after January 1, 2025, and treat them as non-members. Because these images post-date the original Stable Diffusion 1.5 training era (late 2022), they provide a conservative practical control group for this experiment.

Refer to caption
(a) SSIM ()(\uparrow)
Refer to caption
(b) Reconstruction Error (MSE) ()(\downarrow)
Figure 10: Evolution of (a) SSIM and (b) Reconstruction Error over 15 chained regeneration steps. The solid lines indicate the mean values for training members (blue) and held-out non-members (red), with shaded regions representing standard deviation. Across both metrics, members exhibit higher structural fidelity and slower degradation than non-members.

For each pool, we run the same chained-regeneration procedure for 15 iterations and summarize the trajectories with SSIM and reconstruction error (MSE) (see Figure˜10). The SSIM plot measures whether regenerations remain structurally closer to the initial query for the member pool than for the non-member pool. The MSE plot provides a complementary pixel-level view across regeneration depth by measuring how quickly reconstructed samples drift away from their reference images. In our experiments, the two pools remain visibly separated under both SSIM and MSE. We do not use FID in this case, because it is very unstable on 2,000-image pools. We still interpret MSE conservatively: it is sensitive to low-level reconstruction error rather than semantic fidelity alone. For this reason, we use MSE as a stable auxiliary trajectory measure across iterations, while SSIM remains the more directly interpretable structural signal in this case study.

Appendix L Proofs for Section 3

L.1 Proof of Theorem 3.2

Proof. By definition,

𝔼[STM=m]=1Tt=0T1𝔼[ϕtM=m],m{0,1}.\mathbb{E}[S_{T}\mid M=m]=\frac{1}{T}\sum_{t=0}^{T-1}\mathbb{E}[\phi_{t}\mid M=m],\qquad m\in\{0,1\}.

Hence

𝔼[ST1]𝔼[ST0]=1Tt=0T1(𝔼[ϕt1]𝔼[ϕt0]).\mathbb{E}[S_{T}\mid 1]-\mathbb{E}[S_{T}\mid 0]=\frac{1}{T}\sum_{t=0}^{T-1}\Big(\mathbb{E}[\phi_{t}\mid 1]-\mathbb{E}[\phi_{t}\mid 0]\Big).

Under A1,

𝔼[ϕt1]𝔼[ϕt0]Δt0,t,\mathbb{E}[\phi_{t}\mid 1]-\mathbb{E}[\phi_{t}\mid 0]\geq\Delta_{t}\geq 0,\quad\forall t,

so

𝔼[ST1]𝔼[ST0]1Tt=0T1Δt0.\mathbb{E}[S_{T}\mid 1]-\mathbb{E}[S_{T}\mid 0]\geq\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}\geq 0.

Therefore

ΓT:=|𝔼[ST1]𝔼[ST0]|1Tt=0T1Δt.\Gamma_{T}:=\big|\mathbb{E}[S_{T}\mid 1]-\mathbb{E}[S_{T}\mid 0]\big|\geq\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}.

For the denominator, A3 gives, for each class mm,

Var(STM=m)Cσ2τeffT.\mathrm{Var}(S_{T}\mid M=m)\leq C\frac{\sigma^{2}\tau_{\mathrm{eff}}}{T}.

Hence

maxmVar(STM=m)Cσ2τeffT.\max_{m}\mathrm{Var}(S_{T}\mid M=m)\leq C\frac{\sigma^{2}\tau_{\mathrm{eff}}}{T}.

Combining with the lower bound on ΓT\Gamma_{T},

SNR2(ST)=ΓT2maxmVar(STM=m)(1Tt=0T1Δt)2Cσ2τeff/T.\mathrm{SNR}^{2}(S_{T})=\frac{\Gamma_{T}^{2}}{\max_{m}\mathrm{Var}(S_{T}\mid M=m)}\geq\frac{\left(\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}\right)^{2}}{C\sigma^{2}\tau_{\mathrm{eff}}/T}.

This proves Theorem 3.2. ∎

L.2 Proof of Corollary 3.3

Proof. Assume Δt=Δ0et/τg\Delta_{t}=\Delta_{0}e^{-t/\tau_{g}}. Then

1Tt=0T1Δt=Δ0Tt=0T1et/τg=Δ0T1eT/τg1e1/τg.\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}=\frac{\Delta_{0}}{T}\sum_{t=0}^{T-1}e^{-t/\tau_{g}}=\frac{\Delta_{0}}{T}\cdot\frac{1-e^{-T/\tau_{g}}}{1-e^{-1/\tau_{g}}}.

Since 1euu1-e^{-u}\leq u for u>0u>0, with u=1/τgu=1/\tau_{g} we get

1e1/τg1τg11e1/τgτg.1-e^{-1/\tau_{g}}\leq\frac{1}{\tau_{g}}\quad\Longrightarrow\quad\frac{1}{1-e^{-1/\tau_{g}}}\geq\tau_{g}.

Therefore

1Tt=0T1ΔtΔ01eT/τgT/τg.\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}\geq\Delta_{0}\frac{1-e^{-T/\tau_{g}}}{T/\tau_{g}}.

By Theorem 3.2,

SNR2(ST)(1Tt=0T1Δt)2Cσ2τeff/TΔ02τgσ2τeffg(x),\mathrm{SNR}^{2}(S_{T})\geq\frac{\left(\frac{1}{T}\sum_{t=0}^{T-1}\Delta_{t}\right)^{2}}{C\sigma^{2}\tau_{\mathrm{eff}}/T}\;\gtrsim\;\frac{\Delta_{0}^{2}\tau_{g}}{\sigma^{2}\tau_{\mathrm{eff}}}\,g(x),

where

g(x):=(1ex)2x,x:=T/τg,g(x):=\frac{(1-e^{-x})^{2}}{x},\qquad x:=T/\tau_{g},

and \gtrsim absorbs only TT-independent constants (including 1/C1/C and comparability constants).

To optimize the shape in xx, differentiate:

g(x)=(1ex)(2xex(1ex))x2.g^{\prime}(x)=\frac{(1-e^{-x})\big(2xe^{-x}-(1-e^{-x})\big)}{x^{2}}.

For x>0x>0, critical points satisfy

2xex=1exex=2x+1.2xe^{-x}=1-e^{-x}\quad\Longleftrightarrow\quad e^{x}=2x+1.

This has a unique positive solution x1.2564x^{\star}\approx 1.2564, so the surrogate shape is maximized at

Txτg1.2564τg.T^{\star}\approx x^{\star}\tau_{g}\approx 1.2564\,\tau_{g}.

L.3 Proof of Corollary 3.4 (shape-constant clarification)

Proof. From the previous corollary (under the same comparability regime),

SNR2(ST)Δ02τgσ2τeffg(x).\mathrm{SNR}^{2}(S_{T^{\star}})\gtrsim\frac{\Delta_{0}^{2}\tau_{g}}{\sigma^{2}\tau_{\mathrm{eff}}}\,g(x^{\star}).

Assume additionally

Γ1Δ0,Var(S1M)σ2,\Gamma_{1}\asymp\Delta_{0},\qquad\mathrm{Var}(S_{1}\mid M)\asymp\sigma^{2},

so SNR(S1)Δ0/σ\mathrm{SNR}(S_{1})\asymp\Delta_{0}/\sigma. Taking square roots and ratio:

SNR(ST)SNR(S1)g(x)τgτeff=cshapeκ,\frac{\mathrm{SNR}(S_{T^{\star}})}{\mathrm{SNR}(S_{1})}\gtrsim\sqrt{g(x^{\star})}\sqrt{\frac{\tau_{g}}{\tau_{\mathrm{eff}}}}=c_{\mathrm{shape}}\sqrt{\kappa},

where

κ:=τgτeff,cshape:=g(x)0.638.\kappa:=\frac{\tau_{g}}{\tau_{\mathrm{eff}}},\qquad c_{\mathrm{shape}}:=\sqrt{g(x^{\star})}\approx 0.638.

Thus cshapec_{\mathrm{shape}} is the idealized shape constant; additional model-dependent prefactors remain absorbed by \gtrsim. ∎

L.4 Additional comments on Bayes-cap statement at the end of Section 3

If membership is deterministic in the initial sample, M=f(Z0)M=f(Z_{0}), then H(MZ0)=0H(M\mid Z_{0})=0, so

I(M;Z0)=H(M)H(MZ0)=H(M).I(M;Z_{0})=H(M)-H(M\mid Z_{0})=H(M).

Also, conditioning on Z0Z_{0} already determines MM, hence

H(MZ0,Z1:T)=0=H(MZ0),H(M\mid Z_{0},Z_{1:T})=0=H(M\mid Z_{0}),

which implies

I(M;Z1:TZ0)=H(MZ0)H(MZ0,Z1:T)=0.I(M;Z_{1:T}\mid Z_{0})=H(M\mid Z_{0})-H(M\mid Z_{0},Z_{1:T})=0.

Therefore, by the chain rule for mutual information,

I(M;Z0:T)=I(M;Z0)+I(M;Z1:TZ0)=I(M;Z0).I(M;Z_{0:T})=I(M;Z_{0})+I(M;Z_{1:T}\mid Z_{0})=I(M;Z_{0}).

Thus trajectory iteration cannot increase Bayes-optimal information; it can improve practical fixed-form statistics through variance reduction and temporal aggregation.