This study addresses a critical challenge in modern blockchain systems: the excessive size of Merkle proofs in state verification, which significantly impacts scalability and efficiency. As highlighted by Ethereum’s founder, Vitalik Buterin, current Merkle Patricia Tries (MPTs) are highly inefficient for stateless clients, with worst-case proofs reaching approximately 300 MB. We present a comprehensive probabilistic analysis of path length distributions in MPTs to optimize proof size while maintaining security guarantees. Our novel mathematical model characterizes the distribution of path lengths in tries containing random blockchain addresses and validates it through extensive computational experiments. The findings reveal logarithmic scaling of average path lengths with respect to the number of addresses, with unprecedented precision in predicting structural properties across scales from 100 to 300 million addresses. The research demonstrates remarkable accuracy, with discrepancies between theoretical and experimental results not exceeding 0.01 across all tested scales. By identifying and verifying the right-skewed nature of path length distributions, we provide critical insights for optimizing Merkle proof generation and size reduction. Our practical implementation guidelines demonstrate potential proof size reductions of up to 70% through optimized path structuring and node layout. This work bridges the gap between theoretical computer science and practical blockchain engineering, offering immediate applications for blockchain client optimization and efficient state-proof generation.

Optimizing Merkle Proof Size Through Path Length Analysis: A Probabilistic Framework for Efficient Blockchain State Verification

Kuznetsov O.
;
Arnesano M.
2025-01-01

Abstract

This study addresses a critical challenge in modern blockchain systems: the excessive size of Merkle proofs in state verification, which significantly impacts scalability and efficiency. As highlighted by Ethereum’s founder, Vitalik Buterin, current Merkle Patricia Tries (MPTs) are highly inefficient for stateless clients, with worst-case proofs reaching approximately 300 MB. We present a comprehensive probabilistic analysis of path length distributions in MPTs to optimize proof size while maintaining security guarantees. Our novel mathematical model characterizes the distribution of path lengths in tries containing random blockchain addresses and validates it through extensive computational experiments. The findings reveal logarithmic scaling of average path lengths with respect to the number of addresses, with unprecedented precision in predicting structural properties across scales from 100 to 300 million addresses. The research demonstrates remarkable accuracy, with discrepancies between theoretical and experimental results not exceeding 0.01 across all tested scales. By identifying and verifying the right-skewed nature of path length distributions, we provide critical insights for optimizing Merkle proof generation and size reduction. Our practical implementation guidelines demonstrate potential proof size reductions of up to 70% through optimized path structuring and node layout. This work bridges the gap between theoretical computer science and practical blockchain engineering, offering immediate applications for blockchain client optimization and efficient state-proof generation.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11389/70997
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact