Introduction
Monte Carlo (MC) methods approximate expectations by random sampling. If \(I=\int f(x)\,p(x)\,dx\), draw \(x^{(i)}\sim p\) and estimate \(\hat I=\frac{1}{N}\sum_i f(x^{(i)})\). The law of large numbers guarantees convergence, and the central limit theorem gives uncertainty \(\operatorname{SE}\approx \sigma_f/\sqrt{N}\). The remarkable fact is that this \(\mathcal{O}(N^{-1/2})\) rate is dimension-agnostic, making MC the workhorse for high-dimensional physics: statistical mechanics, particle transport, quantum field theory, and Bayesian inference.
But plain sampling is only the opening move. Efficiency hinges on variance reduction (importance sampling, control variates, stratification), Markov Chain Monte Carlo to sample difficult posteriors, and quasi–Monte Carlo for low-discrepancy sequences. Good MC is not just randomness — it is engineered randomness guided by physics.
Sampling & Integration
For integrals \(\int_\Omega g(x)\,dx\), choose a proposal \(q(x)\) that resembles \(|g(x)|\) and compute \[ I=\int \frac{g(x)}{q(x)} q(x)\,dx \approx \frac{1}{N}\sum_{i=1}^N \frac{g(x^{(i)})}{q(x^{(i)})},\quad x^{(i)}\sim q. \] The variance shrinks as \(q\) concentrates where \(|g|\) is large. In rendering and neutron transport this is importance sampling; in finance it prices rare events; in path integrals it targets classical paths.
Control variates subtract a correlated, integrable function \(h\) to reduce variance: estimate \(I=\mathbb{E}[g]-\alpha(\mathbb{E}[h]-H]\) with \(\alpha\) tuned from samples. Antithetic sampling (pair \(x,-x\)) cancels odd error terms for symmetric integrands. Stratification divides the domain into cells to guarantee coverage. Quasi–Monte Carlo replaces pseudorandom draws with low-discrepancy sequences (Sobol’, Halton), converting the rate toward \(\mathcal{O}(N^{-1+\epsilon})\) for smooth integrands.
Markov Chain Monte Carlo (MCMC)
When direct sampling is impossible, construct a Markov chain with stationary density \(\pi(x)\). Metropolis–Hastings proposes \(x'\sim q(\cdot|x)\) and accepts with probability \[ a=\min\!\left(1,\frac{\pi(x')\,q(x|x')}{\pi(x)\,q(x'|x)}\right). \] The resulting chain leaves \(\pi\) invariant. Gibbs sampling updates coordinates conditionally. In high dimension with smooth log-density, Hamiltonian Monte Carlo introduces auxiliary momenta and simulates Hamiltonian dynamics with leapfrog steps, proposing distant moves with high acceptance.
Diagnostics matter: discard burn-in; thin only if storage is a bottleneck; monitor effective sample size, autocorrelation time, and \(\hat R\) for multiple chains. Reparameterization and preconditioning tame anisotropy; tempering explores multimodality.
Monte Carlo in Physics
Statistical Mechanics. The Metropolis algorithm samples the Boltzmann distribution \(p(\sigma)\propto e^{-\beta E(\sigma)}\). In the Ising model we flip spins; cluster algorithms (Wolff/Swendsen–Wang) mitigate critical slowing near \(T_c\). Histogram reweighting and multicanonical sampling estimate free-energy landscapes; Wang–Landau learns the density of states directly.
Path Integrals & Quantum Monte Carlo. In imaginary time, quantum partition functions map to classical polymers. Worldline methods and path-integral Monte Carlo compute superfluid fractions and condensates; auxiliary-field QMC handles fermions via Hubbard–Stratonovich transforms (battling the sign problem). Diffusion Monte Carlo projects ground states by branching walkers according to local energy.
Transport & Radiative Transfer. Photons or neutrons are traced through media with random free paths and scattering angles. Variance reduction (Russian roulette & splitting, weight windows) targets detectors. Adjoint MC and importance maps learned from deterministic solves focus effort where tallies matter.
Cosmology & Inference. Posterior sampling of cosmological parameters uses MCMC/SMC with fast emulators of expensive simulations; likelihood-free inference (ABC) compares summary statistics when explicit likelihoods are unavailable.
Convergence, Error & HPC
MC estimates come with built-in uncertainty: report means with standard errors and confidence intervals. For MCMC, account for correlation via integrated autocorrelation time \(\tau_{\text{int}}\) and use \(\text{ESS}\approx N/2\tau_{\text{int}}\). Bias arises from burn-in, finite-time nonequilibrium, or poor proposal mixes; variance collapses under importance weights with heavy tails — cap or smooth weights to avoid pathologies.
Parallelism is natural: independent streams across cores/GPUs; population MCMC exchanges states (parallel tempering); batched likelihoods feed vectorized hardware. High-quality generators (counter-based RNGs, leapfrogable streams) ensure reproducibility in distributed settings. Quasi–MC blends well with GPU kernels due to deterministic sequences.