A unifying framework for mean-field theories of asymmetric kinetic Ising systems

Miguel Aguilera, S. Amin Moosavi, Hideaki Shimazaki

https://doi.org/10.1038/s41467-021-20890-5, Volume: 12, Pages: null

Article Type: research-article Article History

Publisher: Nature Publishing Group UK

- Facebook
- Twitter
- Linkedin
- Whatsapp
Altmetric

Table of Contents

Introduction
Results
Discussion
Methods
Supplementary information
Supplementary information

Abstract

Kinetic Ising models are powerful tools for studying the non-equilibrium dynamics of complex systems. As their behavior is not tractable for large networks, many mean-field methods have been proposed for their analysis, each based on unique assumptions about the system’s temporal evolution. This disparity of approaches makes it challenging to systematically advance mean-field methods beyond previous contributions. Here, we propose a unifying framework for mean-field theories of asymmetric kinetic Ising systems from an information geometry perspective. The framework is built on Plefka expansions of a system around a simplified model obtained by an orthogonal projection to a sub-manifold of tractable probability distributions. This view not only unifies previous methods but also allows us to develop novel methods that, in contrast with traditional approaches, preserve the system’s correlations. We show that these new methods can outperform previous ones in predicting and assessing network properties near maximally fluctuating regimes.

Many mean-field theories are proposed for studying the non-equilibrium dynamics of complex systems, each based on specific assumptions about the system’s temporal evolution. Here, Aguilera et al. propose a unified framework for mean-field theories of asymmetric kinetic Ising systems to study non-equilibrium dynamics.

Keywords

Magnetic properties and materials, Statistical physics, thermodynamics and nonlinear dynamics, Information theory and computation

Aguilera,Moosavi,and Shimazaki: A unifying framework for mean-field theories of asymmetric kinetic Ising systems

Introduction

Advances in high-throughput data acquisition technologies for very large biological and social systems are providing unprecedented possibilities to investigate their complex, non-equilibrium dynamics. For example, optical recordings from genetically modified neural populations make it possible to simultaneously monitor activities of the whole neural network of behaving C. elegans¹ and zebrafish², as well as thousands of neurons in the mouse visual cortex³. Such networks generally exhibit out-of-equilibrium dynamics⁴, and are often found to self-organize near critical regimes at which their fluctuations are maximized^5,6. Evolution of such systems cannot be faithfully captured by methods assuming an asymptotic equilibrium state. Therefore, in general, there is a pressing demand for mathematical tools to study the dynamics of large-scale, non-equilibrium complex systems and to analyze high-dimensional datasets recorded from them.

The kinetic Ising model with asymmetric couplings is a prototypical model for studying such non-equilibrium dynamics in biological^7,8 and social systems⁹. It is described as a discrete-time Markov chain of interacting binary units, resembling the nonlinear dynamics of recurrently connected neurons. The model exhibits non-equilibrium behavior when couplings are asymmetric or when model parameters are subject to rapid changes, ruling out quasi-static processes. These conditions induce a time reversal asymmetry in dynamical trajectories, leading to positive entropy production (the second law of thermodynamics) as revealed by the fluctuation theorem^10–15 (see refs. ^16,17 for reviews). This time-asymmetry is characteristic of non-equilibrium systems as it can only be displayed by systems in which energy dissipation takes place¹⁸. In the case of symmetric connections and static parameters, the model converges to an equilibrium stationary state. Consequently, it is a generalization of its equilibrium counterpart known as the (equilibrium) Ising model¹⁹.

The forward Ising problem refers to calculating statistical properties of the model, such as mean activation rates (mean magnetizations of spins) and correlations, given the parameters of the model. In contrast, inference of the model parameters from data is called the inverse Ising problem²⁰. In this regard, kinetic Ising models^21,22 and their equilibrium counterparts^23–25 have become popular tools for modeling and analyzing biological and social systems. In addition, they capture memory retrieval dynamics in classical associative networks. Namely, they are equivalent to the Boltzmann machine, extensively used in machine learning applications²⁰. Unfortunately, exact solutions of the forward and inverse problems often become computationally too expensive due to the combinatorial explosion of possible patterns in large, recurrent networks or the high volume of data, and applications of exact or sampling-based methods are limited in practice to around a hundred of neurons^5,25,26. In consequence, analytical approximation methods are necessary for analysing large systems. In this endeavour, mean-field methods have emerged as powerful tools to track down otherwise intractable statistical quantities.

The standard mean-field approximations to study equilibrium Ising models are the classical naive mean-field (nMF) and the more accurate Thouless-Anderson-Palmer (TAP) approximations²⁷. These methods have also been employed to solve the inverse Ising problem^28–31. Plefka demonstrated that the nMF and TAP approximations for the equilibrium model can be derived using the power series expansion of the free energy around a model of independent spins, a method which is now referred to as the Plefka expansion³². This expansion up to the first and second orders leads to the nMF and TAP mean-field approximations respectively. The Plefka expansion was later formalized by Tanaka and others in the framework of information geometry^33–37.

In non-equilibrium networks, however, the free energy is not directly defined, and thus it is not obvious how to apply the Plefka expansion. Kappen and Spanjers³⁸ proposed an information geometric approach to mean-field solutions of the asymmetric Ising model with asynchronous dynamics. They showed that their second-order approximation for an asymmetric model in the stationary state is equivalent to the TAP approximation for equilibrium models. Later, Roudi and Hertz derived TAP equations for nonstationary states using a Legendre transformation of the generating functional of the set of trajectories of the model³⁹. Another study by Roudi and Hertz extended mean-field equations to provide expressions for the nonstationary delayed correlations assuming the presence of equal-time correlations at the previous step⁴⁰. Yet another interesting method proposed by Mézard and Sakellariou approximates the local fields by a Gaussian distribution according to the central limit theorem, yielding more accurate results for fully asymmetric networks⁴¹. This method was later extended to include correlations at the previous time step, improving the results for symmetric couplings⁴². More recently, Bachschmid-Romano et al. extended the path-integral methods in ref. ³⁹ with Gaussian effective fields⁴³, not only recovering ref. ⁴¹ for fully asymmetric networks but also proposing a method that better approximates mean rate dynamics by conserving autocorrelations of units. Although many choices of mean-field methods are available, the diversity of methods and assumptions makes it challenging to advance systematically over previous contributions.

Here, we propose a unified approach for mean-field approximations of the Ising model. While our method is applicable to symmetric and equilibrium models, we focus for generality on asymmetric kinetic Ising models. Our approach is defined as a family of Plefka expansions in an information geometric space. This approach allows us to unify and relate existing mean-field methods and to provide expressions for other statistics of the systems such as pairwise correlations. Furthermore, our approach can be extended beyond classical mean-field assumptions to propose novel approximations. Here, we introduce an approximation based on a pairwise model that better captures network correlations, and we show that it outperforms existing approximations of kinetic Ising models near a point of maximum fluctuations. We also provide a data-driven method to reconstruct and test if a system is near a phase transition by combining the forward and inverse Ising problems, and demonstrate that the proposed pairwise model more accurately estimates the system’s fluctuations and its sensitivity to parameter changes. These results confirm that our unified framework is a useful tool to develop methods to analyze large-scale, non-equilibrium biological and social dynamics operating near critical regimes. In addition, since the methods are directly applicable to Boltzmann machine learning, the geometrical framework introduced here is relevant in machine learning applications.

The paper is organized as follows. First, we introduce the kinetic Ising model and its statistical properties of interest. Second, we introduce our framework for the Plefka approximation methods from a geometric perspective. To explain how it works, we derive the classical naive and TAP mean-field approximations under the proposed framework. Third, we show that our approach can unify other known mean-field approximation methods. We then propose a novel pairwise approximation under this framework. Finally, we compare different mean-field approximations in solving the forward and inverse Ising problems, as well as in performing the data-driven assessment of the system’s sensitivity. The last section is devoted to discussion.

Results

The kinetic Ising model

The kinetic Ising model is the least structured statistical model containing delayed pairwise interactions between its binary components (i.e., a maximum caliber model⁴⁴). The system consists of N interacting binary variables (down or up of Ising spins or inactive or active of neural units) s_i,t ∈ { − 1, + 1}, i = 1, 2, . . , N, evolving in discrete-time steps t with parallel dynamics. Given the configuration of spins at t − 1, s_t−1 = {s_1,t−1, s_2,t−1, …, s_N,t−1}, spins s_t at time t are conditionally independent random variables, updated as a discrete-time Markov chain, following

The parameters H = {H_i} and J = {J_ij} represent local external fields at each spin and couplings between pairs of spins respectively. When the couplings are asymmetric (i.e., J_ij ≠ J_ji), the system is away from equilibrium because the process is irreversible with respect to time. Given the probability mass function of the previous state P(s_t−1), the distribution of the current state is:

This marginal distribution P(s_t) is not factorized (except at J = 0), but it rather exhibits a complex statistical structure, generally containing higher-order spin interactions. We can apply this equation recursively, e.g., decomposing

P (s_{t - 1})

in terms of the distribution P(s_t−2), and trace the evolution of the system from the initial distribution P(s₀).

In this article, we use variants of the Plefka expansion to calculate some statistical properties of the system. Namely, we investigate the average activation rates m_t, correlations between pairs of units (covariance function) C_t, and delayed correlations D_t given by

Note that m_t and D_t are sufficient statistics of the kinetic Ising model. Therefore, we will use them in solving the inverse Ising problem (see Methods). We additionally consider the equal-time correlations C_t as they are commonly used to describe neural systems, and are investigated by some of the mean-field approximations in the literature⁴⁰. Calculation of these expectation values is analytically intractable and computationally very expensive for large networks, due to the combinatorial explosion of the number of possible states. To reduce this computational cost, we approximate the marginal probability distributions (Eq. (3)) by the Plefka expansion method that utilizes an alternative, tractable distribution.

Geometrical approach to mean-field approximation

Information geometry^37,45,46 provides clear geometrical understanding of information-theoretic measures and probabilistic models^15,47,48. Using the language of information geometry, we introduce our method for mean-field approximations of kinetic Ising systems.

Let $P_{t}$ be the manifold of probability distributions at time t obtained from Eq. (3). Each point on the manifold corresponds with a set of parameter values. The manifold $P_{t}$ contains submanifolds $Q_{t}$ of probability distributions with analytically tractable statistical properties (See Fig. 1). We use this tractable manifold, i.e., a reference model, to approximate a target point P(s_t∣H, J) in the manifold $P_{t}$ and its statistical properties m_t, C_t, D_t.

Fig. 1

A geometric view of the approximations based on Plefka expansions.

The point P(s_t) is the marginal distribution of a kinetic Ising model at time t. The submanifold $Q_{t}$ is a set of tractable distributions, for example a manifold of independent models. The points in $A$ correspond to a m-geodesic, that is a linear mixture of P(s_t) and Q^*(s_t) on $Q_{t}$ , where for independent $Q_{t}$ all points on $A$ share the same mean values m_t. Geometrically, $A$ constitutes the m-projection from P(s_t) to $Q_{t}$ , defining Q^*(s_t) as the closest point in the submanifold $Q_{t}$ to the point P(s_t)⁴⁷. The Plefka expansion is defined by expanding an α-dependent distribution P_α(s_t) that satisfies P_α=0(s_t) = Q^*(s_t) and P_α=1(s_t) = P(s_t).

The simplest submanifold $Q_{t}$ is the manifold of independent models, used in classical mean-field approximations to compute average activation rates. Each point on this submanifold corresponds to a distribution

where Θ_t = {Θ_i,t} is the vector of parameters that represents a point in

Q_{t}

. This distribution does not include couplings between units, and its average activation rate is immediately given as

m_{i, t} = \tanh Θ_{i, t}

Our first goal is to find the average activation rates of the target distribution P(s_t∣H, J). It turns out that they can be obtained from the independent model Q(s_t∣Θ_t) that minimizes the following Kullback-Leibler (KL) divergence from P(s_t):

The independent model $Q (s_{t} ∣ Θ_{t}^{*}) (\equiv Q^{*} (s_{t}))$ that minimizes the KL divergence has activation rates m_t identical to those of the target distribution P(s_t)³⁸ because the minimizing points $Θ_{i, t}^{*}$ satisfy (for i = 1, …, N)

where

m_{i, t}^{P}

and

m_{i, t}^{Q^{*}}

are respectively expectation values of s_i,t by P(s_t) and

Q (s_{t} ∣ Θ_{t}^{*})

. As these values are equal, for the rest of the paper we will drop their superscripts and just write m_i,t for simplicity. The result of this approximation is indifferent to the system’s correlations. Later in the paper we will consider approximations that take into account pairwise correlations.

From an information geometric point of view, given m_t (or $Θ_{t}^{*}$ ), we may consider a family of points defined as a linear mixture of P(s_t) and $Q (s_{t} ∣ Θ_{t}^{*})$ for which m_t is kept constant (the dashed line $A$ in Fig. 1). This is known as an m-geodesic, and it is orthogonal to the e-flat manifold $Q_{t}$ , constituting an m-projection to this manifold^37,47. Thus, the previous search of $Q (s_{t} ∣ Θ_{t}^{*})$ given P(s_t∣H, J) is equivalent to finding the orthogonal projection point from P(s_t∣H, J) to the manifold $Q_{t}$ of independent models^36,37.

The Plefka expansion

Although the m-projection provides the exact and unique average activation rates, its calculation in practice requires the complete distribution P(s_t). In the Plefka expansion, we relax the constraints of the m-projection, and introduce another set of more tractable distributions that passes only through P(s_t∣H, J) and $Q (s_{t} ∣ Θ_{t}^{*})$ (the solid line in Fig. 1). This distribution is defined using a new conditional distribution introducing a parameter α that connects a distribution on the manifold $Q_{t}$ with the original distribution P(s_t):

At α = 0, P_α=0(s_t∣s_t−1) = Q(s_t∣Θ_t), and α = 1 leads to P_α=1(s_t∣s_t−1) =P(s_t∣s_t−1). Using this alternative conditional distribution P_α(s_i,t∣s_t−1), we construct an approximate marginal distribution P_α(s_t). Consequently, expectation values with respect to P_α(s_t) are functions of α. We thus write the statistics of the approximate system as m_t(α), C_t(α), and D_t(α).

The Plefka expansions of these statistics are defined as the Taylor series expansion of these functions around α = 0. In the case of the mean activation rate, the expansion up to the nth-order leads to:

where

O (α^{(n + 1)})

stands for the residual error of the approximation of order n + 1 and higher. For the nth-order approximation, we neglect the residual terms as

O (α^{(n + 1)}) {∣)}_{α = 1} \approx 0

. Note that all coefficients of expansion are functions of Θ_t. The mean-field approximation is computed by setting α = 1 and finding the value of

Θ_{t}^{*}

that satisfies Eq. (). Since the original marginal distribution is recovered at α = 1, the equality of Eq. () holds: m_t(α = 1) = m_t(α = 0). Then, we have

which should be solved with respect to the parameters Θ_t. Since we neglected the terms higher than the n-th order, the solution may not lead to the exact projection,

Q (s_{t} ∣ Θ_{t}^{*})

. In this study, we investigate the first (n = 1) and second (n = 2) order approximations. Moreover we can apply the same expansion to approximate the correlations C_t and D_t, using Eq. ().

What is the difference between this approach and other mean-field methods? Conventionally, naive mean-field approximations are obtained by minimizing D(Q∣∣P) as opposed to D(P∣∣Q) (Eq. (8))^36,49. This approach is typically used in variational inference to construct a tractable approximate posterior in machine learning problems. Following the Bogolyubov inequality, minimizing this divergence is equivalent to minimizing the variational free energy. Geometrically, it comprises an e-projection of P(s_t∣H, J) to the submanifold $Q_{t}$ , which does not result in $Q (s_{t} ∣ Θ_{t}^{*})$ . Namely, minimizing D(Q∣∣P), as well as minimization of other α-divergences except for D(P∣∣Q), introduces a bias in the estimation of the mean-field approximation^36,37. In contrast, if we consider the m-projection point that minimizes D(P∣∣Q), we can approximate the exact value of m_t using Eq. (12) up to an arbitrary order.

In the subsequent sections we show that different approximations of the marginal distribution P(s_t) in Eq. (3) can be constructed by replacing P(s_i,τ∣s_τ−1) with P_α(s_i,τ∣s_τ−1) for different pairs i, τ (here we will explore the cases of τ = t and τ = t − 1). More generally, we show in Supplementary Note 1 that this framework can be extended to a marginal path of arbitrary length k, P(s_t−k+1, …, s_t). In addition, we are not restricted to manifolds of independent models. The independent model is adopted as a reference model to approximate the average activation rate, but one can also more accurately approximate correlations using this method. In this vein, we can extend our framework to use reference manifolds $Q_{t - k + 1 : t}$ (of models Q(s_t−k+1, …, s_t)) that include interactions, e.g., pairwise couplings between elements at two different time points, to more accurately approximate the delayed correlations (see Supplementary Note 1). By systematically defining these reference distributions, we will provide a unified approach to derive Plefka approximations of m_t, C_t, and D_t, including the one that utilizes a pairwise structure.

Plefka[t − 1, t]: expansion around independent models at times t − 1 and t

Before elaborating different mean-field approximations, we demonstrate our method by deriving the known results of the classical nMF and TAP approximations for the kinetic Ising model^38,39. In order to derive these classical mean-field equations, we make a Plefka expansion around the points $Θ_{t}^{*}$ and $Θ_{t - 1}^{*}$ that are, respectively, obtained by orthogonal projection to the independent manifolds $Q_{t}$ and $Q_{t - 1}$ , computed as in Eq. (9). Here we should note that assuming an approximation where previous distributions (e.g., t − 2, t − 3, … ) are also independent yields exactly the same result. In this way, we derive the nMF and TAP equations of a model defined by a marginal probability distribution $P_{α}^{[t - 1 : t]}$ . Using Eqs. (3) and (10), we write

where

P_{α = 0}^{[t - 1 : t]} (s_{t}) = Q (s_{t})

and the original distribution is recovered for

P_{α = 1}^{[t - 1 : t]} (s_{t}) = P (s_{t})

Following Eq. (13), for the first order approximation we have $\frac{\partial m_{i, t} (α = 0)}{\partial α} = 0$ . Since the derivative of the first order moment is

by solving the equation, we find

Θ_{i, t}^{*} \approx H_{i} + \sum_{j} J_{i j} m_{j, t - 1}

that leads to the naive mean-field approximation:

We apply the same expansion to approximate the correlations, expanding C_ik,t(α) and D_il,t(α) around α = 0 up to the first order using $Θ_{i, t} = Θ_{i, t}^{*}$ . Then we obtain

Detailed calculations are presented in Supplementary Note .

To obtain the second-order approximation, we need to solve $\frac{\partial m_{i} (α = 0)}{\partial α} + \frac{1}{2} \frac{\partial^{2} m_{i} (α = 0)}{\partial α^{2}} = 0$ from Eq. (13). Here the second-order derivative is given as

where terms of the order higher than quadratic were neglected (see Supplementary Note for further details). From these equations, we find

Θ_{i, t}^{*} \approx H_{i} + \sum_{j} J_{i j} m_{j, t - 1} - m_{i, t} \sum_{j} J_{i j}^{2} (1 - m_{j, t - 1}^{2})

leading to the TAP equation:

Having $Θ_{i, t}^{*}$ , we can incorporate TAP approximations of the correlations by expanding C_ik,t(α) and D_il,t(α) (see Supplementary Note 2 for details) as:

In these approximations, Eqs. (16) and (20) of activation rates m_t correspond to the classical nMF and TAP equations of the kinetic Ising model^38,39. The mean-field equations for the equal-time and delayed correlations (Eqs. (17), (18), (21), and (22)) are novel contributions from applying the Plefka expansion to correlations.

Using the equations above, we can compute the approximate statistical properties of the system at t (m_t, C_t, D_t) from m_t−1. Therefore, the system evolution is described by recursively computing m_t from an initial state m₀ (for both transient and stationary dynamics), although approximation errors accumulate over the iterations. After we introduce a unified view of mean-field approximations in the subsequent sections, we will numerically examine approximation errors of these various methods in predicting statistical structure of the system.

Generalization of mean-field approximations

In the previous section, we described a Plefka expansion that uses a model containing independent units at time t − 1 and t to construct a marginal probability distribution $P_{α}^{[t - 1 : t]} (s_{t})$ . This is, however, not the only possible choice of approximation. As we mentioned above, other approximations have been introduced in the literature. In ref. ⁴⁰, expressions are provided for the nonstationary delayed correlations D_t as a function of C_t−1. In ref. ⁴¹, an approximation is derived by assuming that units at state s_t−1 are independent while correlations of s_t are preserved.

In the following sections, we show that various approximation methods, including those mentioned above, can be unified as Plefka expansions. Each method of the approximation corresponds to a specific choice of the submanifold $Q_{t}$ at each time step. Fig. 2 shows the corresponding submanifolds $Q_{t - 1 : t}$ of possible approximations, where gray lines represent interactions that are affected by α in the Plefka expansion. The mean-field approximations in the previous section were obtained by using the model represented in Fig. 2B, where the couplings at time t − 1 and t are affected by α. Below, we present systematic applications of the Plefka expansions around other reference models in order to approximate the original distribution (Fig. 2C–E). By doing so, we not only unify the previously reported mean-field approximations but also provide novel solutions that can provide more precise approximations than known methods.

Fig. 2

Unified mean-field framework.

Original model (A) and family of generalized Plefka expansions (B–E). Gray lines represent connections that are proportional to α and thus removed in the approximated model to perform the Plefka expansions, while solid black lines are conserved and dashed lines are free parameters. Plefka[t − 1, t] (B) retrieves the classical naive and TAP mean-field equations^38,39. Plefka[t] (C) results in a novel method which preserves correlations of the system at t − 1, incorporating equations similar to ref. ⁴⁰. Plefka[t − 1] (D) assumes independent activity at t-1, and in its first order approximation reproduces the results in ref. ⁴¹. Plefka2[t] (E) represents a novel pairwise approximation which performs better in approximating correlations.

Plefka[t]: expansion around an independent model at time t

For the Plefka[t − 1, t] approximation, explained above, the system becomes independent for α = 0 at t as well as t − 1. This leads to approximations of m_t, C_t, D_t being specified by m_t−1, while being independent of C_t−1 and D_t−1. In ref. ⁴⁰, the authors describe a mean-field approximation by performing new expansion over the classical nMF and TAP equations that takes into account previous correlations C_t−1. Here, our framework allows us to obtain similar results by considering only a Plefka expansion over manifold $Q_{t}$ while assuming that we know the properties of P(s_t−1) (Fig. 2C). Therefore, we denote this approximation as $P_{α}^{[t]}$ and consider

In Supplementary Note we derive the equations for this approximation. For the first order, we obtain

Note that Eqs. (24) and (25) are the same as the nMF Plefka[t − 1, t] equations. Equation (26) includes C_t−1, being exactly the same result obtained in ref. ⁴⁰, Eq. (4). The second-order approximations leads to:

All update rules include the effect of C_t−1. We can see that if we use the covariance matrix of the independent model at t − 1, we recover the results of the Plefka[t − 1, t] approximation in the previous section. In contrast with ref. ⁴⁰, we provide a novel approximation method that depends on previous correlations using a single expansion (instead of two subsequent expansions), and additionally present approximated equal-time correlations.

Plefka[t − 1]: expansion around an independent model at time t − 1

In ref. ⁴¹, a mean-field method is proposed by approximating the effective field h_t as the sum of a large number of independent spins, approximated by a Gaussian distribution, yielding exact results for fully asymmetric networks in the thermodynamic limit. In our framework, we describe this approximation as an expansion around the projection point from P(s_t−1) to the submanifold $Q_{t - 1}$ , using a model where only s_t−1 are independent (Fig. 2D). In this case (see Supplementary Note 4), the effective field h_t at the submanifold is a sum of independent terms, which for large N yields a Gaussian distribution.

By defining

we see that now the expansion is defined for the marginal distribution of the path {s_t−1, s_t} (see Supplementary Note ). The first order equations for this method are

Here we use

D_{x} = \frac{d x}{\sqrt{2 π}} \exp (- \frac{1}{2} x^{2})

D_{x y}^{ρ_{i k}} = \frac{d x d y}{2 π \sqrt{1 - ρ_{i k}^{2}}} \exp (- \frac{1}{2} \frac{(x^{2} + y^{2}) - 2 ρ_{i k} x y}{1 - ρ_{i k}^{2}})

Δ_{i, t} = \sum_{j} J_{i j}^{2} (1 - m_{j, t - 1}^{2})

and

ρ_{i k} = \sum_{j} J_{i j} J_{k j} (1 - m_{j, t - 1}^{2}) / \sqrt{Δ_{i, t} Δ_{j, t}}

. Derivations are described in Supplementary Note . These results are exactly the same as those presented for m_t, D_t in ref. ⁴¹, adding an additional expression for C_t. For this approximation, we do not consider the second-order equations since they are computationally much more expensive than the other approximations.

Plefka2[t]: expansion around a pairwise model

The proposed framework is also a powerful tool to develop novel Plefka expansions. To make the expansions more accurately approximate target statistics, we can consider a reference manifold composed of multiple time steps while maintaining some of the parameters in the system (see Supplementary Note 1). Motivated by this idea, here we propose new methods that directly approximate pairwise activities of the units by choosing a reference manifold that preserves a coupling term.

Let us first consider the joint probability of any arbitrary pair of units at time t − 1 and t to compute the delayed correlations (Fig. 2E, left). Namely, we consider the joint probability of spins s_i,t and s_l,t−1:

with s_⧹l,t−1 containing all elements of s_t−1 except s_l,t−1. As a reference manifold

Q_{t - 1 : t}

, we consider the dependency among only the units i and l:

where θ_i,t(s_l,t−1) = Θ_i,t + Δ_il,ts_l,t−1. The orthogonal projection to

Q_{t}

is equivalent to minimizing the KL divergence D(P∣∣Q) with respect to the parameters:

with

As in the previous approximations, P(s_i,t, s_l,t−1) is connected to

Q (s_{i, t}, s_{l, t - 1} ∣ θ_{t}^{*}, Θ_{t - 1}^{*})

through an α-dependent probability

with conditional probabilities given by

As in the cases above, we can calculate the equations for the first and second-order approximations (see Supplementary Note 5). Here, for the second-order approximation (which is more accurate than the first order) we have that:

which directly leads to calculation of means and delayed correlations as:

These results are related to previous work⁴³ that included autocorrelations as one of the constraints to derive the Plefka approximation. Instead, here we provide a Plefka approximation that includes delayed correlations between any pair of units.

To compute the above approximations, we need to know C_t−1 and C_t−2. Here, we provide similar pairwise Plefka approximations for the pairwise distribution at time t, P(s_i,t, s_k,t). Since s_i,t, s_k,t are conditionally independent, we can construct a model in which first s_k,t is computed from s_t−1, and then s_i,t is computed conditioned on s_k,t, s_t−1 (Fig. 2E, right):

with conditional probabilities given by

Here θ_i,t is a function of s_k,t that accounts for equal-time correlations between s_i,t and s_k,t. Computed similarly to delayed correlations, the second-order approximation yields (see Supplementary Note 5):

Using these equations, approximate equal-time correlations are given as

Note that the approximation of equal-time correlations may not be symmetric for C_ik,t and C_ki,t. In the results of this paper we use the average of the two.

Comparison of the different approximations

In the subsequent sections, we compare the family of Plefka approximation methods described above by testing their performance in the forward and inverse Ising problems. More specifically, we compare the second-order approximations of Plefka[t − 1, t] and Plefka[t], the first order approximation of Plefka[t − 1], and the second-order pairwise approximation of Plefka2[t]. We define an Ising model as an asymmetric version of the kinetic Sherrington-Kirkpatrick (SK) model, setting its parameters around the equivalent of a ferromagnetic phase transition in the equilibrium SK model. External fields H_i are sampled from independent uniform distributions $U (- β H_{0}, β H_{0})$ , H₀ = 0.5, whereas coupling terms J_ij are sampled from independent Gaussian distributions $N (β \frac{J_{0}}{N}, β^{2} \frac{J_{σ}^{2}}{N})$ , J₀ = 1, J_σ = 0.1, where β is a scaling parameter (i.e., an inverse temperature).

Generally, mean-field methods are suitable for approximating properties of systems with small fluctuations. However, there is evidence that many biological systems operate in critical, highly fluctuating regimes^5,6. In order to examine different approximations in such a biologically plausible yet challenging situation, we select the model parameters around a phase transition point displaying large fluctuations.

To find such conditions, we employed path-integral methods to solve the asymmetric SK model (Supplementary Note 6). We find that the stationary solution of the asymmetric model displays for our choice of parameters a non-equilibrium analogue of a critical point for a ferromagnetic phase transition, which takes place at β_c ≈ 1.1108 in thermodynamic limit (see Supplementary Note 6, Supplementary Fig. 1). The uniformly distributed bias terms H shift the phase transition point from β = 1 obtained at H = 0. By simulation of the finite size systems, we confirmed that the maximum fluctuations in the model are found near the theoretical β_c, which shows maximal covariance values (see Supplementary Note 6, Supplementary Fig. 2).

Fluctuations of a system are generally expected to be maximized at a critical phase transition¹⁹. In addition, entropy production (a signature of time irreversibility) has been suggested as an indicator of phase transitions. For example, it presents a peak at the transition point of a continuous phase transition in a non-equilibrium Curie-Weiss Ising model with oscillatory field⁵⁰ and some instances of mean-field majority vote models^51,52. We found that the entropy production of the kinetic Ising system is also maximized around β_c (discussed later, see also Methods for its derivation).

Forward Ising problem

We examine the performance of the different Plefka expansions in predicting the evolution of an asymmetric SK model of size N = 512 with random H and J. To study the nonstationary transient dynamics of the model, we start from s₀ = 1 (all elements set to 1 at t = 0) and recursively update its state for T = 128 steps. We repeated this stochastic simulation for R = 10⁶ trials for 21 values of β in the range [0.7β_c, 1.3β_c] (except for the reconstruction of the phase transition where we used R = 10⁵ and 201 values of β in the same range). Using the R samples, we computed the statistical moments and cumulants of the system, m_t, C_t, and D_t at each time step. We then computed their averages over the system units, i.e., ${⟨ m_{i, t} ⟩}_{i}$ , ${⟨ C_{i k, t} ⟩}_{i k}$ and ${⟨ D_{i l, t} ⟩}_{i l}$ , where the angle bracket denotes average over indices of its subscript.

The black solid lines in Fig. 3A–C display nonstationary dynamics of these averaged statistics from t = 0, …, 128, simulated by the original model at $β = β_{c}$ . In comparison, color lines display these statistics predicted by the family of Plefka approximations that are recursively computed using the obtained equations, starting from the initial state m₀ = 1, C₀ = 0 and D₀ = 0. We observe that although the recursive application of all the approximation methods provides good predictions for the transient dynamics of the mean activation rates m_t until its convergence (Fig. 3A), the predictions using Plefka[t] and especially the proposed Plefka2[t] approximations are closer to the true dynamics than the others. Evolution of the mean equal-time and time-delayed correlations C_t, D_t is precisely captured only by our new method Plefka2[t]. In contrast, Plefka[t] overestimates correlations while Plefka[t − 1] and Plefka[t − 1, t] underestimate correlations.

Fig. 3

Forward Ising problem.

Top: Evolution of average activation rates (magnetizations) (A), equal-time correlations (B), and delayed correlations (C) found by different mean-field methods for β = β_c. Middle: Comparison of the activation rates (D), equal-time correlations (E), and delayed correlations (F) found by the different Plefka approximations (ordinate, p superscript) with the original values (abscissa, o superscript) for β = β_c and t = 128. Black lines represent the identity line. Bottom: Average squared error of the magnetizations $ϵ_{m} = {⟨ {⟨ {(m_{i, t}^{o} - m_{i, t}^{p})}^{2} ⟩}_{i} ⟩}_{t}$ (G), equal-time correlations $ϵ_{C} = {⟨ {⟨ {(C_{i k, t}^{o} - C_{i k, t}^{p})}^{2} ⟩}_{i k} ⟩}_{t}$ (H), and delayed correlations $ϵ_{D} = {⟨ {⟨ {(D_{i k, t}^{o} - D_{i k, t}^{p})}^{2} ⟩}_{i l} ⟩}_{t}$ (I) for 21 values of β in the range [0.7β_c, 1.3β_c].

Performance of the methods in predicting individual activation rates and correlations are displayed in Fig. 3D–F by comparing vectors m_t, C_t and D_t at the last time step (t = 128) of the original model (o superscript) and those of the Plefka approximations (p superscript). For activation rates m_t, the proposed Plefka2[t] and Plefka[t] perform slightly better than the others (see also Fig. 3A). While being overestimated by Plefka[t], underestimated moderately by Plefka[t − 1] and significantly by Plefka[t − 1, t], equal-time and time-delayed correlations C_t, D_t are best predicted by Plefka2[t] (Fig. 3E, F).

The above results are obtained at the critical β = β_c, intuitively the most challenging point for mean-field approximations. In order to further show that our novel approximation Plefka2[t] systematically outperforms the others in a wider parameter range, we repeated the analysis for different inverse temperatures β (the same random parameters are applied for all β). Fig. 3G, H, I, respectively, show the averaged squared errors (averaged over time and units) of the activation rates ϵ_m, equal-time correlations ϵ_C and delayed correlations ϵ_D between the original model and approximations, averaged over units and time for 21 values of β in the range [0.7β_c, 1.3β_c]. Fig. 3G–I shows that Plefka2[t] outperforms the other methods in computing m_t, C_t, D_t (with the exception of a certain region of β > β_c in which Plefka[t] is slightly better), yielding consistently a low error bound for all values of β. Errors of these approximations are smaller when the system is away from β_c.

Inverse Ising problem

We apply the approximation methods to the inverse Ising problem by using the data generated above for the trajectory of T = 128 steps and R = 10⁶ trials to infer the parameters of the model, H and J. The model parameters are estimated by the Boltzmann learning method under the maximum likelihood principle: H and J are updated to minimize the differences between the average rates m_t or delayed correlations D_t of the original data and the model approximations, which can significantly reduce computational time (see Methods). While Boltzmann learning requires to compute the likelihood of every point in a trajectory and every trial (RT calculations) each iteration, we can estimate the gradient at each iteration in a one-shot computation by applying the Plefka approximations (Methods). At β = β_c (Fig. 4A, B), we observe that the classical Plefka[t − 1, t] approximation adds significant offset values to the fields H and couplings J. In contrast, Plefka[t], Plefka[t − 1] and Plefka2[t] are all precise in estimating the values of H and J.

Fig. 4

Inverse Ising problem.

Top: Inferred external fields (A) and couplings (B) found by different mean-field models plotted versus the real ones for β = β_c. Black lines represent the identity line. Bottom: Average squared error of inferred external fields $ϵ_{H} = {⟨ {(H_{i}^{o} - H_{i}^{p})}^{2} ⟩}_{i}$ (C) and couplings $ϵ_{J} = {⟨ {(J_{i j}^{o} - J_{i j}^{p})}^{2} ⟩}_{i j}$ (D) for 21 values of β in the range [0.7β_c, 1.3β_c].

Fig. 4C, D shows the mean squared error ϵ_H, ϵ_J for bias terms and couplings between the original model and the inferred values for different β. In this case, errors are large in the estimation of J for Plefka[t − 1, t]. In comparison, Plefka[t], Plefka[t − 1] and Plefka2[t] work equally well even in the high fluctuation regime (β ≈ β_c). Since the inverse Ising problem is solved by applying approximation one single time step (per iteration), it is not as challenging as the forward problem that can accumulate errors by recursively applying the approximations. Therefore, different approximations other than the classical mean-field Plefka[t − 1, t] perform equally well in this case.

Phase transition reconstruction

We have shown how different methods perform in computing the behavior of the system (forward problem) and inferring the parameters of a given network from its activation data (inverse problem). Combining the two, we can ask how well the methods explored here can reconstruct the behavior of a system from data, potentially exploring behaviors under different conditions than the recorded data.

First, in Fig. 5A–C we examine how the different approximation methods approximate fluctuations (equal-time and time-delayed covariances) and the entropy production (see Methods) at t = 128 after solving the forward problem by recursively applying the approximations for the 128 steps. As we mentioned above, the asymmetric SK model explored here presents maximum fluctuations and maximum entropy production around β = β_c (Supplementary Note 6, Supplementary Fig. 2). However, we see that Plefka[t − 1, t] and Plefka[t − 1] cannot reproduce the behavior of correlations C_t and D_t of the original SK model around the transition point. Plefka[t] and Plefka2[t] show much better performance in capturing the behavior of C_t and D_t in the phase transition, although Plefka[t] overestimates both correlations. Additionally, all the methods capture the phase transition in entropy production, though Plefka[t] overestimates its value around β_c and Plefka2[t] is more precise than the other methods.

Fig. 5

Reconstructing phase transition of kinetic Ising systems.

Top: Average of the Ising model’s equal-time correlations (A), delayed correlations (B), and entropy production (shown as an exponential for better presentation of its maximum) (C), at the last step t = 128 found by different mean-field methods for β = β_c. Bottom (D–F): The same as above using the reconstructed network H, J by solving the inverse Ising problem at β = β_c and multiplying a fictitious inverse temperature $\tilde{β}$ to the estimated parameters. The stars are marked at the values of $\tilde{β}$ that yield maximum fluctuations or maximum entropy production.

Next, we combine the forward and inverse Ising problem and try to reproduce the transition in the asymmetric SK model in the models inferred from the data. We first take the values of H, J from solving the inverse problem from the data sampled at β = β_c, and next we solve again the forward problem with those estimated parameters rescaled by a new inverse temperature $\tilde{β}$ . The results for the correlations (Fig. 5D, E) show that in this case Plefka[t − 1, t] works badly, not being able to capture the transition. Plefka[t − 1] shows similar performances as in the forward problem, and Plefka[t] and Plefka2[t] have a similar behavior, underestimating fluctuations slightly. When we analyze entropy production of the system (Fig. 5F), we find that Plefka2[t] exhibits better performance with a high precision, with Plefka[t − 1] slightly overestimating it, Plefka[t] underestimating it, and Plefka[t − 1, t] not capturing the phase transition. Overall, the results above suggest that Plefka2[t] is better suited to identify non-equilibrium phase transitions in models reconstructed from experimental data.

Discussion

We have proposed a framework that unifies different mean-field approximations of the evolving statistical properties of non-equilibrium Ising models. This allows us to derive approximations premised on specific assumptions about the correlation structure of the system previously proposed in the literature. Furthermore, using our framework we derive a new approximation (Plefka2[t]) using atypical assumptions for mean-field methods, i.e., the maintenance of pairwise correlations in the system. This new pairwise approximation outperforms existing ones for approximating the behavior of an asymmetric SK model near the non-equilibrium equivalent of a ferromagnetic phase transition (see Supplementary Note 6), where classical mean-field approximations face problems. This shows that the proposed methods are useful tools to analyze large-scale, non-equilibrium dynamics near critical regimes expected for biological and social systems. However, we note that low-temperature spin phases (e.g., the spin-glass phase in symmetric models) also impose limitations on mean-field approximations^32,41, which could be further explored with methods like the ones presented here.

The generality of this framework allows us to picture other approximations with atypical assumptions. For example, the Sessak-Monasson expansion⁵³ for an equilibrium Ising model assumes a linear relation between α and spin correlations. An equivalent equilibrium expansion could use an effective field h(α) nonlinearly dependent on α, satisfying linear C_t(α) = αC_t or D_t(α) = αD_t relations. As another extension, Plefka2[t] could incorporate higher-order interactions. As Eqs. (43) and (52) are each equivalent to two mean-field approximations with s_l,t−1 = ± 1 respectively, a generalized PlefkaM[t] would involve 2^M−1 equations, increasing accuracy but also computational costs. In general, reference models Q(s_t) set coupling parameters of the model to zero at some steps of its dynamics. Other parameters (e.g., fields) are either free parameters fitted as m-projection from P(s_t), or preserved to their original value (see Supplementary Note 7 for comparing free and fixed parameters of each model). Augmenting accuracy by increasing parameters often involves a computational cost. As a practical guideline for using each method, Supplementary Note 7 compares their precision and computation time in the forward and inverse problems (see also Supplementary Figs. 3 and 4).

Asides from its theoretical implications, our unified framework offers analysis tools for diverse data-driven research fields. In neuroscience, it has been popular to study the activity of ensembles of neurons by inferring an equilibrium Ising model with homogeneous (fixed) parameters²³ or inhomogeneous (time-dependent) parameters^25,54 from empirical data. Extended analyses based on the equilibrium model have reported that neurons operate near a critical regime^5,6. However, studies of non-equilibrium dynamics in neural spike trains are scarce^7,26,55, partly due to the lack of systematic methods for analysing large-scale non-equilibrium data from neurons exhibiting large fluctuations. The proposed pairwise model Plefka2[t] is suitable for simulating such network activities, being more accurate than previous methods in predicting the network evolution at criticality (Fig. 3) and in testing if the system is near the maximally fluctuating regime (Fig. 5). In particular, application of our methods for computing entropy production in non-equilibrium systems could provide tools for characterizing the non-equilibrium dynamics of neural systems⁵⁶.

In summary, a unified framework of mean-field theories offers a systematic way to construct suitable mean-field methods in accordance with the statistical properties of the systems researchers wish to uncover. This is expected to foster a variety of tools to analyze large-scale non-equilibrium systems in physical, biological, and social systems.

Methods

Boltzmann learning in the inverse Ising problem

Let $S_{t}^{r} = {S_{1, t}^{r}, S_{2, t}^{r}, \dots, S_{N, t}^{r}}$ for t = 1, …, T be observed states of a process described by Eq. (1) at the r-th trial (r = 1, …, R). We also define S_1:T to represent the processes from all trials. The inverse Ising problem consists in inferring the external fields H and couplings J of the system. These parameters can be estimated by maximizing the log-likelihood $ℓ$ (S_1:T) of the observed states under the model:

with

h_{i, t}^{r} = H_{i} + \sum_{j} J_{i j} S_{j, t - 1}^{r}

. The learning steps are obtained as:

where 〈⋅〉_r denotes average over trials. We solve the inverse Ising problem by applying these equations as a gradient ascent rule adjusting H and J. The second terms of Eqs. () and () need to be computed at every iteration, thus the computational cost grows linearly with R × T. However, the use of mean-field approximations can significantly reduce the cost when a large number of samples R and time bins T are used to correctly estimate activation rates and correlations in large networks. Here the second terms can be written as

where

\bar{P} (\tilde{s}) = \frac{1}{R T} \sum_{r, t} δ (\tilde{s}, S_{t}^{r})

is the empirical distribution averaged over trials and trajectories (with δ being a Kronecker delta) and

{\tilde{m}}_{l}

is the average activation rate computed from the empirical distribution.

P (s ∣ \tilde{s})

is defined as Eq. (). We then approximate m_i and D_il using the mean-field equations. Note that when we apply the mean-field equations, we replaced all statistics related to the previous step with those computed by the empirical distribution. By applying the mean-field methods, we reduced the computation of R trials of trajectories of length T into a single computation (instead of RT calculations). In our numerical tests, gradient ascent was executed using learning coefficients

η_{H} = 0.1 / R T, η_{J} = 1 / (R T \sqrt{N})

, starting from H = 0, J = 0.

Entropy production of the kinetic Ising model

The entropy production is defined as the KL divergence between the forward and backward path, quantifying the irreversibility of the system^17,55,57:

where P_B(s_t−1∣s_t) is a probability of the backward trajectory defined as in Eq. () but switching s_t and s_t−1. Assuming a non-equilibrium steady state, where P(s_t) = P(s_t−1), the entropy production of the kinetic Ising system is computed as:

Peer review information: Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-021-20890-5.

Acknowledgements

We thank Yasser Roudi and Masanao Igarashi for valuable comments and discussions on this manuscript. This work was supported in part by the Cooperative Intelligence Joint Research between Kyoto University and Honda Research Institute Japan, MEXT/JSPS KAKENHI Grant Number JP 20K11709, and the grant of Joint Research by the National Institutes of Natural Sciences (NINS Program No. 01112005). M.A. was funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 892715 and the University of the Basque Country UPV/EHU post-doctoral training program grant ESPDOC17/17, and supported in part by the Basque Government project IT 1228-19 and project Outonomy PID2019-104576GB-I00 by the Spanish Ministry of Science and Innovation.

Author contributions

M.A., S.A.M., and H.S. designed and reviewed research; M.A. contributed analytical and numerical results; M.A., S.A.M., and H.S. wrote the paper.

Data availability

The datasets generated and analysed in this study are available under CC BY license at Zenodo https://zenodo.org/record/4318983⁵⁸ (10.5281/zenodo.4318983).

Code availability

The source code for implementing the methods and results in this work is available under GPL license at GitHub https://github.com/MiguelAguilera/kinetic-Plefka-expansions⁵⁹ (10.5281/zenodo.4357634).

Competing interests

The authors declare no competing interests.

References

Nguyen

. Whole-brain calcium imaging with cellular resolution in freely behaving Caenorhabditis elegans.

doi: 10.1073/pnas.1507110112

Ahrens

, Orger

, Robson

, Li

, Keller

. Whole-brain functional imaging at cellular resolution using light-sheet microscopy.

doi: 10.1038/nmeth.2434

Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M. & Harris, K. D. High-dimensional geometry of population responses in visual cortex. Nature10.1038/s41586-019-1346-5 (2019).

Nicolis, G. & Prigogine, I. Self-Organization in Nonequilibrium Systems: From Dissipative Structures to Order through Fluctuations. 1st edn. (Wiley, New York, 1977).

Tkačik

. Thermodynamics and signatures of criticality in a network of neurons.

doi: 10.1073/pnas.1514188112

Mora

, Deny

, Marre

. Dynamical criticality in the collective activity of a population of retinal neurons.

doi: 10.1103/PhysRevLett.114.078105

Hertz, J., Roudi, Y. & Tyrcha, J. Ising model for inferring network structure from spike data. In Principles of Neural Coding, 527–546 (CRC Press, 2013).

Roudi

, Dunn

, Hertz

. Multi-neuronal activity and functional connectivity in cell assemblies.

doi: 10.1016/j.conb.2014.10.011

Bouchaud

. Crises and collective socio-economic phenomena: simple models and challenges.

doi: 10.1007/s10955-012-0687-3

10.

Evans

, Cohen

EGD

, Morriss

. Probability of second law violations in shearing steady states.

doi: 10.1103/PhysRevLett.71.2401

11.

Jarzynski

. Nonequilibrium equality for free energy differences.

doi: 10.1103/PhysRevLett.78.2690

12.

Crooks

. Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems.

doi: 10.1023/A:1023208217925

13.

Crooks

. Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences.

doi: 10.1103/PhysRevE.60.2721

14.

Lebowitz

, Spohn

. A Gallavotti-Cohen-type symmetry in the large deviation functional for stochastic dynamics.

doi: 10.1023/A:1004589714161

15.

Ito

, Oizumi

, Amari

S-I

. Unified framework for the entropy production and the stochastic interaction based on information geometry.

doi: 10.1103/PhysRevResearch.2.033048

16.

Evans

, Searles

. The fluctuation theorem.

doi: 10.1080/00018730210155133

17.

Seifert

. Stochastic thermodynamics, fluctuation theorems and molecular machines.

doi: 10.1088/0034-4885/75/12/126001

18.

Gaspard, P. Time Asymmetry in Nonequilibrium Statistical Mechanics. In Special Volume in Memory of Ilya Prigogine, 83–133 (John Wiley, Sons, Ltd, 2007).

19.

Salinas, S. R. A. The Ising Model. In Introduction to Statistical Physics, Graduate Texts in Contemporary Physics (ed. Salinas, S. R. A.) 257–276 (Springer New York, New York, NY, 2001).

20.

Ackley

, Hinton

, Sejnowski

. A learning algorithm for Boltzmann machines.

doi: 10.1207/s15516709cog0901_7

21.

Witoelar

, Roudi

. Neural network reconstruction using kinetic Ising models with memory.

doi: 10.1186/1471-2202-12-S1-P274

22.

Donner

, Opper

. Inverse Ising problem in continuous time: a latent variable approach.

doi: 10.1103/PhysRevE.96.062104

23.

Schneidman

, Berry

, Segev

, Bialek

. Weak pairwise correlations imply strongly correlated network states in a neural population.

doi: 10.1038/nature04701

24.

Cocco

, Leibler

, Monasson

. Neuronal couplings between retinal ganglion cells inferred by efficient inverse statistical physics methods.

doi: 10.1073/pnas.0906705106

25.

Shimazaki

, Amari

S-i

, Brown

, Grün

. State-space analysis of time-varying higher-order spike correlation for multiple neural spike train data.

doi: 10.1371/journal.pcbi.1002385

26.

Tyrcha

, Roudi

, Marsili

, Hertz

. The effect of nonstationarity on models inferred from neural data.

doi: 10.1088/1742-5468/2013/03/P03005

27.

Thouless

, Anderson

, Palmer

. Solution of ’Solvable model of a spin glass.

doi: 10.1080/14786437708235992

28.

Kappen

, Rodríguez

. Efficient learning in Boltzmann machines using linear response theory.

doi: 10.1162/089976698300017386

29.

Roudi, Y., Aurell, E. & Hertz, J. A. Statistical physics of pairwise probability models. Front. Comput. Neurosci. 10.3389/neuro.10.022.2009 (2009).

30.

Roudi

, Tyrcha

, Hertz

. Ising model for neural data: model quality and approximate methods for extracting functional connectivity.

doi: 10.1103/PhysRevE.79.051915

31.

Donner

, Obermayer

, Shimazaki

. Approximate inference for time-varying interactions and macroscopic dynamics of neural populations.

doi: 10.1371/journal.pcbi.1005309

32.

Plefka

. Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model.

doi: 10.1088/0305-4470/15/6/035

33.

Tanaka

. Mean-field theory of Boltzmann machine learning.

doi: 10.1103/PhysRevE.58.2302

34.

Tanaka, T. A theory of mean field approximation. In Advances in Neural Information Processing Systems, 351–357 (1999).

35.

Bhattacharyya, C. & Keerthi, S. S. Information geometry and Plefkaas mean-field theory. J. Phys. A33, 1307 (2000).

36.

Tanaka, T. Information Geometry of Mean-Field Approximation. In Advanced mean field methods: Theory and practice 351–360 (MIT press, 2001).

37.

Amari, S., Ikeda, S. & Shimokawa, H. Information Geometry of Alpha-Projection in Mean Field Approximation. In Advanced Mean Field Methods: Theory and Practice (MIT Press, 2001).

38.

Kappen

, Spanjers

. Mean field theory for asymmetric neural networks.

doi: 10.1103/PhysRevE.61.5658

39.

Roudi

, Hertz

. Dynamical TAP equations for non-equilibrium Ising spin glasses.

doi: 10.1088/1742-5468/2011/03/P03031

40.

Roudi

, Hertz

. Mean field theory for nonequilibrium network reconstruction.

doi: 10.1103/PhysRevLett.106.048702

41.

Mézard

, Sakellariou

. Exact mean-field inference in asymmetric kinetic Ising systems.

42.

Mahmoudi

, Saad

. Generalized mean field approximation for parallel dynamics of the Ising model.

doi: 10.1088/1742-5468/2014/07/P07001

43.

Bachschmid-Romano

, Battistin

, Opper

, Roudi

. Variational perturbation and extended Plefka approaches to dynamics on random networks: the case of the kinetic Ising model.

doi: 10.1088/1751-8113/49/43/434003

44.

Pressé

, Ghosh

, Lee

, Dill

. Principles of maximum entropy and maximum caliber in statistical physics.

doi: 10.1103/RevModPhys.85.1115

45.

Amari, S. & Nagaoka, H. Methods of information geometry. Vol. 191 (American Mathematical Soc., 2007).

46.

Amari, S. Information geometry and its applications, Vol. 194 (Springer, 2016).

47.

Amari

, Kurata

, Nagaoka

. Information geometry of Boltzmann machines.

doi: 10.1109/72.125867

48.

Oizumi

, Tsuchiya

, Amari

S-I

. Unified framework for information integration based on information geometry.

doi: 10.1073/pnas.1603583113

49.

Saul, L. K. & Jordan, M. I. Exploiting tractable substructures in intractable networks. In Advances in Neural Information Processing Systems 486–492 (1996).

50.

Zhang

, Barato

. Critical behavior of entropy production and learning rate: Ising model with an oscillating field.

doi: 10.1088/1742-5468/2016/11/113207

51.

Crochik

, Tomé

. Entropy production in the majority-vote model.

doi: 10.1103/PhysRevE.72.057103

52.

Noa

, Harunari

, de Oliveira

, Fiore

. Entropy production as a tool for characterizing nonequilibrium phase transitions.

doi: 10.1103/PhysRevE.100.012104

53.

Sessak

, Monasson

. Small-correlation expansions for the inverse Ising problem.

doi: 10.1088/1751-8113/42/5/055001

54.

Granot-Atedgi

, Tkačik

, Segev

, Schneidman

. Stimulus-dependent maximum entropy models of neural population codes.

doi: 10.1371/journal.pcbi.1002922

55.

Cofré

, Videla

, Rosas

. An introduction to the non-equilibrium steady states of maximum entropy spike trains.

doi: 10.3390/e21090884

56.

Lynn, C. W., Cornblath, E. J., Papadopoulos, L., Bertolero, M. A. & Bassett, D. S. Non-equilibrium dynamics and entropy production in the human brain. Preprint at arXiv 2005.02526 (2020).

57.

Schnakenberg

. Network theory of microscopic and macroscopic behavior of master equation systems.

doi: 10.1103/RevModPhys.48.571

58.

Aguilera, M. A unifying framework for mean field theories of asymmetric kinetic Ising systems [Dataset]. Zenodo10.5281/zenodo.4318983 (2020).

59.

Aguilera, M. A unifying framework for mean field theories of asymmetric kinetic Ising systems [Code]. GitHub10.5281/zenodo.4357634 (2020).