Biocomputation

The prevailing modern scientific paradigm of the brain is a computational one. But if the brain is a computer—which is an 'if'—it must have operating principles, abilities and limitations that are radically different to those of artificial computers. In this session, talks will explore diverse topics within quantitative neuroscience that consider the brain as a device for computation, broadly conceived.

Session Chairs

Professor Dan V. Nicolau Jr (King’s College London)

Yasmine Ayman (Harvard University)

Keynote Talks

Professor Wolfgang Maass (Technische Universität Graz): Local prediction-learning in high-dimensional spaces enables neural networks to plan

Professor Sophie Deneve (Ecole Normale Supérieure, Paris)

Invited Talks

Professor Andrew Adamatzky (UWE)

Professor Christine Grienberger (Brandeis): Dendritic computations underlying experience-dependent hippocampal representation

Spotlight Talks

Paul Haider (University of Bern): Backpropagation through space, time and the brain

Deng Pan (Oxford): Structure learning in the human hippocampus and orbitofrontal cortex

Francesca Mignacco (CUNY Graduate Center & Princeton University): Nonlinear manifold capacity theory with contextual information

Yuanxiang Gao (Institute of Theoretical Physics, Chinese Academy of Sciences): A computational model of learning flexible navigation in a maze by layout-conforming replay of place cells

Angus Chadwick (University of Edinburgh): ROTATIONAL DYNAMICS ENABLES NOISE-ROBUST WORKING MEMORY

Nischal Mainali (ELSC, Hebrew University of Jerusalem): A unified mathematical model of place field statistics across dimensionalities and species

Carla Griffiths (Sainsbury Wellcome Centre): Neural mechanisms of auditory perceptual constancy emerge in trained animals

Harsha Gurnani (University of Washington): Feedback controllability constrains learning timescales of motor adaptation

Arash Golmohammadi (Department for Neuro- and Sensory Physiology, University Medical Center Göttingen): Heterogeneity as an algorithmic feature of neural networks

Salva Ardid (Universitat Politècnica de València): Signal-to-noise optimization: gaining insight into information processing in neural networks

Sacha Sokoloski (University of Tuebingen): Analytically-tractable hierarchical models for neural data analysis and normative modelling

Alejandro Chinea Manrique de Lara (UNED): Cetacean's Brain Evolution: The Intriguing Loss of Cortical Layer IV and the Thermodynamics of Heat Dissipation in the Brain

Adam Manoogian (Monash University): Contextual Inference Underlies Decision Making in Schizophrenia: An Active Inference Model

Keynote Talks

Technische Universität Graz

Local prediction-learning in high-dimensional spaces enables neural networks to plan

Planning and problem solving are cornerstones of higher brain function. But we do not know how the brain does that. We show that learning of a suitable cognitive map of the problem space suffices. Furthermore, this can be reduced to learning to predict the next observation through local synaptic plasticity. Importantly, the resulting cognitive map encodes relations between actions and observations, and its emergent high-dimensional geometry provides a sense of direction for reaching distant goals. This quasi-Euclidean sense of direction provides a simple heuristic for online planning that works almost as well as the best offline planning algorithms from AI. If the problem space is a physical space, this method automatically extracts structural regularities from the sequence of observations that it receives so that it can generalize to unseen parts. This speeds up learning of navigation in 2D mazes and the locomotion with complex actuator systems, such as legged bodies. This is joint work with Christoph Stöckl and Yukun Yang. Details: Nature Communications, 15(1), 2024.

Ecole Normale Supérieure, Paris

TBA

TBA

Invited Talks

University of the West of England

TBA

TBA

Brandeis University

Dendritic computations underlying experience-dependent hippocampal representation

TBA

Spotlight talks

University of Bern

Backpropagation through space, time and the brain

In Machine Learning (ML), the answer to the spatiotemporal credit assignment problem is almost universally given by the error backpropagation algorithm, through both space (BP) and time (BPTT). However, BP(TT) is well-known to rely on biologically implausible assumptions, in particular the dependency on both spatially and temporally non-local information. Here, we introduce Generalized Latent Equilibrium (GLE), a computational framework for spatio-temporal credit assignment in dynamical physical systems. We start by defining an energy based on neuron-local mismatches, from which we derive both neuronal dynamics via stationarity and parameter dynamics via gradient descent. The resulting dynamics can be interpreted as a real-time, biologically plausible approximation of BPTT in deep cortical networks with continuous-time, leaky neuronal dynamics and continuously active, local synaptic plasticity. GLE exploits the ability of biological neurons to phase-shift their output rate with respect to their membrane potential in order to map time-continuous inputs to neuronal space and to enable the temporal inversion of feedback signals which is essential to approximate the adjoint states necessary for estimating useful parameter updates.

CUNY Graduate Center & Princeton University

Nonlinear manifold capacity theory with contextual information

Neural systems efficiently process information through high-dimensional representations. Understanding the underlying physical principles presents a fundamental challenge at the interface of theoretical neuroscience and machine learning. A commonly adopted approach involves the analysis of statistical and geometrical attributes of neural representations as population-level mechanistic descriptors of task implementation. One of these population-geometry metrics is the invariant object classification capacity. However, this metric has been so far limited to linearly separable settings. Here, we propose a theoretical framework that overcomes this limitation leveraging contextual information about the input. We derive an exact formula for the context-dependent capacity that depends on manifold geometry and context correlations. We test our theoretical predictions on synthetic and real manifolds and find good agreement with numerical simulations. The increased expressivity of our framework allows to capture representation untanglement in deep networks at the early stages of the layer hierarchy. Our method is data-driven and widely applicable across datasets and models.

Institute of Theoretical Physics, Chinese Academy of Sciences

A computational model of learning flexible navigation in a maze by layout-conforming replay of place cells

Recent experimental observations have shown that the reactivation of hippocampal place cells (PC) during sleep or wakeful immobility depicts trajectories that can go around barriers and can flexibly adapt to a changing maze layout. However, existing computational models of replay fall short of generating such layout-conforming replay, restricting their usage to simple environments, like linear tracks or open fields. In this paper, we propose a computational model that generates layout-conforming replay and explains how such replay drives the learning of flexible navigation in a maze. First, we propose a Hebbian-like rule to learn the inter-PC synaptic strength during exploration. Then we use a continuous attractor network (CAN) with feedback inhibition to model the interaction among place cells and hippocampal interneurons. The activity bump of place cells drifts along paths in the maze, which models layout-conforming replay. During replay in sleep, the synaptic strengths from place cells to striatal medium spiny neurons (MSN) are learned by a novel dopamine-modulated three-factor rule to store place-reward associations. During goal-directed navigation, the CAN periodically generates replay trajectories from the animal’s location for path planning, and the trajectory leading to a maximal MSN activity is followed by the animal. We have implemented our model into a high-fidelity virtual rat in the MuJoCo physics simulator. Extensive experiments have demonstrated that its superior flexibility during navigation in a maze is due to a continuous re-learning of inter-PC and PC-MSN synaptic strength.

University of Edinburgh

Rotational Dynamics Enables Noise-Robust Working Memory

Working memory is fundamental to higher-order cognitive function, yet the circuit mechanisms through which memoranda are maintained in neural activity after removal of sensory input remain subject to vigorous debate. Prominent theories propose that stimuli are encoded in either stable and persistent activity patterns configured through recurrent attractor dynamics or dynamic and time-varying patterns of population activity brought about through non-normal or feedforward network architectures. However, the optimal dynamics for working memory, particularly when faced with ongoing neuronal noise, has not been resolved. Here, we address this question within the analytically tractable setting of linear recurrent neural networks. We develop a novel method to optimise continuous-time linear RNNs driven by Gaussian noise to solve working memory tasks, without requiring forward-simulation or backpropagation-through-time. Application of this optimisation method yields a novel and previously overlooked mechanism for working memory maintainence combining both non-normal and rotational dynamics. To test whether these dynamics are a consequence of our optimisation method, we derive analytical expressions for the updates generated by backpropagation-through-time, which we produce near-identical learning dynamics to those produced by our method. Finally, we show that the optimised networks replicate core features of experimentally-observed neural population activity in prefrontal cortex, including “dynamic coding" (as quantified by both cross-temporal decoding analysis and switching of single-neuron neuronal selectivity over the delay period) despite stable representational geometry. Taken together, our findings suggest that memoranda are stored and maintained during working memory using combination of non-normal and rotational dynamics, which support a stable and optimally noise-robust representation of working memory contents within a time-varying and dynamic population code.

ELSC, Hebrew University of Jerusalem

A unified mathematical model of place field statistics across dimensionalities and species

Classically spatial coding place cells in hippocampus express unimodal, stereotyped bell-shaped firing fields. Recent recordings from CA1, however, reveal that place cells in large environments typically fire in multiple locations. Furthermore, the multiple firing fields of individual cells, as well as those of the whole population, vary in shape and size, often deviating substantially from the classical form. We find that a mathematical model that generates firing fields by thresholding Gaussian process explains a wide range of statistics of the observed place fields. The model simultaneously provides excellent fits to the distribution of field sizes and the distribution of inter-field distances, in several data sets that differ in the species and the dimensionality of the environment: from bats and rodents, in 1d, 2d, and 3d enclosures. In addition, the model makes quantitative predictions on the statistics of field shapes – the distribution of the number of local maxima within a field and the joint distribution of a field’s width and its peak firing rate – as well as euler characteristics of the fields. These predictions are all borne out when checked against experimental data, without refitting any parameters.

Sainsbury Wellcome Centre

Neural mechanisms of auditory perceptual constancy emerge in trained animals

While considered part of generalisation, the neural mechanisms of perceptual constancy remain challenging to elucidate. To examine the neural basis of perceptual constancy, we trained four ferrets in a Go/No-Go task, where ferrets identified `instruments' in a stream drawn from 54 probe words. Once trained, we varied the perceived pitch across the whole stream. Gradient-boosted trees revealed that pitch was a minor factor in choice uncertainty. Using an LSTM, we decoded neural responses for each target versus probe word combination (trained=715, naive=674, 4 animals) by using the whole word and cumulative 40ms windows over time. A unit that encodes acoustics will vary its peak decoding time across probe words, whereas a categorical response will remain constant in its peak. We found that the target word was robustly represented in trained animals, as assessed by higher and more invariant decoding scores. Trained animals’ units could generalise over multiple distractor words and had less variance in peak time when decoding over time compared to naive units. Overall, neural responses become robust and generalisable to the target, supporting the idea that nodes that take on individualised roles become adaptable to novel environments.

Donders Institute for Brain, Cognition and Behaviour

Correlations are ruining your gradient descent

Biological nervous systems are noisy. Thus, the manner by which robust learning occurs at synaptic connections remains unclear. Furthermore, biologically plausible and local learning rules often struggle to scale to deep networks and to difficult tasks. This has meant that a number of algorithms have been put aside during ongoing successes of modern machine learning. Here we show that node perturbation (NP), a local learning algorithm which relies upon noise injection into a neural network, can be scaled to deep networks and difficult tasks. First, we relate noise-based learning to directional derivatives and show that even systems in which the noise source is inaccessible can be trained. Second, we add to our network architectures a neural activity decorrelating mechanism. This decorrelation mechanism enables biologically plausible and local learning algorithms to scale beyond shallow networks. We theoretically show that this decorrelation is a bridge from regular gradient descent to natural gradient descent - helping to overcome scaling issues in the parameter-loss relation. Finally, we show that this bridge enables significantly faster training by backpropagation, as well as scaling up of alternative learning algorithms.

University of Washington

Feedback controllability constrains learning timescales of motor adaptation

Previous work exploring the structure of primary motor cortex (M1) activity has largely assumed autonomous dynamics, and related work on learning in brain-computer interfaces (BCIs) has focused on local mechanisms (such as M1 synaptic plasticity). However, recent experimental evidence suggests that M1 activity is continuously modified by sensory feedback and produces corrections for noise and external perturbations, suggesting a critical need to model this interaction between feedback and intrinsic M1 dynamics. Here we propose that for fast adaptation to BCI decoder changes, M1 dynamics can be effectively modified by changing inputs, including by flexible remapping of sensory feedback. Using recurrent network models of BCI under feedback control, we show how the rate of such adaptation is constrained by pre-existing structured dynamics .Lastly, we show that the geometry of low-dimensional network activity can affect the design and robustness of BCI decoders. By incorporating adaptive controllers upstream of M1, our work highlights the need to model input-dependent latent dynamics, and clarifies how constraints on learning arise from both the statistical characteristics and the underlying dynamical structure of neural activity.

Department for Neuro- and Sensory Physiology, University Medical Center Göttingen

Heterogeneity as an algorithmic feature of neural networks

To gain tractability, theorists reduce reality to abstract models. Yet, a priori, it is unclear if such a reduction dismisses any fundamental features of the phenomenon under investigation. Consequently, despite overwhelming evidence for neuronal heterogeneity, research on biological networks predominantly focuses on networks with homogeneous neurons. We relaxed this constraint by systematically controlling heterogeneity level in otherwise identical networks. Several networks were trained to perform diverse cognitive tasks such as memory, prediction, and processing. We demonstrated that, even in small networks, heterogeneous ones outperform their homogeneous counterparts. These results suggest that heterogeneity may be more than a theoretical complication; it might be an algorithmic advantage, especially in systems with finitely many units. Given the ubiquity of heterogeneity, it is likely that biological organisms have evolved to exploit this trait under environmental constraints and parasites. As such, we predict that heterogeneity must be a robust feature. Indeed, this prediction aligns with recent experimental findings that showed that cellular heterogeneity profiles are invariant over age and external perturbation.

Universitat Politècnica de València

Signal-to-noise optimization: gaining insight into information processing in neural networks

The current view in neuro-AI is that neural networks compress representations into low-dimensional manifolds (Gao et al., 2015). A study challenges this view, arguing that neural networks benefit from high dimensionality (Elmoznino et al., 2022). We argue that learning in neural networks may optimize the signal-to-noise ratio (SNR), so neural networks can benefit from feature compression and expansion to: (i) increase signal processing and (ii) diminish noise, while (iii) mapping inputs into outputs. Yet, if SNR is optimized through learning, a causal relationship shall exist with model performance (e.g., accuracy or the probability to select the proper response). To test this, we introduce a SNR metric computed as the ratio between the signal (defined as the square distance between clusters), over the noise (defined as the sum of variances of the clusters). Unlike (Sorscher et al., 2022), our metric is optimized: we determine the optimal axis that maximizes SNR, which may differ from the centroid axis. Projecting data to this axis reduces the signal but also reduces, and more strongly, the noise, resulting in higher SNRs. We show this in panel (a): the SNR along the axis in black is larger than that along the centroid axis. In this example, there is no overlap with respect to the optimal axis, whereas the projections to the centroid axis do overlap significantly (not shown). In panel (b), we compute the distribution of the angles formed between these two axes when applied to the linear output of n=40 neural networks classifying MNIST digits by parity. Using a perturbative analysis, we show that SNRs are predictive of the probability to select the proper response (panel c), and the accuracy (not shown). In the analysis, we gradually shortened the distance between the manifold centroids of each category keeping the variance untouched. This analysis shows that the performance relates to the SNR, while the dimensionality of the data remains constant.

University of Tuebingen

Analytically-tractable hierarchical models for neural data analysis and normative modelling

Latent variable models (LVMs) are useful in neuroscience both for learning distributions of neural data and modelling how the brain engages in optimal inference. In practice we often rely on approximation schemes such as variational methods in order to implement inference and learning with LVMs, yet these schemes can introduce difficult to analyze biases and errors, and ideally we would avoid them unless strictly necessary. Towards this end, we present a general theory of hierarchical LVMs for which learning and inference can be implemented exactly. In particular, we derive necessary and sufficient conditions for exact inference and learning in a large class of exponential family LVMs. We then show how these models can be stacked to create novel, hierarchical models that retain their tractable properties. Moreover, we derive general inference and learning algorithms for these models, such as expectation-maximization and Bayesian smoothing, and show that many well-known algorithms are special cases of these general solutions. Finally, we use our theory to develop several novel models, including (i) a hierchical probabilistic population code with a novel prior that combines a multivariate normal distribution with a Boltzmann machine, and (ii) a hierarchical Gaussian mixture model for clustering high-dimensional data. In summary, we show how to build complex LVMs without relying on unnecessary approximations. In future work we will explore training complex LVMs with variational techniques, while minimizing approximations by using our analytically-tractable models as components.

Alejandro Chinea Manrique de Lara

Universidad Nacional de Educación a Distancia

Cetacean's Brain Evolution: The Intriguing Loss of Cortical Layer IV and the Thermodynamics of Heat Dissipation in the Brain

During the transition from the Eocene to the Oligocene epoch there was a cooling of the temperatures of the oceans that is believed affected cetacean brain evolution. Compared to other mammals the most intriguing feature of their brains is the lack of layer IV in the entire cerebral cortex. A novel interpretation of the evolutionary and functional significance of the loss of layer IV in the cerebral cortex of cetaceans is presented using the intelligence and embodiment hypothesis. This hypothesis is based on evolutionary neuroscience postulates the existence of a common information-processing principle associated with nervous systems that evolved naturally and serves as the foundation from which intelligence can emerge and to the efficiency of brain’s computations. The adaptive function of these neuronal trait is shown to be related with an increased heat dissipation of the cerebral cortex as indicated by the statistical physics model of the hypothesis, thus supporting a previous hypothesis correlating thermogenesis to the evolution of large brain sizes in cetaceans but putting forward that these results are not at odds with the possibility of levels of cognitive complexity beyond the majority of other mammals

Monash University

Contextual Inference Underlies Decision Making in Schizophrenia: An Active Inference Model

The learning and decision making processes underlying differences in performance on behavioral tasks in schizophrenia patients are not yet clear. Traditionally established cognitive patterns such as ‘jumping to conclusions’ and a ‘bias against disconfirmatory evidence’ are proving inconsistent under scrutiny or aligned only to certain symptoms at times of severity. Despite the wide history of context misinterpretation in schizophrenia, the influence of contextual inference in decision making tasks remains underexplored. We argue that including an ongoing process of latent state inference can better capture these behavioral tendencies, such as increased switch rates and a failure to form precise beliefs. Here, we use an active inference agent modified with contextual updating in a sequential predictive inference task. Contextual updating is explored as the effect of nonparametric modulation on a hybrid model. The role is placed in context to previous findings of classical neural network memory models of schizophrenia, including circular inference. We also compare nonparametric limitations in the generative model to disrupted message passing. Finally, updating and retrieval processes are measured in our model.