Spotlight Session 2 (virtual)

Dr Michael Popov

OMCAN network; University of Oxford

Round Numbers and Representational Alignment. Fundamentalness of Ramanujan’s theorems

Humans build not only the models of external reality but also the models of subjective experience (unconscious, imaginary feelings, inner states). This is a less known (at least for LLM alignment) sort of diverging representations playing an important role in clinical psychology of consciousness pathologies. Following clinical observations, aligned representations spontaneously emerged from unconsciousness are usually shaped by a special class of numbers called “ round numbers” or “Jungian archetypal numbers” of the form: 2,4,10,12,13,14,28,32,40. It is quite surprising that Srinivasa Ramanujan in a short article entitled “Proof that almost all numbers n are composed of about log log n prime factors” (Ramanujan’s article in 1917) attempted to find mathematical theory of the round numbers in the terms of “ Superior highly composite number theory” (1915). In agreement with Ramanujan, round numbers (from his field observations on “taxicabs number theory”) are composed of a considerable number of comparatively small prime factors and they are exceedingly rare. In today’s mathematics Ramanujan’s concept of radicals of the integers could be connected with famous unsolved ABC problem. Moreover, it can suggest a new approach for resolution! Round numbers have remarkable characteristics (described by the two Ramanujan’s theorems); hence, it is expected that Ramanujan’s mathematics can essentially refine current psychological theories of the consciousness as well as to improve superalignment concept in LLMs.

Fidelis.ai

Identifying the active properties of layer 5 myelinated axons with automated and robust optimization of action potential propagation

Rapid saltatory conduction over myelinated axons is shaped by a confluence of biophysical mechanisms. These include linear conductive and capacitive properties of the myelin sheath and underlying axon, as well as many nonlinearly conducting, differentially-distributed ion channels across the neuron, of notably sodium (NaV) and potassium (KV) subtypes. To understand the precise balance of contribution from each active mechanism to shaping action potential conduction in structures difficult to explore experimentally, a computational modelling approach is necessary. Here, using axo-somatic patch-clamp recordings, detailed morphological reconstructions, and experimentally-verified membrane mechanisms, we developed a large-scale, robust, parallelized simulation approach for action potential propagation throughout neocortical myelinated pyramidal neurons with axons up to ~800 μm in length.

Without bias toward any particular action potential feature, we reproduce single recorded spikes across major metrics: threshold, half-width, peak amplitude, after-hyperpolarization, after-depolarization, and conduction velocity. To determine parameter contribution with statistical robustness, and disentangle non-unique solutions, we seed initial parameters according to a Monte Carlo fractional factorial method. To do so, we created software based on NEURON, and ran our simulations on supercomputers at the NSG (SDSC and UT Austin) and EBRAINS (JSC).

Our latest optimization results indicate the present likelihood of finding unsupervised, unbiased, and statistically-robust values for the biophysical parameters underlying action potential propagation in not only myelinated axons but beyond, through dendritic and axonal trees. An exciting future direction involves extending our axon tree results to fully reconstructed axon networks in single cells, to broaden our understanding of axonal computation.

Dr Aslan Satary Dizaji

AutocurriculaLab & Neuro-Inspired Vision

Dimensionality of Intermediate Representations of Deep Neural Networks with Biological Constraints

It is generally believed that deep neural networks and animal brain solve a vision task by changing the dimensionality of input stimuli across deep hierarchy. However, so far the evidence shows that in object recognition task, deep neural networks first expand and then compress the dimensionality of stimuli, while primate brains do the opposite and first compress and then expand the dimensionality of stimuli across deep hierarchy. In this project, it is shown that if two biological constraints - namely the non-negativity of activities and energy efficiency in both activities and weights - are imposed on deep neural networks - the trend of dimensionality across deep neural networks and primate brain matches to each other and in both of these cases the dimensionality of stimuli is first compressed and then expanded. This result shows that how neuroscience can help better understanding in artificial intelligence.

Munich LMU

Deciphering Causal Reasoning: Human vs. Language Models

Bayesian Belief Networks (BBNs) provide a widely used framework for representing causal structures. BBNs employ graphical models to depict intricate relationships between variables, connected by directed arrows in acyclic structures reflecting probabilistic dependencies. Deviating from Bayes' Theorem axioms in these networks leads to suboptimal reasoning. This study examines two classic BBN structures: Chains (A→B→C) and Common Cause (A←B→C) networks, where C's probability should be independent of A given knowledge of B. Humans routinely violate these independence assumptions (Rehder, 2014; Park & Sloman, 2013). We compare mutually exclusive predictions across different accounts of these violations while controlling for valence and prior domain knowledge (N=300; https://osf.io/qaydt).

Considering that deviations from normative criteria in BBNs might be rooted in language, we investigate causal structure effects on large language models’ (LLMs) query responses. Previous research highlights suboptimal causal reasoning in LLMs, yet specific deviations from normativity remain unexplored. We query GPT3.5-Turbo (OpenAI, 2022), GPT4 (OpenAI, 2023), and Luminous Supreme Control (Aleph Alpha, 2023) with queries similar to those presented to humans, while manipulating hyperparameters (Temperature, TopP, TopK). Hierarchical mixed-effect models reveal that humans find Chains more causally potent, perhaps because intermediate causes are perceived as reliable mechanisms (Ahn et al., 1995). This finding replicates in LLMs at higher temperatures when comparing human and LLM distributions using Earthmover’s distance (EMD) and Entropy. Implications for causal representation theories in human cognition and LLMs are discussed.

Attention Tag

Simulating the (Equinamous) Subconscious Mind

This talk proposes a modular software simulation model of the human mind as a reinforcement learning agent with novel components around subconscious thinking & a reward function (“happiness score”). The proposed model incorporates the prior work on modeling the human mind as a Reinforcement learning (RL) agent, continual learning & attention mechanisms. This RL agent receives sensory inputs from the experience of the external world as inputs & generates physical responses (movement of arms, legs, body parts, speech etc) to maximize the reward function (“happiness score”). The model is input multimodal, hierarchical (across senses, systems & timescale), continually learning with controlled habit formation (plasticity). The various blocks of this system defined are Sensory (input), Compute, Reward function, Subconscious, Attention, and Motor (output), each with well-defined inputs & outputs. The proposed model is a conjecture. We propose open questions & suggestions on implementing such a model in software, putting prior public research work on it, vetting the model against the known behavior, and independently allowing work on the happiness score & the subconscious module. A better understanding of what makes us happy perhaps allows for better treatments & parallel advancements in computing & machine learning, leading to an overall better life. The model helps us understand the best environments for children & adults to learn, including moral realignment. This work is inspired by some of the author’s readings of ancient literature on meditation. More details are on https://arvindsaraf.medium.com/modelling-the-mind-e4237435f4b1

Shirin Vafaei

Osaka University

Brain-grounding of word embeddings for improving brain decoding of visual stimuli

Developing algorithms for accurate and comprehensive decoding of neural representation of objects is one of the fundamental goals in neuroscience. Recent studies have demonstrated the feasibility of using neuroimaging and machine learning techniques to decode the neural activity of visual stimuli (Horikawa and Kamitani 2017, Gauthier and Levy 2019). However, their prediction accuracy highly depends on the way that labels of the visual stimuli are denoted in their algorithms (Gauthier and Levy 2019). In current studies, labels are defined by word embedding vectors derived from neural network latent spaces that encode the “distributional semantics” and are based on patterns of co-occurrence of words in large text corpora (Pennington, Socher, and Manning 2014, Mikolov et al. 2013). On the other side, a semantic meaning in the brain is conveyed through various modalities such as perception, imagination, action, hearing or reading and therefore the semantic space of human brain, or brain space (Huth et al. 2012), is formed based on incorporating information from diverse sources6 . In this study, we propose that by integrating features from the brain space into the current commonly used word embedding spaces, we can obtain a new brain-grounded, more brain-like vector representation of labels, that by using them, decoders can better learn to map the neural activity patterns to their corresponding embedding vectors compared to the cases where original word embeddings were adopted.

Michael Yifan Li

Stanford University

Learning to Learn Functions

Humans can learn complex functional relationships between variables from small amounts of data. In doing so, they draw on prior expectations about the form of these relationships. In three experiments, we show that people learn to adjust these expectations through experience, learning about the likely forms of the functions they will encounter. Previous work has used Gaussian processes—a statistical framework that extends Bayesian nonparametric approaches to regression—to model human function learning. We build on this work, modeling the process of learning to learn functions as a form of hierarchical Bayesian inference about the Gaussian process hyperparameters.