Human reasoning identifies abstract patterns from few examples and generalizes them to new inputs through compositional and continual knowledge learning; infants initially learn words slowly but soon learn new words quickly, with this sophisticated pattern recognition extending to activities like planning and mathematics.
Language models emerge as potential planners and world models for agents in virtual environments. This post delves into the unique capabilities of LLMs for decision-making and environmental understanding within simulated worlds.
Intelligent agents use two learning systems - neocortex for structured knowledge and hippocampus for rapid experience learning. Hippocampus acts as an intermediary, preserving memories without disrupting neocortical knowledge. This post provides a literature review on natural memory to support our recent study on the Associative Transformer.
Global Workspace Theory (GWT) proposes that the brain operates as a network of specialized modules, with a central "global workspace" where selected information is integrated and broadcast. This post is about the alignment between the GWT and Transformer models, with a specific focus on the attention mechanism.
Transformers are a type of neural network architecture that can model long-term dependencies and process sequences in parallel on GPUs. They lack inherent biases, making them flexible but requiring large amounts of training data to generalize well.
Contemporary neural networks struggle with flexible and dynamic information binding, and tend to learn surface statistics instead of underlying concepts. Attractor dynamics, such as Hopfield networks, are an approach to addressing representational dynamics in binding, using multiple stable equilibria that correspond to different decompositions of a scene.
Modular Neural Networks introduce sparsity in neuron connections between neural network layers. MNNs are divided into smaller, more manageable parts, they can be trained more efficiently and with fewer resources than a traditional, monolithic neural network.
Continual learning refers to the ability of an AI system to learn and adapt to new information and experiences over time, without forgetting previous knowledge. To achieve lifelong learning in AI systems, compositionality and modularity in neural networks have been intensively studied to overcome the challenges of catastrophic forgetting and data efficiency.
The global workspace theory demonstrates that in the human brain, multiple neural network models cooperate and compete in solving problems via a shared feature space for common knowledge sharing, which is called the global workspace (GW). Conscious attention selects which module and conscious content is gated through and remains available shortly in working memory.
Visual Question Answering (VQA) is a common problem in multimodal machine learning. Given an image and a question about the contents of the image, the VQA model is trained to answer the question correctly. These questions require an understanding of vision, language, and commonsense knowledge to answer. Several approaches such as Attention have been employed in VQA.
One of the most challenging problems in decentralized AI is to improve the generality of the global model based on client data from different data domains. They are usually used for the same classification task but with particular sample features due to different data collection conditions of clients.
Information in the real world usually comes as different modalities. When searching for visual or audio content on the web, we can train a model leveraging any available collection of web data and index that type of media based on learned multimodal embeddings.