United States

Large Language Models (LLMs) have brought significant advances across various NLP tasks through few-shot or zero-shot prompting, bypassing the need for parameter tuning. However, the &quot;black-box&quot; nature behind their massive parameter sizes increases the &quot;hallucination&quot; concerns, especially in high-stakes applications (e.g., healthcare), where decision mistakes can lead to severe consequences. In contrast, human decision-making relies on complex cognitive processes, such as the ability to sense and adaptively correct mistakes through conceptual understanding. Drawing inspiration from human cognition, we propose an innovative metacognitive approach CLEAR, to equip LLMs with capabilities for self-aware error identification and correction. Our framework constructs concept-specific sparse subnetworks that indicate decision processes. This provides a novel interface for model {intervention} after deployment. The benefits include: (i) at inference time, our metacognitive LLMs can self-consciously identify potential mispredictions with minimum human involvement, (ii) the model can self-correct its errors efficiently without additional tuning, and (iii) the correction procedure is not only self-explanatory but also user-friendly, enhancing model interpretability and accessibility. With these metacognitive features, our approach pioneers a new path toward the trustworthiness of LLMs.

AAAI 2025

Tuning-Free Accountable Intervention for LLM Deployment – a Metacognitive Approach

interpretability analysis and evaluation of nlp models

snlp

Large Language Models (LLMs) have brought significant advances across various NLP tasks through few-shot or zero-shot prompting, bypassing the need for parameter tuning. However, the "black-box" nature behind their massive parameter sizes increases the "hallucination" concerns, especially in high-stakes applications (e.g., healthcare), where decision mistakes can lead to severe consequences. In contrast, human decision-making relies on complex cognitive processes, such as the ability to sense and adaptively correct mistakes through conceptual understanding. Drawing inspiration from human cognition, we propose an innovative metacognitive approach CLEAR, to equip LLMs with capabilities for self-aware error identification and correction. Our framework constructs concept-specific sparse subnetworks that indicate decision processes. This provides a novel interface for model {intervention} after deployment. The benefits include: (i) at inference time, our metacognitive LLMs can self-consciously identify potential mispredictions with minimum human involvement, (ii) the model can self-correct its errors efficiently without additional tuning, and (iii) the correction procedure is not only self-explanatory but also user-friendly, enhancing model interpretability and accessibility. With these metacognitive features, our approach pioneers a new path toward the trustworthiness of LLMs.

technical paper

We are pleased to announce the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), which will be held in Philadelphia, Pennsylvania at the Pennsylvania Convention Center from February 25 to March 4, 2025.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-25 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.

### [Invited Speakers](https://aaai.org/conference/aaai/aaai-25/aaai-25-invited-speakers/)

Register [here](https://aaai.org/conference/aaai/aaai-25/registration/)

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-25 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.



This paper addresses theory in evolutionary multiobjective optimisation (EMO) and focuses on the role of crossover operators in many-objective optimisation. The advantages of using crossover are hardly understood and rigorous runtime analyses with crossover are lagging far behind its use in practice, specifically in the case of more than two objectives. We present a many-objective problem class together with a theoretical runtime analysis of the widely used NSGA-III to demonstrate that crossover can yield an exponential speedup on the runtime. In particular, this algorithm can find the Pareto set in expected polynomial time when using crossover while without crossover it requires exponential time to even find a single Pareto-optimal point. To our knowledge, this is the first rigorous runtime analysis in many-objective optimisation demonstrating an exponential performance gap when using crossover for more than two objectives.

A Many-Objective Problem Where Crossover Is Provably Indispensable

Reentrancy vulnerabilities detection is a hotspot issue in smart contract security, as attackers have used them to steal enormous amounts of money from Ethereum platforms. 
A new attack pattern is emerging in which attackers continuously release new reentrancy patterns to exploit fresh vulnerabilities and obfuscate existing ones. However, existing detection methods neglect the time-series evolution of vulnerabilities across different smart contract versions, leading to a gradual decline in their effectiveness over time. We investigate the time-series correlations among vulnerabilities in various updated versions and refer to these as Evolutionary Reentrancy Vulnerabilities (ERVs). We summarize that ERVs detection faces two key challenges:  (i) capturing the evolving patterns of ERVs along a complete evolutionary chain and (ii) detecting fresh reentrancy vulnerabilities in new versions. To address these challenges,  we propose CLEP, a novel Contrastive Learning with Evolving Pairs detection method. It can effectively capture the  evolving patterns by discerning similarities and differences across versions.  Specifically, we first modified the sample distribution by incorporating version declarations as time-series evolution information. Then, leveraging the hierarchical similarity, we design an evolving pairs scheme to form negative and positive contract pairs across versions. Finally, we build a complete evolutionary chain by proposing a version-aware contrastive sampler.  Our experimental results show that CLEP not only outperforms state-of-the-art baselines in version-specific scenarios but also shows  promising performance in cross-version evolution scenarios.

CLEP: A Novel Contrastive Learning Method for Evolutionary Reentrancy Vulnerability Detection

We introduce OmniMark, a novel and efficient fingerprinting method for Latent Diffusion Models. 
OmniMark can encode user-specific fingerprints across multiple dimensions of the diffusion model's weights, including kernels, filters, channels, and spatial dimensions, which ends up with the presence of invisible fingerprints in any generated images that can be subsequently extracted by a decoder.
This approach thus achieves efficient and scalable ad-hoc generation (<100 ms) of numerous models with unique fingerprints that enable user responsibility tracking and attribution of the model.
Our experiments demonstrate that OmniMark applies to various image generation and editing tasks, and achieves Highly accurate fingerprint detection without compromising image quality. 
Furthermore, we show that our approach exhibits good robustness, against both white-box model attacks and image attacks, including fine-tuning and JPEG compression.

OmniMark: Efficient and Scalable Latent Diffusion Model Fingerprinting

The application of graph neural networks (GNNs) to learn heuristic functions in classical planning is gaining traction. Despite the variety of methods proposed in the literature to encode classical planning tasks for GNNs, a comparative study evaluating their relative performances has been lacking. Moreover, some encodings have been assessed solely for their expressiveness rather than practical effectiveness in planning. This paper provides an extensive comparative analysis of existing encodings. Our results indicate that the smallest encoding based on Gaifman graphs, not yet applied in planning, outperforms the rest due to its fast evaluation times and the informativeness of the resulting heuristic. The overall coverage measured on the IPC almost reaches that of the state-of-the-art planner LAMA, while exhibiting rather complementary strengths across different domains.

State Encodings for GNN-Based Lifted Planners

Retrosynthesis prediction focuses on identifying reactants capable of synthesizing a target product. Typically, the retrosynthesis prediction involves two phases: Reaction Center Identification and Reactant Generation. However, we argue that most existing methods suffer from two limitations in the two phases: 1) Existing models do not adequately capture the "face" information in molecular graphs for the reaction center identification. 2) Current approaches for the reactant generation predominantly use sequence generation in a 2D space, which lacks versatility in generating reasonable distributions for completed reactive groups and overlooks molecules' inherent 3D properties. To overcome the above limitations, we propose GDiffRetro. For the reaction center identification, GDiffRetro uniquely integrates the original graph with its corresponding dual graph to represent molecular structures, which helps guide the model to focus more on the faces in the graph. For the reactant generation, GDiffRetro employs a conditional diffusion model in 3D to further transform the obtained synthon into a complete reactant. Our experimental findings reveal that GDiffRetro outperforms contemporary state-of-the-art semi-template models across various evaluative metrics (for example, GDiffRetro achieves a performance improvement of up to 12.0% on top-1 accuracy).

GDiffRetro: Retrosynthesis Prediction with Dual Graph Enhanced Molecular Representation and Diffusion Generation

Concerns about the risks and harms posed by artificial intelligence (AI) have resulted in significant study into algorithmic transparency, giving rise to a sub-field known as Explainable AI (XAI). Unfortunately, despite a decade of development in XAI, an existential challenge remains: progress in research has not been fully translated into the actual implementation of algorithmic transparency by organizations. In this work, we test an approach for addressing the challenge by creating transparency advocates, or motivated individuals within organizations who drive a ground-up cultural shift towards improved algorithmic transparency.

Over several years, we created an open-source educational workshop on algorithmic transparency and advocacy. We delivered the workshop to professionals across two separate domains to improve their algorithmic transparency literacy and willingness to advocate for change. In the weeks following the workshop, participants applied what they learned, such as speaking up for algorithmic transparency at an organization-wide AI strategy meeting. We also make two broader observations: first, advocacy is not a monolith and can be broken down into different levels. Second, individuals' willingness for advocacy is affected by their professional field. For example, news and media professionals may be more likely to advocate for algorithmic transparency than those working at technology start-ups.

Making Transparency Advocates: An Educational Approach Towards Better Algorithmic Transparency in Practice

In a prior paper, we argued that Artificial Intelligence (AI) should be placed on a different foundation, one based on pattern recognition and feature learning rather than symbol manipulation and feature engineering. In this paper, we provide a proof of concept of an AI course that follows that proposed approach. Students study how these systems become so incredibly powerful through machine learning of features and through pattern matching. Students learn how those systems represent knowledge and they study their currently limited reasoning abilities. Students spend time discussing the accomplishments of current systems, positive as well as negative and they study the projected impact of anticipated systems. In this paper, we give a brief argument of why one would want to offer such a course. We present a detailed outline of the contents of such a course, together with learning materials and their proposed use. We summarize relevant anonymous student feedback and offer a subjective evaluation of the pilot course.

Towards an AI Course Based on Neural Networks

AI Reasoning and System 2 Thinking

The media are agog with claims that recent advances in AI put artificial general intelligence (AGI) within reach. Is this true?  If so, is that a good thing? Alan Turing predicted that AGI would result in the machines taking control. Turing was right to express concern but wrong to think that doom is inevitable. Yet the question of whether superior AI systems can really benefit humanity remains unanswered. It may be that they can do so only by their absence.

Can AI Benefit Humanity?

Over the past few years, Artificial Intelligence has bounded into the mainstream of society.  Remarkable technical achievements in the use of Deep Learning and  Large Language Models have given rise to expectations and hype regarding the possibility of achieving artificial general intelligence, as well as general concerns over the potential deleterious consequences of emerging AI technologies and how to ensure their responsible use. In this panel we engage four Past AAAI Presidents to discuss their views on questions relating to the current state and future of AI research, including such topics as important emerging application areas, current technical challenges, the eventual prospects for achieving artificial general intelligence, and potential AI risks and solutions.

Premium content

Downloads

Next from AAAI 2025

A Many-Objective Problem Where Crossover Is Provably Indispensable

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES