Detecting out-of-distribution (OOD) inputs is crucial for the safe deployment of natural language processing (NLP) models. Though existing methods, especially those based on the statistics in the feature space of fine-tuned pre-trained language models (PLMs), are claimed to be effective, their effectiveness on different types of distribution shifts remains underexplored. In this work, we take the first step to comprehensively evaluate the mainstream textual OOD detection methods for detecting semantic and non-semantic shifts. We find that: (1) no existing method behaves well in both settings; (2) fine-tuning PLMs on in-distribution data benefits detecting semantic shifts but severely deteriorates detecting non-semantic shifts, which can be attributed to the distortion of task-agnostic features. To alleviate the issue, we present a simple yet effective general OOD score named GNOME that integrates the confidence scores derived from the task-agnostic and task-specific representations. Experiments show that GNOME works well in both semantic and non-semantic shift scenarios, and further brings significant improvement on two cross-task benchmarks where both kinds of shifts simultaneously take place. Our code is available at https://github.com/lancopku/GNOME.

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

The conventional wisdom behind learning deep classification models is to focus on bad-classified examples and ignore well-classified examples that are far from the decision boundary. For instance, when training with cross-entropy loss, examples with higher likelihoods (i.e., well-classified examples) contribute smaller gradients in back-propagation. However, we theoretically show that this common practice hinders representation learning, energy optimization, and margin growth. To counteract this deficiency, we propose to reward well-classified examples with additive bonuses to revive their contribution to the learning process. This counterexample theoretically addresses these three issues. We empirically support this claim by directly verifying the theoretical results or significant performance improvement with our counterexample on diverse tasks, including image classification, graph classification, and machine translation. Furthermore, this paper shows that we can deal with complex scenarios, such as imbalanced classification, OOD detection, and applications under adversarial attacks because our idea can solve these three issues. Code is available at https://github.com/lancopku/well-classified-examples-are-underestimated.

Well-Classified Examples are Underestimated in Classification with Deep Neural Networks

Backdoor attacks, which maliciously control a well-trained model’s outputs of the instances with specific triggers, are recently shown to be serious threats to the safety of reusing deep neural networks (DNNs). In this work, we propose an efficient online defense mechanism based on robustness-aware perturbations. Specifically, by analyzing the backdoor training process, we point out that there exists a big gap of robustness between poisoned and clean samples. Motivated by this observation, we construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing (NLP) models. Moreover, we give a theoretical analysis about the feasibility of our robustness-aware perturbation-based defense method. Experimental results on sentiment analysis and toxic detection tasks show that our method achieves better defending performance and much lower computational costs than existing online defense methods. Our code is available at https://github.com/ lancopku/RAP.

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Recent researches have shown that large natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack. Backdoor attacked models can achieve good performance on clean test sets but perform badly on those input sentences injected with designed trigger words. In this work, we point out a potential problem of current backdoor attacking research: its evaluation ignores the stealthiness of backdoor attacks, and most of existing backdoor attacking methods are not stealthy either to system deployers or to system users. To address this issue, we first propose two additional stealthiness-based metrics to make the backdoor attacking evaluation more credible. We further propose a novel word-based backdoor attacking method based on negative data augmentation and modifying word embeddings, making an important step towards achieving stealthy backdoor attacking. Experiments on sentiment analysis and toxic detection tasks show that our method is much stealthier while maintaining pretty good attacking performance. Our code is available at https://github.com/lancopku/SOS.

Rethinking Stealthiness of Backdoor Attack against NLP Models

Recent studies have revealed a security threat to natural language processing (NLP) models, called the Backdoor Attack. Victim models can maintain competitive performance on clean samples while behaving abnormally on samples with a specific trigger word inserted. Previous backdoor attacking methods usually assume that attackers have a certain degree of data knowledge, either the dataset which users would use or proxy datasets for a similar task, for implementing the data poisoning procedure. However, in this paper, we find that it is possible to hack the model in a data-free way by modifying one single word embedding vector, with almost no accuracy sacrificed on clean samples. Experimental results on sentiment analysis and sentence-pair classification tasks show that our method is more efficient and stealthier. We hope this work can raise the awareness of such a critical security risk hidden in the embedding layers of NLP models. Our code is available at https://github.com/ lancopku/Embedding-Poisoning.

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Natural language processing (NLP) models are known to be vulnerable to backdoor attacks, which poses a newly arisen threat to NLP models. Prior online backdoor defense methods for NLP models only focus on the anomalies at either the input or output level, still suffering from fragility to adaptive attacks and high computational cost. In this work, we take the first step to investigate the unconcealment of textual poisoned samples at the intermediate-feature level and propose a feature-based efficient online defense method. Through extensive experiments on existing attacking methods, we find that the poisoned samples are far away from clean samples in the intermediate feature space of a poisoned NLP model. Motivated by this observation, we devise a distance-based anomaly score (DAN) to distinguish poisoned samples from clean samples at the feature level. Experiments on sentiment analysis and offense detection tasks demonstrate the superiority of DAN, as it substantially surpasses existing online defense methods in terms of defending performance and enjoys lower inference costs. Moreover, we show that DAN is also resistant to adaptive attacks based on feature-level regularization. Our code is available at https://github.com/lancopku/DAN.

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

Please feel free to view the Findings papers at your convenience. You are welcome to leave your comment or a question in the Chat box.

Findings

findings / work in progress

### Welcome to the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL) 
Continuing its mission of expanding and involving the science community of all European countries, EACL had selected the Ukrainian community for the 16th EACL, which took place online due to the COVID pandemic. Unfortunately, the ongoing war made the organisation in Kyiv impossible. Considering the importance of physical interaction among researchers, especially after the restrictions imposed by the COVID pandemic, in addition to an online mode, the conference will be held in Dubrovnik, Croatia, from 2 to 6 of May, 2023. The original aim of strengthening the connection with the Ukrainian community will not change as our program will feature a dedicated session and a workshop to highlight work on Ukrainian language technologies. As the flagship European conference in the field of computational linguistics, EACL welcomes European and international researchers covering a broad spectrum of research areas that are concerned with computational approaches to natural language.

You need to register for the conference in order to access this site. Please visit https://2023.eacl.org/registration for more information.

EACL 2023

As the flagship European conference in the field of computational linguistics, EACL welcomes European and international researchers covering a broad spectrum of research areas that are concerned with computational approaches to natural language.

technical paper

The purpose of the AAAI conference is to promote research in artificial intelligence (AI) and scientific exchange among AI researchers, practitioners, scientists, and engineers in affiliated disciplines. 

AAAI 2022

Virtual Poster Session III: Machine Learning for NLP

poster

EMNLP 2021 is planned to be a hybrid event in Punta Cana, Dominican Republic, with both on-site and fully virtual participation possible. The experience for on-site participants would closely approximate a normal pre-COVID *ACL conference, with 5-6 thematically organized parallel sessions and live Q/A and interactive discussion immediately after the talks. Presentations by virtual participants will be equitably interleaved with those of on-site participants, projected on the auditorium screens as if on-site, and also followed immediately by live Q/A and interactive discussion at a time during reasonable waking hours for the virtual presenter. For all participants, on-site and virtual, who are unable to attend a session due to either time-zone issues or because they are participating in another session live, talk recordings and slides will be available online at a minimum after the live presentation (and in many cases before as well), and questions may be submitted in advance on session-specific discussion boards and answered live in session with the usual visual aids if desired.

<iframe style="width:700px;height:400px" src="https://online.fliphtml5.com/ebtyf/ceby/" seamless="seamless" scrolling="no" frameborder="0" allowtransparency="true" allowfullscreen="true" ></iframe>

Please Note: The EMNLP registration system is not currently connected to the underline site as we are still in the process of building out EMNLP 2021. You will receive access instructions from underline the week of November 1st. 

Access is given only to EMNLP upon registration, if you have not registered please do so [here](https://2021.emnlp.org/registration).

Registered attendees will receive access the week of November 1st.

EMNLP 2021

EMNLP 2021 is planned to be a hybrid event in Punta Cana, Dominican Republic, with both on-site and fully virtual participation possible.

Poster 3G: Machine Learning for NLP

**Welcome to ACL-IJCNLP 2021!**

The great event is jointly organized by the Association for Computational Linguistics (ACL) and Asian Federation of Natural Language Processing (AFNLP). 

As in previous years, the program of the conference includes a poster session, tutorials, workshops and demonstrations in addition to the main conference.


We were able to keep the registration fees similar to those charged for the virtual ACL 2020. The one fee allows attendance at the main conference and any/all tutorials and workshops. These fees would be $125 Regular Early and $175 Regular Late; $50 Student Early and $75 Student Late. Early registration closes at midnight July 11, 2021 (Eastern Daylight Time).

**Reminder:** It is ACL’s policy that at least one author of each accepted paper (including ACL Finding papers) must register for the conference.

**Reminder2:** Underline site will open closer to the event. If you already registered you will receive access detail

Registration is now open

IJCNLP-AACL 2021

The great event is jointly organized by the Association for Computational Linguistics (ACL) and Asian Federation of Natural Language Processing (AFNLP).

**Session Chair: ** Lingpeng Kong

6C-Oral: Machine Learning for NLP: Classification and Structured Prediction Models

2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics.

**Whova App** 
Stay in touch with your fellow conference attendees via the [Whova App](https://whova.com/portal/webapp/nacon_202106/)

**Conference Structure**
https://2021.naacl.org/blog/conference-structure/

**Walkthrough video of how to NAACL 2021** 

Please take a moment to view this video explaining how to navigate the platform, attend sessions network with other attendees. 


<figure class="video_container">
 <iframe src="https://screencast-o-matic.com/watch/crhwbGVh3vx?v=6&ff=1&title=0&controls=1" width=640 height=350 frameborder="0" allowfullscreen="true"> </iframe>
</figure>

NAACL 2021

## Welcome to EMNLP 2022!
I am delighted to welcome you to EMNLP 2022! I believe this conference has been complicated beyond any precedent. Over the past year, it’s been thrilling to see the organization team approach each new puzzle with creativity and enthusiasm. We hope that those participating in Abu Dhabi as well as those joining remotely will leave the conference feeling newly inspired by the program and newly connected to our ever-growing community. Following EMNLP 2021 and major NLP conferences since, EMNLP 2022 is “hybrid,” serving both virtual and in-person participants.

Our key innovations for EMNLP 2022 include:

* EMNLP 2022 is “hybrid” in a second sense, as well: we allowed both direct and rolling review paper submissions, building on the pilot experiment of EMNLP 2021, which considered a small number of ARR submissions. 
* Familiar from NAACL but new to EMNLP, we’ve added an industry track.
* During the conference, “portals” will link virtual poster sessions to in-person conference participants during poster sessions each day.
* The first ACL-family conference in the United Arab Emirates.

 *Message from Noah A. Smith, University of Washington and Allen Institute for AI, Seattle, Washington, USA* 
***EMNLP 2022 General Chair***
 
[![](https://assets.underline.io/uploads/markdown_image/1/image/9eec7d4a287ee18c278b08229290aa83.png)](https://drive.google.com/file/d/1OlPv6QBeo62VVTughj2jkiLeyHd1WnUt/view)
 
[![](https://assets.underline.io/uploads/markdown_image/1/image/a3db7a768409f05192210d98601edb25.png)](https://emnlp2022.rocket.chat/)

To access this site you need to register. Please register [here](https://2022.emnlp.org/registration/).

Register here

EMNLP 2022

Welcome!
EMNLP 2022 will take place in Abu Dhabi from December 7th to December 11th, 2022. And it will be held in hybrid mode, both online and offline.

Wenkai Yang

6

7

6

SHORT BIO

Presentations

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

Well-Classified Examples are Underestimated in Classification with Deep Neural Networks

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Rethinking Stealthiness of Backdoor Attack against NLP Models

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES