Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background
VIDEO DOI: https://doi.org/10.48448/2nhb-v447

poster

ACL 2024

August 14, 2024

Bangkok, Thailand

VAEGPT-Sim: Improving Sentence Representation with Limited Corpus Using Gradually-Denoising VAE

keywords:

limited dataset and resource training

synonym generation

text representation

Text embedding requires a highly efficient method for training domain-specific models on limited data, as general models trained on large corpora lack universal applicability in highly specific fields. Therefore, we have introduced VAEGPT-Sim, an innovative model for generating synonyms that combines a denoising variational autoencoder with a target-specific discriminator to generate synonymous sentences that closely resemble human language. Even when trained with completely unsupervised settings, it maintains a harmonious balance between semantic similarity and lexical diversity, as shown by a comprehensive evaluation metric system with the highest average scores compared to other generative models. When VAEGPT-Sim is utilized as a module for contrastive learning in text representation, it delivers state-of-the-art results in small-dataset training on STS benchmarks, surpassing ConSERT by 2.8 points. This approach optimizes the effectiveness of text representation despite a limited corpus, signifying an advancement in domain-specific embedding technology.

Downloads

SlidesTranscript English (automatic)

Next from ACL 2024

Assessing News Thumbnail Representativeness: Counterfactual text can enhance the cross-modal matching ability
poster

Assessing News Thumbnail Representativeness: Counterfactual text can enhance the cross-modal matching ability

ACL 2024

Kunwoo Park
Yejun Yoon and 2 other authors

14 August 2024

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Lectures
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2023 Underline - All rights reserved