Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Graph contrastive learning (GCL) aims to learn representations by bringing semantically similar graphs closer and pushing dissimilar ones farther apart without label supervision. Hard negatives, which refer to graphs that have different labels but similar embeddings to the target graph, play a key role in improving representation discrimination. However, current methods that generate both high-quality positives and hard negatives face two challenges: (1) Hard negative sample generation often suffers from class imbalance, resulting in unequal attention across classes and reduced discriminative power in the learned representations. (2) The typical binary positive sample generation approach, which divides the graph into important and unimportant semantic regions, overlooks regions that negatively impact semantics and mislead model predictions. To address these issues, we introduce a novel method named BalanceGCL, which enhance graph contrastive learning with balanced hard negatives and fine-grained semantic-aware positives. BalanceGCL comprises two modules: Balanced Hard Negative graphs generation (BHN) and Fine-grained Semantic-aware Positive graphs generation (FSP). Inspired by the counterfactual mechanism, BHN generates balanced hard negatives that remain structurally similar to the original graph while inducing a controlled semantic shift. To ensure class balance, BHN iteratively constructs one hard negative sample for each class, ensuring an even distribution of negative samples across all alternative categories. FSP leverages the semantic differences between original graphs and balanced hard negatives to identify positively contributing, negatively contributing, and unimportant regions. By enhancing the influence of positive contributors, suppressing negative ones, and perturbing unimportant areas, it generates more reliable and semantically complete positive samples. The proposed method outperforms state-of-the-art GCL techniques across 14 datasets in graph classification and transfer learning tasks, demonstrating its effectiveness in tackling class imbalance and identifying fine-grained semantic-aware regions.