Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

AAAI 2026

January 21, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Legal practitioners increasingly rely on machine learning systems to triage large volumes of contractual evidence, yet most deployed models are opaque, non-deterministic, and difficult to align with strict regulatory frameworks such as HIPAA or NERC CIP. We study a simple, reproducible alternative based on deterministic dual encoders and transparent fuzzy triage bands. Concretely, we train a RoBERTa-base dual encoder with a 512-dimensional projection and cosine similarity on the ACORD benchmark for graded clause retrieval, and then fine-tune it on a CUAD-derived binary compliance dataset. Across five random seeds (40--44) on a single NVIDIA A100 GPU, our model achieves ACORD-style retrieval performance of approximately NDCG@5 $\approx 0.38$--$0.42$, NDCG@10 $\approx 0.45$--$0.50$, and 4-star Precision@5 $\approx 0.37$ on the test split. On CUAD-derived binary labels, we obtain AUC $\approx 0.98$--$0.99$ and F$_1 \approx 0.22$--$0.30$ depending on the positive-class weight, substantially outperforming majority and random baselines on a highly imbalanced setting (positive rate $\approx 0.6\%$).

On top of the scalar compliance scores, we introduce a simple fuzzy triage mapping that partitions the score axis into three regions: auto-noncompliant, auto-compliant, and human-review. We tune the lower and upper thresholds on validation data to maximize auto-decision coverage subject to a hard constraint on the empirical error rate (at most $2\%$) over auto-decided examples. This yields deterministic, seed-stable models whose behavior can be summarized by a small number of scalar parameters and reported consistently across runs. We argue that this combination of deterministic encoders, calibrated fuzzy bands, and explicit error constraints offers a practical middle ground between hand-crafted rules and fully opaque large language models: it supports explainable evidence triage, enables reproducible audit trails, and provides a concrete interface for mapping scores and triage regions onto legal concepts such as access control, risk-based review, and residual-risk handling under regulatory frameworks like HIPAA.

Next from AAAI 2026

Multi-Agent Path Finding with Unassigned Agents (MAPFUA)
technical paper

Multi-Agent Path Finding with Unassigned Agents (MAPFUA)

AAAI 2026

Roni Stern
Ariel Felner and 1 other author

22 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved