XLLM @ ACL 2025


The 1st Joint Workshop on Large Language Models and Structure Modeling


Vienna, Austria

August 1st, 2025 (All day event)

Workshop Introduction

Language structure modeling has long been a crucial subfield of natural language processing (NLP) that entails understanding the underlying semantic or syntactic structure of language and texts. Language structures can broadly range from low-level morphological/syntactic types (e.g., dependency structures and phrasal constituent structures) to high-level discourse/semantic structures (e.g., semantic parsing, semantic role labeling, abstract meaning representation), and can even extend to more NLP applications, multi-lingual and multi-modal scenarios in a broader sense, such as information extraction and structured sentiment analysis, etc. In previous days, modeling, inferring, and learning about linguistic structures constituted an indispensable component in many NLP systems and were the key focus of a large proportion of NLP research.

The methodologies and paradigms concerning language structure modeling have always changed dramatically since each deep learning revolution started around a decade ago. In the last two to three years, Large Language Models (LLMs) have emerged, demonstrating unprecedented language understanding and generalization capabilities in effectively addressing a wide range of tasks. This raises a critical question: Is NLP structure modeling still worth exploring in the LLM era? Do the methods and tasks before LLMs still hold value?

On the one hand, we wonder whether previous NLP structure modeling tasks, such as those concerning morphological/syntactic/semantic/discourse structures and high-level structure-aware applications, can achieve even stronger task performance with the powerful capabilities of LLMs.

On the other hand, we are also considering whether it is still necessary to model the underlying structures of language, given that large-scale pretraining on the surface form alone can endow LLMs with extraordinarily powerful language capabilities. In particular, can language structure modeling be beneficial for improving or understanding LLMs?

Thus, this 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025) at ACL 2025 aims to encourage discussions and highlight methods for language structure modeling in the era of LLMs. Specifically, we will explore two main directions: LLM for Structure Modeling (LLM4X) and Structure Modeling for LLM (X4LLM).

🔔News

🔥 [2025-06-15]: Our XLLM workshop will be held in August 1st, 2025, as a whole day event!
🔥 [2025-06-01]: All the decisions of submissions for regular/ARR extended abstract papers, and the shared task papers are sent to all authors!
🔥 [2025-05-01]: We are working hard in finalizing the decisions of regular workshop submissions, and the notifications will be sent by May 5, 2025 (AoE)!
🔥 [2025-04-27]: Openreview entry for Non-archival Extended Abstract Submission is ready at here, submit paper as soon!
🔥 [2025-03-31]: We again extend the workshop direct submission deadline from Mar 31 to Apr 6, 2025 (AoE)! Welcome submissions and don't miss the boat!
🔥 [2025-03-14]: We extend the workshop direct submission deadline from Mar 18 to Mar 31, 2025! Welcome submissions!
🔥 [2025-03-12]: We extend all the shared task deadlines all to Apr 6, 2025! Welcome participation!
🔥 [2025-02-10]: All our data of shared tasks are released! Welcome participation!
🔥 [2025-02-05]: Second round of call for paper and participation is out! Welcome submissions!
🔥 [2024-12-25]: First call for paper is out! Welcome submissions!


Important Dates

All deadlines are specified in AoE (Anywhere on Earth).

[Workshop Timeline]
March 18, 2025 March 31 April 6 Regular workshop paper: direct submission deadline
April 20, 2025 Regular workshop paper: ARR pre-reviewed paper commitment deadline
April 30, 2025 May 5 Regular workshop paper: acceptance notification
May 16, 2025 May 25 Regular workshop paper: camera-ready deadline
May 30, 2025 Non-archival extended abstract: direct submission deadline
June 7, 2025 Non-archival extended abstract: acceptance notification
August 1, 2025 Workshop dates (TBD)
[Shared Task Timeline]
February 10, 2025 Training data and participant instruction release for all shared tasks
March 30, 2025 Apr 6, 2025 Evaluation deadline for all shared tasks
April 5, 2025 Apr 12, 2025 Notification of all shared tasks
April 20, 2025 Apr 25, 2025 Shared-task paper submission deadline
April 30, 2025 Shared-task paper acceptance notification
May 16, 2025 Shared-task paper camera-ready deadline

Call for Papers

Topics

We welcome paper submissions on all topics related to structure modeling under LLMs, including but not limited to:

  • LLM for Structure Modeling (LLM4X)
    • Low-level Syntactic Parsing and Methods
      • Morphological Parsing
      • Dependency Parsing/Constituency Parsing
      • Low-resource/Cross-lingual Syntactic Parsing
      • Head-driven Phrase Structure Grammar Parsing
      • Unsupervised Grammar Induction
      • Cross-modal Parsing/Vision-Language Grammar Induction
    • High-level Semantic Parsing and Methods
      • Semantic Dependency Parsing
      • Frame Parsing
      • Semantic Role Labeling
      • Abstract Meaning Representation
      • Uniform Meaning Representation
      • Universal Decompositional Semantic Parsing
      • Universal Conceptual Cognitive Annotation
      • Rhetorical Structure Theory (RST) Parsing
      • Conversation Discourse Parsing
      • Low-resource/Cross-lingual Semantic Parsing
    • Broader Structure-aware Applications and Methods
      • Information Extraction (IE): NER, RE, EE
      • Structured Sentiment Analysis (SSA), Aspect-based Sentiment Analysis (ABSA)
      • Low-resource/Cross-lingual IE/SSA/ABSA/
      • Cross-modal IE/SSA/ABSA/
      • Text-to-SQL
      • Table Parsing
      • Document Parsing
      • Scene Graph Parsing
      • Universal Structure Parsing/Modeling
      • Human-centered Parsing with LLM
      • Robustness Analysis of LLM-based Parsing
  • Structure Modeling for LLM (X4LLM)
    • Linguistic and/or mathematical arguments for or against the utility of linguistic structures in language models
    • Empirical studies of the utility of linguistic structures in language models
    • Integration of various types of linguistic structures into transformers or other architectures underlying modern language models
    • Incorporation of linguistic structures and representations as additional input or output in language modeling
    • Incorporation of training signals from linguistic structures in language model pre-training and post-training
    • Language model prompting with linguistic rules and structural information
    • Analyses and interpretation of transformers and language models through the lens of linguistic structures

Paper Submission Information

We welcome two types of papers: regular papers and non-archival extended abstracts. All submissions must follow the format requirement of ACL/ARR, and made through OpenReview.

  • Regular workshop papers:

    Authors can submit papers up to 8 pages, with unlimited pages for references. Authors may submit up to 100 MB of supplementary materials separately and their code for reproducibility. All submissions undergo a double-blind single-track review. We will set Best Paper Award(s), which will be given based on nomination by the reviewers. Accepted papers will be presented as posters with the possibility of oral presentations, and will be included in the workshop proceedings.

  • Non-archival extended abstracts:

    Cross-submissions are welcome. Authors can submit extended abstracts up to 2 pages, with unlimited pages for references. An extended abstract may report on work in progress or work that has already appeared in or been accepted by another venue within two years before the workshop. It does not need to be anonymized, but should state explicitly where it was originally accepted or published. Accepted extended abstracts will be presented as posters and will not be included in the workshop proceedings.

In addition to papers submitted directly to the workshop, which will be reviewed by our Programme Committee, we also accept papers reviewed through ACL Rolling Review and committed to the workshop. Please check the relevant dates for each type of submission.

Accepted Papers

We are delighted to congratulate the following papers on being accepted to the XLLM Workshop!

  • Regular workshop papers:
    • BARTABSA++: Revisiting BARTABSA with Decoder LLMs (Oral)
    • Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models (Oral)
    • Can LLMs Interpret and Leverage Structured Linguistic Representations? A Case Study with AMRs (Oral)
    • LLM Dependency Parsing with In-Context Rules (Oral)
    • From Syntax to Semantics: Evaluating the Impact of Linguistic Structures on LLM-Based Information Extraction (Oral)
    • Cross-Document Event-Keyed Summarization (Oral)
    • Fine-Tuning Large Language Models for Relation Extraction within a Retrieval-Augmented Generation Framework
    • Benchmarking Table Extraction: Multimodal LLMs vs Traditional OCR
    • Injecting Structured Knowledge into LLMs via Graph Neural Networks
    • Regular-pattern-sensitive CRFs for Distant Label Interactions
    • Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis
    • Self-Contrastive Loop of Thought Method for Text-to-SQL Based on Large Language Model
    • Combining Automated and Manual Data for Effective Downstream Fine-Tuning of Transformers for Low-Resource Language Applications
    • Seamlessly Integrating Tree-Based Positional Embeddings into Transformer Models for Source Code Representation
    • Enhancing AMR Parsing with Group Relative Policy Optimization
    • Structure Modeling Approach for UD Parsing of Historical Modern Japanese
    • Typed-RAG: Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation
    • Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
    • Leveraging LLM-based sentiment analysis for portfolio optimization with proximal policy optimization
    • Cognitive Mirroring for DocRE: A Self-Supervised Iterative Reflection Framework with Triplet-Centric Explicit and Implicit Feedback
    • Transfer of Structural Knowledge from Synthetic Languages
    • Language Models are Universal Embedders
  • Shared task papers:
    • DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring (Oral)
    • SpeechEE@XLLM25: End-to-End Structured Event Extraction from Speech (Oral)
    • SpeechEE@XLLM25: Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction
    • DocIE@XLLM25: UIEPrompter: A Unified Training-Free Framework for Universal Document-Level Information Extraction via Structured Prompt (Oral)
    • DocIE@XLLM25: In-Context Learning for Information Extraction using Fully Synthetic Demonstrations
    • DocIE@XLLM25: ZeroSemble - Robust and Efficient Zero-Shot Document Information Extraction with Heterogeneous Large Language Model Ensembles
    • LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction (Oral)
    • LLMSR@XLLM25: SWRV: Empowering Self-Verification of Small Language Models through Step-wise Reasoning and Verification
    • LLMSR@XLLM25: Integrating Reasoning Prompt Strategies with Structural Prompt Formats for Enhanced Logical Inference
    • LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation
    • LLMSR@XLLM25: An Empirical Study of LLM for Structural Reasoning
  • Non-archival extended abstract papers:
    • Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale
    • Structured Discourse Representation for Factual Consistency Verification
    • Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation
    • Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models
    • A Systematic Study of Compositional Syntactic Transformer Language Models
    • PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection
    • LinguaLens: Towards Interpreting Linguistic Structure of Large Language Models via Sparse Auto-Encoder
    • Focus on the Emotional Arc! Fine-tune the Model for Ultra-Long Novel Outline Reconstruction
    • Scaling Laws and Structure Acquisition in Neural Language Models: A Theory Based on Hierarchical Grammars
    • Compositional Generalization and Creativity in Language Diffusion Models
    • Leveraging Large Language Models for Structured Sentiment Analysis in Low-resource Domains
    • Leveraging LLM-based sentiment analysis for portfolio optimization with proximal policy optimization

Invited Keynote Speakers

Mark Johnson

Professor at Macquarie University

Bio:Mark Johnson is a Professor of Language Science (CORE) in the School of Computing at Macquarie University. He is also the Chief AI Scientist, Oracle Digital Assistant at Oracle Corporation, where he develops chatbots and digital assistants. The Oracle Digital Assistant division develops novel deep learning models to power the next generation of Conversational AI using semantic parsing. Mark Johnson has worked on a wide range of topics in computational linguistics, but his main area of research is natural language understanding, especially syntactic parsing and semantic analysis, and their applications to text and speech processing.

Title: The Changing Roles of (Linguistic) Structure in Computational Linguistics

Abstract: This talk describes the various roles that linguistic theory and structure have played in computational linguistics, and speculates about the role that they may play in the future. The closest relationship between linguistics and computational linguistics was probably with the Unification Grammars introduced in the 1980s, where the goal was to develop a computational model that implemented the linguistic theory. This close relationship proved impractical for scientific and sociological reasons that I’ll describe, and since then the relationship has steadily weakened. I argue that the huge training data and long context windows of Deep Learning models makes it unnecessary to incorporate any specific linguistically-inspired parsing architecture into such models. While there are deep scientific questions about how LLMs “understand” human languages, their linguistic ability is sufficiently good for most practical tasks. Quite reasonably most current research focuses on the information content of the language LLMs generate, such as reducing hallucinations and improving instruction-following. Thus it seems the main opportunities for linguistics to contribute the modern computational linguistics are in model evaluation and explainability.

Jan Hajič

Professor at Charles University

Bio: Jan Hajič is a professor of Computational Linguistics at the Institute of Formal and Applied Linguistics, School of Computer Science, Charles University, Prague, Czechia. His interests span fundamental formal linguistic problems, machine translation, deep language understanding, and applications. He has built resources for many languages with rich linguistic annotation, such as the Prague Dependency Treebank; he is currently leading a multi-institutional research infrastructure on language resources in Czechia, LINDAT/CLARIAH-CZ, and coordinating a Horizon Europe pilot project on building LLMs, HPLT. His work experience includes both industrial research (IBM Research Yorktown Heights, NY, USA) and academia (Charles University, Prague, Czechia, Johns Hopkins University and University of Colorado, USA, Fellow of the Centre for Advanced Studies at the Norway Academy of Sciences, and others). He has published more than 200 papers. He is a chair or member of many international and national boards and committees, such as the Steering Committee of the TACL journal.

Title: [TDB]

Abstract: [TDB]

Heng Ji

Professor at University of Illinois Urbana-Champaign

Bio: Heng Ji is a Professor of Computer Science at Siebel School of Computing and Data Science, and a faculty member affiliated with Electrical and Computer Engineering Department, Coordinated Science Laboratory, and Carl R. Woese Institute for Genomic Biology of University of Illinois Urbana-Champaign. She is an Amazon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE), and the Founding Director of CapitalOne-Illinois Center on AI Safety and Knowledge Systems (ASKS). She received Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models and Vision-Language Models, and AI for Science. The awards she received include Outstanding Paper Award at ACL2024, two Outstanding Paper Awards at NAACL2024, "Young Scientist" by the World Laureates Association in 2023 and 2024, "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017, "Women Leaders of Conversational AI" (Class of 2023) by Project Voice, "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, ACL2020 Best Demo Paper Award, NAACL2021 Best Demo Paper Award, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She served as the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and the Program Committee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJCNLP2022. She was elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2023.

Title: Structure is Key to Chemical Language Modeling

Abstract: Everything in our wonderful world is composed of molecules. Recent advances in block chemistry involve the manual design of drugs and materials by decomposing molecules into graph substructures—i.e., functional modules—and reassembling them into new molecules with desired functions. However, the process of discovering and manufacturing functional molecules has remained highly artisanal, slow, and expensive. In this talk I will present our recent efforts at teaching computers to speak two complementary languages: one that represents molecular subgraph structures indicative of specific functions, and another that describes these functions in natural language. Unlike existing approaches that add such knowledge as a post hoc step, we developed a function- and synthesis-aware modular chemical language model (mCLM). Inspired by bilingual speakers who frequently “code-switch” (naturally and often switch between their two languages within the same message), we propose a novel neural encoder that integrates molecular structure and natural language. mCLM incorporates both function- and synthesis-related knowledge into the small molecule tokenization process a priori. In experiments on 430 FDA-approved drugs, we find mCLM capable of significantly improving 5 out of 6 chemical functions critical to determining drug potentials. More importantly, mCLM can reason on multiple functions and improve the FDA-rejected drugs (“fallen angels”) over multiple iterations to greatly improve their shortcomings.

Nianwen Xue

Professor at Brandeis University

Bio: Nianwen Xue is a Professor of Linguistics and Computer Science at Brandeis University, specializing in computational linguistics and natural language processing. His research focuses on machine learning methods for syntactic, semantic, and discourse parsing, as well as the creation of large-scale linguistically annotated resources. He has been a principal developer of widely-used linguistic resources, including the Chinese Treebank and the Chinese Proposition Bank, and currently leads the Uniform Meaning Representation (UMR) project, a major initiative dedicated to developing a standardized, cross-linguistic framework for semantic representation. Recently, his research interests have expanded into computational social science, particularly exploring automatic verification of LLM-generated content and the computational analysis of media framing. Xue has served in various editorial and organizational capacities, including as Editor-in-Chief of ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) from 2016 to 2019, and as program co-chair for LREC-COLING 2024, a leading international conference in computational linguistics. He currently serves on the editorial boards of Computational Linguistics and the Journal of Language Resources and Evaluation. His research has received support from organizations including the National Science Foundation, DARPA, IARPA, and Amazon Machine Learning Research.

Title: Beyond Sentence-Level Semantics: Parsing and Evaluating Document-Level Meaning

Abstract: In the era of Large Language Models (LLMs), the role of linguistic structures in Natural Language Processing (NLP) has fundamentally shifted. Traditionally, linguistic representations such as syntactic trees and semantic graphs served primarily as intermediate forms supporting downstream applications like machine translation and question-answering systems. However, with the advent of LLMs capable of performing many NLP tasks end-to-end, this narrative needs reevaluation. For numerous applications, explicit linguistic structures have become unnecessary, while for others, their use has transitioned from intermediate representations toward end-products. In this talk, I will introduce Uniform Meaning Representation (UMR), a comprehensive, document-level semantic representation designed to function effectively as a knowledge graph. UMR integrates sentence-level semantic analyses focusing on named entities and predicate-argument structures with document-level analyses that address temporal relations between events and time expressions, modal dependencies involving events and their cognizers, and coreference relations linking entities and events across text. Additionally, I will present preliminary results for parsing UMR structures from English and Chinese texts, along with novel metrics specifically designed for evaluating document-level semantic representations like UMR.

Program Schedule

- TBD -

Shared Tasks

In addition to paper contributions, we are organizing open challenges on structure-related NLP tasks. Through these shared tasks, we aim to provide a centralized platform for further exploring and advancing traditional or newly-emerged structure-aware tasks.

We have set up three shared tasks as follows. Participants can access the respective task pages to learn about the specific participation requirements. System submissions will be evaluated using automatic metrics, with a focus on the accuracy and relevance of the results. Participants can submit at Codabench.

Teams that achieve top rankings in part of the shared tasks will receive cash prizes. Winning participants are required to write a technical paper that fully describes the techniques and experimental results used. Additionally, they will need to prepare a poster or oral presentation to showcase their methods and approaches on-site.

Task-I: Dialogue-Level Dependency Parsing (DiaDP)

DiaDP aims to build a unified word-wise dependency tree for dialogue contexts. The tree integrates both inner-EDU dependencies (within Elementary Discourse Units, EDUs) and inter-EDU dependencies (across EDUs) to represent the syntactic and discourse relationships between words in dialogues. Given a dialogue consisting of multiple utterances segmented into EDUs, where each utterance is treated as a sentence-like unit, DiaDP outputs a structured dependency tree that includes: 1) Inner-EDU dependencies: Syntactic relationships within individual EDUs; 2) Inter-EDU dependencies: Discourse relationships connecting different EDUs, including cross-utterance links. We set zero-shot and few-shot learning settings, respectively.

The task bridges the gap between sentence-level dependency parsing and discourse-level parsing by extending syntactic tree structures to dialogue scenarios; incorporating both rhetorical and syntactic elements into the tree. Top-3 teams will receive a certificate for their performance, and will be invited to write technical papers to be included into workshop proceedings. For more details to participate, visit the DiaDP challenge website.

Task-II: Speech Event Extraction (SpeechEE)

SpeechEE aims to detect event predicates and arguments directly from audio speech, enabling information acquisition from spoken content such as meetings, interviews, and press releases. The SpeechEE is defined as: Given a speech audio input consisting of a sequence of acoustic frames, the goal is to extract structured event records comprising four elements: 1) the event type, 2) the event trigger, 3) event argument roles, and 4) the corresponding event arguments.

This task bridges the gap between traditional textual event extraction and real-world speech scenarios, providing a foundation for structured knowledge extraction from audio data. Top-3 teams will receive a certificate for their performance, and will be invited to write technical papers to be included into workshop proceedings. For more details to participate, visit the SpeechEE challenge website.

Task-III: LLM for Structural Reasoning (LLM-SR)

LLM-SR seeks to generate a controllable and interpretable reasoning process by leveraging structural reasoning. LLM-SR requires the structural parsing of two distinct components: major premises and minor premises, then involving identifying fine-grained “alignments” between these two structures and ultimately deriving a conclusion.

This task can be regarded as a constrained Chain-of-Thought (CoT) reasoning process, where reasoning is conducted step by step with reference to facts and relevant rules, thereby improving the transparency and reliability of the process. Cash prizes will be awarded to the top three teams. For more details to participate, visit the LLM-SR challenge website.

Task-IV: Document-level Information Extraction (DocIE)

DocIE focuses on extracting information from long documents rather than isolated sentences, necessitating the integration of information both within and across multiple sentences while capturing complex interactions. Given a document and a predefined schema, DocIE requires the extraction of each instance (which may be null) corresponding to the schema's elements. This process involves identifying: (1) types of entities, (2) coreference relationships among mentions, (3) types of relations, and (4) the head and tail entities of each identified relation.

This task evaluates the ability of large language models (LLMs) to extract information from long-context documents and comprehend abstract concepts, thereby advancing their application in mining critical, domain-specific information across various fields. Cash prizes will be awarded to the top three teams. For more details to participate, visit the DocIE challenge website.

Organization Team

https://haofei.vip/

Hao Fei

National University of Singapore
ShanghaiTech University

Kewei Tu

ShanghaiTech University
ShanghaiTech University

Yuhui Zhang

Stanford University
ShanghaiTech University

Xiang Hu

Ant Research
ShanghaiTech University

Wenjuan Han

Beijing Jiaotong University
ShanghaiTech University

Zixia Jia

BigAI
ShanghaiTech University

Zilong Zheng

BigAI
Fudan University

Yixin Cao

Fudan University
ShanghaiTech University

Meishan Zhang

Harbin Institute of Technology (Shenzhen)
ShanghaiTech University

Wei Lu

Singapore University of Technology and Design
ShanghaiTech University

N. Siddharth

University of Edinburgh
ShanghaiTech University

Lilja Øvrelid

University of Oslo
ShanghaiTech University

Nianwen Xue

Brandeis University
Westlake University

Yue Zhang

Westlake University

Program Committee

Contact

Join and post at our Google Group!
Email the organziers at xllm2025@googlegroups.com.