XLLM @ ACL 2025


The 1st Joint Workshop on Large Language Models and Structure Modeling


Vienna, Austria

July 27–August 1st, 2025

Workshop Introduction

Language structure modeling has long been a crucial subfield of natural language processing (NLP) that entails understanding the underlying semantic or syntactic structure of language and texts. Language structures can broadly range from low-level morphological/syntactic types (e.g., dependency structures and phrasal constituent structures) to high-level discourse/semantic structures (e.g., semantic parsing, semantic role labeling, abstract meaning representation), and can even extend to more NLP applications, multi-lingual and multi-modal scenarios in a broader sense, such as information extraction and structured sentiment analysis, etc. In previous days, modeling, inferring, and learning about linguistic structures constituted an indispensable component in many NLP systems and were the key focus of a large proportion of NLP research.

The methodologies and paradigms concerning language structure modeling have always changed dramatically since each deep learning revolution started around a decade ago. In the last two to three years, Large Language Models (LLMs) have emerged, demonstrating unprecedented language understanding and generalization capabilities in effectively addressing a wide range of tasks. This raises a critical question: Is NLP structure modeling still worth exploring in the LLM era? Do the methods and tasks before LLMs still hold value?

On the one hand, we wonder whether previous NLP structure modeling tasks, such as those concerning morphological/syntactic/semantic/discourse structures and high-level structure-aware applications, can achieve even stronger task performance with the powerful capabilities of LLMs.

On the other hand, we are also considering whether it is still necessary to model the underlying structures of language, given that large-scale pretraining on the surface form alone can endow LLMs with extraordinarily powerful language capabilities. In particular, can language structure modeling be beneficial for improving or understanding LLMs?

Thus, this 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025) at ACL 2025 aims to encourage discussions and highlight methods for language structure modeling in the era of LLMs. Specifically, we will explore two main directions: LLM for Structure Modeling (LLM4X) and Structure Modeling for LLM (X4LLM).

🔔News

🔥 [2024-12-25]: First call for paper is out! Welcome submissions!


Important Dates

All deadlines are specified in AoE (Anywhere on Earth).

[Workshop Timeline]
March 18, 2025 Direct workshop paper submission deadline
March 25, 2025 ARR pre-reviewed workshop paper commitment deadline
April 5, 2025 Notification of all shared tasks
April 30, 2025 Acceptance notification of all papers
May 16, 2025 Camera ready paper deadline
July 7, 2025 Pre-recorded video due (hard deadline)
July 31st - August 1st 2025 Workshop dates (TBD)
[Shared Task Timeline]
February 10, 2025 Training data and participant instruction release for all shared tasks
March 30, 2025 Evaluation deadline for all shared tasks
April 5, 2025 Notification of all shared tasks
April 12, 2025 Shared-task paper submission deadline
April 30, 2025 Acceptance notification of shared-task papers
May 16, 2025 Camera ready paper deadline

Call for Papers

Topics

We welcome paper submissions on all topics related to structure modeling under LLMs, including but not limited to:

  • LLM for Structure Modeling (LLM4X)
    • Low-level Syntactic Parsing and Methods
      • Morphological Parsing
      • Dependency Parsing/Constituency Parsing
      • Low-resource/Cross-lingual Syntactic Parsing
      • Head-driven Phrase Structure Grammar Parsing
      • Unsupervised Grammar Induction
      • Cross-modal Parsing/Vision-Language Grammar Induction
    • High-level Semantic Parsing and Methods
      • Semantic Dependency Parsing
      • Frame Parsing
      • Semantic Role Labeling
      • Abstract Meaning Representation
      • Uniform Meaning Representation
      • Universal Decompositional Semantic Parsing
      • Universal Conceptual Cognitive Annotation
      • Rhetorical Structure Theory (RST) Parsing
      • Conversation Discourse Parsing
      • Low-resource/Cross-lingual Semantic Parsing
    • Broader Structure-aware Applications and Methods
      • Information Extraction (IE): NER, RE, EE
      • Structured Sentiment Analysis (SSA), Aspect-based Sentiment Analysis (ABSA)
      • Low-resource/Cross-lingual IE/SSA/ABSA/
      • Cross-modal IE/SSA/ABSA/
      • Text-to-SQL
      • Table Parsing
      • Document Parsing
      • Universal Structure Parsing/Modeling
      • Human-centered Parsing with LLM
      • Robustness Analysis of LLM-based Parsing
  • Structure Modeling for LLM (X4LLM)
    • Linguistic and/or mathematical arguments for or against the utility of linguistic structures in language models
    • Empirical studies of the utility of linguistic structures in language models
    • Integration of various types of linguistic structures into transformers or other architectures underlying modern language models
    • Incorporation of linguistic structures and representations as additional input or output in language modeling
    • Incorporation of training signals from linguistic structures in language model pre-training and post-training
    • Language model prompting with linguistic rules and structural information
    • Analyses and interpretation of transformers and language models through the lens of linguistic structures

Paper Submission Information

We welcome two types of papers: regular papers and non-archival extended abstracts. All submissions must follow the format requirement of ACL/ARR, and made through OpenReview.

  • Regular workshop papers:

    Authors can submit papers up to 8 pages, with unlimited pages for references. Authors may submit up to 100 MB of supplementary materials separately and their code for reproducibility. All submissions undergo a double-blind single-track review. We will set Best Paper Award(s), which will be given based on nomination by the reviewers. Accepted papers will be presented as posters with the possibility of oral presentations, and will be included in the workshop proceedings.

  • Non-archival extended abstracts:

    Cross-submissions are welcome. Authors can submit extended abstracts up to 6 pages (short) to 8 pages (long), with unlimited pages for references. An extended abstract may report on work in progress or work that has already appeared in or been accepted by another venue within two years before the workshop. It does not need to be anonymized, but should state explicitly where it was originally accepted or published. Accepted extended abstracts will be presented as posters and will not be included in the workshop proceedings.

In addition to papers submitted directly to the workshop, which will be reviewed by our Programme Committee, we also accept papers reviewed through ACL Rolling Review and committed to the workshop. Please check the relevant dates for each type of submission.

Invited Keynote Speakers

- TBD -

Mark Johnson

Macquarie University

Bio: [TBD] Mark Johnson is a Professor of Language Science (CORE) in the School of Computing at Macquarie University. He is also the Chief AI Scientist, Oracle Digital Assistant at Oracle Corporation, where he develops chatbots and digital assistants. The Oracle Digital Assistant division develops novel deep learning models to power the next generation of Conversational AI using semantic parsing. Mark Johnson has worked on a wide range of topics in computational linguistics, but his main area of research is natural language understanding, especially syntactic parsing and semantic analysis, and their applications to text and speech processing.

Title: TBD

Abstract: TBD.

Nianwen Xue

Professor at Brandeis University

Bio: [TDB] Nianwen Xue is a Professor at Brandeis University. He directs the Chinese Language Processing Group in the Computer Science Department and the Language & Linguistics Program. His research interests include developing linguistic corpora annotated with Syntactic, Semantic, and Discourse Structures, as well as Machine Learning approaches to Syntactic, Semantic and Discourse Parsing. He is the co-founder of various linguistic corpora and projects, such as CPB, SRL and OntoNotes. His research has been funded by NSF, DARPA, and IARPA. He served as the Editor-in-Chief of TALLIP and currently serves on the editorial boards of LRE, and Lingua Sinica.

Title: Uniform Meaning Representation

Abstract: [TDB] One of the frequent points in the mainstream narrative about large language models is that they have emergent properties", but there is a lot of disagreement about what that even means. If they are understood as a kind of generalization beyond training data- as something that a model does without being explicitly trained for it- I argue that we have not in fact established the existence of any such properties, and at the moment we do not even have the methodology for doing so.

Marianna Apidianaki

Senior Researcher at University of Pennsylvania

Bio: One of the frequent points in the mainstream narrative about large language models is that they have emergent properties", but there is a lot of disagreement about what that even means. If they are understood as a kind of generalization beyond training data- as something that a model does without being explicitly trained for it- I argue that we have not in fact established the existence of any such properties.

Title: TBD

Abstract: TBD.

Program Schedule

- TBD -

Shared Tasks

In addition to paper contributions, we are organizing open challenges on structure-related NLP tasks. Through these shared tasks, we aim to provide a centralized platform for further exploring and advancing traditional or newly-emerged structure-aware tasks.

We have set up three shared tasks as follows. Participants can access the respective task pages to learn about the specific participation requirements. System submissions will be evaluated using automatic metrics, with a focus on the accuracy and relevance of the results. Participants can submit at Codabench.

Teams that achieve top rankings in the shared tasks will receive cash prizes. Winning participants are required to write a technical paper that fully describes the techniques and experimental results used. Additionally, they will need to prepare a poster or oral presentation to showcase their methods and approaches on-site.

Task-I: Dialogue-Level Dependency Parsing (DiaDP)

DiaDP aims to build a unified word-wise dependency tree for dialogue contexts. The tree integrates both inner-EDU dependencies (within Elementary Discourse Units, EDUs) and inter-EDU dependencies (across EDUs) to represent the syntactic and discourse relationships between words in dialogues. Given a dialogue consisting of multiple utterances segmented into EDUs, where each utterance is treated as a sentence-like unit, DiaDP outputs a structured dependency tree that includes: 1) Inner-EDU dependencies: Syntactic relationships within individual EDUs; 2) Inter-EDU dependencies: Discourse relationships connecting different EDUs, including cross-utterance links. We set zero-shot and few-shot learning settings, respectively.

The task bridges the gap between sentence-level dependency parsing and discourse-level parsing by extending syntactic tree structures to dialogue scenarios; incorporating both rhetorical and syntactic elements into the tree. We will set cash prizes for top three teams. For more details to participate, visit the DiaDP challenge website.

Task-II: Speech Event Extraction (SpeechEE)

SpeechEE aims to detect event predicates and arguments directly from audio speech, enabling information acquisition from spoken content such as meetings, interviews, and press releases. The SpeechEE is defined as: Given a speech audio input consisting of a sequence of acoustic frames, the goal is to extract structured event records comprising four elements: 1) the event type, 2) the event trigger, 3) event argument roles, and 4) the corresponding event arguments.

This task bridges the gap between traditional textual event extraction and real-world speech scenarios, providing a foundation for structured knowledge extraction from audio data. We will set cash prizes for top three teams. For more details to participate, visit the SpeechEE challenge website.

Task-III: LLM for Structural Reasoning (LLM-SR)

LLM-SR seeks to generate a controllable and interpretable reasoning process by leveraging structural reasoning. LLM-SR requires the structural parsing of two distinct components: major premises and minor premises, then involving identifying fine-grained “alignments” between these two structures and ultimately deriving a conclusion.

This task can be regarded as a constrained Chain-of-Thought (CoT) reasoning process, where reasoning is conducted step by step with reference to facts and relevant rules, thereby improving the transparency and reliability of the process. Cash prizes will be awarded to the top three teams. For more details to participate, visit the LLM-SR challenge website.

Task-IV: Document-level Information Extraction (DocIE)

DocIE focuses on extracting information from long documents rather than isolated sentences, necessitating the integration of information both within and across multiple sentences while capturing complex interactions. Given a document and a predefined schema, DocIE requires the extraction of each instance (which may be null) corresponding to the schema's elements. This process involves identifying: (1) types of entities, (2) coreference relationships among mentions, (3) types of relations, and (4) the head and tail entities of each identified relation.

This task evaluates the ability of large language models (LLMs) to extract information from long-context documents and comprehend abstract concepts, thereby advancing their application in mining critical, domain-specific information across various fields. Cash prizes will be awarded to the top three teams. For more details to participate, visit the DocIE challenge website.

Organization Team

https://haofei.vip/

Hao Fei

National University of Singapore
ShanghaiTech University

Kewei Tu

ShanghaiTech University
ShanghaiTech University

Yuhui Zhang

Stanford University
ShanghaiTech University

Xiang Hu

Ant Research
ShanghaiTech University

Wenjuan Han

Beijing Jiaotong University
ShanghaiTech University

Zixia Jia

BigAI
ShanghaiTech University

Zilong Zheng

University of California, Los Angeles
Fudan University

Yixin Cao

Fudan University
ShanghaiTech University

Meishan Zhang

Harbin Institute of Technology (Shenzhen)
ShanghaiTech University

Wei Lu

Singapore University of Technology and Design
ShanghaiTech University

N. Siddharth

University of Edinburgh
ShanghaiTech University

Lilja Øvrelid

University of Oslo
ShanghaiTech University

Nianwen Xue

Brandeis University
Westlake University

Yue Zhang

Westlake University

Program Committee

  • David Chiang, University of Notre Dame
  • Milos Stanojevic, University College London
  • Jennifer Hu, Johns Hopkins University
  • Lei Li, Carnegie Mellon University
  • Ryan Cotterell, ETH Zürich
  • Min-Yen Kan, National University of Singapore
  • Scott Wen-tau Yih, Meta AI
  • Marianna Apidianaki, University of Pennsylvania
  • Joakim Nivre, Uppsala University
  • Ryan McDonald, Microsoft Research
  • Alexander Clark, Gothenburg University
  • Marianna Apidianaki, University of Pennsylvania
  • Heng Ji, University of Illinois Urbana-Champaign
  • Xuanjing Huang, Fudan University
  • Erik Cambria, Nanyang Technological University
  • Luheng He, Google
  • Freda Shi, University of Waterloo
  • Yikang Shen, MIT-IBM Watson Lab
  • Jan Hajič, Charles University
  • Yanpeng Zhao, Beijing Institute for General Artificial Intelligence
  • Jishnu Ray Chowdhury, University of Illinois Chicago
  • Matthias Lindemann, University of Edinburgh
  • Wei Liu, Hong Kong University of Science and Technology
  • Chao Lou, ShanghaiTech
  • Mattia Opper, University of Edinburgh
  • Haoyi Wu, ShanghaiTech
  • Songlin Yang, Massachusetts Institute of Technology
  • Ryo Yoshida, University of Tokyo
  • Yu Zhang, Soochow University

Contact

Join and post at our Google Group!
Email the organziers at xllm2025@googlegroups.com.