XLLM @ ACL 2025

The 1^st Joint Workshop on Large Language Models and Structure Modeling

Vienna, Austria

August 1st, 2025 (All day event)

Workshop Introduction

Language structure modeling has long been a crucial subfield of natural language processing (NLP) that entails understanding the underlying semantic or syntactic structure of language and texts. Language structures can broadly range from low-level morphological/syntactic types (e.g., dependency structures and phrasal constituent structures) to high-level discourse/semantic structures (e.g., semantic parsing, semantic role labeling, abstract meaning representation), and can even extend to more NLP applications, multi-lingual and multi-modal scenarios in a broader sense, such as information extraction and structured sentiment analysis, etc. In previous days, modeling, inferring, and learning about linguistic structures constituted an indispensable component in many NLP systems and were the key focus of a large proportion of NLP research.

The methodologies and paradigms concerning language structure modeling have always changed dramatically since each deep learning revolution started around a decade ago. In the last two to three years, Large Language Models (LLMs) have emerged, demonstrating unprecedented language understanding and generalization capabilities in effectively addressing a wide range of tasks. This raises a critical question: Is NLP structure modeling still worth exploring in the LLM era? Do the methods and tasks before LLMs still hold value?

On the one hand, we wonder whether previous NLP structure modeling tasks, such as those concerning morphological/syntactic/semantic/discourse structures and high-level structure-aware applications, can achieve even stronger task performance with the powerful capabilities of LLMs.

On the other hand, we are also considering whether it is still necessary to model the underlying structures of language, given that large-scale pretraining on the surface form alone can endow LLMs with extraordinarily powerful language capabilities. In particular, can language structure modeling be beneficial for improving or understanding LLMs?

Thus, this 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025) at ACL 2025 aims to encourage discussions and highlight methods for language structure modeling in the era of LLMs. Specifically, we will explore two main directions: LLM for Structure Modeling (LLM4X) and Structure Modeling for LLM (X4LLM).

🔔News

🔥 [2025-06-15]: Our XLLM workshop will be held in August 1st, 2025, as a whole day event!
🔥 [2025-06-01]: All the decisions of submissions for regular/ARR extended abstract papers, and the shared task papers are sent to all authors!
🔥 [2025-05-01]: We are working hard in finalizing the decisions of regular workshop submissions, and the notifications will be sent by May 5, 2025 (AoE)!
🔥 [2025-04-27]: Openreview entry for Non-archival Extended Abstract Submission is ready at here, submit paper as soon!
🔥 [2025-03-31]: We again extend the workshop direct submission deadline from Mar 31 to Apr 6, 2025 (AoE)! Welcome submissions and don't miss the boat!
🔥 [2025-03-14]: We extend the workshop direct submission deadline from Mar 18 to Mar 31, 2025! Welcome submissions!
🔥 [2025-03-12]: We extend all the shared task deadlines all to Apr 6, 2025! Welcome participation!
🔥 [2025-02-10]: All our data of shared tasks are released! Welcome participation!
🔥 [2025-02-05]: Second round of call for paper and participation is out! Welcome submissions!
🔥 [2024-12-25]: First call for paper is out! Welcome submissions!

Important Dates

All deadlines are specified in AoE (Anywhere on Earth).

[Workshop Timeline]

~~March 18, 2025~~ ~~March 31~~ ~~April 6~~	Regular workshop paper: direct submission deadline
~~April 20, 2025~~	Regular workshop paper: ARR pre-reviewed paper commitment deadline
~~April 30, 2025~~ ~~May 5~~	Regular workshop paper: acceptance notification
~~May 16, 2025~~ ~~May 25~~	Regular workshop paper: camera-ready deadline
~~May 30, 2025~~	Non-archival extended abstract: direct submission deadline
~~June 7, 2025~~	Non-archival extended abstract: acceptance notification
August 1, 2025	Workshop dates (TBD)

[Shared Task Timeline]

~~February 10, 2025~~	Training data and participant instruction release for all shared tasks
~~March 30, 2025~~ ~~Apr 6, 2025~~	Evaluation deadline for all shared tasks
~~April 5, 2025~~ ~~Apr 12, 2025~~	Notification of all shared tasks
~~April 20, 2025~~ ~~Apr 25, 2025~~	Shared-task paper submission deadline
~~April 30, 2025~~	Shared-task paper acceptance notification
~~May 16, 2025~~	Shared-task paper camera-ready deadline

Call for Papers

Topics

We welcome paper submissions on all topics related to structure modeling under LLMs, including but not limited to:

LLM for Structure Modeling (LLM4X)

Low-level Syntactic Parsing and Methods

Morphological Parsing
Dependency Parsing/Constituency Parsing
Low-resource/Cross-lingual Syntactic Parsing
Head-driven Phrase Structure Grammar Parsing
Unsupervised Grammar Induction
Cross-modal Parsing/Vision-Language Grammar Induction

High-level Semantic Parsing and Methods

Semantic Dependency Parsing
Frame Parsing
Semantic Role Labeling
Abstract Meaning Representation
Uniform Meaning Representation
Universal Decompositional Semantic Parsing
Universal Conceptual Cognitive Annotation
Rhetorical Structure Theory (RST) Parsing
Conversation Discourse Parsing
Low-resource/Cross-lingual Semantic Parsing

Broader Structure-aware Applications and Methods

Information Extraction (IE): NER, RE, EE
Structured Sentiment Analysis (SSA), Aspect-based Sentiment Analysis (ABSA)
Low-resource/Cross-lingual IE/SSA/ABSA/
Cross-modal IE/SSA/ABSA/
Text-to-SQL
Table Parsing
Document Parsing
Scene Graph Parsing
Universal Structure Parsing/Modeling
Human-centered Parsing with LLM
Robustness Analysis of LLM-based Parsing

Structure Modeling for LLM (X4LLM)

Linguistic and/or mathematical arguments for or against the utility of linguistic structures in language models
Empirical studies of the utility of linguistic structures in language models
Integration of various types of linguistic structures into transformers or other architectures underlying modern language models
Incorporation of linguistic structures and representations as additional input or output in language modeling
Incorporation of training signals from linguistic structures in language model pre-training and post-training
Language model prompting with linguistic rules and structural information
Analyses and interpretation of transformers and language models through the lens of linguistic structures

Paper Submission Information

We welcome two types of papers: regular papers and non-archival extended abstracts. All submissions must follow the format requirement of ACL/ARR, and made through OpenReview.

Regular workshop papers:
Authors can submit papers up to 8 pages, with unlimited pages for references. Authors may submit up to 100 MB of supplementary materials separately and their code for reproducibility. All submissions undergo a double-blind single-track review. We will set Best Paper Award(s), which will be given based on nomination by the reviewers. Accepted papers will be presented as posters with the possibility of oral presentations, and will be included in the workshop proceedings.
Non-archival extended abstracts:
Cross-submissions are welcome. Authors can submit extended abstracts up to 2 pages, with unlimited pages for references. An extended abstract may report on work in progress or work that has already appeared in or been accepted by another venue within two years before the workshop. It does not need to be anonymized, but should state explicitly where it was originally accepted or published. Accepted extended abstracts will be presented as posters and will not be included in the workshop proceedings.

In addition to papers submitted directly to the workshop, which will be reviewed by our Programme Committee, we also accept papers reviewed through ACL Rolling Review and committed to the workshop. Please check the relevant dates for each type of submission.

Invited Keynote Speakers

Mark Johnson

Professor at Macquarie University

Bio:Mark Johnson is a Professor of Language Science (CORE) in the School of Computing at Macquarie University. He is also the Chief AI Scientist, Oracle Digital Assistant at Oracle Corporation, where he develops chatbots and digital assistants. The Oracle Digital Assistant division develops novel deep learning models to power the next generation of Conversational AI using semantic parsing. Mark Johnson has worked on a wide range of topics in computational linguistics, but his main area of research is natural language understanding, especially syntactic parsing and semantic analysis, and their applications to text and speech processing.

Title: The Changing Roles of (Linguistic) Structure in Computational Linguistics

Abstract: This talk describes the various roles that linguistic theory and structure have played in computational linguistics, and speculates about the role that they may play in the future. The closest relationship between linguistics and computational linguistics was probably with the Unification Grammars introduced in the 1980s, where the goal was to develop a computational model that implemented the linguistic theory. This close relationship proved impractical for scientific and sociological reasons that I’ll describe, and since then the relationship has steadily weakened. I argue that the huge training data and long context windows of Deep Learning models makes it unnecessary to incorporate any specific linguistically-inspired parsing architecture into such models. While there are deep scientific questions about how LLMs “understand” human languages, their linguistic ability is sufficiently good for most practical tasks. Quite reasonably most current research focuses on the information content of the language LLMs generate, such as reducing hallucinations and improving instruction-following. Thus it seems the main opportunities for linguistics to contribute the modern computational linguistics are in model evaluation and explainability.

Jan Hajič

Professor at Charles University

Bio: Jan Hajič is a professor of Computational Linguistics at the Institute of Formal and Applied Linguistics, School of Computer Science, Charles University, Prague, Czechia. His interests span fundamental formal linguistic problems, machine translation, deep language understanding, and applications. He has built resources for many languages with rich linguistic annotation, such as the Prague Dependency Treebank; he is currently leading a multi-institutional research infrastructure on language resources in Czechia, LINDAT/CLARIAH-CZ, and coordinating a Horizon Europe pilot project on building LLMs, HPLT. His work experience includes both industrial research (IBM Research Yorktown Heights, NY, USA) and academia (Charles University, Prague, Czechia, Johns Hopkins University and University of Colorado, USA, Fellow of the Centre for Advanced Studies at the Norway Academy of Sciences, and others). He has published more than 200 papers. He is a chair or member of many international and national boards and committees, such as the Steering Committee of the TACL journal.

Title: LLMs and Symbolic Meaning Representations

Abstract: LLMs are becoming the mainstream tool for many tasks formerly built under the Natural Language Processing heading, including not only applications such as conversational agents and dialog systems, but also in information retrieval, summarisation, machine translation and others. Yet there is still ongoing research on deep computational linguistics topic, now primarily focused on linguistics as such. However, perhaps there could be still synergies between the linguistically oriented research and the standard LLM and/or end-to-end deep learning approach to improve, for example, LLMs for low-resourced languages as well as for complementing current LLM-based applications with explanatory power or interpretation. In the talk, I will present current developments in LLM building in Europe, show recent developments on the linguistic front (semantic, or meaning representations such as PDT and UMR, and the use of eventive ontologies), and conclude with a list of yet unanswered research questions.

Heng Ji

Professor at University of Illinois Urbana-Champaign

Bio: Heng Ji is a Professor of Computer Science at Siebel School of Computing and Data Science, and a faculty member affiliated with Electrical and Computer Engineering Department, Coordinated Science Laboratory, and Carl R. Woese Institute for Genomic Biology of University of Illinois Urbana-Champaign. She is an Amazon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE), and the Founding Director of CapitalOne-Illinois Center on AI Safety and Knowledge Systems (ASKS). She received Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models and Vision-Language Models, and AI for Science. The awards she received include Outstanding Paper Award at ACL2024, two Outstanding Paper Awards at NAACL2024, "Young Scientist" by the World Laureates Association in 2023 and 2024, "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017, "Women Leaders of Conversational AI" (Class of 2023) by Project Voice, "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, ACL2020 Best Demo Paper Award, NAACL2021 Best Demo Paper Award, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She served as the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and the Program Committee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJCNLP2022. She was elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2023.

Title: Structure is Key to Chemical Language Modeling

Abstract: Everything in our wonderful world is composed of molecules. Recent advances in block chemistry involve the manual design of drugs and materials by decomposing molecules into graph substructures—i.e., functional modules—and reassembling them into new molecules with desired functions. However, the process of discovering and manufacturing functional molecules has remained highly artisanal, slow, and expensive. In this talk I will present our recent efforts at teaching computers to speak two complementary languages: one that represents molecular subgraph structures indicative of specific functions, and another that describes these functions in natural language. Unlike existing approaches that add such knowledge as a post hoc step, we developed a function- and synthesis-aware modular chemical language model (mCLM). Inspired by bilingual speakers who frequently “code-switch” (naturally and often switch between their two languages within the same message), we propose a novel neural encoder that integrates molecular structure and natural language. mCLM incorporates both function- and synthesis-related knowledge into the small molecule tokenization process a priori. In experiments on 430 FDA-approved drugs, we find mCLM capable of significantly improving 5 out of 6 chemical functions critical to determining drug potentials. More importantly, mCLM can reason on multiple functions and improve the FDA-rejected drugs (“fallen angels”) over multiple iterations to greatly improve their shortcomings.

Nianwen Xue

Professor at Brandeis University

Bio: Nianwen Xue is a Professor of Linguistics and Computer Science at Brandeis University, specializing in computational linguistics and natural language processing. His research focuses on machine learning methods for syntactic, semantic, and discourse parsing, as well as the creation of large-scale linguistically annotated resources. He has been a principal developer of widely-used linguistic resources, including the Chinese Treebank and the Chinese Proposition Bank, and currently leads the Uniform Meaning Representation (UMR) project, a major initiative dedicated to developing a standardized, cross-linguistic framework for semantic representation. Recently, his research interests have expanded into computational social science, particularly exploring automatic verification of LLM-generated content and the computational analysis of media framing. Xue has served in various editorial and organizational capacities, including as Editor-in-Chief of ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) from 2016 to 2019, and as program co-chair for LREC-COLING 2024, a leading international conference in computational linguistics. He currently serves on the editorial boards of Computational Linguistics and the Journal of Language Resources and Evaluation. His research has received support from organizations including the National Science Foundation, DARPA, IARPA, and Amazon Machine Learning Research.

Title: Beyond Sentence-Level Semantics: Parsing and Evaluating Document-Level Meaning

Abstract: In the era of Large Language Models (LLMs), the role of linguistic structures in Natural Language Processing (NLP) has fundamentally shifted. Traditionally, linguistic representations such as syntactic trees and semantic graphs served primarily as intermediate forms supporting downstream applications like machine translation and question-answering systems. However, with the advent of LLMs capable of performing many NLP tasks end-to-end, this narrative needs reevaluation. For numerous applications, explicit linguistic structures have become unnecessary, while for others, their use has transitioned from intermediate representations toward end-products. In this talk, I will introduce Uniform Meaning Representation (UMR), a comprehensive, document-level semantic representation designed to function effectively as a knowledge graph. UMR integrates sentence-level semantic analyses focusing on named entities and predicate-argument structures with document-level analyses that address temporal relations between events and time expressions, modal dependencies involving events and their cognizers, and coreference relations linking entities and events across text. Additionally, I will present preliminary results for parsing UMR structures from English and Chinese texts, along with novel metrics specifically designed for evaluating document-level semantic representations like UMR.

Program Schedule

The workshop will be held on Friday, August 1, 2025, which is located in Room 1.61-62 at the ACL2025 conference venue.

Note that, each oral presentation will have 10 minutes, including any QA; each keynote talk will have 45 minutes, including any QA.

Poster session will take place at Exhibit Hall X5, where all posters are placed there.

The schedule for the workshop is the following:

08:50 - 09:00	Opening Remarks
09:00 - 10:30	Keynote Session - I
	The Changing Roles of (Linguistic) Structure in Computational Linguistics Mark Johnson
	LLMs and Symbolic Meaning Representations Jan Hajič
10:30 - 11:00	Coffee break
11:00 - 12:00	Oral Session - Regular Papers
	BARTABSA++: Revisiting BARTABSA with Decoder LLMs Jan et al.
	Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models Bram et al.
	Can LLMs Interpret and Leverage Structured Linguistic Representations? A Case Study with AMRs Ankush et al.
	LLM Dependency Parsing with In-Context Rules Michael et al.
	From Syntax to Semantics: Evaluating the Impact of Linguistic Structures on LLM-Based Information Extraction Anushka et al.
	Cross-Document Event-Keyed Summarization William et al.
12:00 - 12:30	Poster Session - I
12:30 - 14:00	Lunch Break
14:00 - 14:45	Keynote Session - II
	Beyond Sentence-Level Semantics: Parsing and Evaluating Document-Level Meaning Nianwen Xue
14:45 - 15:30	Oral Session - Shared Tasks
	DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring Shuoqiu et al.
	SpeechEE@XLLM25: End-to-End Structured Event Extraction from Speech Soham et al.
	DocIE@XLLM25: UIEPrompter: A Unified Training-Free Framework for Universal Document-Level Information Extraction via Structured Prompt Chengfeng et al.
	LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction Hongrui et al.
15:30 - 16:00	Coffee break
16:00 - 16:45	Keynote Session - III
	Structure is Key to Chemical Language Modeling Heng Ji
16:45 - 17:00	Closing Remarks
17:00 - 17:30	Poster Session - II

Accepted Papers

We are delighted to congratulate the following papers on being accepted to the XLLM Workshop!

Oral papers are marked, while the rest are posters.

Regular workshop papers:

BARTABSA++: Revisiting BARTABSA with Decoder LLMs (Oral)
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models (Oral)
Can LLMs Interpret and Leverage Structured Linguistic Representations? A Case Study with AMRs (Oral)
LLM Dependency Parsing with In-Context Rules (Oral)
From Syntax to Semantics: Evaluating the Impact of Linguistic Structures on LLM-Based Information Extraction (Oral)
Cross-Document Event-Keyed Summarization (Oral)
Fine-Tuning Large Language Models for Relation Extraction within a Retrieval-Augmented Generation Framework
Benchmarking Table Extraction: Multimodal LLMs vs Traditional OCR
Injecting Structured Knowledge into LLMs via Graph Neural Networks
Regular-pattern-sensitive CRFs for Distant Label Interactions
Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis
Self-Contrastive Loop of Thought Method for Text-to-SQL Based on Large Language Model
Combining Automated and Manual Data for Effective Downstream Fine-Tuning of Transformers for Low-Resource Language Applications
Seamlessly Integrating Tree-Based Positional Embeddings into Transformer Models for Source Code Representation
Enhancing AMR Parsing with Group Relative Policy Optimization
Structure Modeling Approach for UD Parsing of Historical Modern Japanese
Typed-RAG: Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
Cognitive Mirroring for DocRE: A Self-Supervised Iterative Reflection Framework with Triplet-Centric Explicit and Implicit Feedback
Transfer of Structural Knowledge from Synthetic Languages
Language Models are Universal Embedders

Shared task papers:

DiaDP@XLLM25: Advancing Chinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring (Oral)
SpeechEE@XLLM25: End-to-End Structured Event Extraction from Speech (Oral)
SpeechEE@XLLM25: Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction
DocIE@XLLM25: UIEPrompter: A Unified Training-Free Framework for Universal Document-Level Information Extraction via Structured Prompt (Oral)
DocIE@XLLM25: In-Context Learning for Information Extraction using Fully Synthetic Demonstrations
DocIE@XLLM25: ZeroSemble - Robust and Efficient Zero-Shot Document Information Extraction with Heterogeneous Large Language Model Ensembles
LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction (Oral)
LLMSR@XLLM25: SWRV: Empowering Self-Verification of Small Language Models through Step-wise Reasoning and Verification
LLMSR@XLLM25: Integrating Reasoning Prompt Strategies with Structural Prompt Formats for Enhanced Logical Inference
LLMSR@XLLM25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation
LLMSR@XLLM25: An Empirical Study of LLM for Structural Reasoning

Non-archival extended abstract papers:

Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale
Structured Discourse Representation for Factual Consistency Verification
Probabilistic Transformer: A Probabilistic Dependency Model for Contextual Word Representation
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models
A Systematic Study of Compositional Syntactic Transformer Language Models
PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection
LinguaLens: Towards Interpreting Linguistic Structure of Large Language Models via Sparse Auto-Encoder
Focus on the Emotional Arc! Fine-tune the Model for Ultra-Long Novel Outline Reconstruction
Scaling Laws and Structure Acquisition in Neural Language Models: A Theory Based on Hierarchical Grammars
Compositional Generalization and Creativity in Language Diffusion Models
Leveraging Large Language Models for Structured Sentiment Analysis in Low-resource Domains
Leveraging LLM-based sentiment analysis for portfolio optimization with proximal policy optimization

Shared Tasks

In addition to paper contributions, we are organizing open challenges on structure-related NLP tasks. Through these shared tasks, we aim to provide a centralized platform for further exploring and advancing traditional or newly-emerged structure-aware tasks.

We have set up three shared tasks as follows. Participants can access the respective task pages to learn about the specific participation requirements. System submissions will be evaluated using automatic metrics, with a focus on the accuracy and relevance of the results. Participants can submit at Codabench.

Teams that achieve top rankings in part of the shared tasks will receive cash prizes. Winning participants are required to write a technical paper that fully describes the techniques and experimental results used. Additionally, they will need to prepare a poster or oral presentation to showcase their methods and approaches on-site.

Task-I: Dialogue-Level Dependency Parsing (DiaDP)

DiaDP aims to build a unified word-wise dependency tree for dialogue contexts. The tree integrates both inner-EDU dependencies (within Elementary Discourse Units, EDUs) and inter-EDU dependencies (across EDUs) to represent the syntactic and discourse relationships between words in dialogues. Given a dialogue consisting of multiple utterances segmented into EDUs, where each utterance is treated as a sentence-like unit, DiaDP outputs a structured dependency tree that includes: 1) Inner-EDU dependencies: Syntactic relationships within individual EDUs; 2) Inter-EDU dependencies: Discourse relationships connecting different EDUs, including cross-utterance links. We set zero-shot and few-shot learning settings, respectively.

The task bridges the gap between sentence-level dependency parsing and discourse-level parsing by extending syntactic tree structures to dialogue scenarios; incorporating both rhetorical and syntactic elements into the tree. Top-3 teams will receive a certificate for their performance, and will be invited to write technical papers to be included into workshop proceedings. For more details to participate, visit the DiaDP challenge website.

Task-II: Speech Event Extraction (SpeechEE)

SpeechEE aims to detect event predicates and arguments directly from audio speech, enabling information acquisition from spoken content such as meetings, interviews, and press releases. The SpeechEE is defined as: Given a speech audio input consisting of a sequence of acoustic frames, the goal is to extract structured event records comprising four elements: 1) the event type, 2) the event trigger, 3) event argument roles, and 4) the corresponding event arguments.

This task bridges the gap between traditional textual event extraction and real-world speech scenarios, providing a foundation for structured knowledge extraction from audio data. Top-3 teams will receive a certificate for their performance, and will be invited to write technical papers to be included into workshop proceedings. For more details to participate, visit the SpeechEE challenge website.

Task-III: LLM for Structural Reasoning (LLM-SR)

LLM-SR seeks to generate a controllable and interpretable reasoning process by leveraging structural reasoning. LLM-SR requires the structural parsing of two distinct components: major premises and minor premises, then involving identifying fine-grained “alignments” between these two structures and ultimately deriving a conclusion.

This task can be regarded as a constrained Chain-of-Thought (CoT) reasoning process, where reasoning is conducted step by step with reference to facts and relevant rules, thereby improving the transparency and reliability of the process. Cash prizes will be awarded to the top three teams. For more details to participate, visit the LLM-SR challenge website.

Task-IV: Document-level Information Extraction (DocIE)

DocIE focuses on extracting information from long documents rather than isolated sentences, necessitating the integration of information both within and across multiple sentences while capturing complex interactions. Given a document and a predefined schema, DocIE requires the extraction of each instance (which may be null) corresponding to the schema's elements. This process involves identifying: (1) types of entities, (2) coreference relationships among mentions, (3) types of relations, and (4) the head and tail entities of each identified relation.

This task evaluates the ability of large language models (LLMs) to extract information from long-context documents and comprehend abstract concepts, thereby advancing their application in mining critical, domain-specific information across various fields. Cash prizes will be awarded to the top three teams. For more details to participate, visit the DocIE challenge website.

Organization Team

Program Committee

David Chiang, University of Notre Dame
Huang Xuanjing, Fudan University
Milos Stanojevic, University College London
Jennifer Hu, Harvard University
Lei Li, Carnegie Mellon University
Joakim Nivre, Uppsala University
Marianna Apidianaki, University of Pennsylvania
Zhiyuan Liu, Tsinghua University
Qian Liu, University of Auckland
Djamé Seddah, University Paris-Sorbonne
Songlin Yang, Massachusetts Institute of Technology
Jayeol Chun, Brandeis University
Lizi Liao, Singapore Management University
Natha Schneider, Georgetown University
Zuchao Li, Wuhan University
Attapol Rutherford, Chulalongkorn University
Xinya Du, University of Texas at Dallas
Freda Shi, University of Waterloo
Jeff Flanigan, University of California, Santa Cruz
Hongyu Lin, Chinese Academy of Sciences
Jinlan Fu, National University of Singapore
Liangming Pan, University of Arizona
Matthias Lindemann, University of Edinburgh
Zhuosheng Zhang, Shanghai Jiao Tong University
Mattia Opper, University of Edinburgh
Yanpeng Zhao, Beijing Institute for General Artificial Intelligence
Shira Wein, Amherst College
He Kai, National University of Singapore
Wenyue Hua, University of California, Santa Barbara
Anjiang Wei, Stanford University
Ryo Yoshida, University of Tokyo
Lei Li, University of Hong Kong
Julia Bonn, University of Colorado at Boulder
Chao Lou, ShanghaiTech University
Haoyi Wu, ShanghaiTech University
Yu Zhang, Soochow University
Shengqiong Wu, National University of Singapore
Jishnu Ray Chowdhury, University of Illinois Chicago

XLLM @ ACL 2025

The 1^st Joint Workshop on Large Language Models and Structure Modeling

Workshop Introduction

🔔News

Important Dates

Call for Papers

Topics

Paper Submission Information

Invited Keynote Speakers

Mark Johnson

Jan Hajič

Heng Ji

Nianwen Xue

Program Schedule

Accepted Papers

Shared Tasks

Task-I: Dialogue-Level Dependency Parsing (DiaDP)

Task-II: Speech Event Extraction (SpeechEE)

Task-III: LLM for Structural Reasoning (LLM-SR)

Task-IV: Document-level Information Extraction (DocIE)

Organization Team

Hao Fei

Kewei Tu

Yuhui Zhang

Xiang Hu

Wenjuan Han

Zixia Jia

Zilong Zheng

Yixin Cao

Meishan Zhang

Wei Lu

N. Siddharth

Lilja Øvrelid

Nianwen Xue

Yue Zhang

Program Committee

Contact

Join and post at our Google Group!

Email the organziers at xllm2025@googlegroups.com.

XLLM @ ACL 2025

The 1st Joint Workshop on Large Language Models and Structure Modeling

Workshop Introduction

🔔News

Important Dates

Call for Papers

Topics

Paper Submission Information

Invited Keynote Speakers

Mark Johnson

Jan Hajič

Heng Ji

Nianwen Xue

Program Schedule

Accepted Papers

Shared Tasks

Task-I: Dialogue-Level Dependency Parsing (DiaDP)

Task-II: Speech Event Extraction (SpeechEE)

Task-III: LLM for Structural Reasoning (LLM-SR)

Task-IV: Document-level Information Extraction (DocIE)

Organization Team

Hao Fei

Kewei Tu

Yuhui Zhang

Xiang Hu

Wenjuan Han

Zixia Jia

Zilong Zheng

Yixin Cao

Meishan Zhang

Wei Lu

N. Siddharth

Lilja Øvrelid

Nianwen Xue

Yue Zhang

Program Committee

Contact

Join and post at our Google Group!

Email the organziers at xllm2025@googlegroups.com.

The 1^st Joint Workshop on Large Language Models and Structure Modeling