SIGDAT is the Association for Computational Linguistics special interest group for linguistic data and corpus-based approaches to natural language processing. SIGDAT organizes the EMNLP Conference.

SIGDAT was founded in 1993 and is one of ACL’s oldest SIGs. Since its inception, SIGDAT’s primary mission has been to organize a series of conferences and workshops, including EMNLP (Conference on Empirical Methods in Natural Language Processing) and WVLC (Workshop on Very Large Corpora). These meetings have become quite popular, and EMNLP is now a 3-day main conference with 2 day workshops and tutorials, having about 2500 attendees and 2137 paper submissions in 2018.

SIGDAT is generally focused on corpus-based and statistical methods in Natural Language Processing, and encourages initiatives in support of this broader mission from its members.

The sigdat.org website is maintained by the current SIGDAT Secretary. It was originally created by the then ACL Information Director, Nitin Madnani, in 2018. It was then updated in 2019 by Yuchen Zhang from Jian Su’s Natural Language Processing Group from A-STAR I2R.