Welcome to my homepage! I am Kai Sun (孙锴). I obtained my Ph.D. in Computer Science from Cornell University in 2021. My advisor was Claire Cardie. Before that, I did my undergraduate studies in Computer Science, in the ACM Honored Class 2011 at Shanghai Jiao Tong University.
I am interested in artificial intelligence (AI) related areas including natural language processing, speech, AI in games, and machine learning.
Natural Language Processing
My dissertation research focused on improving the state-of-the-art and exploring new tasks for machine reading comprehension and dialogue understanding.
I was the lead developer of the Cornell Chinese Belief and Sentiment Detection Systems, which ranked 1st in BeSt Evaluation at TAC 2016 and TAC 2017 and was employed in the TAC 2017 Cold Start KB task as part of the Tinkerbell team.
AI in Board Games
Yixin was the first Gomoku and Renju AI that can compete at the human champion level. It beat Taiwan's Meijin title holder Lin Shu-Hsuan and the world Gomoku champion Rudolf Dupszki in 2017, and drew with world Renju champion Qi Guan in 2018.
(*: equal contribution)
Dian Yu, Kai Sun, Dong Yu, and Claire Cardie. Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering Data. 2021. arXiv
Kai Sun*, Dian Yu*, Jianshu Chen, Dong Yu, and Claire Cardie. Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge. 2020. arXiv
Kai Sun, Seungwhan Moon, Paul Crook, Stephen Roller, Becka Silvert, Bing Liu, Zhiguang Wang, Honglei Liu, Eunjoon Cho, and Claire Cardie. Adding Chit-Chat to Enhance Task-Oriented Dialogues. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2021. arXiv, data & code
Liang Xu, Hai Hu, Xuanwei Zhang, Lu Li, Chenjie Cao, Yudong Li, Yechen Xu, Kai Sun, Dian Yu, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Zhe Zhao, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson, and Zhenzhong Lan. CLUE: A Chinese Language Understanding Evaluation Benchmark. The 28th International Conference on Computational Linguistics (COLING). 2020. arXiv
Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension. Transactions of the Association for Computational Linguistics (TACL). 2020. arXiv, data & code, project page
Xiaoman Pan*, Kai Sun*, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, and Dong Yu. Improving Question Answering with External Knowledge. EMNLP Workshop on Machine Reading for Question Answering (MRQA). 2019. arXiv, resource
Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu, David McAllester, and Dan Roth. Evidence Sentence Extraction for Machine Reading Comprehension. The SIGNLL Conference on Computational Natural Language Learning (CoNLL). 2019. arXiv, resource
Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, and Dong Yu. Improving Pre-Trained Multilingual Model with Vocabulary Expansion. The SIGNLL Conference on Computational Natural Language Learning (CoNLL). 2019. arXiv
Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. Improving Machine Reading Comprehension with General Reading Strategies. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2019. arXiv, code
Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, and Claire Cardie. DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension. Transactions of the Association for Computational Linguistics (TACL). 2019. arXiv, data & code, leaderboard
Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, and Kai Yu. Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Conference of the International Speech Communication Association (Interspeech). 2016. pdf, bib
Kai Yu, Kai Sun, Lu Chen, and Su Zhu. Constrained Markov Bayesian Polynomial for Efficient Dialogue State Tracking. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP). 2015. pdf, bib
Qizhe Xie, Kai Sun, Su Zhu, Lu Chen, and Kai Yu. Recurrent Polynomial Network for Dialogue State Tracking with Mismatched Semantic Parsers. 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGdial). 2015. pdf, bib
Kai Sun and Claire Cardie. Cornell Belief and Sentiment System at TAC 2017. Text Analysis Conference (TAC). 2017.
Mohamed Al-Badrashiny, Jason Bolton, Arun Tejavsi Chaganty, Kevin Clark, Craig Harman, Lifu Huang, Matthew Lamm, Jinhao Lei, Di Lu, Xiaoman Pan, Ashwin Paranjape, Ellie Pavlick, Haoruo Peng, Peng Qi, Pushpendre Rastogi, Abigail See, Kai Sun, Max Thomas, Chen-Tse Tsai, Hao Wu, Boliang Zhang, Chris Callison-Burch, Claire Cardie, Heng Ji, Christopher Manning, Smaranda Muresan, Owen C. Rambow, Dan Roth, Mark Sammons and Benjamin Van Durme. TinkerBell: Cross-lingual Cold-Start Knowledge Base Construction. Text Analysis Conference (TAC). 2017.
Vlad Niculae, Kai Sun, Xilun Chen, Yao Cheng, Xinya Du, Esin Durmus, Arzoo Katiyar and Claire Cardie. Cornell Belief and Sentiment System at TAC 2016. Text Analysis Conference (TAC). 2016.
Email: ks985 [at] cornell [dot] edu