Welcome to my homepage! I am Kai Sun (孙锴). I am currently a research scientist at Meta Reality Labs. I obtained my Ph.D. in Computer Science from Cornell University in 2021. My advisor was Claire Cardie. Before that, I did my undergraduate studies in Computer Science, in the ACM Honors Class 2011 at Shanghai Jiao Tong University.
My research interests lie broadly in artificial intelligence (AI), especially natural language processing (NLP) and AI in Board Games.
(*: equal contribution)
Xiao Yang*, Kai Sun*, Hao Xin*, Yushi Sun*, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar, Wen-tau Yih, and Xin Luna Dong. CRAG -- Comprehensive RAG Benchmark. 2024. arXiv
Yushi Sun, Hao Xin, Kai Sun, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang, and Lei Chen. Are Large Language Models a Good Replacement of Taxonomies? The 50th International Conference on Very Large Databases (VLDB). 2024. arXiv, data & code
Kai Sun, Yifan Ethan Xu, Hanwen Zha, Yue Liu, and Xin Luna Dong. Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs? Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2024. arXiv, code
Kai Sun. Digital Asset Valuation: A Study on Domain Names, Email Addresses, and NFTs. 2022. arXiv, data & code
Kai Sun*, Dian Yu*, Jianshu Chen, Dong Yu, and Claire Cardie. Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge. Annual Meeting of the Association for Computational Linguistics (ACL). 2022. arXiv, code
Kai Sun. Machine Reading Comprehension: Challenges and Approaches. Ph.D. Thesis. 2021. pdf
Dian Yu, Kai Sun, Dong Yu, and Claire Cardie. Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering Data. Findings of the Association for Computational Linguistics: EMNLP 2021 (EMNLP Findings). 2021. arXiv, data
Kai Sun, Seungwhan Moon, Paul Crook, Stephen Roller, Becka Silvert, Bing Liu, Zhiguang Wang, Honglei Liu, Eunjoon Cho, and Claire Cardie. Adding Chit-Chat to Enhance Task-Oriented Dialogues. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2021. arXiv, data & code
Liang Xu, Hai Hu, Xuanwei Zhang, Lu Li, Chenjie Cao, Yudong Li, Yechen Xu, Kai Sun, Dian Yu, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Zhe Zhao, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson, and Zhenzhong Lan. CLUE: A Chinese Language Understanding Evaluation Benchmark. The 28th International Conference on Computational Linguistics (COLING). 2020. arXiv
Dian Yu*, Kai Sun*, Claire Cardie, and Dong Yu. Dialogue-Based Relation Extraction. Annual Meeting of the Association for Computational Linguistics (ACL). 2020. arXiv, data & code, project page
Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension. Transactions of the Association for Computational Linguistics (TACL). 2020. arXiv, data & code, project page
Xiaoman Pan*, Kai Sun*, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, and Dong Yu. Improving Question Answering with External Knowledge. EMNLP Workshop on Machine Reading for Question Answering (MRQA). 2019. arXiv, resource
Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu, David McAllester, and Dan Roth. Evidence Sentence Extraction for Machine Reading Comprehension. The SIGNLL Conference on Computational Natural Language Learning (CoNLL). 2019. arXiv, resource
Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, and Dong Yu. Improving Pre-Trained Multilingual Model with Vocabulary Expansion. The SIGNLL Conference on Computational Natural Language Learning (CoNLL). 2019. arXiv
Kai Sun, Dian Yu, Dong Yu, and Claire Cardie. Improving Machine Reading Comprehension with General Reading Strategies. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2019. arXiv, code
Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, and Claire Cardie. DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension. Transactions of the Association for Computational Linguistics (TACL). 2019. arXiv, data & code, leaderboard
Kai Sun and Claire Cardie. Cornell Belief and Sentiment System at TAC 2017. Text Analysis Conference (TAC). 2017.
Mohamed Al-Badrashiny, Jason Bolton, Arun Tejavsi Chaganty, Kevin Clark, Craig Harman, Lifu Huang, Matthew Lamm, Jinhao Lei, Di Lu, Xiaoman Pan, Ashwin Paranjape, Ellie Pavlick, Haoruo Peng, Peng Qi, Pushpendre Rastogi, Abigail See, Kai Sun, Max Thomas, Chen-Tse Tsai, Hao Wu, Boliang Zhang, Chris Callison-Burch, Claire Cardie, Heng Ji, Christopher Manning, Smaranda Muresan, Owen C. Rambow, Dan Roth, Mark Sammons, and Benjamin Van Durme. TinkerBell: Cross-lingual Cold-Start Knowledge Base Construction. Text Analysis Conference (TAC). 2017.
Vlad Niculae, Kai Sun, Xilun Chen, Yao Cheng, Xinya Du, Esin Durmus, Arzoo Katiyar, and Claire Cardie. Cornell Belief and Sentiment System at TAC 2016. Text Analysis Conference (TAC). 2016.
Kai Sun, Su Zhu, Lu Chen, Siqiu Yao, Xueyang Wu, and Kai Yu. Hybrid Dialogue State Tracking for Real World Human-to-Human Dialogues. Conference of the International Speech Communication Association (Interspeech). 2016. pdf, bib
Kai Sun, Qizhe Xie, and Kai Yu. Recurrent Polynomial Network for Dialogue State Tracking. Dialogue and Discourse (D&D). 2016. pdf, bib
Kai Yu, Lu Chen, Kai Sun, Su Zhu, and Qizhe Xie. Evolvable Dialogue State Tracking for Statistical Dialogue Management. Frontiers of Computer Science. 2015. pdf, bib
Kai Yu, Kai Sun, Lu Chen, and Su Zhu. Constrained Markov Bayesian Polynomial for Efficient Dialogue State Tracking. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP). 2015. pdf, bib
Qizhe Xie, Kai Sun, Su Zhu, Lu Chen, and Kai Yu. Recurrent Polynomial Network for Dialogue State Tracking with Mismatched Semantic Parsers. 16th Annual SIGdial Meeting on Discourse and Dialogue (SIGdial). 2015. pdf, bib
Kai Yu, Lu Chen, Bo Chen, Kai Sun, and Su Zhu. Cognitive Technology in Task-Oriented Dialogue Systems -- Concepts, Advances and Future. Chinese Journal of Computers. 2014. pdf, bib
Su Zhu, Lu Chen, Kai Sun, Da Zheng, and Kai Yu. Semantic Parser Enhancement for Dialogue Domain Extension with Little Data. IEEE Spoken Language Technology Workshop (SLT). 2014. pdf, bib
Kai Sun, Lu Chen, Su Zhu, and Kai Yu. A Generalized Rule Based Tracker for Dialogue State Tracking. IEEE Spoken Language Technology Workshop (SLT). 2014. pdf, bib
Kai Sun, Lu Chen, Su Zhu, and Kai Yu. The SJTU System for Dialog State Tracking Challenge 2. 15th Annual SIGdial Meeting on Discourse and Dialogue (SIGdial). 2014. pdf, bib
I designed Yixin, an AI program playing Gomoku and Renju. It was the world champion AI, the winner of the 13th, 14th, 15th, 16th, 17th, 18th, and 19th Gomocup.
Yixin was the first Gomoku and Renju AI that can compete at the human champion level. It beat Taiwan's Meijin title holder Lin Shu-Hsuan and the world Gomoku champion Rudolf Dupszki in 2017, and drew with world Renju champion Qi Guan in 2018.
I have been managing Gomocup (with Tianyi Hao) since the 17th Gomocup, 2016.
Email: ks985 [at] cornell [dot] edu
Last updated: June 2024