Publications

Trainable Transformer in Transformer
Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia, Sanjeev Arora
arXiv 2023
[paper]
InstructEval: Systematic Evaluation of Instruction Selection Methods
Anirudh Ajith, Chris Pan, Mengzhou Xia, Ameet Deshpande, Karthik Narasimhan
arXiv 2023
[paper] [code]
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
arXiv 2023
[paper] [code] [site]
Enabling Large Language Models to Generate Text with Citations
Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
arXiv 2023
[paper] [code]
MQUAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen
arXiv 2023
[paper] [code]
CSTS: Conditional Semantic Textual Similarity
Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan
arXiv 2023
[paper] [code]
Adapting Language Models to Compress Contexts
Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen
arXiv 2023
[paper] [code]
Referral Augmentation for Zero-Shot Information Retrieval
Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan
arXiv 2023
[paper] [code]
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
arXiv 2023
[paper] [blog]
Learning Transformer Programs
Dan Friedman, Alexander Wettig, Danqi Chen
arXiv 2023
[paper] [code]
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
arXiv 2023
[paper] [code]
Fine-Tuning Language Models with Just Forward Passes
Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora
arXiv 2023
[paper] [code]
Privacy Implications of Retrieval-Based Language Models
Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen
arXiv 2023
[paper]
MUX-PLMs: Data Multiplexing for High-throughput Language Models
Vishvak Murahari, Ameet Deshpande, Carlos E. Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik Narasimhan
arXiv 2023
[paper] [code]
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He
Association for Computational Linguistics (ACL) 2023
[paper]
Training Trajectories of Language Models Across Scales
Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov
Association for Computational Linguistics (ACL) 2023
[paper]
What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning
Jane Pan, Tianyu Gao, Howard Chen, Danqi Chen
Findings of Association for Computational Linguistics (ACL) 2023
[paper] [code]
Optimizing Test-Time Query Representations for Dense Retrieval
Mujeen Sung, Jungsoo Park, Jaewoo Kang, Danqi Chen, Jinhyuk Lee
Findings of Association for Computational Linguistics (ACL) 2023
[paper]
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
International Conference on Learning Representations (ICLR) 2023
[paper] [code] [site] [blog]
Task-Specific Skill Localization in Fine-tuned Language Models
Abhishek Panigrahi, Nikunj Saunshi, Haoyu Zhao, Sanjeev Arora
International Conference on Machine Learning (ICML) 2023
[paper]
SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification
Pranjal Aggarwal, Ameet Deshpande, Karthik Narasimhan
International Conference on Machine Learning (ICML) 2023
[paper] [code]
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
International Conference on Machine Learning (ICML) 2023
[paper] [code]
Controllable Text Generation with Language Constraints
Howard Chen, Huihan Li, Danqi Chen, Karthik Narasimhan
arXiv 2022
[paper] [code]
Training Language Models with Memory Augmentation
Zexuan Zhong, Tao Lei, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper] [code]
Finding Dataset Shortcuts with Grammar Induction
Dan Friedman, Alexander Wettig, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper] [code]
Generating Natural Language Proofs with Verifier-Guided Search
Kaiyu Yang, Jia Deng, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper] [code]
MABEL: Attenuating Gender Bias using Textual Entailment Data
Jacqueline He, Mengzhou Xia, Christiane Fellbaum, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper] [code]
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
Mengzhou Xia, Mikel Artetxe, Jingfei Du, Danqi Chen, Ves Stoyanov
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper] [code]
Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models
Mozes van de Kar, Mengzhou Xia, Danqi Chen, Mikel Artetxe
Empirical Methods in Natural Language Processing (EMNLP) 2022
[paper]
Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines
Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, and Thomas L. Griffiths
Neural Information Processing Systems (NeurIPS) 2022
[paper]
DataMUX: Data Multiplexing for Neural Networks
Vishvak Murahari, Carlos E. Jimenez, Runzhe Yang, Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2022
[paper] [code] [site]
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao, Howard Chen, John Yang, and Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2022
[paper] [code] [site]
Recovering Private Text in Federated Learning of Language Models
Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, Danqi Chen
Neural Information Processing Systems (NeurIPS) 2022
[paper] [code]
Learning Physics Constrained Dynamics Using Autoencoders
Tsung-Yen Yang, Justinian P. Rosca, Karthik Narasimhan, and Peter Ramadge
Neural Information Processing Systems (NeurIPS) 2022
[paper]
Can Rationalization Improve Robustness?
Howard Chen, Jacqueline He, Karthik Narasimhan, Danqi Chen
North American Association for Computational Linguistics (NAACL) 2022
[paper] [code]
Should You Mask 15% in Masked Language Modeling?
Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
European Chapter of the ACL (EACL) 2022
[paper]
Ditch the Gold Standard: Re-evaluating Conversational Question Answering
Huihan Li, Tianyu Gao, Manan Goenka, Danqi Chen
Association for Computational Linguistics (ACL) 2022
[paper] [code] [slides] [talk]
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia, Zexuan Zhong, Danqi Chen
Association for Computational Linguistics (ACL) 2022
[paper] [code] [slides] [talk]
CARETS: A Consistency And Robustness Evaluative Test Suite for VQA
Carlos Jimenez, Olga Russakovsky, Karthik Narasimhan
Association for Computational Linguistics (ACL) 2022
[paper]
Multi-Stage Episodic Control for Strategic Exploration in Text Games
Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan
International Conference on Learning Representations (ICLR) 2022
[paper] [code]
Linking Emergent and Natural Languages via Corpus Transfer
Shunyu Yao, Mo Yu, Yang Zhang, Karthik Narasimhan, Joshua Tenenbaum, Chuang Gan
International Conference on Learning Representations (ICLR) 2022
[paper]
Semantic Supervision: Enabling Generalization over Output Spaces
Austin W. Hanjie, Ameet Deshpande, Karthik Narasimhan
arXiv 2022
[paper] [code] [site]
Multi-query Video Retrieval
Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
European Conference on Computer Vision (ECCV) 2022
[paper] [code]
When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
Ameet Deshpande, Partha Talukdar, Karthik Narasimhan
North American Association for Computational Linguistics (NAACL) 2022
[paper] [code]
Safe Reinforcement Learning with Natural Language Constraints
Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2021
[paper]
SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark
Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer
Neural Information Processing Systems (NeurIPS) 2021
[paper] [code]
Simple Entity-Centric Questions Challenge Dense Retrievers
Christopher Sciavolino, Zexuan Zhong, Jinhyuk Lee, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2021
[paper] [code]
Phrase Retrieval Learns Passage Retrieval, Too
Jinhyuk Lee, Alexander Wettig, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2021
[paper] [code]
Single-dataset Experts for Multi-dataset Question Answering
Dan Friedman, Ben Dodge, Danqi Chen
Empirical Methods in Natural Language Processing (EMNLP) 2021
[paper] [code]
Safe Reinforcement Learning with Natural Language Constraints
Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2021
[paper]
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao, Xingcheng Yao, Danqi Chen
Association for Computational Linguistics (ACL) 2021
[paper] [code]
Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies
Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
International Conference on Machine Learning (ICML) 2021
[paper] [code]
Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
H.J. Austin Wang, Victor Zhong, Karthik Narasimhan
International Conference on Machine Learning (ICML) 2021
[paper] [code]
Learning Dense Representations of Phrases at Scale
Jinhyuk Lee, Mujeen Sung, Jaewoo Kang, Danqi Chen
Association for Computational Linguistics (ACL) 2021
[paper] [code]
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao, Adam Fisch, Danqi Chen
Association for Computational Linguistics (ACL) 2021
[paper] [code]
Self-Attention Networks Can Process Bounded Hierarchical Languages
Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan
Association for Computational Linguistics (ACL) 2021
[paper] [code]
Improving Dialog Systems for Negotiation with Personality Modeling
Runzhe Yang, Jingxiao Chen, Karthik Narasimhan
Association for Computational Linguistics (ACL) 2021
[paper]
A Frustratingly Easy Approach for Entity and Relation Extraction
Zexuan Zhong, Danqi Chen
North American Association for Computational Linguistics (NAACL) 2021
[paper] [code]
Factual Probing Is [MASK]: Learning vs. Learning to Recall
Zexuan Zhong, Dan Friedman, Danqi Chen
North American Association for Computational Linguistics (NAACL) 2021
[paper] [code]
Universal Adversarial Attacks with Natural Triggers for Text Classification
Liwei Song, Xinwei Yu, Hsuan-Tung Peng, Karthik Narasimhan
North American Association for Computational Linguistics (NAACL) 2021
[paper] [code]
Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents
Shunyu Yao, Karthik Narasimhan, Matthew Hausknecht
North American Association for Computational Linguistics (NAACL) 2021
[paper]
Non-Parametric Few-Shot Learning for Word Sense Disambiguation
Howard Chen, Mengzhou Xia, Danqi Chen
North American Association for Computational Linguistics (NAACL) 2021
[paper] [code]
m-Stage Epsilon-Greedy Exploration for Reinforcement Learning
Rohan Rao, Karthik Narasimhan
AAAI-21 Workshop on Reinforcement Learning in Games 2021
[paper]
Learning Rewards from Linguistic Feedback
Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths
Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
[paper]
Connecting Context-specific Adaptation in Humans to Meta-learning
Rachit Dubey, Erin Grant, Michael Luo, Karthik Narasimhan, Thomas Griffiths
arXiv 2020
[paper]
Keep CALM and Explore: Language Models for Action Generation in Text-based Games
Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan
Empirical Methods in Natural Language Processing (EMNLP) 2020
[paper] [code]
Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih
Empirical Methods in Natural Language Processing (EMNLP) 2020
[paper] [code]
TextHide: Tackling Data Privacy in Language Understanding Tasks
Yangsibo Huang, Zhao Song, Danqi Chen, Kai Li, Sanjeev Arora
Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
[paper] [code]
Guiding Attention for Self-Supervised Learning with Transformers
Ameet Deshpande, Karthik Narasimhan
Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
[paper] [code]
Robust and Interpretable Grounding of Spatial References with Relation Networks
Tsung-Yen Yang, Andrew S. Lan, Karthik Narasimhan
Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
[paper] [code]
Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
Raeid Saqur, Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2020
[paper]
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
Neural Information Processing Systems (NeurIPS) 2020
[paper]
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy
Transactions of the Association of Computational Linguistics (TACL) 2020
[paper] [code]
Towards Unique and Informative Captioning of Images
Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky
European Conference on Computer Vision (ECCV) 2020
[paper]
Calibration, Entropy Rates, and Memory in Language Models
Mark Braverman, Xinyi Chen, Sham Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang
International Conference on Machine Learning (ICML) 2020
[paper]
Projection Based Constrained Policy Optimization
Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
International Conference on Learning Representations (ICLR) 2020
[paper]
Take the scenic route: improving generalization in vision-and-language navigation
Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
CVPR Visual Learning with Limited Labels Workshop 2020
[paper]
Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering
Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi
arXiv 2019
[paper]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
arXiv 2019
[paper] [code]
A Discrete Hard EM Approach for Weakly Supervised Question Answering
Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer
Empirical Methods in Natural Language Processing (EMNLP) 2019
[paper] [code]
CoQA: A Conversational Question Answering Challenge
Siva Reddy, Danqi Chen, Christopher D. Manning
Transactions of the Association of Computational Linguistics (TACL) 2019
[paper] [code] [site]
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, Danqi Chen
Proceedings of 2nd Machine Reading for Reading Comprehension (MRQA) Workshop at EMNLP 2019
[paper] [code]
A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
Runzhe Yang, Xingyuan Sun, Karthik Narasimhan
Neural Information Processing Systems (NeurIPS) 2019
[paper]
Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
Yilun Du, Karthik Narasimhan
International Conference on Machine Learning (ICML) 2019
[paper] [code]
A System-Wide Debugging Assistant Powered by Natural Language Processing
Pradeep Dogga, Karthik Narasimhan, Anirudh Sivaraman, Ravi Netravali
Proceedings of the ACM Symposium on Cloud Computing 2019
[paper]