Publications

  • Trainable Transformer in Transformer
    Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia, Sanjeev Arora
    arXiv 2023
    [paper]
  • InstructEval: Systematic Evaluation of Instruction Selection Methods
    Anirudh Ajith, Chris Pan, Mengzhou Xia, Ameet Deshpande, Karthik Narasimhan
    arXiv 2023
    [paper] [code]
  • InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
    John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
    arXiv 2023
    [paper] [code] [site]
  • Enabling Large Language Models to Generate Text with Citations
    Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
    arXiv 2023
    [paper] [code]
  • MQUAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
    Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen
    arXiv 2023
    [paper] [code]
  • CSTS: Conditional Semantic Textual Similarity
    Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak Murahari, Victoria Graf, Tanmay Rajpurohit, Ashwin Kalyan, Danqi Chen, Karthik Narasimhan
    arXiv 2023
    [paper] [code]
  • Adapting Language Models to Compress Contexts
    Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen
    arXiv 2023
    [paper] [code]
  • Referral Augmentation for Zero-Shot Information Retrieval
    Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan
    arXiv 2023
    [paper] [code]
  • Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
    Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
    arXiv 2023
    [paper] [blog]
  • Learning Transformer Programs
    Dan Friedman, Alexander Wettig, Danqi Chen
    arXiv 2023
    [paper] [code]
  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models
    Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
    arXiv 2023
    [paper] [code]
  • Fine-Tuning Language Models with Just Forward Passes
    Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora
    arXiv 2023
    [paper] [code]
  • Privacy Implications of Retrieval-Based Language Models
    Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen
    arXiv 2023
    [paper]
  • MUX-PLMs: Data Multiplexing for High-throughput Language Models
    Vishvak Murahari, Ameet Deshpande, Carlos E. Jimenez, Izhak Shafran, Mingqiu Wang, Yuan Cao, Karthik Narasimhan
    arXiv 2023
    [paper] [code]
  • Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
    Chenglei Si, Dan Friedman, Nitish Joshi, Shi Feng, Danqi Chen, He He
    Association for Computational Linguistics (ACL) 2023
    [paper]
  • Training Trajectories of Language Models Across Scales
    Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov
    Association for Computational Linguistics (ACL) 2023
    [paper]
  • What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning
    Jane Pan, Tianyu Gao, Howard Chen, Danqi Chen
    Findings of Association for Computational Linguistics (ACL) 2023
    [paper] [code]
  • Optimizing Test-Time Query Representations for Dense Retrieval
    Mujeen Sung, Jungsoo Park, Jaewoo Kang, Danqi Chen, Jinhyuk Lee
    Findings of Association for Computational Linguistics (ACL) 2023
    [paper]
  • ReAct: Synergizing Reasoning and Acting in Language Models
    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
    International Conference on Learning Representations (ICLR) 2023
    [paper] [code] [site] [blog]
  • Task-Specific Skill Localization in Fine-tuned Language Models
    Abhishek Panigrahi, Nikunj Saunshi, Haoyu Zhao, Sanjeev Arora
    International Conference on Machine Learning (ICML) 2023
    [paper]
  • SemSup-XC: Semantic Supervision for Zero and Few-shot Extreme Classification
    Pranjal Aggarwal, Ameet Deshpande, Karthik Narasimhan
    International Conference on Machine Learning (ICML) 2023
    [paper] [code]
  • A Kernel-Based View of Language Model Fine-Tuning
    Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
    International Conference on Machine Learning (ICML) 2023
    [paper] [code]
  • Controllable Text Generation with Language Constraints
    Howard Chen, Huihan Li, Danqi Chen, Karthik Narasimhan
    arXiv 2022
    [paper] [code]
  • Training Language Models with Memory Augmentation
    Zexuan Zhong, Tao Lei, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper] [code]
  • Finding Dataset Shortcuts with Grammar Induction
    Dan Friedman, Alexander Wettig, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper] [code]
  • Generating Natural Language Proofs with Verifier-Guided Search
    Kaiyu Yang, Jia Deng, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper] [code]
  • MABEL: Attenuating Gender Bias using Textual Entailment Data
    Jacqueline He, Mengzhou Xia, Christiane Fellbaum, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper] [code]
  • Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
    Mengzhou Xia, Mikel Artetxe, Jingfei Du, Danqi Chen, Ves Stoyanov
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper] [code]
  • Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models
    Mozes van de Kar, Mengzhou Xia, Danqi Chen, Mikel Artetxe
    Empirical Methods in Natural Language Processing (EMNLP) 2022
    [paper]
  • Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines
    Sreejan Kumar, Carlos G. Correa, Ishita Dasgupta, Raja Marjieh, Michael Y. Hu, Robert D. Hawkins, Nathaniel D. Daw, Jonathan D. Cohen, Karthik Narasimhan, and Thomas L. Griffiths
    Neural Information Processing Systems (NeurIPS) 2022
    [paper]
  • DataMUX: Data Multiplexing for Neural Networks
    Vishvak Murahari, Carlos E. Jimenez, Runzhe Yang, Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2022
    [paper] [code] [site]
  • WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
    Shunyu Yao, Howard Chen, John Yang, and Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2022
    [paper] [code] [site]
  • Recovering Private Text in Federated Learning of Language Models
    Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, Danqi Chen
    Neural Information Processing Systems (NeurIPS) 2022
    [paper] [code]
  • Learning Physics Constrained Dynamics Using Autoencoders
    Tsung-Yen Yang, Justinian P. Rosca, Karthik Narasimhan, and Peter Ramadge
    Neural Information Processing Systems (NeurIPS) 2022
    [paper]
  • Can Rationalization Improve Robustness?
    Howard Chen, Jacqueline He, Karthik Narasimhan, Danqi Chen
    North American Association for Computational Linguistics (NAACL) 2022
    [paper] [code]
  • Should You Mask 15% in Masked Language Modeling?
    Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
    European Chapter of the ACL (EACL) 2022
    [paper]
  • Ditch the Gold Standard: Re-evaluating Conversational Question Answering
    Huihan Li, Tianyu Gao, Manan Goenka, Danqi Chen
    Association for Computational Linguistics (ACL) 2022
    [paper] [code] [slides] [talk]
  • Structured Pruning Learns Compact and Accurate Models
    Mengzhou Xia, Zexuan Zhong, Danqi Chen
    Association for Computational Linguistics (ACL) 2022
    [paper] [code] [slides] [talk]
  • CARETS: A Consistency And Robustness Evaluative Test Suite for VQA
    Carlos Jimenez, Olga Russakovsky, Karthik Narasimhan
    Association for Computational Linguistics (ACL) 2022
    [paper]
  • Multi-Stage Episodic Control for Strategic Exploration in Text Games
    Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan
    International Conference on Learning Representations (ICLR) 2022
    [paper] [code]
  • Linking Emergent and Natural Languages via Corpus Transfer
    Shunyu Yao, Mo Yu, Yang Zhang, Karthik Narasimhan, Joshua Tenenbaum, Chuang Gan
    International Conference on Learning Representations (ICLR) 2022
    [paper]
  • Semantic Supervision: Enabling Generalization over Output Spaces
    Austin W. Hanjie, Ameet Deshpande, Karthik Narasimhan
    arXiv 2022
    [paper] [code] [site]
  • Multi-query Video Retrieval
    Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
    European Conference on Computer Vision (ECCV) 2022
    [paper] [code]
  • When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
    Ameet Deshpande, Partha Talukdar, Karthik Narasimhan
    North American Association for Computational Linguistics (NAACL) 2022
    [paper] [code]
  • Safe Reinforcement Learning with Natural Language Constraints
    Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2021
    [paper]
  • SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark
    Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer
    Neural Information Processing Systems (NeurIPS) 2021
    [paper] [code]
  • Simple Entity-Centric Questions Challenge Dense Retrievers
    Christopher Sciavolino, Zexuan Zhong, Jinhyuk Lee, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2021
    [paper] [code]
  • Phrase Retrieval Learns Passage Retrieval, Too
    Jinhyuk Lee, Alexander Wettig, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2021
    [paper] [code]
  • Single-dataset Experts for Multi-dataset Question Answering
    Dan Friedman, Ben Dodge, Danqi Chen
    Empirical Methods in Natural Language Processing (EMNLP) 2021
    [paper] [code]
  • Safe Reinforcement Learning with Natural Language Constraints
    Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2021
    [paper]
  • SimCSE: Simple Contrastive Learning of Sentence Embeddings
    Tianyu Gao, Xingcheng Yao, Danqi Chen
    Association for Computational Linguistics (ACL) 2021
    [paper] [code]
  • Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies
    Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
    International Conference on Machine Learning (ICML) 2021
    [paper] [code]
  • Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
    H.J. Austin Wang, Victor Zhong, Karthik Narasimhan
    International Conference on Machine Learning (ICML) 2021
    [paper] [code]
  • Learning Dense Representations of Phrases at Scale
    Jinhyuk Lee, Mujeen Sung, Jaewoo Kang, Danqi Chen
    Association for Computational Linguistics (ACL) 2021
    [paper] [code]
  • Making Pre-trained Language Models Better Few-shot Learners
    Tianyu Gao, Adam Fisch, Danqi Chen
    Association for Computational Linguistics (ACL) 2021
    [paper] [code]
  • Self-Attention Networks Can Process Bounded Hierarchical Languages
    Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan
    Association for Computational Linguistics (ACL) 2021
    [paper] [code]
  • Improving Dialog Systems for Negotiation with Personality Modeling
    Runzhe Yang, Jingxiao Chen, Karthik Narasimhan
    Association for Computational Linguistics (ACL) 2021
    [paper]
  • A Frustratingly Easy Approach for Entity and Relation Extraction
    Zexuan Zhong, Danqi Chen
    North American Association for Computational Linguistics (NAACL) 2021
    [paper] [code]
  • Factual Probing Is [MASK]: Learning vs. Learning to Recall
    Zexuan Zhong, Dan Friedman, Danqi Chen
    North American Association for Computational Linguistics (NAACL) 2021
    [paper] [code]
  • Universal Adversarial Attacks with Natural Triggers for Text Classification
    Liwei Song, Xinwei Yu, Hsuan-Tung Peng, Karthik Narasimhan
    North American Association for Computational Linguistics (NAACL) 2021
    [paper] [code]
  • Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents
    Shunyu Yao, Karthik Narasimhan, Matthew Hausknecht
    North American Association for Computational Linguistics (NAACL) 2021
    [paper]
  • Non-Parametric Few-Shot Learning for Word Sense Disambiguation
    Howard Chen, Mengzhou Xia, Danqi Chen
    North American Association for Computational Linguistics (NAACL) 2021
    [paper] [code]
  • m-Stage Epsilon-Greedy Exploration for Reinforcement Learning
    Rohan Rao, Karthik Narasimhan
    AAAI-21 Workshop on Reinforcement Learning in Games 2021
    [paper]
  • Learning Rewards from Linguistic Feedback
    Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths
    Thirty-Fifth AAAI Conference on Artificial Intelligence 2021
    [paper]
  • Connecting Context-specific Adaptation in Humans to Meta-learning
    Rachit Dubey, Erin Grant, Michael Luo, Karthik Narasimhan, Thomas Griffiths
    arXiv 2020
    [paper]
  • Keep CALM and Explore: Language Models for Action Generation in Text-based Games
    Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan
    Empirical Methods in Natural Language Processing (EMNLP) 2020
    [paper] [code]
  • Dense Passage Retrieval for Open-Domain Question Answering
    Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih
    Empirical Methods in Natural Language Processing (EMNLP) 2020
    [paper] [code]
  • TextHide: Tackling Data Privacy in Language Understanding Tasks
    Yangsibo Huang, Zhao Song, Danqi Chen, Kai Li, Sanjeev Arora
    Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
    [paper] [code]
  • Guiding Attention for Self-Supervised Learning with Transformers
    Ameet Deshpande, Karthik Narasimhan
    Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
    [paper] [code]
  • Robust and Interpretable Grounding of Spatial References with Relation Networks
    Tsung-Yen Yang, Andrew S. Lan, Karthik Narasimhan
    Findings of Empirical Methods in Natural Language Processing (EMNLP) 2020
    [paper] [code]
  • Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
    Raeid Saqur, Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2020
    [paper]
  • Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
    Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
    Neural Information Processing Systems (NeurIPS) 2020
    [paper]
  • SpanBERT: Improving Pre-training by Representing and Predicting Spans
    Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy
    Transactions of the Association of Computational Linguistics (TACL) 2020
    [paper] [code]
  • Towards Unique and Informative Captioning of Images
    Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky
    European Conference on Computer Vision (ECCV) 2020
    [paper]
  • Calibration, Entropy Rates, and Memory in Language Models
    Mark Braverman, Xinyi Chen, Sham Kakade, Karthik Narasimhan, Cyril Zhang, Yi Zhang
    International Conference on Machine Learning (ICML) 2020
    [paper]
  • Projection Based Constrained Policy Optimization
    Tsung-Yen Yang, Justinian Rosca, Karthik Narasimhan, Peter J. Ramadge
    International Conference on Learning Representations (ICLR) 2020
    [paper]
  • Take the scenic route: improving generalization in vision-and-language navigation
    Felix Yu, Zhiwei Deng, Karthik Narasimhan, Olga Russakovsky
    CVPR Visual Learning with Limited Labels Workshop 2020
    [paper]
  • Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering
    Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi
    arXiv 2019
    [paper]
  • RoBERTa: A Robustly Optimized BERT Pretraining Approach
    Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov
    arXiv 2019
    [paper] [code]
  • A Discrete Hard EM Approach for Weakly Supervised Question Answering
    Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer
    Empirical Methods in Natural Language Processing (EMNLP) 2019
    [paper] [code]
  • CoQA: A Conversational Question Answering Challenge
    Siva Reddy, Danqi Chen, Christopher D. Manning
    Transactions of the Association of Computational Linguistics (TACL) 2019
    [paper] [code] [site]
  • MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
    Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, Danqi Chen
    Proceedings of 2nd Machine Reading for Reading Comprehension (MRQA) Workshop at EMNLP 2019
    [paper] [code]
  • A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
    Runzhe Yang, Xingyuan Sun, Karthik Narasimhan
    Neural Information Processing Systems (NeurIPS) 2019
    [paper]
  • Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
    Yilun Du, Karthik Narasimhan
    International Conference on Machine Learning (ICML) 2019
    [paper] [code]
  • A System-Wide Debugging Assistant Powered by Natural Language Processing
    Pradeep Dogga, Karthik Narasimhan, Anirudh Sivaraman, Ravi Netravali
    Proceedings of the ACM Symposium on Cloud Computing 2019
    [paper]