Publications

I am primarily interested in building technology, but occasionally I publish papers.

"Learning Interpretable Style Embeddings via Prompting LLMs", ArXiv (2023)

Ajay Patel, Delip Rao, Chris Callison-Burch, Learning Interpretable Style Embeddings via Prompting LLMs

Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches result in uninterpretable representations, complicating their usage in downstream applications like authorship attribution where auditing and explainability is critical. In this work, we use prompting to perform stylometry on a large number of texts to create a synthetic dataset and train human-interpretable style representations we call LISA embeddings. We release our synthetic stylometry dataset and our interpretable style models as resources.

"Faithful Chain-of-Thought Reasoning", ArXiv (2023)

Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch, Faithful Chain-of-Thought Reasoning

While Chain-of-Thought (CoT) prompting boosts Language Models’ (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a faithful-by-construction framework that decomposes a reasoning task into two stages: Translation (Natural Language query symbolic reasoning chain) and Problem Solving (reasoning chain answer), using an LM and a deterministic solver respectively. We demonstrate the efficacy of our approach on 10 reasoning datasets from 4 diverse domains. It outperforms traditional CoT prompting on 9 out of the 10 datasets, with an average accuracy gain of 4.4 on Math Word Problems, 1.9 on Planning, 4.0 on Multi-hop Question Answering (QA), and 18.1 on Logical Inference, under greedy decoding. Together with self-consistency decoding, we achieve new state-of-the-art few-shot performance on 7 out of the 10 datasets, showing a strong synergy between faithfulness and accuracy.

“Learning Efficient Representations for Fake Speech Detection”, AAAI (2020)

Nishant Subramani and Delip Rao. “Learning Efficient Representations for Fake Speech Detection”. In: Proceeding of the AAAI Conference on Artificial Intelligence. 2020.

Synthetic speech or “fake speech” which matches personal vocal traits has become better and cheaper due to advances in deep learning-based speech synthesis and voice conversion approaches. This increased accessibility of synthetic speech systems and the growing misuse of them highlights the critical need to build countermeasures. Furthermore, new synthesis models evolve all the time and the efficacy of previously trained detection models on these unseen attack vectors is poor. In this paper, we focus on: 1) How can we build highly accurate, yet parameter and sample-efficient models for fake speech detection? 2) How can we rapidly adapt detection models to new sources of fake speech? We present four parameter-efficient convolutional architectures for fake speech detection with best detection F1 scores of around 97 points on a large dataset of fake and bonafide speech. We show how the fake speech detection task naturally lends itself to a novel multi-task problem further improving F1 scores for a mere 0.5% increase in model parameters. Our multi-task setting also helps in data-sparse situations, commonplace in adversarial settings. We investigate an alternative approach to the data-sparsity problem using transfer learning and show that it is possible to meet purely supervised detection performance for unseen attack vectors with as little as 6.25% of the training data. This is the first known application of transfer learning in adversarial settings for speech. Finally, we show how well our transfer learning approach adapts in an instance-efficient way to new attack vectors using the Real-Time Voice Cloning toolkit. We exceed the purely supervised detection performance (99.18 F1) with as little as 6.25% of the data.

“Listening to the World Improves Speech Command Recognition”, AAAI (2018)

Brian McMahan and Delip Rao. “Listening to the World Improves Speech Command Recognition”. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018.

We study transfer learning in convolutional network architectures applied to the task of recognizing audio, such as environmental sound events and speech commands. Our key finding is that not only is it possible to transfer representations from an unrelated task like environmental sound classification to a voice-focused task like speech command recognition, but also that doing so improves accuracies significantly. We also investigate the effect of increased model capacity for transfer learning audio, by first validating known results from the field of Computer Vision of achieving better accuracies with increasingly deeper networks on two audio datasets: UrbanSound8k and Google Speech Commands. Then we propose a simple multiscale input representation using dilated convolutions and show that it is able to aggregate larger contexts and increase classification performance. Further, the models trained using a combination of transfer learning and multiscale input representations need only 50% of the training data to achieve similar accuracies as a freshly trained model with 100% of the training data. Finally, we demonstrate a positive interaction effect for the multiscale input and transfer learning, making a case for the joint application of the two techniques.

“Entity linking: Finding Extracted Entities in a Knowledge Base”, Springer (2013)

Delip Rao, Paul McNamee, and Mark Dredze. “Entity linking: Finding extracted entities in a knowledge base”. In: Multi-source, Multilingual Information Extraction and Summarization. Springer, 2013, pp. 93–115.

In the menagerie of tasks for information extraction, entity linking is a new beast that has drawn a lot of attention from NLP practitioners and researchers recently. Entity Linking, also referred to as record linkage or entity resolution, involves aligning a textual mention of a named entity to an appropriate entry in a knowledge base, which may or may not contain the entity. This has manifold applications ranging from linking patient health records to maintaining personal credit files, prevention of identity crimes, and supporting law enforcement. We discuss the key challenges present in this task and we present a high-performing system that links entities using max-margin ranking. We also summarize recent work in this area and describe several open research problems.

Typed-Graph Models for Semi-supervised Learning of Name Ethnicity”, ACL (2011)

Delip Rao and David Yarowsky. “Typed graph models for semi-supervised learning of name ethnicity”. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2. Association for Computational Linguistics. 2011, pp. 514–518.

This paper presents an original approach to semi-supervised learning of personal name ethnicity from typed graphs of morphophonemic features and first/last-name co-occurrence statistics. We frame this as a general solution to an inference problem over typed graphs where the edges represent labeled relations be- tween features that are parameterized by the edge types. We propose a framework for parameter estimation on different constructions of typed graphs for this problem using a gradient-free optimization method based on grid search. Results on both in-domain and out-of-domain data show significant gains of over 30% accuracy improvement using the techniques presented in the paper.

"Hierarchical Bayesian Models for Latent Attribute Detection in Social Media", ICWSM (2011)

Delip Rao, Michael Paul, Clay Fink, David Yarowsky, Timothy Oates, Glen Coppersmith. “Hierarchical Bayesian Models for Latent Attribute Detection in Social Media.” In: ICWSM 11 (2011), pp. 598–601.

We present several novel minimally-supervised models for detecting latent attributes of social media users, focusing on ethnicity and gender. Previous work on ethnicity detection has used coarse-grained, widely separated classes of ethnicity and assumed the existence of large amounts of training data, such as the US census, simplifying the problem. Instead, we examine content generated by users in addition to name morpho-phonemics to detect ethnicity and gender. Further, we address this problem in a challenging setting where the ethnicity classes are more fine-grained-ethnicity classes in Nigeria–and with very limited training data.

"Streaming Cross-document Entity Coreference Resolution", COLING (2010)

Delip Rao, Paul McNamee, and Mark Dredze. “Streaming cross-document entity coreference resolution”. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics. 2010, pp. 1050–1058.

Previous research in cross-document entity coreference has generally been restricted to the offline scenario where the set of documents is provided in advance. As a consequence, the dominant approach is based on greedy agglomerative clustering techniques that utilize pairwise vector comparisons and thus require $O(n^2)$ space and time. In this paper, we explore identifying coreferent entity mentions across documents in high-volume streaming text, including methods for utilizing orthographic and contextual information. We test our methods using several corpora to quantitatively measure both the efficacy and scalability of our streaming approach. We show that our approach scales to at least an order of magnitude larger data than previously reported methods.

"Entity Disambiguation for Knowledge Base Population", COLING (2010)

Mark Dredze, Paul McNamee, Delip Rao, Adam Gerber, Tim Finin, “Entity disambiguation for knowledge base population”. In: Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics. 2010, pp. 277–285.

The integration of facts derived from information extraction systems into existing knowledge bases requires a system to disambiguate entity mentions in the text. This is challenging due to issues such as non-uniform variations in entity names, mention ambiguity, and entities absent from a knowledge base. We present a state of the art system for entity disambiguation that not only addresses these challenges but also scales to knowledge bases with several million entries using very little resources. Further, our approach achieves performance of up to 95% on entities mentioned from newswire and 80% on a public test set that was designed to include challenging queries.

"Ranking and Semi-Supervised Classification on Large Scale Graphs Using Map Reduce", ACL (2009)

Delip Rao and David Yarowsky. “Ranking and semi-supervised classification on large scale graphs using map-reduce”. In: Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing. Association for Computational Linguistics. 2009, pp. 58–65.

Label Propagation, a standard algorithm for semi-supervised classification, suffers from scalability issues involving memory and computation when used with large-scale graphs from real-world datasets. In this paper, we approach Label Propagation as a solution to a system of linear equations which can be implemented as a scalable parallel algorithm using the map-reduce framework. In addition to semi-supervised classification, this approach to Label Propagation allows us to adapt the algorithm to make it usable for ranking on graphs and derive the theoretical connection between Label Propagation and PageRank. We provide empirical evidence of that effect using two natural language tasks–lexical relatedness and polarity induction. The version of the Label Propagation algorithm presented here scales linearly in the data size with a constant main memory requirement, in contrast to the quadratic cost of both in traditional approaches.

"Semi-supervised Polarity Lexicon Induction", EACL (2009)

Delip Rao and Deepak Ravichandran. “Semi-supervised polarity lexicon induction”. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics. 2009, pp. 675–682.

We present an extensive study on the problem of detecting polarity of words. We consider the polarity of a word to be either positive or negative. For example, words such as good, beautiful, and wonderful are considered as positive words; whereas words such as bad, ugly, and sad are considered negative words. We treat polarity detection as a semi-supervised label propagation problem in a graph. In the graph, each node represents a word whose polarity is to be determined. Each weighted edge encodes a relation that exists between two words. Each node (word) can have two labels: positive or negative. We study this framework in two different resource availability scenarios using WordNet and OpenOffice thesaurus when WordNet is not available. We report our results on three different languages: English, French, and Hindi. Our results indicate that label propagation improves significantly over the baseline and other semi-supervised learning methods like Mincuts and Randomized Mincuts for this task.

"Affinity Measures Based on the Graph Laplacian", COLING (2008)

Delip Rao, David Yarowsky, and Chris Callison-Burch. “Affinity measures based on the graph Laplacian”. In: Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing. Association for Computational Linguistics. 2008, pp. 41–48.

Several language processing tasks can be inherently represented by a weighted graph where the weights are interpreted as a measure of relatedness between two vertices. Measuring similarity between arbitary pairs of vertices is essential in solving several language processing problems on these datasets. Random walk based measures perform better than other path based measures like shortest-path. We evaluate several random walk measures and propose a new measure based on commute time. We use the psuedo inverse of the Laplacian to derive estimates for commute times in graphs. Further, we show that this pseudo inverse based measure could be improved by discarding the least significant eigenvectors, corresponding to the noise in the graph construction process, using singular value decomposition.

"An Unsupervised Approach to Person Name Disambiguation using Web Snippets", SemEval (2007)

Delip Rao, Nikesh Garera, and David Yarowsky. “JHU1: an unsupervised approach to person name disambiguation using web snippets”. In: Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics. 2007, pp. 199–202.

This paper presents an approach to person name disambiguation using K-means clustering on rich-feature-enhanced document vectors, augmented with additional web extracted snippets surrounding the polysemous names to facilitate term bridging. This yields a significant F-measure improvement on the shared task training data set. The paper also illustrates the significant divergence between the properties of the training and test data in this shared task, substantially skewing results. Our system optimized on F0. 2 rather than F0. 5 would have achieved top performance in the shared task.

"Part of Speech Tagging and Shallow Parsing of Indian Languages", IJCAI (2007)

Delip Rao and David Yarowsky. “Part of speech tagging and shallow parsing of Indian languages”. In: Shallow Parsing for South Asian Languages (2007), p. 17.

This paper describes and evaluates shallow parsing of several Indian languages utilizing Conditional Random Field models. We show how performance can be substantially improved by several feature enhancements and improved modeling techniques, including expanding the chunk tag inventory, and separating punctuation from linguistic phrases. We also report results from part of speech tagging of Hindi, Bengali and Telugu using generative methods.