About
I am interested in
language technologies.
The core algorithms and models used in things
like
machine translation,
virtual assistants
and, more broadly,
information retrieval.
I am a Research Fellow (formerly Chief Scientist) at
ASAPP.
ASAPP is doing
amazing research
in NLP and ML applied to customer service.
I am also an Associate researcher in the
NLP group at
Athens University of Economics and Business. Previously I was
a Research Scientist in the
Language Team at Google for 15 years.
Prior to that I did my Ph.D. in NLP at the University of Pennsylvania supervised by
Fernando Pereira.
Publications, Thesis, Pre-prints
Google Scholar
Thesis
2024
2023
-
Multi-Step Dialogue Workflow Action Prediction
R. Ramakrishnan, E. Elenberg, H. Narangodage, R. McDonald
Arxiv
[PDF]
-
On the Effectiveness of Offline RL for Dialogue Response Generation
P Sodhi, F Wu, ER Elenberg, KQ Weinberger, R McDonald
International Conference on Machine Learning (ICML 2023)
[PDF]
-
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
F Wu, K Kim, S Watanabe, K Han, R McDonald, KQ Weinberger, Y Artzi
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
[PDF]
2022
2021
-
Planning with Learned Entity Prompts for Abstractive Summarization
S Narayan, Y Zhao, J Maynez, G Simoes, V Nikolaev and R McDonald
Transactions of the Association for Computational Linguistics (TACL), 2021
[PDF][preprints]
-
Leveraging Type Descriptions for Zero-shot Named Entity Recognition and Classification
R. Aly, A. Vlachos and R. McDonald∗
Proceedings of the Association for Computational Linguistics (ACL 2021)
[PDF]
-
Focus Attention: Promoting Faithfulness and Diversity in Summarization
R. Aralikatte, S. Narayan, J. Maynez, S. Rothe and R. McDonald∗
Proceedings of the Association for Computational Linguistics (ACL 2021)
[PDF]
-
Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation
J. Ma, I. Korotkov, Y. Yang, K. Hall and R. McDonald
Proceedings of the European Association for Computational Linguistics (EACL 2021)
[PDF]
2020
-
RRF102: Meeting the TREC-COVID Challenge with a 100+ Runs Ensemble
M. Bendersky, H. Zhuang, J. Ma, S. Han, K. Hall and R. McDonald
arXiv
[PDF]
-
Stepwise Extractive Summarization and Planning with Structured Transformers
S. Narayan*, J. Maynez*, J. Adamek, D. Pighin, B. Bratanic and R. McDonald
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
[PDF]
-
Hybrid First-stage Retrieval Models for Biomedical Literature
J. Ma, I. Korotkov, K. Hall, R. McDonald
BioASQ (2020)
[PDF]
-
On Faithfulness and Factuality in Abstractive Summarization
J. Maynez*, S. Narayan*, B. Bohnet, R.McDonald
Proceedings of the Annual Association for Computational Linguistics (ACL 2020)
[PDF][DATA]
-
Zero-shot Neural Retrieval via Domain-targeted Synthetic Query Generation
J. Ma, I. Korotkov, Y. Yang, K. Hall and R. McDonald
arXiv pre-print
[pre-print PDF][Published EACL 2021 PDF]
-
QURIOUS: Question Generation Pretraining for Text Generation
S. Narayan, G. Simoes, J. Ma, H. Craighead, R McDonald
arXiv
[PDF]
-
BioMRC: A Dataset for Biomedical Machine Reading Comprehension
P. Stavropoulos, D. Pappas, I. Androutsopoulos and R. McDonald
Proceedings of the 19th Workshop on Biomedical Natural Language Processing (BioNLP 2020)
[PDF]
2019
-
Embedding Biomedical Ontologies by Jointly Encoding Network Structure and Textual Node Descriptors
S. Kotitsas, D. Pappas, I. Androutsopoulos, R. McDonald and M. Apidianaki
Proceedings of the 18th Workshop on Biomedical Natural Language Processing (BioNLP 2019)
[PDF]
-
Measuring Domain Portability and Error Propagation in Biomedical QA
S. Hosein, D. Andor and R. McDonald
Prroceedings of the 7th BioASQ Workshop, 2019.
[PDF]
-
AUEB at BioASQ 7: Document and Snippet Retrieval
D. Pappas, G. Brokos, R. McDonald, I. Androutsopoulos
Prroceedings of the 7th BioASQ Workshop, 2019.
[PDF]
2018
-
Deep Relevance Ranking Using Enhanced Document-Query Interactions
R. McDonald, G. Brokos and I. Androutsopoulos
Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
[PDF]
[appendix]
-
AUEB at BioASQ 6: Document and Snippet Retrieval
G. Brokos, P. Liosis, R. McDonald, D. Pappas and I. Androutsopoulos
Proceedings of the 6th BioASQ Workshop, 2018.
[PDF]
-
Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings
B. Bohnet et al.
Proceedings of the Association for Computational Linguistics (ACL), 2018.
[PDF]
2017
2016
-
Generalized Transition-based Dependency Parsing via Control Parameters
B. Bohnet, R. McDonald, E. Pitler and J. Ma
Proceedings of the Association for Computational Linguistics (ACL), 2016.
[PDF]
-
Universal dependencies v1: A multilingual treebank collection
J. Nivre et. al.
Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2016.
[PDF]
-
Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning
M. Faruqui, R. McDonald, and R. Soricut
Transactions of the Association for Computational Linguistics (TACL), 2016.
[PDF]
2015
2014
-
Adapting Taggers to Twitter with (Less) Distant Supervision
B. Plank, D. Hovy, R. McDonald and A. Søgaard
International Conference on Computational Linguistics (COLING), 2014.
[PDF]
-
Enforcing Structural Diversity in Cube-pruned Dependency Parsing
H. Zhang and R. McDonald
Association for Computational Linguistics (ACL), 2014.
[PDF]
-
Constrained Arc-Eager Dependency Parsing
J. Nivre, Y. Goldberg and R. McDonald
Computational Linguistics (Squib), ?:?, 2014.
[PDF]
2013
-
Online Learning for Inexact Hypergraph Search
H. Zhang, L.Huang, K. Zhao and R. McDonald
Empirical Methods in Natural Language Processing (EMNLP), 2013.
[PDF]
-
Universal Dependency Annotation for Multilingual Parsing
R. McDonald, J. Nivre, Y. Quirmbach-Brundage, Y. Goldberg,
D. Das, K. Ganchev, K. Hall, S. Petrov, H. Zhang, O. Täckström,
C. Bedini, N. Bertomeu Castello and J. Lee
Association of Computational Linguistics (ACL), 2013.
[PDF][DATA]
-
Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging
O. Täckström, D. Das, S. Petrov, R. McDonald and J. Nivre
Transactions of the Association for Computational Linguistics 1(2013):1-12.
Presented at ACL, 2013.
[PDF]
-
Target Language Adaptation of Discriminative Transfer Parsers
O. Täckström, R. McDonald and J. Nivre
North American Association for Computational Linguistics (NAACL), 2013.
[PDF]
2012
-
Generalized Higher-Order Dependency Parsing with Cube Pruning
H. Zhang and R. McDonald
Empirical Methods in Natural Language Processing
and Computational Natural Language Learning (EMNLP-CoNLL), 2012.
[PDF]
-
Overview of the 2012 Shared Task on Parsing the Web
S. Petrov and R. McDonald
Notes of the First Workshop on Syntactic Analysis of Non-Canonical Language (SANCL), 2012.
[PDF]
-
Using Search-Logs to Improve Query Tagging
K. Ganchev, K. Hall, R. McDonald and S. Petrov
Association for Computational Linguistics (ACL), 2012.
[PDF]
-
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
O. Täckström, R. McDonald and J. Uszkoreit
North American Association for Computational Linguistics (NAACL), 2012.
[Best student paper award]
[PDF]
-
A Universal Part-of-Speech Tagset
S. Petrov, D. Das and R. McDonald
International Conference on Language Resources and Evaluation (LREC), 2012
[PDF ]
[DATA]
2011
-
Training Structured Prediction Models with Extrinsic Loss Functions
K. Hall, R. McDonald and S. Petrov
Domain Adaptation Workshop at NIPS 2011
Note: This is an extension of the Hall et al. EMNLP 2011 paper.
[PDF]
-
Training dependency parsers by jointly optimizing multiple objectives
K. Hall, R. McDonald, J. Katz-Brown and M. Ringgaard
Empirical Methods in Natural Language Processing (EMNLP), 2011
[PDF]
-
Multi-source Transfer of Delexicalized Dependency Parsers
R. McDonald, S. Petrov and K. Hall
Empirical Methods in Natural Language Processing (EMNLP), 2011
[PDF]
-
Training a Parser for Machine Translation Reordering
J. Katz-Brown, S. Petrov, R. McDonald, D. Talbot, F. Och, H. Ichikawa, M. Seno and H. Kazawa
Empirical Methods in Natural Language Processing (EMNLP), 2011
[PDF]
-
Semi-supervised Latent Variable Models for Sentence-level Sentiment Analysis
O. Täckström and R. McDonald
Association for Computational Linguistics (ACL), 2011
[PDF]
[DATA]
-
Analyzing and Integrating Dependency Parsers
R. McDonald and J. Nivre
Computational Linguistics, 37:1, 2011
[PDF]
-
Discovering Fine-grained Sentiment with Latent Variable Structured Prediction Models
O. Täckström and R. McDonald
European Conference on Information Retrieval (ECIR), 2011
Conference version: [PDF]
Tech-report: [PDF]
[HTML]
[DATA]
2010
-
Evaluation of Dependency Parsers on Unbounded Dependencies
J. Nivre, L. Rimell, R. McDonald and C. Gómez Rodríguez
International Conference on Computational Linguistics (COLING), 2010.
[PDF]
-
Learning to Classify the Scope of Negation for Improved Sentiment Analysis
I. Councill, R. McDonald and L. Velikovich
Negation and Speculation in Natural Language Processing (NeSp-NLP), 2010
[PDF]
-
Distributed Training Strategies for the Structured Perceptron
R. McDonald, K. Hall and G. Mann
North American Association for Computational Linguistics (NAACL), 2010.
[PDF]
-
The Viability of Web-derived Polarity Lexicons
L. Velikovich, S. Blair-Goldensohn, K. Hannan and R. McDonald
North American Association for Computational Linguistics (NAACL), 2010.
[PDF]
2009
-
Dependency Parsing
S. Kübler, R. McDonald and J. Nivre
Synthesis Lectures on Human Language Technologies, G. Hirst (ed.)
Morgan & Claypool Publishers
[e-print]
[Amazon]
[Google Books]
-
Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models
G. Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker
Neural Information Processing Systems (NIPS), 2009.
[PDF]
-
Sentiment Summarization: Evaluating and Learning User Preferences
K. Lerman, S. Blair-Goldensohn and R. McDonald
European Association for Computational Linguistics (EACL), 2009.
[PDF]
-
Contrastive Summarization: An Experiment with Consumer Reviews
K. Lerman and R. McDonald
North American Association for Computational Linguistics (NAACL), 2009.
[PDF]
2008
-
A Joint Model of Text and Aspect Ratings for Sentiment Summarization
I. Titov and R. McDonald
Association for Computational Linguistics (ACL), 2008.
[PDF]
-
Integrating Graph-based and Transition-based Dependency Parsers
J. Nivre and R. McDonald
Association for Computational Linguistics (ACL), 2008.
[PDF]
-
Building a Sentiment Summarizer for Local Service Reviews
S. Blair-Goldensohn, K. Hannan, R. McDonald, T. Neylon, G. Reis, and J. Reynar
WWW Workshop on NLP in the Information Explosion Era (NLPIX), 2008.
[PDF]
-
Modeling Online Reviews with Multi-Grain Topic Models
I. Titov and R. McDonald
International World Wide Web Conference (WWW), 2008.
[PDF]
2007
-
The CoNLL 2007 Shared Task on Dependency Parsing
J. Nivre, J. Hall, S. Kübler, R. McDonald, J. Nilsson, S. Riedel, and D. Yuret
Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL), 2007.
[PDF]
-
Characterizing the Errors of Data-Driven Dependency Parsing Models
R. McDonald and J. Nivre
Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL), 2007.
[PDF]
-
On the Complexity of Non-Projective Data-Driven Dependency Parsing
R. McDonald and G. Satta
International Conference on Parsing Technologies (IWPT), 2007.
[PDF]
-
Structured Models for Fine-to-Coarse Sentiment Analysis
R. McDonald, K. Hannan, T. Neylon, M. Wells, and J. Reynar
Association for Computational Linguistics (ACL), 2007
[PDF]
-
A Study of Global Inference Algorithms in Multi-Document Summarization
R. McDonald
European Conference on Information Retrieval (ECIR), 2007
[PDF]
2006
-
Automated recognition of malignancy mentions in biomedical literature
Y. Jin, R. McDonald, K. Lerman, M. Mandel, S. Carroll, M. Liberman, F. Pereira, R. S. Winters and P. S. White
BMC Bioinformatics 2006, 7:492
[PDF]
-
Domain Adaptation with Structural Correspondence Learning
J. Blitzer and R. McDonald and F. Pereira
Empirical Methods in Natural Language Processing (EMNLP), 2006
[PDF]
-
Multilingual Dependency Analysis with a Two-Stage
Discriminative Parser
R. McDonald and K. Lerman and F. Pereira
Conference on Natural Language Learning (CoNLL), 2006
[PDF]
-
An automated procedure to identify biomedical articles that contain cancer-associated gene variants
Ryan McDonald, R. Scott Winters, Claire K. Ankuda, Joan A. Murphy, Amy
E. Rogers, Fernando Pereira, Marc S. Greenblatt, Peter S. White
Human Mutation, Volume 27 Issue 9, 2006
[PDF]
-
Online Learning of Approximate Dependency Parsing Algorithms
R. McDonald and F. Pereira
European Association for Computational Linguistics (EACL), 2006
[PDF]
-
Discriminative Sentence Compression with Soft Syntactic Constraints
R. McDonald
European Association for Computational Linguistics (EACL), 2006
[PDF]
2005
-
Non-Projective Dependency Parsing using Spanning Tree Algorithms
R. McDonald, F. Pereira, K. Ribarov and J. Hajič
Human Language Technologies and Empirical Methods in Natural Language Processing (HLT-EMNLP), 2005
[Best Student Paper Award]
[PDF]
-
Flexible Text Segmentation with Structured Multilabel Classification
R. McDonald, K. Crammer and F. Pereira
Human Language Technologies and Empirical Methods in Natural Language Processing (HLT-EMNLP), 2005
[PDF]
-
Simple Algorithms for Complex Relation Extraction
with Applications to Biomedical IE
R. McDonald, F. Pereira, S. Kulick, S. Winters, Y. Jin and P. White
Association for Computational Linguistics (ACL), 2005
[PDF]
-
Online Large-Margin Training of Dependency Parsers
Ryan McDonald, Koby Crammer and Fernando Pereira
Association for Computational Linguistics (ACL), 2005
[PDF], or a more detailed and updated
Tech Report
-
Identifying and Extracting Malignancy Types in
Cancer Literature
Y. Jin, R. McDonald, K. Lerman, M. Mandel, M. Liberman, F. Pereira,
R.S. Winters and P.S. White
Linking Literature, Information and Knowledge for Biology (BioLink), 2005
[PDF]
-
Automatically annotating documents with normalized gene lists
Jay Crim, Ryan McDonald and Fernando Pereira
BMC Bioinformatics 2005, 6(Suppl 1):S13
[PDF]
-
Identifying gene and protein mentions in text using conditional random fields
Ryan McDonald and Fernando Pereira
BMC Bioinformatics 2005, 6(Suppl 1):S6
[PDF]
-
Spanning Tree Methods for Discriminative Training
of Dependency Parsers
Ryan McDonald, Koby Crammer and Fernando Pereira
UPenn CIS Technical Report: MS-CIS-05-11
[PDF]
2004
-
An entity tagger for recognizing acquired genomic
variations in cancer literature
R. McDonald, R.S. Winters, M. Mandel, Y. Jin, P.S. White and F. Pereira
Journal of Bioinformatics, 2004.
[PDF]
-
Integrated Annotation for Biomedical Information Extraction
S. Kulick, A. Bies, M. Liberman, M. Mandel, R. McDonald, M. Palmer,
A. Schein, L. Ungar, S. Winters and P. White
Linking Biological Literature, Ontologies and Databases (BioLink), 2004
[PDF]
-
New Large Margin Algorithms for Structure Prediction
Koby Crammer, Ryan McDonald and Fernando Pereira
NIPS Workshop on Learning With Structured Outputs, 2004
-
Scalable Large Margin Online Learning Algorithms for Structured Classification
Ryan McDonald, Koby Crammer and Fernando Pereira
NIPS Workshop on Learning With Structured Outputs, 2004
-
Extracting Relations from Unstructured Text
Ryan McDonald
My WPEII review paper for admission to PhD candidacy, 2004.
UPenn CIS Technical Report: MS-CIS-05-06
[PDF]
-
Automatically Annotating Documents with Normalized Gene Lists
Jay Crim, Ryan McDonald and Fernando Pereira
A critical assessment of text mining methods in molecular biology, BioCreative, 2004
[PDF]
-
Identifying Gene and Protein Mentions in Text Using
Conditional Random Fields
Ryan McDonald and Fernando Pereira
A critical assessment of text mining methods in molecular biology, BioCreative, 2004
[PDF]
Before 2004
-
Exploiting Sequent Structure in Membership Algorithms for the Lambek Calculus
Ryan McDonald
15th Annual European Summer School in Logic Language and Information (ESSLLI), 2003
[PDF]
-
Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices
Gerald Penn, Jianying Hu, Hengbin Luo and Ryan McDonald
International Conference on Document Analysis and Recognition (ICDAR), 2001
[PDF]
-
A Distributed Social MUD to Enhance Reliability and Scalability
Ryan McDonald and Nick Montfort, 2003.
Unpublished
[PDF]
Software and Data
Note: I try to answer emails about my software, in particular for MSTParser. However,
time constraints limit the amount of requests I can manage.
-
Multilingual treebanks annotated/converted in a harmonized scheme.
-
Implementation of parsers described in ACL and HLT-EMNLP '05 papers.
-
General online structured learning package.
-
Conditional Random Field Biomedical Entity Tagger (Genes, Variations and
Malignancies). This tagger (or variants of it) form part of the core technology of the following resources:
- FABLE biomedical literature search engine
- PlasmoDB relevant literature search
-
In the past I have been known to contribute to MALLET,
which is a general implementation of Conditional Random Fields
and other learning algorithms tailored specifically to language.
Andrew McCallum is
the primary writer and caretaker of the package
which is available
at http://mallet.cs.umass.edu
Teaching
Walking