Abstract: Advances in sequencing technology have made available a plethora of panomics data for cancer research, yet the search for disease genes and drug targets remains a formidable challenge. Biological knowledge such as pathways can play an important role by constraining the search space and boosting the signal-to-noise ratio. The majority of knowledge resides in text (e.g., journal publications), which has been undergoing its own exponential growth, making it mandatory to develop machine reading methods for automatic knowledge extraction. In this talk, I will formulate the machine reading task for pathway extraction, review the state of the art and open challenges, and present our Literome project and latest attack to the problem using grounded unsupervised semantic parsing.
Bio: Hoifung Poon is a researcher at Microsoft Research. His research interests are in advancing machine learning and natural language processing (NLP) to help automate discovery in genomics and precision medicine. His most recent work focuses on scaling semantic parsing to Pubmed for extracting biological pathways, and on developing probabilistic methods to incorporate pathways with high-throughput omics data in cancer systems biology. He has received Best Paper Awards in premier NLP and machine learning venues such as the Conference of the North American Chapter of the Association for Computational Linguistics, the Conference of Empirical Methods in Natural Language Processing, and the Conference of Uncertainty in AI.