- Contact Information
- Subscribe to these events
- Send to a Friend
- Send to Social Media outlet
- CIRSS Home
- 317 views
Efron, Organisciak and Fenlon win best paper award at ASIS&T 2011
CIRSS faculty affiliate Miles Efron and CIRSS graduate students Peter Organisciak and Katrina Fenlon have won this year's best paper award at the American Society for Information Science and Technology (ASIS&T) annual meeting in New Orleans LA, October 9-12 2011. The three won for their joint paper "Building Topic Models in a Federated Digital Library Through Selective Document Exclusion."
Building Topic Models in a Federated Digital Library Through Selective Document Exclusion
Miles Efron, Peter Organisciak and Katrina Fenlon
Building topic models in federated digital collections presents numerous challenges due to metadata inconsistencies. The quality of topical metadata is difficult to ascertain and is interspersed with often irrelevant administrative metadata. In this study, we propose a way to improve topic modeling in large collections by identifying documents that convey only weak topical information. These documents are ignored when training topic models. Their topical associations are instead inferred model training. A method is outlined for identifying weakly topical documents by defining runs of similar documents in a collection. In preliminary evaluation using a corpus from the Institute of Museum and Library Services Digital Collections and Content aggregation, results show an increase in coherence among words in topics. In showing this, we demonstrate that it may be beneficial to induce topic models using less, higher-quality data.