Connecting the dots: Reader ratings, bibliographic data, and machine-learning algorithms for monograph selection



Journal Title

Journal ISSN

Volume Title



Traditional collection development relies heavily on human input, with librarians relying on reviews and subject selection lists, and through user requests. With the development of machine learning, more and more businesses seek automated methods to deliver results relevant to users. The Recommender system, a subclass of information filtering that seeks to predict the "rating" or "preference" of a user, is among the most successful systems of machine learning in action. It has been adopted by many major e-commerce businesses such as Amazon, Netflix, and Expedia, and has been widely implemented to predict product and media recommendations, making it a key factor in increasing product average order value and the number of items per order.

Drawing inspiration from the benefits of a recommender system to business and its success in heightening the reliability of recommendations, we attempted to build optimal collection recommendations with machine-learning algorithms using Python. The purpose of this project is to help librarians make collection decisions using the recommender system, and in this presentation we will illustrate several examples of building this system to aid in the selection of monographs. One example involves the merging of popular titles with reader rating data. We found that while The New York Times publishes best seller titles based on the rates of sales, they do not have any connection to user ratings. By leveraging data from Goodreads, the world’s largest site for readers and book recommendations, we will build a simple recommender system that produces The New York Times best seller titles that have higher user rating using a matrix factorization based method.

Another example of using a recommender system is to have the ability to refer selectors to books that are similar to a particular title based on pairwise similarity scores. News services are already able to identify related articles of interest to readers based on the articles that they have read in the past, so applying this system to libraries is an exciting prospect. Drawing on bibliographic data from highly circulated items, the recommender system will suggest items with similar features using similarity metrics. The recommender system will use machine-learning algorithms not only to simplify collection development for librarians, but will also help end users discover more items relevant to their interests.