Library Acquisition Patterns is a report on trends in U.S. academic libraries’ book purchasing released in January by Ithaka S+R. ProQuest’s Bob Nardini spoke recently with Katherine Daniel, the report’s lead author. The interview was published in the June issue of Against the Grain. We’ve shared some highlights below.
Counting books used to be easy and was for a long time the primary way North American academic libraries kept score. How were they doing? Research libraries, especially, answered that question by turning to the annual “ARL Rankings.” Libraries counted how many books they’d added, and how many removed. Results sent to the Association of Research Libraries were published in a yearly ranking of members. If a library added enough books to move up a notch, that was reason to celebrate. If down, there was always next year.
Today this sounds antiquated. But in an era when the print book symbolized the mission of academic libraries, how else would you want to rank research libraries? Reference questions? Gate count? Books mattered most. Counting them wasn’t hard.
Today’s ARL rankings no longer focus on book counts. Among the reasons why that’s a good idea, it’s no longer easy to count books. That was one lesson learned by Katherine Daniel, principal author of “Library Acquisition Patterns,” a report published in January by Ithaka S+R, a well-known research organization. Ithaka had the aim “of examining trends in US academic libraries’ book purchasing,” through data extracted from the internal systems of a sample of libraries. That entailed counting books — book orders, actually. For the print books, this was easy enough — and thanks to Ithaka we now know that Amazon’s share of the print book market for academic libraries is likely around 11 percent.
With ebooks, on the other hand, “It … became apparent to me throughout this project that print and electronic resources are very different in the ways they are acquired,” Katherine told Bob. “With print,” she said, “you either have it on your shelf or you don’t. But with electronic, it’s more likely that you can get it in a bundle, you can have it for a limited amount of time, multiple people can access the same item at the same time … there’s a need for richer data that captures these differences between print and electronic resources, especially as the acquisition models for both continue to evolve at a rapid pace.”
The Ithaka study was Katherine’s initiation into the book business, and it was interesting to hear a newcomer’s take on the complexities known all too well to those of us immersed in it.
Bob Nardini: …What did you learn over the course of the [Library Acquisition Patterns] project?
Katherine Daniel: How complex this industry is; how the publishers, vendors, and libraries are so interconnected but, it appears to me, simultaneously siloed with ramifications for each group’s ability to make optimal business decisions. It also became apparent to me throughout this project that print and electronic resources are very different in the ways they are acquired. With print, you either have it on your shelf or you don’t. But with electronic, it’s more likely that you can get it in a bundle, you can have it for a limited amount of time, multiple people can access the same item at the same time. Like with everything else these days, there’s a need for richer data that captures these differences between print and electronic resources, especially as the acquisition models for both continue to evolve at a rapid pace.
BN: We certainly agree, on the complexity. ProQuest’s position in the ebook market, in fact, we felt was under-represented, due to the types of things you mention. The report itself highlighted data gathering as a major challenge. Can you tell us about some of those challenges?
KD: It was more so the data clean-up than the data gathering that was a challenge, although to your point, some of the data that would have been valuable to an analysis we weren’t able to collect. I think anyone who works with data anticipates that it’s not going to be perfect right off the bat, so it’s a matter of getting your hands on it, unspooling all the issues, and doing a lot of research to make an executive decision on how something should be cleaned up, which can be daunting in itself.
One of the big challenges was the degree of miscategorization present in the dataset, which made the analysis much less straightforward than pulling all book items into a subset and analyzing that. Instead, we had to research which items were most likely to be books in the first place — hence the pricing parameters we introduced in an attempt to isolate monographs — and craft a dataset based on those findings. Another example of a challenge was that publisher and vendor names were listed completely idiosyncratically and, in the former’s case, required quite a bit of engineering and manual work to standardize in the dataset, while finding every possible variant of a vendor’s name in the latter’s case was time-consuming but not particularly difficult. There were also the unanticipated challenges, like discovering that book packages are invoiced as one acquisition record instead of each book listed with its own record.
BN: If you were to organize a follow-up report, what questions would you try to answer?
KD: I’d like to look at the items classified as books in the data that are above our pricing parameters. These items accounted for a substantial amount of libraries’ book expenditures, and some of the feedback we received suggests that these more expensive items are actually books, not misclassified items, and their cost is a result of evolving business models for books as they themselves evolve, especially regarding digital books and how they are accessed. These items could therefore point toward future trends instead of the past trends that the report assesses. Ideally, a follow-up would also be able to bring in those book packages for analysis to examine their share of the market and if they make up for any of the print book declines we saw, but right now there’s no easy way to identify those packages.