Thursday, April 30, 2009

Springtime in Ohio

OCLC has had some interesting announcements over the last few weeks regarding the WorldCat platform. Their new partnership with Ebsco will really enrich WorldCat Local as an article discovery tool and bring it closer to being a kind of Google for libraries. It'll be interesting to see how much full text content they index vs. citation level indexing. This could be a huge step forward in the search fragmentation problem that federated searching has been trying to solve for a long time.

Andrew Pace commented recently in his blog on the spring weather in Ohio. Perhaps the warmer temperatures have those folks in Dublin thinking that they are in Northern California, looking out at the golden rolling hills around Silicon Valley rather than the verdant hills of central Ohio. His next post announces OCLC's plans to give away WorldCat Local for free (sort of)! Do these folks think they are running a Web 2.0 start-up company or what?

The bigger announcement was that OCLC is entering the ILS fray with a "web-scale" library management system. OCLC's description of the product makes the distinction between a SAAS model and what they are trying to achieve.
OCLC's vision is similar to Software as a Service (SaaS) but is distinguished by the cooperative "network effect" of all libraries using the same, shared hardware, services and data, rather than the alternative model of hosting hardware and software on behalf of individual libraries.
I think they are on the right track. The important idea here is that the OCLC community can aggregate library management data together and gain huge advantages. OCLC has holdings data and bibliographic data, which they have put to use effectively in searching. Circulation data, e resource usage data, license data, etc. could bring major improvements in workflow and business intelligence.

The point that people miss here is that this endeavor is not about competing against other library management systems. It's about making libraries relevant in the broader, Google- centered information ecosystem. There are big problems with the way libraries work currently when viewed from the perspective of the modern day web:
  • resource fragmentation-we have too many silos of data for searching; people want the kind of big indexes that Google provide
  • the finite collection-if people want to read any article or a book, they should be able to click to it and have it appear; there is an expectation of this on the web, in the blogosphere, etc.; waiting a day for an article that is already digitized somewhere to be scanned and sent over ILL is too long; libraries are still tied to this notion that they provide their patrons access to a finite physical and licensed collection
  • walled garden effect-often you have to be going through the library's web gateway to benefit from its resources
  • Web/library sector content divide-our systems are often only aware of information resources within the products we provide--there is a disconect with the broader web that tools like Google Scholar bridge
  • local value-what kind of local customization are libraries providing regarding information resources? I think often we fall short in providing enough added value to justify our existence as middlemen
If we don't solve some of these, we may lose our position as information provider/mediator to our communities.

Beyond making existing processes more efficient, the network level ILS should be an agent of change for the way that libraries purchase, license, and provide information. Its infrastructure and data should support more sophisticated arrangements with content providers (I think the aforementioned Ebsco arrangement demonstrates this).

In Karen Coyle's article on this initiative, she points out the connection between this project and some of the findings of the Working Group on the Future of Bibliographic Control:
A report from the Working Group on the Future of Bibliographic Control ( noted that libraries spend a great deal of time on repetitive tasks, such as cataloging best-sellers, while ignoring the most valuable aspects of their collections: the archives, the rare items, the unique collections. The report urged libraries to "transfer effort into higher value activity" and separately called for libraries to embrace the web as the primary technology infrastructure.
The web scale library management system should provide the tools for libraries to do this higher value work, including synthesizing and specializing resources for a local environment.

Furthermore, rather than competing with other library sector technology vendors, OCLC should build the infrastructure that allows those vendors to build services on top of the WorldCat platform in the same way that Flickr works with partner companies who add value to their services. I know this is a tricky process, but it probably starts with open APIs.

Monday, April 27, 2009

Evolution of library discovery systems in the web environment

An article that I've mentioned previously was just published in the Spring Issue of Oregon Library Association Quarterly, which is chock full of good articles on the future of library catalogs. It kind of sums up my thinking on library systems over the last several years.

Evolution of Library Discovery Systems in the Web Environment

Saturday, April 25, 2009

The gathering storm

I'm on my way home from Kentucky right now after giving a presentation last night on cloud computing at a NITLE workshop on collaboration at Centre College.

Had a few interesting questions about the presentation:

  • in the 80 core/20 context scenario, aren't you simply dumping the context work on your clientele, especially with sending them to mainstream applications
  • in the 80 core/20 context scenario, won't most of your faculty, as is the currently the case, be uninterested in going beyond the basics of technology. What will you do for them?

Monday, April 20, 2009

future scenarios for college IT departments

College IT in 2020, the dark scenario

In this case, the same forces that work to undermine the library also undercut the importance and influence of the IT department. Cloud computing, enabled by applications delivered from massive data centers over low cost bandwidth, have allowed applications formerly managed in house to be run from the network.

In 2009, when the department first began its first major cloud initiative with GMail and Google Apps, it seemed like these applications were simply another type of enterprise software that they could control and manage centrally. But the trend has been toward a major decentralization in the management of IT resources.

Because the IT department no longer controls resources essential for networked, multi-user applications, the management of those apps has devolved to the departments. This is partially because there is less technology to manage, but also because the technology has become invisible. It's just an integral part of the work of each part of the enterprise. The business office manages the financial side of the ERP application, while the registrar handles the academic side. The HR department, as part of their mission to promote organizational effectiveness, manages the use of institutional communications software (email, groupware, calendaring, wikis, etc.)

The course management system no longer exists. Various fairly generic communication tools, the descendants of Google Apps, are easy to bring together for shared communication among students/faculty in a course and institutional data can be mashed within them. And there are many more discipline specific apps on the network. Many departments, academic and non, and individual faculty members buy applications on the network.

There remains some demand for academic technology support, but most faculty have personal networks, external and internal where they can get the support that they need that best fits their teaching/research niche. Faculty that are fairly non-technologically intensive in their academic work are able to navigate generic applications effectively--it's an invisible part of doing their work. Those that are at the cutting edge and pushing cyberinfrastructure to the limits need highly focused help that they acquire remotely.

End user technology, including PCs, laptops, handheld devices have also gone decentralized. Over the years, these have become such a personalized device that people prefer to buy and configure their own, and the IT department no longer provisions the campus with desktop PCs in the case of computer labs or for employee desks. Employees are given a pay subsidy to provide their own personal devices.

IT's main role is to maintain the physical network and installed devices on campus, which it does by contracting out much of the work. It also continues to play an important but limited role in systems integration and security, stitching together external applications and supplying them with institutional data.

College IT in 2020, the bright scenario

In this case, we still see many applications move to the network. But there is still a need in the organization for the kind of concentrated expertise in data management, programming, software configuration, systems integration, and security that comes with a centralized information technology unit. Furthermore, there are several important organization-wide applications that benefit from centralized administration and integration. These include communication (email, groupware), ERP, fundraising, and the descendant of the current day CMS. These systems may be in the cloud, but they are 10X more sophisticated than their ancestors of today. Positions devoted to installing patches and tweaking databases of the old ERP system have evolved into new jobs to analyze, manipulate and mash up the data in these new systems

With applications on the network, personal computers have indeed evolved to personal devices and as in the dark scenario, IT has given up buying and installing desktop PCs for staff and student labs alike. The positions formerly supporting desktop installation and troubleshooting have been repurposed to academic technology support. Digital technology has become a huge part of research and teaching, with remote cyberinfrastructure resources serving as virtual laboratories in many disciplines. Students do their academic work in a digitally sophisticated manner that mirrors the way they'll neet to work in 21rst century organizations. Faculty, more overloaded than ever, turn to their local academic technologists to help with course design and research challenges.

The trend that we see at present where higher education is scrambling to apply consumer applications like microblogging, wikis, lightweight video production, mobile apps, etc. has run its course. These technologies are still important, and have become part of the way the organization works. But a new wave of technologies (in 3D visualization, remote sensing, or ???) has emerged. These technologies involve expensive physical devices and favor implementation at the organizational level, and IT has stepped in to support them. On-site personal are needed to install and configure a growing set of devices that we wouldn't recognize today.

IT is recognized as strategically critical to the competitiveness of the institution as progress in research and teaching is highly dependent on its effective use. The IT department is more important than ever.

Wednesday, April 15, 2009

future scenarios for the college library

I'm giving a talk on cloud computing at a library/IT conference sponsored by NITLE in next week at Centre College, located in Danville, KY, the heart of Kentucky Bluegrass country.

One of the things I'd like to discuss is possible futures for college library and IT departments given current trends in cloud computing and digital technology more broadly. Guess this ties back to that "core vs. context" session at the NITLE Summit. My idea is to present two visions of the future: a "dark" future and a "bright" future, the dark one making the case that libraries and IT department will basically shrink in size and importance, the bright one supporting the idea that their role will in fact strengthen in importance and influence.

A college library in 2020, the dark scenario:

In this case, libraries play a much less important role in bringing people and information together. Electronic access to book and journal content through open access academic publishing models combined with new models for purchasing content on an on-demand, per individual basis have removed the library as intermediary. Because the network allows it, smaller actors with specific needs now purchase, license, and manage content in more focused ways. Faculty license access to research databases for specific courses and maintain their own mini digital libraries in the cloud. Students purchase e-content on their own as they do their research, similar to the way they buy textbooks.

The library still exists as a rump organization. Physically it serves as a somewhat charming study hall. Much space formally devoted to books has been cannibalized by various other interests on campus. The library still provides a few general purpose electronic research tools to the community as a whole, doles out micro-credits to purchase electronic content and maintaining a small collection of print materials for those disciplines still interested in the physical book. The reduced physical and electronic collections and correspondingly low usage statistics have led to smaller staffs in all library departments supporting the discovery-to-delivery chain: acquisitions, cataloging, collection development, systems, circulation, and ILL.

With more sophisticated search systems, finding basic academic articles and books on a topic has gotten easier and this has undermined the role of reference/instruction librarians. Students still need help with research, but because librarians no longer manage the most important research sources, their tacit authority in this area has waned. Students turn to other figures on campus for research help such as the faculty, more senior level undergrads, graduate students, etc.

Compared to other library departments, special collections has fared rather well, maintaining their existing staffing levels. The digital environment has amplified the impact of their work, making it visible to a wider audience and because it is of a unique nature, it faces little competition from the network. Nevertheless, their ability to grow is hampered because they are disconnected somewhat from the teaching mission of the institution. Efforts offer digital archiving services for various constituencies have fallen flat as most campus departments prefer self management of digital archives in the cloud.

A college library in 2020, the bright scenario:

In this case, the role of the library as information provider and mediator stays strong and even grows.

The library still maintains its role as purchaser and provider of information for its institution for several reasons. The marketplace for academic information products remains complex, with many different commercial and non-profit providers, a wide range of formats, both physical and virtual (many of which we've never heard of right now), knotty copyright restrictions, a wide range of purchasing and licensing options. The library is needed to manage this complexity. This environment is also ever changing and consequently the library has a particularly important role in providing access over time to information in out-of-date format.

Furthermore, there is continued consensus on the value of giving students in an institution a bundle of information sources in which they can explore freely without incremental cost. Finally, a general inertia in academia, and the publishing and library worlds prevents too much change in they way academic information is bought and sold. The libraries love their budgets too much, and so do the publishers, and the symbolic value of the library prevents most schools from being too ruthless with budget cuts.

For these reasons, staffing in the entire discovery-to-delivery chain has remained fairly strong, though the roles have shifted somewhat from lower-paid physical processing positions to somewhat fewer higher paid, higher skill digital content management positions. Circulation and traditional acquisitions and book processing work have fallen off with less printed content being purchased. ILL has become mostly irrelevant but for esoteric items, as economical digital purchasing/delivery of per-item content has taken over.

Collection development has shifted away from picking individual books to purchasing and licensing aggregated sets, and the management of these sets is done using a globally connected integrated library systems, where much of the management data is already populated. Managing (or synthesizing) this content requires strong analytical skills and the positions in charge of this work are fewer than the old paper acquisitions/serials management jobs but pay more and require more knowledge and skills. The systems work required to specialize and mobilize this content for the college lightens as it shifts to the network level.

As digitally formatted information becomes more of the norm, the outside demand for expertise in older printed and digital information formats unexpectedly grows and some librarians specialize in this kind of expertise. For instance, there is now a "printed materials" librarian specializing in book preservation and the nuances of the traditional codex. This person works in special collections and the main collection, which more and more is about book as art and artifact rather than book as just information delivery device.

Because of the very complex information environment, the demand for reference and instruction increases. As scholarship and scholarly communications evolves in the digital environment, navigating it becomes ever more complex. Expectations for what constitutes a college research project increase, with faculty demanding more than the traditional 10 typewritten page paper. Some of these increasing expectations could include: the increased use of images, multimedia, sophisticated manipulation of statistics, mining digital archives, and actually making the research a public contribution to a body of work. These increasing expectations correspond to the types of demands placed on students when they go to work in 21rst century organizations after graduation. Faculty, already overworked, are even more so in 2020, and they need to leverage the library and librarians to make these complex research projects happen.

The evolution of research, scholarship and teaching in the digital environment creates new opportunities for what would have been cataloging and systems personnel in the old library. Faculty together with their students are creating organic niche digital collections of knowledge that they build on over time. Digital initiatives librarians and metadata experts serve as consultants in the construction of these archives, which provide a Web 2.0 style participatory style of learning and advance knowledge in their own right.

The physical library, while perhaps relinquishing some of the space formerly occupied by physical books and journals, becomes ever more the congregating space for this type collaborative learning and scholarship, and can now incorporate an array of student support services. The library remains a sanctuary for individual study and learning but also a collaborative place.

Special collections enjoys ever more relevance in the long tail world, especially as it makes it case that it's presence on the network increases the institution's prestige globally. As the web matures and people began to miss material from earlier decades that is suddenly lost, digital archiving becomes a high priority and a role that the library can fill for the college. Some of the positions devoted to circulating and processing print materials are re purposed in this area.

Overall, the library plays a bigger, better role than ever on campus.


Next up, the two scenarios for IT. And then a prescription to make the bright scenario happen. Actually, no, I'll be making the case that the library has some influence over which of these plays out but that much of it is out of our control.

Friday, April 10, 2009


Nick Carr has an interesting take on Google as the "middleman", how it has sort of stolen that role from the newspapers. Funny, I was just talking about newspapers and libraries as "middleman" organizations threatened by the Internet in a recent post.

I'm not sure I agree with his prescription for the news business.

Friday, April 3, 2009

Ed Ayers on Digital Scholarship

One of the more enjoyable parts of the NITLE Summit was the final keynote by Ed Ayers, president of the University of Richmond. Ayers is a historian previously at U Va and is one of the people behind the much vaunted Valley of the Shadow project.

The main theses of his presentation was that collaborative digital scholarship that engages students can broaden the horizons of students at smaller institutions while at the same time providing a tighter sense of community and connectedness for students at larger institutions. The History Engine, which lets students/scholars contribute to a crowdsourced collection of historical episodes drawn from primary sources, was his prime example of this type of digital scholarship.

When Ayers came to Richmond, one of the things he asked for was a Digital Scholarship Lab, which has done The History Engine and a few other projects.

Ayers spoke of a missed opportunity by the academy to really embrace the revolution in networked technology. In his view, digital technology can be trans formative to humanities scholarship, not just teaching.

I was encouraged by the talk, especially to have someone at the highest level of liberal arts college leadership encouraging the types of digital projects that we're trying to foster here at Watzek. He said that when college presidents get together and eat rubber chicken these are the kinds of things they like to show off to each other. What an endorsement!

In my mind, The History Engine falls into a category of project in which you have undergraduate students doing research and contributing publicly to an evolving body of knowledge. Ayers thought that students learn better when they know that their work is public and making an original contribution to a body of knowledge.

I can think of a few projects like that around here, chief among which would be our situated research initiative in Environmental Studies. We've also had interest from our SoAn folks in building a digital library of senior project bibliographies. Many of the science labs on campus accumulate data over the years through student research projects, too.

Should academic libraries develop competencies in building these collaboratively created digital knowledgebases? Will this kind of project be an aspect of future library expertise and service? If the library has been the laboratory for humanists to date, are these a future version of that laboratory?

Thursday, April 2, 2009

Moving Metadata into the Cloud

Here's the ppt of a poster presentation that I gave at the NITLE Summit 2009 in Philadelphia this past weekend. It's about moving metadata from local databases to global ones and cites applications of WorldCat Local,, and flickr as examples.

Nothing that I haven't blogged on before, but thought some folks might appreciate it.

Wednesday, April 1, 2009

The Kindle Question

At the "Core vs. Context" session at the NITLE Summit, Rick Holmgren of Allegany College likened the dilemma of higher ed library and IT services to that of newspapers, making reference to Clay Shirky's piece on the latter.

Hopefully, I'll find some time to blog more on that session. But for now, I thought I'd toss out an idea that came to me this morning. Ever since that session, I've been thinking about a trigger might cause the current "organizational form" of the academic library to collapse.

Could new models for e-books be that trigger?

A liberal arts college library like ours does around 100,000 circulation transactions a year. We spend around $500K to buy print books. The staff time devoted to book selection, acquisitions, cataloging, ILS administration, circulation, stacks maintenance, etc. would easily equal $300K a year. I imagine that the space used to house the book stacks for a 400,000 volume collection in a library like ours is worth $200K a year or more not including study spaces in the library, computer labs, special collections, etc. (it would cost that much to rent similar space with all utilities included, etc.).

What if all the books that a college like ours needed were available electronically for the Kindle price of $10 each? $10 X 100,000 circ transactions=$1,000,000, about what libraries like ours are spending right now to keep up a print collection. So our library should just give a million dollars to Amazon a year and be done with it, right?

That begs the question, why even have the library or the college as the middle-man? Instead of raising their tuition for another year in a row, why not pass the $1,000,000 savings on to students, and let them buy the books they need themselves?

Of course, this is an extreme scenario, impossible at the moment. Currently, the Kindle only has a small fraction of the books we purchase and house. Most of our books cost more than $10 each, though this might be different if the economics of book sales changed. Institutionally purchasing content does have potential advantages in cost savings and perhaps the incentive it gives students to read and research widely without thinking about incremental costs.

The point I guess is that the network can remove the advantages that print institutions had in bundling together information. The bundle of information that a library provides in its stacks (and web site) might lose its value in the same way that the bundle of information that is a print (or even online) newspaper has.

Information is atomized in the network environment and middle man organizations are increasingly irrelevant. Libraries, bookstores, publishers are essentially middle men between the author and the reader just as newspapers are (were) middle men between journalist and reader.