Wednesday, September 23, 2009

Summon 'web scale'? I don't think so.

I think it's strange that Serials Solutions is attempting to apply the "web-scale" adjective to their Summon Service.

As far as I can tell, the library community has really co-opted this term from its original use, which pertained to computing infrastructure that could support web sites that handle huge amounts of traffic. Perhaps Lorcan Dempsey widened the use of the term in January 2007:
'Web-scale' refers to how major web presences architect systems and services to scale as use grows. But it also seems evocative in a broader way of the general attributes of the large gravitational hubs which are such a feature of the current web (eBay, Amazon, Google, WikiPedia, ...).
This reference to 'web scale' is now at the top of Google results for the term, making me think that the library community has just about taken over the term.

I attended a webinar on Summon yesterday, and found out that with Summon, Serials Solutions creates a broad index of content available to your library: books, journals, digital collections, etc. It gets the data from your library uploading data and from the e content vendors with which your library has relations. The data goes in a SOLR index, which then can serve as a comprehensive discovery tool for your library's content. Because it is built on local data and tailored for a particular user community this sounds much more like an 'intranet' type search than anything that is "web scale."

WorldCat Local with its upcoming metasearch features does something similar, but I think that it can make a more legitimate claim to the "web scale" designation because it is attached to the WorldCat.org database. In my opinion, WorldCat.org is web scale in the sense that it is used and improved by a global community.

Summon and WorldCat Local are competing in the same discovery interface space. On first glance, it appears that Serials Solutions is ahead of OCLC in the incorporation of article content, perhaps because of their close relations with content vendors. OCLC seems to have the edge in books: they are able to leverage holdings data in relevance rankings and they have a more sophisticated treatment of various editions of the same work (FRBR). OCLC is also endeavoring to provide delivery services in addition to discovery.

It will be interesting to see if OCLC can use its global database and the Web 2.0 principle "it gets better the more people use it" to differentiate its product from competitors like Summon.

I don't think its obvious, but what OCLC is trying to do with WorldCat is much bolder than Serials Solutions and Summon. With Summon, libraries are basically throwing all of their content into one index to break down the data silos within an institution. But what you end up with is a big search silo for that institution.

With WorldCat, the vision is to break down not only the silos within institutions but also the silos between institutions. And not just break down those silos in the sense of harvest-and-search. The concept is that libraries and their patrons will be working together to improve a shared database through intentional and professional metadata. This shared database will be big enough to have a real impact on the web. Its records will surface in search engine results. Its interface will be familiar to many, and it will be customizable for a particular audience via the WorldCat Local route.

We'll see if this grand vision takes hold.

Wednesday, September 9, 2009

WorldCat Local Review

I've written a fair amount in the abstract about the benefits of WorldCat.org and WorldCat Local.

At Watzek, we launched "L&C WorldCat" around July 1. Here are some thoughts based on my experience with the implementation.
  • There is already a sense developing at our school that "everything" is in or should be in WorldCat Local. People expect all articles and books to be there (even though they aren't). I may post more on this later.
  • Compared with launching an III OPAC, the process of bringing WCL up is refreshingly simple. They have consciously limited customization to the very basics (logo, colors, etc.)
  • Even so, as I've said before in this blog, I'd prefer a greater level of customize-abilty, kind of on the level of Blogger. Give me full access to the stylesheet. Let me add code snippets.
  • It's backward that the software pulls in live holdings data for print items from your ILS, but can't pull in links to digital content from your link resolver. When students come upon an article, they want the direct link to it up front, not a click or two away. OCLC should scrape resolvers like they do ILSs to embed link resolver links in records for articles.
  • I'm excited about the idea of OCLC partnering with content providers like EBSCO and indexing their content in WC. One thing I speculated on when writing the Digital Libraries book in '06 was that following on the success of search engines, meta indexing services for library content would eventually emerge. We now see that with Serials Solutions Summon and WorldCat.
  • The idea of also incorporating in traditional real-time meta-searching seems like a backward compromise: OCLC should be firm with content providers and resolve to only incorporate content that they can put into their index.
  • The stats module for WCL is basically a commercial web analytics package slapped onto WCL with a few limited custom reports. Basically, you can look at your site traffic and search terms being used.
  • I like the idea of using standard web analytics software on WCL, but please let me drop the code snippet in for Google Analytics.
  • If they did some url rewriting so as to map some of the search/browsing activity to clean URL paths (eg "/author/" "/title/" "/facet/video/") web analytics software becomes more useful because you can collate together like activities based on url paths.
  • For a minute, I was thinking that to provide access to an e book package we purchased through WCL, all we'd need to do is "flip the switch" and activate our holdings for those records in WCL, forget about ILS records. But then I remembered: the URLs to that package need to go through our proxy server so they need to be drawn from our ILS. WCL is not making our lives easier yet.
  • A little off the subject, but now that OCLC owns EZproxy, aren't they in a great position to develop some better, more graceful form of remote authentication than proxy? OCLC could act as a trusted third party and provide single sign on to content provider websites.
I will likely post more comments at a later time.

Tuesday, September 8, 2009

Economist on Google Books

The Economist has a leader supporting the Google Books Deal, and an interview with Paul Courant, Dean of Libraries at Univ. of Michigan.



He talks some about the product that Google will be offering to libraries with this deal.

I have to wonder if this product will be the watershed moment for e books in academic libraries. If Google's library of books is big and broad enough to serve as a general library on its own, Google's platform for e books could become the place to do research in books.

Much of its success will depend on how much current content is in their index, and this is really dependent on Google doing deals with thousands of publishers. If Google's index is largely made up of older scanned books, it'll be a useful research tool, but not compelling as place for general research.

Google might become the place to do research in books, whereas recreational e book reading will happen through other vendors like Amazon.

Thursday, September 3, 2009

What I did for my summer vacation


Our library picked up an Amazon Kindle for staff to try out, and I brought it on our family vacation to Manzanita on the Oregon Coast a couple weeks ago.

Let me give you some of my thoughts on it. My first impression was that it was kind of awkward to navigate. The little joysticky thing that functions as a mouse isn't all that intuitive. I kept wanting it to be like an iPhone/iTouch with a larger screen.

Once I figured out how to navigate content, I liked reading on it. The very simple presentation of text is refreshing. It eliminates the distractions of a PC operating system and really lets you concentrate on the text. The e ink technology works well, though I do wish it could illuminate itself in the dark. It is slim and fits into a beach bag as easily as any paper back, though I popped it in a ziplock to keep out the sand.

I found myself navigating around the Amazon store some, reading samples of various books. Having this limited body of content to chose from--just books with a recreational bent, as opposed to the whole web--felt kind of relaxing. Sort of like I was in another limited media environment like a movie theater or flipping channels on TV. (I know it has a web browser built in, but I avoided it because I was on vacation.)

After I got back from vacation and was preparing for a class that I'm teaching on digital libraries, I decided to download a couple PDF reports to the Kindle, just to check out how they would work. (I had assigned a few reports for students to read and needed to read them fully for myself.) I used Stanza to convert them to Kindle format.

It was nice to go out in the backyard and hang out on the hammock and do the reading on the Kindle as opposed to my laptop.

The Kindle is a nice device for concentrated reading in the same way that a big flatscreen TV is a nice device for watching a feature length movie.

In some ways, I wish it wasn't even connected to the network. That way there would be even fewer distractions.

In other ways, I wish it was an iPhone with a bigger screen.

Wednesday, September 2, 2009

Video for Seattle Pacific U Retreat

I thought that I would post this video, shot as an introduction to my article in the Spring OLA Quarterly on the "Evolution of Library Discovery Systems in the Web Environment." Seattle Pacific University Library is using the article as a discussion piece for their retreat.



Evolution of Library Discovery Systems in the Web Environment