Tuesday, December 18, 2007

"Innkeeper at the Roach Motel"

Dorthea Salo of the UW-Madison (my alma mater!) makes some great points in this not-yet-published article on institutional repositories.

Friday, December 14, 2007


Talk of Google's Knols is spreading quickly across the blogosphere. This strikes me as a venture into online publishing, kind of on par with publishing a traditional encyclopedia that has entries published by paid "experts."

I see this as a welcome development that will compete with and compliment Wikipedia rather than subvert it. As a side effect, it also may further erode usage of the kinds of reference tools that libraries purchase and provide to users.

I'm sensing that there is kind of a backlash against Google from the Wikipedia community on this one. But I say let them have a go at it. This is just another experiment in a form of online publishing that may or may not catch on.

One might worry that Google will force this stuff on their search engine users. But supposedly "editorial integrity" of search engine results is one of their guiding principles, and that would mean that if a Wikipedia article was more relevant than a Knol, it would still rise to the top.

How in the world did I get to be so pro-Google?

digital repositories at the network level

The Internet Archive and the Center for History and New Media are teaming up to create the "Zotero Commons", a kind of open, shared scholarly repository. This is an experiment with some similarities to institutional repository initiatives at academic libraries. Notably, the two organizations tackling this problem are not big players within the academic library world. It'll be interesting to see if this takes off. In the long run, it may make the most sense to collect and preserve unpublished scholarship at a level that transcends academic institutions. Some have been fretting that this leaves libraries out of the picture. Someone in the comments section of Library 2.0 pointed out, however, that the Internet Archive is officially recognized as a library.

Thursday, December 13, 2007

Google Universal Search as federated search approach

The Google Operating System blog offers some thoughts on Google's evolving approach to Universal Search (where they group results together from various Google indexes--Web, Books, Images, News, etc.). Even though Google has direct control over its various search 'silos', they are not trying to mix results together in Universal Search. Whether this is due to technical limitations, usability, or a combination of both, we can't be sure.

Trying to intermix and collectively rank results from a wide variety of search systems never seemed like a good approach to me, but that's just what many libraries have been attempting to do with federated searching. If Google can have its silos, why can't we have ours?

Maybe the Google approach reflects the utility of searching for the same type of media: news stories, web sites, images, products, etc. within the same search. Following on this logic, should we stick with the idea of keeping article searches separate from books in library search offerings?

Our library is thinking through these questions as we attempt to package our major search options into a search widget for our website.

Thursday, December 6, 2007

politics of Wikipedia

A recent article reports that inside the "inner circle" of Wikipedia authors and editors, there are a lot of backstabbing politics. Does this mean that Wikipedia is getting more "academic" after all? Perhaps the old saying "the politics are so bad because the stakes are so low" applies equally to a college English department and to the literati of nonprofit Wikipedia. On the other hand, a significant number of people actually read what comes out of Wikipedia, so perhaps the stakes wouldn't be as quite low as those on hand in an English department.

Richard Skrenta makes the case that this kind of political maneuvering is normal in any "social game."

Wednesday, December 5, 2007

empty trading floors

Stock trading floors: yet another example of networked computing supplanting the need for a physical space:


My father in law, a big burly guy, was a trader on the Chicago Mercantile Exchange in the 70s and 80s. He's got some good stories about life on the floor. You had to be pretty tough and assertive to be good. Now you can just sit in front of your computer and push buttons if you want to trade.

Friday, November 30, 2007

flexible furnishings

Many academic libraries are creating new, informal and flexible spaces that our catalysts for collaboration and intellectual growth. These spaces and the furniture which they occupy must be carefully balanced to accommodate the traditional printed volume.

A German firm has come up with what might be the perfect solution:

"Bookinist is a movable chair
designed especially for reading.
It is based on the principle of a pushcart
and can be rolled to a favourite
Ca. 80 paperbacks can be stored in
the armrests and the backrest.
Inclusive reading lamp and 2 hidden
compartments for writing utensils
in the armrests."

Thanks to Nikki Williams of Watzek Library for pointing this out.

Thursday, November 29, 2007

a hypothetical vendor response to the dis-integrating ILS

It's becoming clear that libraries can pick and choose from a variety of digital library products independent from their main ILS platform. These products include open url resolvers, federated search systems, e journal management services, digital asset management systems, and most recently, next generation catalogs.

It's now possible, and perhaps even common, for a library (or a consortium of libraries), to have an aging, but reliable ILS performing the basic inventory management functions of the 'bought' collection while a bevy of other digital library products peform the new digital library functions mentioned above.

But what about the vendor of that core ILS product? Is it not likely that prior to these disconnected array of digital library products, they were able to reliably sell upgrades and add-ons to their installed base of ILS customers? Now, they must face competition with many other vendors for finite resources that libraries have for new technology. It might also be likely that this vendor isn't positioned that well to compete against some of these newer, more nimble firms out there in the digital library marketplace, and that even thought they offer digital library products, they have trouble selling their products even to their existing base of customers.

What is this vendor likely to do in this situation? Simply sit there, idly supporting their aging traditional ILS system and watch their chunk of revenue from the library technology marketplace decline? That would be pretty painful, I imagine.

The easy answer, of course, is that they need to reinvent themselves, innovate, and regain market share. But if that doesn't work, there is another option. Even though the traditional ILS is losing some of its strategic importance for libraries, it is still a core part of a library's operations. And generally speaking, libraries depend on their vendor's support to keep these systems running. Specifically, they depend on a support contract, which is a voluntary agreement between two the vendor and the library. If a library system vendor doesn't like how a library (or perhaps group of libraries) is spending its money, they could simply threaten to discontinue that contract in order to bring the library into line. After all, most libraries would be hard pressed to switch ILSs on short notice as it is quite a costly and time-intensive undertaking.

Of course, this would be a dangerous game to play for a vendor in the long term, as a core ILS can be replaced with plenty of different options. But it is one tactic that a desperate ILS vendor could use against libraries in the short term.

Tuesday, November 27, 2007

Forbes and the cost of higher ed

I came across this article in Forbes at the Dr's office the other day. It's about the lack of controls on costs in higher education. Being inside the system at a private college, I can say that it seems totally accurate. There really isn't much pressure around to develop efficiencies. Enhance quality and provide new services, yes, but develop efficiencies, no. Overall, I think the problem of rising costs is partially due to rising expectations for enhanced support services (computing, athletics programs/facilities, counseling centers, libraries, etc.) and partially due to a lack of cost-cutting culture in academia.

One figure it points out is the number of nonteaching professionals per student, which has doubled since the 1970s.

What are colleges doing about their overhead? Not much. In 1929 universities spent 8 cents of each operating budget dollar on administration; today they spend 14 cents, says Richard Vedder, an economics professor at Ohio University in Athens. In 1976, he says, colleges had three nonteaching professionals for every 100 students; 25 years later they had six.
Being one of those nonteaching professionals, this seems a little disturbing. I sometimes wonder if there will be a huge backlash against the increasing costs of private Colleges (and ballooning student loans), or perhaps some other event that produces a rethinking of staffing at Colleges. It's important for us on the higher ed payroll to realize that a good portion of our paycheck is being paid out of a student loan some 18 year old took out.

One way to look at elite private Colleges is that they are a boutique product, like micro brewed beer or good wine. There's demand for high quality education done the expensive way like there is for other "artisan" products. Hence, it makes sense to focus on quality rather than the bottom line.

The top income earners in this country who fund much of private higher ed by picking up the tab on tuition and giving money to colleges like the idea of high quality education. And those top income earners have done rather well in the last 5-10 years or so, helping out many private colleges quite a bit. They probably also read Forbes.

fun with Google Analytics

I just hopped on my Google Analytics account and noticed that there was a spike in traffic to this blog from the Ohio region after my post last week on OCLC and network level services. Someone in Dublin is reading!

outsourcing IT

Inside Higher Ed has a good article on some of the issues surrounding university outsourcing of email to companies like Google and Microsoft. It presents it as an issue that is still a matter of debate, but come on, it's time to turn over these kinds of services to the organizations that do them the best and most cost effectively.

This passage is particularly good:

Through the 1980s, students in college who were affluent enough to come from households with a personal computer routinely experienced technology that was more advanced than what they were used to at home. “Over the course of the ’90s and into the decade that we’re almost finished with, universities have slipped considerably from that position and have gone sort of into the position of near-follower, and maybe ‘near’ is being charitable,” Sannier said.

Of course, it was universities that developed, refined and incubated the predecessor of the Internet, and they were some of the first institutions to adopt e-mail capability. But when it came time to offer the services to all students, rather than just faculty and researchers, many colleges created their own homegrown solutions. Now, some of them are suffering “from the innovators’ dilemma,” as Sannier put it, as software infrastructure intended for a smaller scale is increasingly strained to match growing numbers of students and their widening expectations.

The idea is that if Colleges and Universities stop putting so much energy into managing things like email systems, their IT departments can get back to work on bringing cutting edge academic technology to students and faculty.

Tuesday, November 20, 2007

OCLC and network level services

OCLC is loading up on the big names in the digital library world. Of course, they've had Lorcan Dempsey for awhile. They recently picked up Roy Tennant and most recently, they hired Andrew Pace as Executive Director of Network Level Services. That job title makes me think that Dempsey had something to do with designing the job. An old OCLC hand out here in the Pacific Northwest recently referred to him as "Lorcan our prophet." Apparently he really does have a hand at shaping company strategy.

I know Andrew Pace some from the '06 Frye institute and offer my congratulations to him. I always used to see his columns in Computers in Libraries and think he was just another dork writing about library technology. At Frye, I discovered that he's a pretty enjoyable and interesting guy to listen to and in person is always dropping this funny, sometimes southern flavored aphorisms when describing various dilemmas and situations in the library world. I guess "lipstick on a pig" might be an example. I think he'll do a bang up job in this new position. As much as he's a thinker and observer, I know he's also a doer, as the NCSU catalog demonstrates.

In my humble opinion, OCLC has a lot of work to do on their "network-level" services. Here's a few areas where they could stand to improve:
  • ILL: the current mish-mash of products for managing interlibrary loan is pretty lame and is a real time sucker for library systems people. In most situations, these include an ILL management system (Clio/Illiad) for workflow management and patron interaction, a system for sending and receiving documents (Ariel), and the OCLC resource sharing network. There's no reason that OCLC shouldn't be able to provide a comprehensive ILL management suite including document exchange, workflow, and patron interaction as an entirely web-based, hosted tool, and offer it as a basic part of their ILL service.
  • Digital collections software: ContentDM is a relic from the 1990s. It's a strange hodgepodge of C code and PHP, is clunky as hell, not to mention that it produces the ugliest looking URLs and no page titles, making it terrible for search engine crawling. Someone good at Ruby on Rails could build a better piece of software in a weekend. (Perhaps I'm just a little annoyed with it right now because I've been working on getting a Google Search Appliance to index our ContentDM collections) OCLC should be offering a fully hosted, web scale digital asset management system with web-based client software on par with somthing like Flickr. They could offer migration from ContentDM
  • Semantic web strategy: OCLC needs to follow the Talis into the semantic web space. They need to be designing systems that share data in an open fashion.
  • WorldCat Local: This is where OCLC has got it most right recently, in my opinion. If they add an API, and make the UI more customizable, and allow for localized versions of records, they'll be in business.
  • Partnerships with large-scale players: OCLC is positions to make partnerships with big players in the information space like Google. They've found ways to bring library assets into Google's space, perhaps there are ways to bring Google assets like Scholar and Books into the library space (a Google Books/WorldCat Local integration?).
  • Data exchange: if you've ever had to get your data up to OCLC in batch form, you've most likely experienced pain on par with visiting the dentist for a minor procedure. Their procedures for doing things like local data record uploading are horrendously slow and bureaucratic. WorldCat will never truly be a universal catalog for library assets if OCLC can't streamline methods for updating data in WorldCat.
The team at OCLC has their work cut out for them.

Friday, November 16, 2007

the serverless internet business

A good post by Nick Carr on the serverless Internet businesses:


Tuesday, November 13, 2007

Wall Street Journal to go for free online

The Wall Street Journal has announced that they are going to provide their content free online. It's always annoying when I find links to WSJ articles and then get hit with a subscription required notice upon linking, so I welcome this development.

This decision points to a broader phenomenon of more content supported by advertising instead of subscription, a phenomenon that makes libraries less relevant when it comes to finding information. With subscription based resources, we are the gateway to the resource; economically speaking, we aggregate demand for a resource and serve as a purchasing agent for our community. In the ad based model, we're taken out of the loop.

I imagine there will always be information seeking "power tools" not supportable by ad-revenue, particularly in the area of backfile content, and libraries will be there to provide them.

Friday, November 9, 2007

Canadian Libraries

From my experience at Access 2007, I got the idea that Canadian academic libraries might be a little ahead of the curve in their incorporation of web technology, particularly open source solutions. This article (coming to me by way of the Frye listserv) reinforces this idea.

A number of institutions are looking at Evergreen, while there are some pretty cool digital collections systems put together from open source tools, especially the combination of Drupal/Fedora, which they are doing at UPI:

Friday, November 2, 2007

EDUCAUSE wrap-up

I've been meaning to post a wrap up on the other sessions I attended at EDUCAUSE in Seattle. I attended a couple that were sleepers and a few other interesting ones:
  • one on a shared institutional repository for liberal arts colleges sponsored by NITLE. I observed that these institutions are having some of the same challenges surrounding "content recruitment" for an institutional repository as we are with our student thesis archive here at Lewis & Clark. It was clear that that these institutions had put a fair amount of thought and effort into this program, and it's too bad that they haven't had more luck filling up their repository. This makes me wonder, are institutional repositories a solution in search of a problem?
  • a surprisingly entertaining and well attended session on where students use computers at a liberal arts college. the guy giving the program had some scripts setup to gather an impressive amount of stats on computer lab usage at Claremont McKenna College. The lesson: gather data to support decisions. Also, undergraduates still use computer labs even with laptops.
  • a pretty cool demonstration of an interactive digital center in Kentucky; the guy showed off some impressive 3d visualization tools, mostly of machines and automobililes; it was all powered by EON software

Wednesday, October 24, 2007


I attended a "hot topics" session yesterday on cyberinfrastructure, largely to figure out the meaning of this somewhat fuzzy concept. It was introduced by the moderator as a somewhat rag tag group of computing functions: cluster computing, digital libraries, data visualization, high speed networks, etc.

Gradually, it became apparent that many folks in the higher ed IT world see this as a movement to play a higher profile in the academic missions of their institutions by bringing integrated high performance computing resources to a wide range of academic disciplines. There was a lot of talk about how IT departments, now, focus on mundane administrative systems, the "plumbing" of the institutions. This is seen as an effort to make IT more strategic, to move from "administrative dweebs" who put up barriers to research (firewalls) to folks who "get stuff done" for faculty. This has strange echos for me of conversations among academic librarians regarding ways to become more relevant, to get involved in a deeper way in research and teaching.

There wasn't as much talk as I expected on the role of extra-institutional systems in supporting this, including systems supported by the commercial sector. At least on the library side of things, we're seeing some of the most powerful, web scale digital library systems (Google Books) in the hands of a commercial vendor.

In my view, the cyberinfrastructure question, especially the adoption of high performance computing in the research process in disciplines that are less comfortable with computing, has a lot to do with raising expectations regarding what scholars expect from IT and what they expect from themselves in terms of research methodologies. I doubt we'll see philosophers clamoring for these things, but I'm sure our geologists, sociologists, and historians will be there soon.

U Penn Undergrad Research Journal

This is a different take on the institutional repository concept. Treat the content more as editorial content for a journal. Keep the quality high and use faculty and program chairs as an editorial board. Maybe we should take this tactic.

There's a similar workflow to our senior thesis software. And they have some of the same problems with seniors disappearing.

They use the creative commons license for submissions.

Xavier University's Web 2.0

In this presentation, they're showing off a kind of new student portal that borrows functionality from youtube, MySpace.

Xavier has a fairly radical organizational structure that completely sheds the "library" division. They have a discovery services department which seems to pick up some of the traditional library functions, as well as the content services division.

Interesting concept: inverted service pyramid...put the weight of staff into automated and self service services. Reserve in person services for specialized stuff.

People with young kids always put them in their presentations for a little comic relief; no exceptions here.

They also demonstrated a fairly traditional campus portal with library resources, people (librarian, advisor, dean), courses tailored to each student.

Educause 2007

Here I am up in Seattle, signing in for Educause 2007.

I flew up this morning from Portland to avoid the cost of a downtown hotel stay. This made for an early morning and missing the keynote session.

After getting signed in, I couldn't help but drift up the hill into the familiar Capitol Hill neighborhood, where I am currently situated in a coffee shop fueling up on a double Americano in preparation for the first of the regular sessions. I've spent a fair amount of time in this area of Seattle, especially back in the mid to late nineties. The divey Comet Tavern is close by, and I have a feeling I might be drawn there at some point.

Initial impressions of the conference are that it is huge and well-organized, leveraging IT in lots of ways.

One pet peeve of recent conferences attended: the ever present disposable tote bag. Out of principal, I am refusing to pick mine up this time. These things are worthless and serve only to advertise the vendor on the outside of the bag and fill up our landfills. Educause also is giving out free umbrellas...sort of a nice touch in Seattle. But they obviously don't realize that people in the Northwest hardly use them. We realize that a little water is pretty harmless.

Another environmental complaint: I've never received as much junk mail at work as after having signed up for this thing. Lots of garbage from software vendors and IT service providers. Some of them have even called so I've been using caller ID a lot.

As long as I'm on a roll, I notice that the official dress code for the conference is "business casual." For some reason, the whole business casual concept just ooozes a sense of blandness and mediocrity. I suppose I dress that way at work most of the time, but actually making this official recommendation for a conference somehow seems kind of lame. I prefer a more inclusive dress code--with everyone from jeans and t-shirts to Saville Row suits being okay. If we want cutting edge IT, we need to be open to a Googlely sort of dress code?

Thursday, October 11, 2007

Access 2007

Reporting in here from Access 2007, the Canadian library technology conference.

Here are some of the major topics at the conference:
  • next generation OPACs
  • the ILS marketplace
  • cyberinfrastructure and large scale research computing
From the "ILS options for academic libraries" talk this afternoon, one of the speakers talking about III likened user group conferences to Zombie movies. He was making the point that many staff at III libraries have sort of been lulled into complacency and might not be prepared for the moment when outside forces conspire to force a migration.

Thursday, October 4, 2007

demise of the CIO

One of the themes that I've pursued in this blog has been the demise of centralized information technology in organizations. Nick Carr continues a familiar theme in his work on the decreasing strategic importance of CIOs:
We've entered the long twilight of the CIO position, a sign that information technology is finally maturing. Technical expertise is becoming centralized in the supply industry, freeing workers and managers to concentrate on the manipulation and sharing of information. It will be a slow transition - CIOs will continue to play critical roles in many firms for many years - but we're at last catching up with the vision expressed back in 1990 by the legendary CIO Max Hopper, who predicted that IT would come to “be thought of more like electricity or the telephone network than as a decisive source of organizational advantage. In this world, a company trumpeting the appointment of a new chief information officer will seem as anachronistic as a company today naming a new vice president for water and gas. People like me will have succeeded when we have worked ourselves out of our jobs. Only then will our organizations be capable of embracing the true promise of information technology.”
From my experience in the higher ed environment, I'd observe that leadership in the application of technology is critical. But it need not come from one centralized place in the organization.

Thursday, August 23, 2007

outsourcing a public library

This is kind of a disturbing tale from down in Jackson County, Oregon. It appears that county officials are going to outsource the operations of the library to a company in Maryland. Are we going to start seeing this "franchising" phenomenon in academic libraries?

It seems to me that you'd lose some of that local connection to the community by having a library run in a McDonalds style fashion.

Friday, August 17, 2007

Flickr ideas forum

I just posted something in the Flickr ideas forum:

I work in a liberal arts college library that is building an institutional collection of digital images (primarily of art and architecture) to support teaching. Many other colleges and universities are doing the same thing.

The problem is, the digital asset management systems that we're using (MDID, ContentDM) aren't that great. They don't have the elegance and functionality of Flickr's web interface or image management tools. This is really a drawback, especially when we're trying to get faculty to build their own personal image collections online and share them with the institutional collection.

The other drawback of using these digital asset management systems is that they are isolated systems--they must be maintained locally and they don't share data nicely in a Web 2.0 sort of fashion like Flickr.

If Flickr had organization-level capabilities, it could potentially revolutionize digital asset management in this arena. The main things missing now are:

1. an organizational account designed for larger scale use, in which several users could administer images (different than current groups capability)
2. capability to authorize a large body of users without flickr accounts to see certain photos through IP recognition, LDAP authentication, etc. (necessary because some of the images we manage have copyright restrictions)

I realize that opening up the Flickr platform to organizations could alter the flavor of the community. But perhaps it could be done in a way that preserved the spirit of Flickr. Perhaps only organizational collections that fit certain criteria would be elgible for inclusion in the broader body of Flickr work.

Opening up Flickr in this way could enrich it.

Mark Dahl
Lewis & Clark College

Thursday, August 9, 2007

on the Talis Platform

Something about these semantic web databases fascinates me, and Jeremy and I had an enjoyable conversation with Richard Wallis, Technology Evangelist from Talis, about the possibility of using the Talis Platform to mount data from the Summit catalog. Given the time difference with the UK, it was about the time of the day where he was ready for, as he put it, a "warm beer" and we were needing our coffee.

At one point, I asked him what advantages we would get by mounting the data on the platform versus setting up a MySQL database on our own. As I recall he mentioned three major advantages:
  • the platform is zero-setup and "web scale", no overhead of running a database server ourselves or scaling it to handle load; it resides "in the cloud"
  • it is already optimized for handling MARC bibliographic and holdings data and building faceted next gen type catalogs around it
  • its a semantic web type database; the data can be easily "augmented" with other value added data (bookjackets, wikipedia info) in Talis stores or outside the platform
I also learned that the platform can ingest digital objects and automatically extract metadata from them. This makes me wonder if the platform could be used for setting up lightweight digital asset management systems. Can the platform compete with something like Fedora? Many of their capabilities seem quite similar.

Open Library

The Open Library will be something to watch, though what it is seems a little vague and haphazard at the moment. This article rightly mentions the fact that there are all ready some impressive web-based systems out there that aggregate data about books, WorldCat, Amazon Google Books, being among the most prominent. The Open Library promises a more mashable data model, which I find promising.

Friday, August 3, 2007

ITHAKA report

There's some buzz around this report by ITHAKA about the future of scholarly publishing. The just of it is that:
  • universities need to pay more attention to publishing;
  • the value university presses really add comes in the peer review process;
  • university presses should share a digital publishing infrastructure.
Hard to say what this all means for libraries and smaller institutions without presses, but the report that every institution that produces research needs a strategy for scholarly communication.

ITHAKA encourages the continuance of a diversified market for scholarly communication, including commercial and non commercial players. I think the inherent complexity of such market bodes well for libraries, who will need to manage the complexity inherent in such a market.

Thursday, August 2, 2007

waiting for Google's JotSpot

At Watzek Library, we're using a combination of static web pages maintained with the Contribute/Dreamweaver suite, Basecamp project management, and Google Docs to create/maintain our intranet.

I've been wanting to move us off static web pages and on to a wiki for awhile, but
I'm waiting to see what Google does with the JotSpot wiki platform. The beauty of the JotSpot wiki platform is that you could build applications on it such as simple databases, project management systems, etc. Basecamp is nice, but isn't quite up to Google Docs in functionality. Combining that with Google Docs could be pretty powerful.

Haven't heard much news on this front for awhile.

Monday, July 30, 2007

faculty/librarian expectations

This EDUCAUSE article on the changing information service needs of faculty, coming to me by way of Dan Cohen, points to a disconnect between faculty and librarian perceptions regarding the role of librarians.
The consultative role of the librarian in helping faculty in their research and teaching is viewed as an important function by most librarians, but most faculty members do not put the same emphasis on this role of the library.
Unfortunately, the article doesn't unpack this idea of a "consultative" role. I think that there are many new roles in academic libraries that we are experimenting with, and the consultative role is one of them, the institutional repository another. We need to try them and see how they work. But it is certainly not a forgone conclusion that they will work.

Most of the rest of the article discusses collections. No real surprises there.

Tuesday, July 17, 2007

DSpace cash injection

The Chronicle of Higher Ed picked up a story about the DSpace project getting an extra injection of funds from MIT and Hewlett Packard. I thought it was interesting how they portrayed the DSpace project, and institutional repositories generally, as a struggling endeavor:
But while hundreds of institutions have installed the software, many are still struggling to get faculty members to fill their databases with material. Academic librarians say many scholars justifiably worry that publishers will reject their work if it has been in an open archive. Others prefer promoting their research through personal Web sites, even though those venues are less secure than archives.
The article, notably, doesn't mention DSpace's main rival, Fedora.

Indeed the institutional repository isn't something that seems to have gained rapid momentum anywhere. The article states that even MIT is struggling to fill their repository.

I think that there's a notion out there that now that we can save and curate things like pre-prints, white papers, teaching materials, video versions of presentations, etc. we should. But if we didn't save these things before, why should we now? Resources are scarce, time is scarce.

Perhaps libraries should be looking at cruder methods of archiving electronic content that don't require labor intensive submission processes--crawling institutional web sites, for example, and archiving the results.

I've noticed a trend of sort of "self archiving" web content at our institution. For example, our symposia at the College on tend to archive past programs simply by leaving them in accessible folders on our web server. I'm sure any digital preservationist could list many perils of doing this. But can I really convince someone that it's worth the time to re-organize those files and stick them in an institutional repository, where they'll probably be harder to access?

As we consider an IR at our institution, these are the questions I ponder.

Thursday, July 5, 2007

reclaiming the OPAC real estate

Here at Watzek Library, we've been working on a project that might be called a "mashup" of our library catalog and regional union catalog, Summit. We're slowly releasing it to the public to get feedback and to see how well the technology holds up (see previous two links).

Basically, we're using JQuery and JSON to add some little widgets to the full bibliographic record displays of these Innovative Interfaces OPACs. This is a tricky process as control over the HTML output by an III OPAC is highly locked down by Innovative. By using javascript to go out and get value added data and insert it into the records, we're doing something fairly similar to the LibraryThing for Libraries widget, discussed here in panlibus.

Technology-wise, most of the heavy lifting is done by the JQuery javascript library as well as the JSON for JQuery library, which we use to employ the common technique of moving data across domains using JSON.

The services that we've added so far to the open beta of the catalogs include:
  • Amazon images
  • Link to RefWorks export
  • Link to Google Book Search record and a search box if it is searchable
  • Direct link to search for book reviews in one of our general research databases
Note: we're using the technique described by John Blyberg to do our link to Google Book Search. We haven't yet been shut off by Google, but our volume isn't that high.

In our alpha version of the catalog, we've also implemented wikipedia link for the author, Google map of holding libraries, and similar items driven by Amazon web services.

The goal here is to hook are users up to valuable related services and linkages that they might not otherwise find.

Monday, July 2, 2007

Google Book Search Local

So here's an idea...

The Google AJAX search API makes it pretty darn easy to create a customized Google Book Search for your own web page. The API will return an identifier (which is typically an ISBN) for the book, among other metadata in results lists. Why not use a little JSON and JQuery to figure out whether or not your library holds the item and insert that information w/ link in the results? A database of your library and/or union catalog holdings would facilitate the task.

This would create a nicely localized version of Book Search for your library.

Monday, June 25, 2007

Needham talk

Inside Higher Ed has some coverage of OCLC VP's George Needham's advice to librarians for accommodating "digital natives". They include...

  • Avoid implying to students that there is a single, correct way of doing things.
  • Offer online services not just through e-mail, but through instant messaging and text messaging, which many students prefer.
  • Hold LAN parties, after hours, in libraries. (These are parties where many people bring their computers to play computer games, especially those involving teams, together.)
  • Schedule support services on a 24/7/365 basis, not the hours currently in use at many college libraries, which were “set in 1963.”
  • Remember that students are much less sensitive about privacy issues than earlier generations were and are much more likely to share passwords or access to databases.
  • Look for ways to involve digital natives in designing library services and even providing them. “Expertise is more important than credentials,” he said, even credentials such as library science degrees.
  • Play more video games.
Most of these suggestions strike me as obvious, unoriginal or pointless. We've been hearing about how 24/7 chat reference is important for almost a decade and at least at our library, nobody ever used it.

One point he makes which deserves emphasis is understanding that folks like to jump in and experiment with applications before reading manuals or help sheets. Then again, hasn't it always been the case that people (particularly men) never read the manual? I know it was true of me when I got our new BBQ grill a few weeks ago. If it weren't for my wife, I would have fired it up with pieces of plastic packaging still inside the grill.

The reporter notes that librarians in attendance were "taking furious notes."

Sunday, June 17, 2007

online sales growth slowing

The growth in online retail seems to be slowing, according to the NYT:

Perhaps a reminder that a place-based experience remains quite important.

Somewhere in the article it says that online makes up a total of 5% of all retail. Seems like such a tiny portion, but I supposed that when you consider that there are lots of large ticket purchases (refridgerators, TVs, plants) that most folks are not likely to make online.

Personally, I dislike shopping in person (except in rare cases where there's some kind of interesting gadget involved) so I'm doing everything I can to push those numbers up. My wife on the other hand...

Tuesday, June 12, 2007

Mossberg on centralized IT

The Chronicle of Higher Ed reports that the Wall St. Journal's technology columnist Walter Mossberg had this to say about centralized IT departments:
...he began his speech by calling the information-technology departments of large organizations, including colleges, "the most regressive and poisonous force in technology today."

They make decisions based on keeping technology centralized, he said. Although lesser-known software may be better, he said, technology departments are likely to use big-name products for their own convenience. That may keep costs down for an organization, he said. But it puts consistency above customization, preventing individuals from exploring what technology products are best suited to their own needs.

And change is coming, he said, whether IT departments can keep up or not.
This was part of a speech that also emphasized the declining importance of the personal computer.

Monday, June 11, 2007

Photosynth demo

This is a pretty amazing demo of a technology called "photosynth" that brings together photos of something by analyzing them.

Thursday, June 7, 2007

Google Digitization and the CIC

Well, I'm glad to hear that Google is moving quickly with the big consortium of university libraries in the Midwest to do more digitization. The ivory towers on the coasts can't have all the fun, right? My academic home, is of course, the University of Wisconsin-Madison, and I grew up in Minnesota, so I have a soft spot for this group of libraries.

This time around they are doing selective digitization, based on collection strengths. On their press release page, they offer a description of collection strengths, which I found interesting. Nortwestern U has a big collection of Africana, for example. I was a little nostalgic noting that one of University of Wisconsin's great strengths is European History and Social Sciences. The beauty of studying modern German history there was that if you were looking for any book on Germany published in US or Europe within a certain timeframe (the 1950s-1970s I think) you could practically count on it being there. The comprehensiveness of the collection seemed to diminish as you hit the eighties and tighter university budgets took effect.

One thing not to overlook here is that this goes far beyond English-language content; these books will be useful well-beyond the English speaking world. There is going to be tons and tons of non-English material in this collection. I can recall shelves and shelves of books in Polish, Chinese, Russian, German, French, etc. wandering Memorial Library stacks in Madison. I imagine that American university libraries are the most effective place to start for collections that span the world's corpus of written works.

Lorcan Dempsey sees this as a big step. He rightly points out that with this comprehensive data, Google is going to be able to build services that no one else can:

However, as we are beginning to see on Google Book Search, we are really going beyond 'retrieval as we have known it' in significant ways. Google is mining its assembled resources - in Scholar, in web pages, in books - to create relationships between items and to identify people and places. So we are seeing related editions pulled together, items associated with reviews, items associated with items to which they refer, and so on. As the mass of material grows and as approaches are refined this service will get better. And it will get better in ways that are very difficult for other parties to emulate.
By "other parties" I think we can read OCLC, who is doing their best to leverage all of the data in WorldCat to develop structured relationships between intellectual works, their authors, and subjects. Will Google learn to do FRBR before OCLC does?

The libraries that are party to this deal get to keep the digitize texts and do their own things with it. Will this give these big universities a "strategic advantage" over some of their competitors? Does this mean that size still can matter in the networked environment? This reminds me a little of the NITLE initiative, the original intent of which was to overcome the disadvantage of small sized liberal arts Colleges in the information technology arena. Here's an example of an area in which us small folks can't compete--we just don't have much unique material in our libraries. But I guess the point is that everyone can access this stuff to some extent through Google.

Monday, June 4, 2007

tweaking Google search

NYT has a nice piece that goes inside Google's 'inner sanctum': the search quality department. It gives you some illustration of how worked-through search is in Google, and how far ahead of the competition that they are.

At one point, they discuss the impressions of a recent recruit from Amazon to the department:

When he arrived and began to look inside the company’s black boxes, he says, he was surprised that Google’s methods were so far ahead of those of academic researchers and corporate rivals.

“I spent the first three months saying, ‘I have an idea,’ ” he recalls. “And they’d say, ‘We’ve thought of that and it’s already in there,’ or ‘It doesn’t work.’ ”

I think the best thing that we can do in academic libraries is try to co-opt this technology for our more focused purposes. I hope Google can keep their systems open enough for us to do this.

One small way we're thinking about doing this is including Google Book Search results on our OPAC search results page using their AJAX search API. Here's hoping that they soon add Scholar support for the search API.

Friday, June 1, 2007

future of textbooks

A government report on the future of textbooks proposes a new model for assembling copyrighted study material for courses. According to the Chronicle of Higher Ed., the Advisory Committee on Student Financial Assistance,
proposes the creation of a "national digital marketplace" that would allow instructors to select and students to buy custom-designed texts -- a chapter from one book, a case study from another -- while protecting fair-use allowances and publishers' copyrights.
This could effectively replace much of library course reserves as we know it. Yet another example of "taking it to the network level." This time, it's nice to see it coming from a group concerned about reducing costs for students.

Monday, May 21, 2007

Google's API for their search appliance

Interestingly, Google says that it's releasing an API for their search appliance that will allow the appliance to tap into lots of external content stores. I wonder if this could be adapted to an ILS, digital collections system, or some other application in libraries.

ASPs and decentralizing IT

Just noticed that our HR department is going to use software for applicant tracking based on an ASP model, whose website says, "The ASP model eliminates the need for IT assistance during implementation while ensuring that your system is running on the latest, fully redundant hardware in a secure hosted facility."

Evidence to support my thesis in this post.

Google Universal Search

Lorcan Dempsey points out that Google's just introduced a strategy to bring together results from many of their vertical search engines (books, images, web, video, etc.). It's called "universal search". This would have been a good thing to bring up at the presentation I did on federated searching at the Oregon Library Association conference. I was trying to make the case that search engines are really the future of federated searching, or at least worth looking at for important trends.

Unfortunately, they don't mention bringing Scholar on board universal search. I'm also wondering about the ability to integrate search appliance results with Google Search engine results. We're thinking of indexing some of our local content with an Appliance, and if we could mix results from the Appliance with Google Web and Scholar results, maybe we could put together something like federated searching with Google technology.

Sunday, May 20, 2007

network apps and the demise of centralized IT

Some observations...
  • We're seeing more and more powerful network based applications, available for free on the Internet: GMail, Google productivity apps, Flickr, etc.;
  • Often, these applications are arguably, the best of breed in their class; they work better than anything you'd pay for and install on your desktop or on your network's local servers
  • People can adopt these applications independently or in small communities within their organizations; no central coordination or IT support is necessary;
  • They can be used across organizational boundaries--folks from different companies or universities can dive in an work together;
  • They can, increasingly, be purchased and adapted to an organizational setting (see Google Apps)
As network-based applications mature and we see more of them become available in niche areas, a dramatically reduced role for centralized information technology support in organizations. Central IT used to be a necessity because of the infrastructure required to support systems. Technology support will gravitate to the edges of the organization, with the focus shifting to the rightful application of the technology rather than technology as infrastructure.

This, I suppose, will happen faster in loosely coupled organizations like universities. In the university environment, I can imagine administrative computing being done between the business office and the registrar, with relevant applications hosted off site in a salesforce.com style. The public web site designed and maintained by the communications office and hosted by an off site firm, including supporting back end databases, search services, etc. The communications systems (email, groupware, etc.) will be chosen and maintained by organizational development folks in HR. Academic technology will be applied in an ever diffuse fashion, with communities of practice developing across institutional boundaries, and institution-specific applications like CMSs maintained out of a Center for Teaching and Learning. Libraries have already seen a lot of the tools that they offer become remotely hosted web applications, and in some ways, they will lose more control. Their role will be facilitation and customization of these networked-applications.

IT will be there to support the plumbing: keeping up network and the relevant devices. Even some of the plumbing, especially data centers at smaller organizations, will go away. There will also be a persistent need for integration and standardization. But the strategic role of centralized IT will diminish, just like it did for those VP's of electricity in the last century according to Nick Carr.

A centralized IT operation was important in the past because you could aggregate software/hardware and knowledge in one place. But following that model was always dangerous as it removed those working on the IT from the parts of the organization that IT was designed to serve. As the support for equipment and software becomes more easily outsourced on the network, we'll see centrifugal tendencies accelerate.

The IT department that tries to maintain control too tightly will get "worked around" and just accelerate the phenomenon.

Saturday, May 12, 2007


I've learned a great deal about NITLE recently, as I'm going to be the new campus liaison for L&C.

Interestingly, NITLE is an "incubated" organization by a larger nonprofit known as ITHAKA. ITHAKA is behind projects such as ARTstor and JSTOR.

The chief benefit of NITLE, in my mind, is that it allows cooperative technology work among institutions with similar missions (liberal arts college). Even though our regional cooperatives (like Orbis Cascade Alliance) work superbly in many instances where geography is key (physical resource sharing), I think it can make more sense to collaborate based on mission on these more content and teaching-and-learning focused initiatives.

I particularly like the discipline focused programs that NITLE offers: Al-Musharaka (Middle Eastern Studies), Sunoikisis (Classics), foreign languages, and a couple digital collections: IDEAS and REALIA. The organization also has its fingers in many other pies. These programs, plus quite a few others, reflect a pretty creative, innovative group.

Tuesday, May 1, 2007

Giving WorldCat Local a Go

UW (not the great, Badger State UW, rather that lesser institution: the University of Washington) launched WorldCat Local today. I can't say that there are too many surprises to me in the implementation, as it is based on the now familiar WorldCat.org platform.

But a few observations:

I knew that they would be offering fairly direct requesting for items held in the Summit consortium. This works pretty well and, interestingly, even works for me as someone at Lewis & Clark. Even if I come across an item that is held at the UW libraries, I am offered the ability to request it on Summit.

One thing I don't really care for is the display of holding libraries closest to me that is shown when I've selected a book...this just seems irrelevant and confusing to me when I'm a member of the Summit network.

The relevance ranking seems to be based a great deal on how many libraries hold an item. This is definitely a good direction to go and could be likened to Google PageRank. But it doesn't always work well. When I searched for "Bend Oregon", for example, the top hit is an EPA publication: "Pressure and vacuum sewer demonstration project Bend, Oregon". I also got lots of references to government documents with the title: "Amending the Bend Pine Nursery Land Conveyance Act..."--it's held by like 189 libraries. (I think this offers some hint of the redundancy involved in acquiring and cataloging government documents across libraries when these types of documents are often rarely used and available openly on the web).

The book that "should" come to the top is probably "Bend, in Central Oregon", which is the main book ABOUT the city Bend, OR...WorldCat Local needs to work on that "aboutness" thing...not sure how. Perhaps they need to be doing more creative things with subject headings or somehow move govt. publications lower in the ranks. Or bring in circulation stats...those would quickly lower govt. docs in the rank.

Interestingly, WorldCat.org seems to have more sensible results when you search for "Bend Oregon"...perhaps they are using more of a public library relevance ranking system that doesn't report all those depository library holdings

Relevance ranking is not an easy thing to do, of course, and is much more than a popularity contest. It's funny how library holdings are a strange take on "popularity."

Another complaint: the facets for authors don't always work very well. Seems like corporate authors without much meaning ("United States") often float to the top. Also, it's hard to browse around the publication date facet.

I don't see many openings for mashups and remixability...no RSS feeds or apis advertised. I have hopes that OCLC will open things up.

The other question that's hard to answer is the degree of local configurability. Generally speaking, it's a pretty busy display when looking at a particular title, but it would be nice to think it could be configured locally and streamlined.

I applaud the inclusion of articles and hope this expands. Bringing on board large aggregations of articles could be great. Better resolver integration would be nice as more articles appear.

Overall, this is a pretty good first shot at an OPAC product by the behemoth OCLC.

Thursday, April 26, 2007

flickr at the organizational level

We're attempting to build an organizational collection of images for teaching around here using MDID, which is great open source web-based software for teaching with digital images. It's got lots of features tuned to teaching the images in the classroom, many of which center around its presentation tool which allows for zooming and side by side display of high res images.

It's got two major flaws, as I see it:

  • It's a silo of data. Folks looking for images don't go to it first when they're looking for something.
  • The UI on it for organizing images isn't nearly as nice as Flickr. I realize this as I use Flickr more and more for my personal stuff. Though MDID has a personal collection feature, there's no way I'd recommend it over Flickr.
It would be great if Flickr offered some kind of organizational account that facilitated organizations building image collections within it. Or perhaps we need to make MDID work with Flickr. MDID could drop all the functionality that Flickr does better and just concentrate on the education specific features.

Wednesday, April 25, 2007

Freebase explored

Got a chance to try out Freebase thanks to an account from Paul Miller.

Once you've logged in, the site has some nice Flash based tutorials that explain the idea behind the service pretty well. You can also browse the categories 'types' that are setup in freebase. Basically, they are designed to span the world's knowledge, a pretty ambitious goal. Top level types include:

  • media
  • sports
  • science and technology
  • art
  • travel and dining
  • special interests
...among others

The amount of data in these areas is still pretty small. They've pre-loaded data about films and restaurants; many of the demo applications take advantage of this data, such as cinespinner.

As I browsed through the categories, I thought about how similar this kind of organization is to library cataloging, especially authority control. It would be great if the catalog and authority data in WorldCat was pre-loaded into Freebase, as well as other datasets with a certain amount of authority such as the Getty Vocabulary Program. Seems like it would also make sense to massively load in data about consumer products.

How to convince organizations to put their data into this open database? Perhaps they could get a fee based on queries using their data when a Freebase account went over a certain daily query limit. Sort of an Amazon web services model. I guess that would make it unfree, technically, but what really is?

Thinking more broadly, once the rules were established for a global database of knowledge, it could be replicated and hosted by more than one organization--with each hosting company providing basic querying functionality but also perhaps some value added services. Those are my anti-monopolistic tendencies showing through.

Thursday, March 22, 2007

Economist on the future of books

This article from the Economist contains a fairly nuanced set of predictions on the future of the book.

Interesting point that people often package a 50-page idea into a 300-page book because those are the economics of the medium.

The observation that the electronic medium offers lots of benefits in the scholarly environment, but not so many in the recreational reading one, would seem to bode well for physical books in book stores and public libraries but not so much for those in academic libraries.

Monday, March 19, 2007

discovery services at the network level

Peter Brantley mentions that someone who went to RLG's D2D conference made this comment:

The conference covered a wide range of issues, and was very interesting. I think the consensus opinion at the close was that discovery has moved to the network layer and libraries should stop allocating their time and money trying to build better end-user UI, and concentrate instead on delivery, and their niche or customized services such as digitizing special collections, providing innovative end-user tools for managing information, and so forth.
Interesting observation. I'm thinking about how this comment relates to lots of dispersed efforts such as Scriblio to remake the library OPAC using technologies like SOLR. These are important pioneering efforts but will they stick around long?

More specifically, I'm thinking about it in terms of the options available to replace the interface to our OPAC at L&C, and ultimately the Summit Union catalog that we share with other Orbis Cascade Alliance members. The Alliance, through the Summit Catalog Committee, is considering lots of different options, including III's Encore, Endeca, OCLC's forthcoming WorldCat Local, as well as local development options. The statement above, in my mind, would be an argument for WorldCat Local.

WorldCat Local (not sure if this is the correct product name), from what I've heard, is a version of the WorldCat.org catalog that is scoped down to your own holdings and optionally, those of your union catalog. It allows you to use your local system for delivery. By using a discovery service like WorldCat local, we would be tapped into OCLC's web-scale bibliographic database and could, theoretically, benefit from "network effects" only available on that platform. One network effect readily available would be the most up-to-date version of any bibliographic record. But there could also be other network effects, for example, Web 2.0 ish stuff like tagging and comments that really only become useful on a wide scale. And maybe they will offer stuff like FRBRization and relevance ranking that takes advantage of the intelligence available in that size of a database.

OCLC, in my mind, has always been sort of a slow moving behemoth. Using many of their services, such as LDR updating, is often painfully slow and cumbersome. But they appear to have turned a corner on their OpenWorldCat program, especially some of the APIs that they have released recently. I'm listening to them with an open mind.

Thursday, March 15, 2007

Library as Platform

Another phrase that I caught at code4lib was "the library as platform." It came from Terry Reese via Jeremy Frumkin, I think.

This got me thinking about all the web applications that we run here at Watzek and how they hang together. There are a couple reasons to be thinking about this now, especially. One is that there are now two (three if you count the Law Library) programmers working on the coding and database creation. And two, we're thinking of redesigning our website soon (even though we just we're named the "College Website of the Month" by ACRL's college library section).

We're pretty small scale here, but I still think that it is productive to think of our digital environment here in terms of a platform. My idea is that we should identify the large building blocks of our environment so that we're not reinventing things every time we create a new application.

Some components of the platform:
  • LAMP (throw postgres in there)
  • An up-to-date XHTML based web design and corresponding set of CSS classes that can be applied to a wide range of situations. This will let us spend our time building applications rather than tweaking design details.
  • Some form of template system (currently dreamweaver + PHP includes) for applying the design uniformly
  • databases:
    • the ILS database
    • serials knowledgebase
    • databases which are subsets of the ILS like our newbooks database and our A/V database
    • database of electronic resources that drives our subject web pages
  • Campus LDAP authentication applied via PHP and Apache mod_auth_ldap
  • PHP classes and conventions for common tasks (like authentication via LDAP, passing data about a citation from one application to the next)
  • common Expect scripts (mainly for data extraction from the ILS)

Many applications (ReservesDirect, our homegrown CMS for e-resources, new books database, etc.) are running on the platform and leverage its resources.

the library workshop

Watzek Library's ILL department just had a visit from some folks at the Multnomah County Library, the big public library system serving most of the Portland Metro area. They were interested in a little hack to our Clio interlibrary loan system that allows us to check out ILL books on our III integrated library system without any rekeying.

As Jeremy and Jenny showed the application in action, I was really admiring the resourcefulness of it. Nothing fancy--just the DIY powertools of the digital workshop: Expect (to talk to a legacy black box III system), MySQL, PHP, and especially a PHP function to create barcode images. Normal ILL workflow for an incoming book is followed by a couple simple steps, and pretty stickers with barcodes that correspond to records in our ILS emerge.

We're thinking about the software platform (basically LAMP), which powers our website. Should we go to something like Rails? Or should we stick with the simple, blunt LAMP instruments that we're used to.

Friday, March 9, 2007


Gotta love the name Freebase.

Along the same lines of the Talis Platform (at least in my mind), some guys are starting up a company that will develop a global database that allows for complex relationships to be established between the data within it. The company is called metaweb, but their site isn't too revealing.

Here's what Tim O'Reilly has to say about it:

“It’s like a system for building the synapses for the global brain,” said Tim O’Reilly, chief executive of O’Reilly Media, a technology publishing firm based in Sebastopol, Calif.

Google Book Search and rank

Some interesting notes from a talk by Google at the Future of Bibliographic Control Committee (the name just oozes dullness). Dan Clancy from Google said that Google is having trouble relevance ranking books in its book search because it can't rely on the link structure of the web to support relevance ranking.

Thursday, March 8, 2007

data stores

At code4lib, Talis was promoting their platform. It's based on the concept of "stores", which are basically large bodies of data stored on Talis' computers. The advantage of putting your data in these stores is that it can be queried, searched, and related to other data in numerous ways.

Some of what they say about their platform:
Large-Scale Content Stores

The Talis Platform is designed to smoothly handle enormous datasets, with its multiple content stores providing a zero-setup, multi-tenant content and metadata storage facility, capable of storing and querying across numerous large datasets. Internally, the technology behind these content stores is referred to as Bigfoot, and there is an early white paper on this technology here.

Content Orchestration

The Talis Platform also comprises a centrally provided orchestration service which enables serendipitous discovery and presentation of content related according to arbitrary metadata. This service makes it easy to combine data from across different Content Stores flexibly, quickly and efficiently.

This all seems rather nebulous when you first think about it, but slowly, the usefulness of the concept begins to reveal itself. They discussed a little bit about how this platform is supporting Interlibrary Loan at UK libraries because it provides a way to query across different libraries.

My question is, do libraries really have enough of their own content to leverage a platform like this? All we really have is generic data about books and journals and specific data about what libraries holds them.

I wonder whether this kind of service would most useful if a player like Google offered it. Why Google and not Talis? Because they have huge amount of data already amassed from web crawling, publisher relationships, not to mention scanning books. Think about the opportunities that would present themselves if you could query specific slices of Google's content alongside your organization's own data? What if Google hosted research databases as stores and you could slice them up, query them, and relate them ala the Talis platform?

Essentially, a library could create its own, highly tailored searching/browsing/D2D systems.

Maybe I'm asking for too much.

Friday, March 2, 2007

standing on the sholders of giants

Casey Durfee's presentation on "Open Source Endeca in 250 lines or less" was pretty cool. How could he create a "next-gen" faceted catalog with such little code...by relying on Solr and Django to do the heavy lifting. Because Solr indexes XML natively, no relational database is even necessary. One of the things, generally speaking, I'm looking for at this conference is ways that we can leave the complexity to other applications.

Thursday, March 1, 2007

proximity and the network

Dan Chudnov gave a talk on making library resources available for sharing like itunes does on a LAN. It was hard to immediately sense the value in this. He spoke of walking into a library and having access to the whole of the library. Isn't that what we get through our digital presence on the web?

But thinking about it more, I like the idea of our computers being able to sense services and resources based on proximity. What if you met you met a group to study and when on the same wireless network, had immediate access to others' personal digital library on an application like Zotero or the like. What if when you walked through a physical library, the web presence of the library changed based on the section of the building you're in. Suppose you're studying in the East European Language Reading room late at night and you notice that somebody else has a similarly esoteric set of references on Polish intellectuals in their shared digital library...and perhaps that's her across the room. Could be a good way to get dates.

why code4lib?

Despite the fact I'm kind of burnt out on writing code, I find code4lib to be one of the most invigorating conferences I've attended in the last few years. Why? I think it's because it's where the new opportunities in the broader web world meet the digital library world.

Some interesting ideas that have come up this year:
  • the SOLR platform for indexing and faceting a library catalog or a digital library of anything really, XML based
  • The Talis platform's concept of data "stores": large bodies of xml data that can be queried and related to data in other stores in an unlimited number of ways using "web scale" infrastructure
  • the idea of hooking up openurl resolver type services as a microformat
  • using del.icio.us as a content management system for library subject guides
  • a subject recommendation engine based on crawling intellectual data associated with university departments
  • using a protocol like zeroconf so that library patrons can auto-discover library services upon entering the physical library space
It seems like most of the big players here work in larger universities or organizations that have large local data sets to work with in the form of institutional repositories or digital collections. There's a lot of concern about building large, searchable digital libraries . This is fine if you have control over a large body of data. I can tell you that in the small college library environment, most of the data we work with is generic data about books and journal articles that is living in some database that is out of our control. We're often only able to add value to that data once it's arrived in a user's search results, through an OpenURL resolver or perhaps a tweak to our catalog.

This is not to say that the what the big players are doing isn't useful or interesting to us. It's just different and makes me wish we had more opportunities to creatively manipulate the digital content to which we provide our patrons access.


A team at my old place of employment, Wendt Library at UW-Madison, showed off a pretty cool application, bibapp, that gathers data about what faculty on their campus have published. Among other things, the data is used to find articles that our legally storable in the IR; almost cooler than that is the connections that they demonstrate between the publications. They can visualize who's publishing with who, analyze popular research subjects across discplines, etc.

Wednesday, February 28, 2007

Karen Schneider Keynote

The keynote had a few good points to it. She spent a lot of time talking about better ways to market open source projects. This seems like pretty obvious stuff.

One thing I liked was the concept of "rebuilding library artisans". The idea is that developers are the artisans of libraries. They build the systems that deliver library services. She argued that every library should have a developer--that this should be a given like a reference librarian, catalog librarian.

I tend to agree with this line of thinking. Libraries need developers to specialize their services to their local clietele. This is where they add value. At a place like L&C, we're really trying to put together a "rich" liberal arts learning environment. It's the micro-brew of higher ed (at least that's how we price it), so you really need the artisan to brew it up.

Another comment I kind of liked was related to library directors going to conferences and coming back determined to create a "learning commons." Funny how administrators attach so much importance to moving walls and furniture around when the revolution is happening online.

I got the feeling from this talk, and generally from this conference, that there's a lot of momentum building around the Evergreen ILS. The buzz is just starting over in III-land on the West Coast.

Tuesday, February 27, 2007

Athens, GA

Just checked into The Foundry Park Inn in Athens, GA for code4lib 2007. Initial impressions of Athens are that it is a laid back college town along the lines of Madison, WI. We'll have to get out and explore more. Our hotel is nice. Sort of a colonial/plantation style that you wouldn't find in the pacific northwest.

The lineup of presentations looks strong, once again. This conference is really an incubator for new digital library services. There's a lot of unconventional thought out there about things we could do with existing data and services and I'm looking forward to wading in deep.

My esteemed colleague Jeremy McWilliams and I are going to try to find some trails to run tomorrow morning before the action starts. We hope to get a lightning talk about our top secret project, code named "Sherpa."

Friday, February 23, 2007

Book Review

The book I co-authored with Kyle Banjerjee and Michael Spalti been reviewed in the journal: Program: electronic library and information systems, a British journal that I hadn't heard of before. Overall, it's a pretty good review, but unfortunately it's not open to the public.

Google Apps Premier Edition Released

Google released its Apps Premier Edition yesterday. It includes integrated GMail and calendaring for your business or education. It is great to see a heavyweight competitor to Microsoft in this space, especially one with something compelling to offer, and a software-as-a-service approach.

I think it only makes sense for colleges and universities to outsource generic services like email. They are really a utility that should be handled higher up the stack. Sure, there are concerns about having someone else be stewards of our data. But frankly, I think Google has a much better security infrastructure than we do around here. Hell, they even talk about "armed personnel":

Data security

Your proprietary data is protected by security measures that include biometric devices and armed personnel, special-purpose equipment and services that only expose the minimal required access points to serve their function, dispersed data in proprietary formats and redundant storage that ensures the availability of your data.

Don't think we can provide that around here at L&C...

Thursday, February 22, 2007

database vendors are the real libraries

Kyle Banerjee gave a dynamite job talk the other day for the Orbis Cascade Alliance's Digital Services Program Manager position. One of the lines that he used that I liked: "database vendors are the ones building libraries, we're just purchasing agents."