Thursday, November 20, 2008

Europeana

The Europeana digital library just launched today. It's running slow. Currently, it's very heavy on French materials.

Wednesday, November 19, 2008

design matters

I want to echo Dan Cohen's belief that "design matters" not only in digital humanities, but also in academic library websites.

Academic libraries, even small ones, need to meet user expectations for good design in their websites and digital collections. They need to get beyond the do-it-yourself HTML mentality of the early web. The also need to do something more than tap into a parent institution's content management system if they want their website to be an optimal research gateway and marketing tool.

When possible, libraries should get staff members on board with design expertise. They should learn to how to contract out for design services when needed. When set aside the thousands of dollars that a small academic library typically spends each year on licensed e resources, a couple thousand every few years for web design of the portal to those resources provides a lot of bang for the buck.

We're working with some professional web designers on accessCeramics right now, and I'm looking forward to learning from the experience.

Friday, November 14, 2008

finding full text with Google Scholar

The Google Operating System Blog had a post the other day that alerted me to a relatively new feature in Google Scholar. For each article in a result set, Google Scholar will point you to a free, unrestricted copy of the article on the web (if available) with a little green .

With many academic journal publishers allowing authors to post copies of their articles on their personal websites, it is now common for scholarly articles in subscription journals to be available for free on the open web. Below is an example of an article, with a copy available from a website in an academic domain (sorry for the tiny image).


This is a good example of Google Scholar leveraging the Google web index to provide something you can't get within the research systems that libraries have built and licensed. It's also yet another reminder that libraries and publishers have lost their role as sole provider and intermediary for academic content.

I've pointed out previously in this blog that creators of research products for libraries do not (or are not able) to take advantage of web indexes as they create their products. I wonder if openurl resolver vendors or someone like OCLC could offer this feature by tapping into something like the Alexa Web Search service to mine the web for full copies of a given article? It might be hard to do on the fly with a resolver request.

I'm guessing that Google Scholar will have 90%+ of scholarly articles in existence in its index at the citation level in the not-to-distant future. It is able to mine so many places for citations: web sites, scanned books and journals, and many publishers' archives, etc.

As OCLC loads article citations into Open WorldCat, I wonder if they have considered a more "brute force" approach to finding citations. They could mine the web for them like Google. Of course, this would introduce all sorts of possibilities for errors and lack of bibliographic control. Google Scholar must have lots of errors in the citations it collects, but it seems to efficiently collate like citations together and recognize which citations are the most referenced.

thinking locally, acting globally

As our library discusses moving to WorldCat Local as our "primary" library catalog, the catalogers have been voicing their concerns about relying on the master records in the WorldCat database for our local catalog. They are very concerned that the corrections and augmentations that they apply to our local bibliographic records will no longer be visible. These often take the form of local subject headings, genre headings, corrections, etc. The logical move of course is to make those changes in the WorldCat database where they can have a global benefit.

The move to WorldCat as the live local database should force the issue of truly cooperative cataloging, and I think that's a good thing. It should give incentive to OCLC to be more inclusive about who can edit and enhance records and it should embolden catalogers to work in the global catalog, not just their local one.

In the networked environment we're doing more and more work in libraries that benefits a global community, rather than just our local community. Digital collections projects are a good example. When we digitize unique art, photos, manuscripts, or historical documents, our work provides value to the world. These collections are contributing to the de facto world digital library that is the Internet. Our print collections, especially the more unique pieces within them, are also making a more global contribution as ILL systems become better lubricated.

How do this work benefit the parent institutions that fund us? Should we only be doing projects with a global benefit if they provide a benefit to our institutions equal to their cost? In some ways this seems logical: we take action when there is a clear local benefit and view any global benefit as a positive side effect. Sort of akin to a "national interest" foreign policy doctrine.

I think about this as I spend time working on accessceramics.org. It's very satisfying to be developing a resource for a global audience rather than just our local one. We can see the visitors coming in from around the world on Google Analytics. But our library like most any academic library is structured and funded as an organization that provides a wide set of rather generic services to a very defined audience. We're not really optimized for developing a narrow, niche collection that we serve up to the world. The Internet has taken away many of the barriers for doing this, however, and we are starting to forge ahead with collections like these.

In some ways, the model for academic libraries doing niche collections is like humanities scholarship, where the revenue from teaching subsidizes research. The services a library provides to a primary audience of students and faculty are akin to the teaching and the niche collections with a global benefit are equivalent to the research. In the same way that an academic's research benefits their teaching (or does it?), does a library's curation of niche collections make the library better in the primary services that it provides to its patrons: reference, instruction, discovery to delivery, etc.?

The local benefit of a digital project can obviously hard to measure, but clearly some unique collections have particular relevance and value to a community. Historical documents that support a niche area of scholarship that is a strength of the institution, a photo collection about the surrounding community, etc. accessCeramics supports a strong tradition of ceramic arts instruction at our institution.

Many niche projects raise the profile of the parent institution broadly and have the potential to boost funds coming in from grants and donors. Libraries tend to take pride, rightfully, in the work they do that has global benefit and, indeed, most wish they had more staff resources to undertake such projects.

Thursday, October 30, 2008

EDUCAUSE 2008 wrap-up

EDUCAUSE is a fun conference, even it it is polished up with kind of branded, corporate veneer and in a grotesque place like Orlando's Convention Center. There is an overflowing exhibitor hall and scads of corporate sponsored dinners and cocktail hours. I was lucky enough to partake in a few of these this time around.

I actually met a lot of counterparts from small college libraries, and this was enjoyable and perhaps a surprise at a national conference for higher ed IT. We didn’t necessarily talk about anything too serious but it was nice to make connections and compare notes on a few things. I did learn that Macalester, in addition to leading the way on Google Apps is also an early adopter of WorldCat Local.

It was unnerving to hear that in light of the financial crisis many liberal arts colleges are making serious budget cuts and putting budget freezes in place, something we haven’t heard much about at my institution.

What did I learn from the formal program?

  • I hit a presentation about data curation projects at some big universities: Indiana University, UC San Diego, Purdue. They involved collaborations between the library and technical computing centers. One of the major challenges was getting someone at the table as research projects that involve data were being proposed, funded and implemented.
  • One of my favorite presentations explored mashup type video projects in undergraduate education at Dartmouth and Penn and made a strong case that these develop a new important kind of literacy. Assignments like this are making their way across the curriculum in poli-sci, composition, language classes.
  • In a discussion session on IT/library collaboration, I learned that our library is behind the curve in experimenting with various merged service desk configurations. Most liberal arts colleges in attendence had done some fruitful experiments with merging IT and library support functions and mixing professional and non-professional staff at support desks.
  • A panel on space planning offered some interesting suggestions on user-centered planning, including impromptu interviews with students working in various spots on campus. It also showcased an ultra-flexible space at Georgia Tech. The guy from Georgia Tech recommended the Convia system for flexible wiring, data cabling, and lighting.
  • Chad Kainz of the University of Chicago gave an update on project Bamboo a rather amorphous humanities cyberinfrastructure planning effort sponsored by Mellon. I won’t try to explain what it is, but it sounds pretty cool.
  • The next day, Kainz also moderated a discussion session on "Faculty: Scholars or Software Developers": the question was how to support faculty that could now go out in the cloud and get or build what they need. The discussion descended into some mundane support issues, but I was able to pipe up about our use of Flickr in accessCeramics as an example of going beyond traditional enterprise-supported systems for a faculty sponsored project. Some people seemed to think it sounded a little risky to use Flickr in such a way.
  • I caught a lunchtime discussion by campus web professionals. Almost everyone is using Google Analytics. Many people put in a plug for Sharepoint, Microsoft's enterprise Wiki/collaboration/content management system. There was some discussion about centralized vs. decentralized control of web design and branding. Most institution-wide designers like some kind of control of the campus brand and there was talk of ways of enforcing this. I'm sympathetic to both the centralized and decentralized schools of thought.
I saw a few other sessions that were kind of lukewarm, so I won't post on them. Overall, though, it was a worthwhile conference.

Tuesday, October 28, 2008

NITLE cloud computing event report

Checking in from sunny Orlando Florida. I caught a ride from the Convention center zone to Rollins College in Winter Park Florida for the NITLE cloud computing event today. Overall, Rollins has a charming campus. The on- campus food wasn't bad either.

We heard a couple Google Apps migration stories from some CTOs. In one case, Wesleyan U, the school was only planning on switching students over, whereas at Macalester they had switched the whole enterprise, students and staff. Interestingly, in the Macalester case, the switch was done in a matter of days as the old email system failed. It seems that a crisis situation really served as an important catalyst and brought the community together. Now Macalester is ahead of the curve as it takes advantage of the whole Google Apps suite.

Jerry Sanders of Macalester said that with Google Apps, IT's role had become more "consultative" and "less reactive." It was now more about discussing the possibilities with these new web 2.0 applications than troubleshooting problems. He likened this shift and the renewed sense of unknown possibilities to the introduction of personal computers and the advent of the web.

We heard from the D-Space federation, who has some plans to enable D-Space to run on cloud-based storage. I continue to think that D-Space is not the right model for digital repositories in these times. It was designed before the Web 2.0 and Cloud Computing era and remains positioned as an isolated silo of data for supporting a single institution.

David Young, CEO of Joyent, gave his view of the cloud. Joyent provides infrastructure for some huge applications on the web. He disagrees with Carr's view that the cloud will be dominated by a few small companies and sees it as a more heterogenous beast. It was kind of fun to listen to an industry insider throw around jargon like "cloud stack" and "cloud primitives." A couple quotes:
cloud computing=aggressively distributed
a cloud should abstract away all consideration except the application and its operation
Young seemed a little concerned that some cloud providers where creating a situation where they would lock users into their platform...perhaps Amazon is trying to do this with its EC2 virtual machines. He said that Joyent's philosophy was "openness is lock in," akin to Southwest Airlines' flexibility in reservations. Using open application stacks like RoR keeps users loyal...plus the more data users put on your servers, the less likely they are to move (the dirty secret of cloud providers).

Finally, we heard form Lee Dirks of Microsoft's Education division. He said that MS sees academics as "extreme information workers." Microsoft has developed a few open source applications based on their Sharepoint platform that are designed to facilitate research, including software that can do conference planning and facilitate peer review. I was a little skeptical of some of these scholarly collaboration platforms--how far beyond more generic collaboration software do they take it? I'd have to have a closer look.

The day ended with some heated dialogue about information privacy and security concerns when using SaS providers. Many of the CTOs felt like it would be a big hurdle to get their campus legal counsel to agree to putting their data on external servers, but pretty much all agreed that this was the direction things are going.

Saturday, October 25, 2008

Economist article on cloud computing

The wife and baby are in Wisconsin this week visiting relatives while I head to EDUCAUSE in Orlando. There might be more blogging as a result.

The Economist has a piece this week on cloud computing. It's a pretty good overview of the concept for those who haven't been following it closely. Overall, however, I think it overemphasizes what I would call the raw, technical aspect of the phenomenon and under-emphasizes network-effects angle.

The idea of highly flexible computing power is a pretty cool one, and the piece cites an Amazon Web Services case study demonstrating just that. Using AWS, a Washington Post Engineer built a digital library of a massive collection of potentially newsworthy government documents about Hillary Clinton in nine hours. What a contrast to the timelines we're used to in libraries!

The most powerful aspect of the cloud computing phenomenon, in my opinion is the aggregation of data and the network effects that rise as systems get larger. The key feature of a cloud application is that it's data is part of a greater organic whole, and that it's able to do things that an isolated application can't. This is where the distinction between Web 2.0 and Cloud Computing gets fuzzy. The piece starts to touch on this concept when it brings up Tim O'Reilly
A raft of start-ups is also trying to build a business by observing its users, in effect turning them into human sensors. One is Wesabe (in which Mr O’Reilly has invested). At first sight it looks much like any personal-finance site that allows users to see their bank account and credit-card information in one place. But behind the scenes the service is also sifting through its members’ anonymised data to find patterns and to offer recommendations for future transactions based, for instance, on how much a particular customer regularly spends in a supermarket. Wireless devices, too, will increasingly become sensors that feed into the cloud and adapt to new information.
We now use Mint to track our home finances...for some reason we didn't like Wesabe. It knows how to categorize purchases on our credit card statement because it picks up on the ways other users categorize purchases with similar labels. Much nicer and easier than using Quicken used to be.

The piece brings up a concept of "industry operating systems" that will arise to allow businesses to become more modular and flexible, while relying more heavily on the services of others.
Both trends could mean that in future huge clouds—which might be called “industry operating systems”—will provide basic services for a particular sector, for instance finance or logistics. On top of these systems will sit many specialised and interconnected firms, just like applications on a computing platform.
This is interesting to contemplate. You could almost argue that Flickr fits this model. It provides the basic operating system and then so many other firms jump in and provide specialized services image service: prints, calendars, cards, etc. In this case the industry is totally virtual.

I liked this quote:
Twenty years ago, he argues, 80% of the knowledge that workers required to do their jobs resided within their company. Now it is only 20% because the world is changing ever faster.
There's a parallel here with libraries. We've seen a similar flip in terms of information residing in-house vs. outside. We're preparing students for the business world where information is also in the cloud.

I've been reading the Economist for 20 years now but I've come to realize that they are a bit technologically stodgy. Their online stories have no hyperlinks within them.