Thursday, March 8, 2007

data stores

At code4lib, Talis was promoting their platform. It's based on the concept of "stores", which are basically large bodies of data stored on Talis' computers. The advantage of putting your data in these stores is that it can be queried, searched, and related to other data in numerous ways.

Some of what they say about their platform:
Large-Scale Content Stores

The Talis Platform is designed to smoothly handle enormous datasets, with its multiple content stores providing a zero-setup, multi-tenant content and metadata storage facility, capable of storing and querying across numerous large datasets. Internally, the technology behind these content stores is referred to as Bigfoot, and there is an early white paper on this technology here.

Content Orchestration

The Talis Platform also comprises a centrally provided orchestration service which enables serendipitous discovery and presentation of content related according to arbitrary metadata. This service makes it easy to combine data from across different Content Stores flexibly, quickly and efficiently.


This all seems rather nebulous when you first think about it, but slowly, the usefulness of the concept begins to reveal itself. They discussed a little bit about how this platform is supporting Interlibrary Loan at UK libraries because it provides a way to query across different libraries.

My question is, do libraries really have enough of their own content to leverage a platform like this? All we really have is generic data about books and journals and specific data about what libraries holds them.

I wonder whether this kind of service would most useful if a player like Google offered it. Why Google and not Talis? Because they have huge amount of data already amassed from web crawling, publisher relationships, not to mention scanning books. Think about the opportunities that would present themselves if you could query specific slices of Google's content alongside your organization's own data? What if Google hosted research databases as stores and you could slice them up, query them, and relate them ala the Talis platform?

Essentially, a library could create its own, highly tailored searching/browsing/D2D systems.

Maybe I'm asking for too much.

No comments: