Mark Custer | defenestrated blog

NC EAD, part 2

January 12, 2010 · Leave a Comment

Well, I was going to try to add all of these links to the freebase website, but I haven’t used that in so long that I forgot how to use it (but maybe I’ll get back to it another time).

In the meantime, I added all of my “NC EAD” bookmarks to the twine website instead.

Check it out, join, help it grow; it’s completely open, so do whatever.  Here’s the link:

http://www.twine.com/twine/1315qrmbz-yv/nc-ead

→ Leave a CommentCategories: Uncategorized

I want my, I want my EAD

December 8, 2009 · Leave a Comment

Controversial lyrics (from the song referenced in this post’s title) by Mark Knopfler aside, I have a few questions regarding the state of EAD in the state of N.C.

  1. How widespread is the adoption of EAD?
  2. Who’s not using the standard who would like to?
  3. Are there any plans afoot to create a regional EAD consortia (such as OAC, NWDA, RIAMCO, etc.)?  Please say yes; and secondly, wouldn’t it be great if such a consortia had an xforms powered admin interface for users to create and update records online? (I’m thinking of the “EADitor” Orbean Xforms project that Ethan Gruber has recently started to work on, regarding this last point).
  4. Is anyone carrying on the torch of NCBHIO and creating EAC records, now that that standard has arrived? (and since that standard essentially demands a consortia, in my mind, how is EAC affecting question #3?)

Unfortunately, I cannot even begin to answer questions 2 – 4 on my own, but I will attempt to quickly provide a start of an answer to my first question.  To that end, I went to the Society of North Carolina Archivists’ website, clicked on their links page, and then did a quick look through the “SNCA Affiliated Repositories” section.

And so, here’s an unannotated list of direct links to (mostly EAD) finding aids at North Carolina institutions (and, if anyone knows of a much better list, please let me know!):

http://www.library.appstate.edu/appcoll/list.html

http://library.bowdoin.edu/arch/mss/manuscript.shtml
http://library.bowdoin.edu/arch/archives/archives.shtml (bowdoin was listed on the SNCA Member Links page, but it’s not a college in N.C.)

http://www.cumberland.lib.nc.us/lshistory/IndexesOnline.htm

http://www3.davidson.edu/cms/x17522.xml

http://library.duke.edu/digitalcollections/rbmscl/inv/

http://archives.mc.duke.edu/collections/eadfaids.html
http://archives.mc.duke.edu/collections/holdings.html

http://digital.lib.ecu.edu/special/ead/

http://www.elon.edu/e-web/library/libraryinfo/findingaids.xhtml

http://www.foresthistory.org/ead/index.html (added to the list on 2009-12-09)

http://www.archives.ncdcr.gov/ead/academic.htm
http://www.archives.ncdcr.gov/ead/military.htm
http://www.archives.ncdcr.gov/ead/photo.htm
http://www.archives.ncdcr.gov/ead/org.htm
http://www.archives.ncdcr.gov/ead/pc.htm
http://www.archives.ncdcr.gov/ead/state.htm

http://www.lib.ncsu.edu/findingaids/
http://www.lib.ncsu.edu/specialcollections/forestry/collections.html

http://toto.lib.unca.edu/collections/manuscripts.html
http://toto.lib.unca.edu/collections/oralhistories.html
http://toto.lib.unca.edu/university_archives/default_university_archives.htm

http://www.hsl.unc.edu/specialcollections/archival/index.cfm

http://www.lib.unc.edu/mss/inv.html
http://www.lib.unc.edu/mss/sfc1/sinv.html

http://specialcollections.uncc.edu/manuscripts/title_list.php

http://library.uncg.edu/depts/archives/mss/title.asp
http://library.uncg.edu/depts/archives/universityrecords/records_groups.asp
http://library.uncg.edu/dp/wv/find.aspx

http://library.uncw.edu/web/collections/manuscript/index.html
http://library.uncw.edu/web/collections/archives/faids/index.html

http://wakespace.lib.wfu.edu/jspui/handle/10339/18473

http://ewake.wfubmc.edu:88/library/archives/collections.html

Also, if you’re using EAD in N.C. and don’t see your website listed here, please let me know.

→ Leave a CommentCategories: Archives · EAD

Symphony’s problem with operators

June 21, 2009 · Leave a Comment

I’d be quite alarmed if this hasn’t already been reported since SirsiDynix’s Symphony OPAC has been out in the wild for quite some time, but here’s an annoying “bug” that I just discovered today.

Search for near in any out-of-the-box Symphony OPAC and you’ll get yourself an error.  Now try with, adj, same, and even the Boolean operators or, not, & and (I’ll ignore “xor“, since I can’t think of any examples when I’d ever type that).

[digression: If you try to search for but (for, by, etc.), however, it will tell your that your search contains all stopwords.  And,  I'll try to forgo the argument about whether or not a library catalog should remove so-called stopwords in this day and age, but suffice it to say that a user can't find any albums by "The The" without moving beyond the default search form; and, in my opinion, a user should never have to do such a thing for such a simple search. ]

So, yes, the Symphony OPAC seems to have a problem with operators, but it’s certainly not likely that someone will search for adj.  If someone does, however, whether by accident or not, they shouldn’t be greeted with an error.  Instead, their query should be run as is, but on the results page there should also be a new <div>, placed unobtrusively, that informs the user that “adj” is also a proximity operator, and this is how to use it, should they want/need.

What’s worse, though, is that you cannot use near, with, same, or, not, and at the beginning OR end of any of your queries (exception:  you can use not at the beginning of your query without getting an error since that operator doesn’t require a first half of an argument, but it will still treat not as an operator).  And this, in my mind, is the real bug here.  You cannot, then, search for:

  • Near a thousand tables
  • Near eastern archaeology
  • The singularity is near
  • With wings like eagles
  • Same differences
  • Same river twice (but you’re fine if you include the stopword “the” at the beginning, since “same” will no longer be the first word in the query)
  • I love you just the same
  • Or else my lady keeps the key
  • Ready or not
  • Not philosophy (won’t search for the query “not philosophy” but will instead search for any record that doesn’t contain the keyword “philosophy”…  so, you won’t get an error, but you’ll get a LOT of results).
  • And then there were none
  • And the band played on
  • And you get the idea…

Of course, you can move beyond the default search values and use any of those proximity operators in conjunction with the “browse” (or “begins with”) radio button, but that should NOT be a requirement for using a select few query terms.  Or, worse, you could work around this bug, for now, by altering your search to something like this:

“and” then there were none

or even

the and then there were none

but that’s a pretty silly solution, as well.  In any event, I have no idea if this bug has been reported or not, but I am quite certain that it would be a very easy fix for SirsiDynix to implement, so I hope that they do so soon — that is, if they don’t already have a patch for this in the works.

Anyhow, if you want to try this out, of if you’re really ambitious and think that you can find any other bugs worth reporting, here’s a list of libraries using Symphony that I’ve compiled:

http://www.twine.com/twine/12vl6vpd0-19f/libraries-using-symphony-elibrary-as-an-opac

Unfortunately, it’s hard to dertermine a static link to Symphony OPACs, so most of those links will take you to a timed-out session.  Once there, though, you can get back to the main search page usually just by clicking on “OK”, and then starting a new search.

[ update: I just checked a Sirsi Unicorn library catlog, and it also seems to have this same issue on default, keyword searches.  So apparently this is a carryover from that legacy system (we were previously on Dynix's Horizon, which did not have the same issue by default; at least not that I'm aware of).  So, in hindsight, I guess this is a Unicorn bug, which makes me certain that it's already been reported, but I really wonder why it exists.  Indexing a default query in this manner seems very strange to me.  Certainly they could just require their operators to be followed by a special character, such as "#", or even just  not treat any boolean or proximity operators as operators when they appear at the beginning or end of a query.]

→ Leave a CommentCategories: Information Retrieval · Libraries
Tagged: , , , , , ,

GNU Wget NCBHIO

June 1, 2009 · 2 Comments

If that doesn’t qualify as a strange blog-post title, I don’t know what does.

Anyhow, I just wanted to say how nice it was to visit the Carolina Digital Library and Archives department (CDLA) at the Louis Round Wilson Library in UNC-Chapel Hill.

Louis Round Wilson Library

Louis Round Wilson Library

Natasha Smith, the head of the Digital Publishing group, invited our entire Digital Collections department to visit them on the morning of May 20th. While there, we were able to demonstrate and discuss with them the behind-the-scenes processes of our Digital Repository, and we were also able to hear and see a lot of interesting projects that CDLA is working on. To mention just one, for brevity’s sake, I’ll say that I was very excited to see a superbly designed online template for a finding aid to a collection that has now been digitized in its entirety. So, definitely keep an eye out for the new Thomas E. Watson Papers finding aid once it is unveiled.

During our discussions, we also got on the topic of EAC (or Encoded Archival Context), which is a standard that I’ve definitely wanted to see developed more fully after first hearing about it just one year ago. It was nice, too, that Richard Szary was in the room, since he was one of the original working group members of the EAC standard while at Yale (though I wasn’t aware of that at the time).

In my mind, EAC would be a perfect candidate to be deployed with something like Metaweb’s Freebase.  Sure, we could still export valid XML files for the preservation of the information (as the standard should still adhere to the original Toronto Tenets), but I think it would go a long way to have this information available on an easily editable and transferable platform.  I’d love, for instance, to be able to pull biographical and relational information about people mentioned in our finding aids via something like their Metaweb Query Language, and also to dynamically generate a list of “related collections” available elsewhere, at other institutions.

Anyhow, Maggie Dickson, the Watson-Brown Project Librarian, mentioned the NCBHIO project, which is something that I had never heard of before.  Here’s a link to their website, NCBHIO, which isn’t entirely functioning anymore, but it is the only website devoted exclusively to a collection of EAC records of which I’m currently aware, at least.  If there are more out there, though, I’d love to hear about them.  I have heard of a few European institutions already incorporating EAD and EAC, but I’m definitely not aware of anything else like this.

In any event, in order to learn more about the standard, I went ahead and used Wget to download all 59 EAC records that are still currently hosted on the NCBHIO website (hence the strange title for this post).   Hopefully I’ll have some time this summer to study those files some more and perhaps even create a few EAC records myself.

Until then, if anyone else is working with EAC, or anything like the EATS project (here’s a blog post about that), which was developed by the New Zealand Electronic Text Centre, I’d love to hear more about it.

→ 2 CommentsCategories: Archives · Libraries
Tagged: , , , , , ,

The happiest Swede in Paris (and Nadal deserved better)

May 31, 2009 · Leave a Comment

I’ll avoid making any puns, but how is it possible that I just had to watch THE greatest upset at the French Open this year (and possibly ever) on ustream?  The quality was so poor I could barely see where the tennis ball was, but at least I could witness part of the action.  I honestly would’ve waited to watch the match if it was broadcast anywhere on TV here, but it wasn’t even shown on the Tennis Channel.

I anxiously await the death of cable television.  Long live the illusion of “user control/interaction” that is the internet.

→ Leave a CommentCategories: Uncategorized

LITA Camp presentation on EAD

May 30, 2009 · Leave a Comment

And here’s a much shorter post about LITA Camp so that I can post my presentation, post-hoc sytle.

I had arrived in Dublin ready to talk about the EAD redesign project that I’m currently involved with at ECU.  However, there wasn’t anyone in attendance that worked exclusively in Special Collections or Archives, so I opted to attend a breakout session on Institutional Repositories rather than to host my own on EAD.

After the conference, I figured that I’d just post a link to my powerpoint presentation.  However, the powerpoint that I prepared was pretty useless without my notes attached, so I then decided that I should record a shortened version for a screencast.  And here’s the result:

And, in order to be a good creative commons citizen — since I skipped my last powerpoint slide in the screencast — here’s a list of the images that I used:

If you have any comments or questions, just let me know.

→ Leave a CommentCategories: Conference · EAD
Tagged:

Goodbye, Columbus; Farewell, COTA

May 29, 2009 · Leave a Comment

LITA Camp is long over now (it ended on May 8th), but I’m just finally getting around to adding a post about it.  Though it would’ve been ideal if the “un”conference hadn’t had such a hefty registration fee attached to it, it was still a nice couple of days to network with a bunch of interesting librarians whom I had never met before.

Also, though Dublin seems to be a really nice town, transportation to and from the airport was less than ideal (especially for me, since I opted to stay in Ohio until Saturday in order to see a bit more of a Columbus while I was there).  Unfortunately, I wasn’t prepared for the horrible website that is COTA (the Central Ohio Transit Authority).

Perhaps I am a little slow, but just one small tweak that could make those PDF bus routes a lot easier to use on one’s own would be to include some “ante” and “post” meridian notation!  Or, heck, even using military time would make things a lot less ambiguous, and slightly more user-friendly.  Here’s an example of a problem that I encountered, putting me on the wrong side of the meridian and stuck downtown without a bus back to Dublin:

cota

Fun to twist your head and read, right?  Well, to make a long story short, I could’ve avoided getting stuck had this particular excerpt included “am” next to each of those times.  Though I was able to travel down to Columbus early that evening, that same bus did not run back uptown at night (and I had unwisely assumed that I could get hitch a ride back at the 7:42 stop).

Aside from that misfortune, I did get to see a few other interesting sights and signs while in Dublin/Columbus, a few of which I’ll post below.  First, you’ve got OCLC’s headquarter’s sign, captured pretty poorly with my camera phone:

OCLC

And then you’ve got, what must be, the most active pedestrians in Dublin, Ohio:

Geese

As for the funniest piece of graffiti that I encountered, I’ll bestow that honor on this alley-side image:

BertErnie

And finally, the out-right winner for what was clearly the best sign that I saw during my trip:

IceCream

A sophisticated treat, indeed; but, sadly, it was one that I opted not to partake in during my time there.  Next time, though, Columbus!

→ Leave a CommentCategories: Conference
Tagged:

Hello, Columbus

May 6, 2009 · Leave a Comment

Not sure what to expect yet, but I’ve just recently arrived in Columbus, Ohio. Here’s the reason why:

Lita Camp 2009

I plan to do a presentation regarding my attempts/plans to integrate our EAD records with our Digital Repository. I was hoping to have a functioning beta by now (aside from just a few web page examples and a PPT presentation), but a lot of other work has come up that hasn’t permitted that to happen. Nevertheless, I still hope to launch everything in July.

And, after this weekend, I’ll go ahead and post my presentation and a detailed conference report. I’m looking forward to it…

http://www.flickr.com/photos/49024304@N00/46924091/http://www.flickr.com/photos/49024304@N00/46924091/

→ Leave a CommentCategories: Conference · EAD · Library 2.0
Tagged: , ,

Subjective Access

April 23, 2009 · Leave a Comment

“Subjective access may not guarantee that I’m right about the character of the state I’m conscious of myself as being in, but on this view it does ensure that I’m the one who’s in the state if anybody is.”

– David M. Rosenthal, from Consciousness and mind (oclc: 61200643, page 355)

In EAD, we file subjects under a tag known as <controlaccess>, which is short for “controlled access headings.” And, since we’re using these for access (just as they were so used in the card catalogs of old, and occasionally still current), they should certainly be hyperlinks, right?

So, in what state are the subjects of our finding aids actually in, and in what state should they be?

In our case, at ECU, we’re in the process of updating all of our old subjects into Library of Congress Subject Headings, like this:

Sinbad (dog) + more

Sinbad (dog) + more

This isn’t an easy process, but it does mean that, once done, all of our finding aids will “play” nicely with all of the objects in our Library Catalog as well as all of the objects in our Digital Repository. But, for the time being, everything that’s listed in our “controlled access headings” is listed as plain text (without even, at this time, the ability to restrict a keyword search to those fields).

After the update, however, not only will we feature more advanced search options on an advanced search page, but we’ll also turn all of our subjects into hyperlinks. But this begs the question:

to what should we link?

At first, this was “obvious” to me, but after now having looked around at other institutions, it seems that there are a few different solutions, which I’ll list below:

(1) Link nowhere (The option that’s most often employed, and the one that we’ll be moving away from)

(2) Link to the rest of the finding aid database (subject to subject)

(3) Link to the rest of the finding aid database (subject to keywords)

(4) Link to the library catalog, which would include the rest of the finding aid database (subject to subject)

Right now, I’m currently leaning toward a fifth option, at least for the time being (which is just a combination of options 2 and 4):

(5) Link to finding aid database (subject to subject), and then on the page of search results, also include a link that will extend that same subject search to our library catalog and also to our digital repository.

I’d love to hear other options and what other people think may be the best route (though, in my opinion, that may largely come down to the size of your collections and also to the extent that they’ve been cataloged in some normalized fashion).

My final thought about all of this is how to extend it beyond our own collections, into larger EAD databases like ArchiveGrid. Wouldn’t it be useful to a researcher if a subject in a local finding aid could be extended to repositories worldwide? In this case, though, it would definitely be easier to follow Columbia’s example so as to not get involved with messy crosswalks and the like.

→ Leave a CommentCategories: Archives · EAD · Information Retrieval

Gangling Container Lists

April 13, 2009 · 2 Comments

Linotype operator

– or, on faking a neologism

What’s a “gangling container list”*, you might reasonably wonder?  Well, I’m using the term “GCL” to refer to a “container list” (or inventory) in a finding aid that is particularly hard to encode/potentially confusing to the user/online viewer.  The main GCL at ECU belongs to the Manuscript Collection numbered 741.  Let me explain, in a less cryptic fashion:

Right now, the only collection that we have that’s both heavily described and digitized is our Daily Reflector Negative Collection.

Though the encoding for this collection isn’t divided into thematic series (it’s arranged chronologically instead), it is arranged/subdivided by:

  1. Box
  2. Folder
  3. Sleeve
  4. Item (when digitized).

Here’s an example of our EAD encoding for that, where the compenent level in the EAD corresponds to the ordered-list numbers above:

Snippet of the EAD container list for the Daily Reflector Negative Collection

Snippet of the EAD container list for the Daily Reflector Negative Collection

If you’re familiar with EAD, you might look at this and have a lot of questions/criticisms.  However, I don’t want to focus on how this finding aid is encoded (as it’s not typical for our collections, and it isn’t ideal yet), but instead what I want to focus on is its physical arrangement, its display, and how we’re going to connect it to the portions that are digitized.

Until now, we’ve only been linking digitized objects in our finding aids at the item level (in this case, that’d be the <c04> tag).  However, we have a received an LSTA grant for this collection that will shortly result in the digitization and description of over 7000 images.   And, in preparation of this grant, the container list (or, GCL) has grown from a relatively short list, that contained information about its 45 boxes, to an exceptionally long list, which now contains information for over 13000 described sleeves.

Presently, the online finding aid has every box, folder, and sleeve listed on just one page of output.  It also includes just over 100 images that were digitized prior to the grant for testing purposes.  But, if the finding aid were to include all of the images, this would result in over 20000 lines being added just to the container list!

So, we have two dilemmas:

  1. How to deal with this “one page display”
  2. How to deal with so many items (which will only increase after the grant).

As for problem number 1, we’re going to continue with our one page display option for the time being (though we may eventually employ other types of interfaces) in order to keep our search processes as simple as possible.  This could/should be an entire blog post on its own, however, so I’ll save that for another time.

That leaves problem number two. One potential solution, though not yet employed, will adhere to the following principles:

  • Encode everything (all +7000 items, and add new items as they’re requested for digitization)
  • Do not provide item level links in the finding aid (at least in the initial display) if the collection has too many items (rather than setting an arbitrary item number limit, however, this decision will be made at the collection level and might only include this particular collection, due to the next reason)
  • When possible, only scan and catalog at the lowest level of granularity already described in the finding aid (this means that when future items are requested by a patron for digitization, we might scan all of the other items in that folder at the same time, and only describe the “digital object” at the same level as is described in the finding aid).  See this object for a pilot example (but note that the display is not finished and that it hasn’t yet been cataloged).
  • Create a new stylesheet that can differentiate between providing links at the box, folder, sleeve, and item levels when necessary.
  • Create a new template that helps to address issue number 1 until that issue can be more thoroughly examined.

For this finding aid, then, the stylesheet will only output links at the “folder” and “sleeve” levels.  The individual items will only be accessible from these two levels (of folder and/or sleeve).  In some cases, then, the sleeve link will take you to a display with only one item and in others it will take you to a display with multiple items (it just depends on how many of the negatives were selected from that particular sleeve).  Each of these “sleeves” has a description that includes the total number of physical negatives included, though, so it should hopefully be somewhat clear to the user whether the sleeve is partially or fully digitized.

Check back next week for a mock-up of a newly improved Daily Reflector finding aid (this ambitious deadline, I’m hoping, will give me some incentive to finally write that stylesheet).   The mock-up won’t look like the final format, however, as there will still be some work that needs to be done to more fully integrate our digital repository with our finding aid database, but it should present a pretty clear idea.

In the meantime, please leave comments, suggestions, or even examples of your own GCLs.  I’ve certainly seem some instances of innovative displays for extremely large collections, but what I’m more interested in seeing is a display method for such a collection that also fits in with the overall delivery and “search” of the rest (that is, not just a finding aid that’s like an online exhibit, but a mutable sort of finding aid that integrates well with every EAD at that institution).

*Though the phrase is abbreviated as GCL, the recommended pronunciation is actually “Gackl”**, which is utilized instead of “G.C.L.” in order to better emphasize the electronic awkwardness of its referent.
**That’s not an e-typographical error. The preferred spelling is “Gackl” rather than something like “Gackle” for two important reasons:

  1. Obviously, to mock all things Web 2.0
  2. So as to not confuse the term with (nor raise awareness of) Gackle, North Dakota.

→ 2 CommentsCategories: Archives · EAD
Tagged: , , , ,