Have good list from the repository. My initial cleanup of Digital Commons items was fairly straight forward. I sat down with our International Law Librarian Anne Burnett to talk about the collection. We batch-downloaded the series as a spreadsheet, and updated the fields that did not match so that they did (example: type was not the same for each, some were articles, some dissertations - this one was easy, updated them all to be "dissertation" type). Then I saved it, and re-uploaded the batch sheet.
records first. Now they will have consistent 502's, and correct item locations. To verify the proper 502, Wendy and I consulted our office copy of the AACR (Anglo-American Cataloging Rules) -
records first. Now they will have consistent 502's, and correct item locations. To verify the proper 502, Wendy and I consulted our office copy of the AACR (Anglo-American Cataloging Rules) -
- "Section 2.7 B13 Dissertations". Although LL.M. was not listed as a specific example, the rule states that you use: "Thesis followed by a brief statement of the degree for which the author was a candidate (e.g.. M.A. or Ph.D.), the name of the institution..., and the year in which the degree was granted."
Export a better list from the ILS. Now that I had a proper list from Sierra, I exported it to a text delimited file, then imported it into an Excel spreadsheet. I was ready to compare it to my repository spreadsheet and figure out what what missing in Digital Commons.To figure this out, I started by doing a "save as" of each spreadsheet, then narrowing both sets of data down to only the fields that I could use to compare between the two: Title, Author, and Publication Date (year was really what I was looking for). This presented further problems - not all titles in Digital Commons looked like what should be matching titles from the ILS records (ex. many repository title fields had been entered in all caps!). For Author, Digital Commons separated last and first name fields, but in the MARC records this was a single field. For Publication, the formatted date in Digital Commons records was very detailed and specific, while the only match-point in the ILS records was the 260 field (included publication date at the end as the subfield $c) - major thanks to our Collection Services Manager David Rutland for knowing this one off the top of his head! The 502 might again prove useful if the 260's were too difficult (since the "year in which the degree was granted" appeared at the end of this field for each item).
There is also an excellent page with more information specific to working with and cleaning up dates. I am still working with the cleanup of this set of items, but even though it is a work in progress this has been a wonderful learning experience. Each time I work on it I learn something new! I am excited about the things I have figured out in this process that can be applied to other sets of items in both our repository and library catalog records in the future. I'd like to thank several of my colleagues at UGA Law Library for providing various pieces of this project's puzzle. Without them I would not have made it this far with these particular data sets. Thank you Anne, David, and Wendy for all your context, tips, tricks, and sharing your experiences with this collection.
What types of cleanup are you doing with your library's data? What tips and resources have worked well for you? Please share with us in the comments!
What types of cleanup are you doing with your library's data? What tips and resources have worked well for you? Please share with us in the comments!