From: Linked Data: what cataloguers need to know. A CIG event. 25 November 2013, Birmingham. #cigld
http://www.cilip.org.uk/cataloguing-and-indexing-group/events/linked-data-what-cataloguers-need-know-cig-event
Accompanying write-up in Catalogue & Index 174: http://discovery.ucl.ac.uk/1449459/
1 of 21
Downloaded 16 times
More Related Content
What's Wrong With MARC?
1. What's Wrong With MARC?
Linked Data: what cataloguers need to know #cigld
CILIP Cataloguing and Indexing Group (CIG)
25 November 2013
Thomas Meehan
tom@aurochs.org @orangeaurochs
3. AACR2 in MARC21
245 00 $a Models for decision :
$b a conference under the auspices of the United Kingdom
Automation Council organised by the British Computer
Society and the Operational Research Society /
260 __
300 __
504 __
700 1_
$c
$a
$b
$c
$a
$b
$c
$a
$a
edited by C.M. Berners-Lee.
London :
English Universities Press,
1965.
x, 149 p. :
ill. ;
23 cm.
Includes bibliographical references.
Berners-Lee, C. M.
4. RDA in MARC21
245
00
264
_1
264
300
_4
__
336
__
337
__
338
__
504
700
__
1_
$a
$b
$c
$a
$b
$c
$c
$a
$b
$c
$a
$2
$a
$2
$a
$2
$a
$a
$e
Models for decision :
a conference under the auspices of the United Kingdom Automation
Council organised by the British Computer Society and the Operational
Research Society /
edited by C.M. Berners-Lee.
London :
The English Universities Press Limited,
1965.
?1965
x, 149 pages :
illustrations ;
23 cm.
text
rdacontent
unmediated
rdamedia
volume
rdacarrier
Includes bibliographical references.
Berners-Lee, C. M.,
editor of compilation.
5. AARC2 in .mrc
00788nam a2200181 a
450000100270000000500170002700800410004402400150008524502100010026000
490031030000320035950400410039165000330043270000230046571000390048871
0003000527710004900557_UCL01000000000000000477125_20061112120300.0_85
0710s1965 enka b 000 0 eng _8 _ax280050495_00_aModels for decision :_ba
conference under the auspices of the United Kingdom Automation Council organised
by the British Computer Society and the Operational Research Society /_cedited by
C.M. Berners-Lee._ _aLondon :_bEnglish Universities Press,_c1965._ _ax, 149 p.
:_bill. ;_c23 cm._ _aIncludes bibliographical references._ 0_aDecision
making_vCongresses._1 _aBerners-Lee, C. M._2 _aUnited Kingdom Automation
Council._2 _aBritish Computer Society._2 _aOperational Research Society (Great
Britain)__
Leader
Directory
Data
245 field, final 710 field
6. What is MARC for?
?
?
?
?
?
?
Storage
Exchange and distribution
Manipulation
Display
Input (http://www.aurochs.org/zz/marc_input/marc_input.html)
¡°Lingua franca of library cataloguing¡±
7. Finite Notation Problem
Too many subject schemes
650 _0 for LCSH
650 _1 for LC for Childrens
650 _2 for MeSH
¡
650 _7 Source specified in subfield $2
Not enough indicators
246 184 $aThe title on the spine
8. Data in More Than One Place
Languages
008 (positions 35-37) eng
041 __ $a eng
240 10 $l English
546 __ $a In English.
9. Double Encoding: ISBD and MARC
Blanket : Constellation of Orion, 3.
260 __
$a Blanket
$b Constellation of Orion
$c 3
260 __
$a Blanket :
$b Constellation of Orion,
$c 3.
10. Text, Not Data (1)
ISBN
020 __ $a 9780285638976 (pbk.)
020 __ $a 012002618X (ebook)
Title
245 10 $a British goblins :
Place of publication
260 __ $a K?ln :
Copyright date
260 __ $c c2005
264 _4 $c ?2002
260 _4 $c copyright 2005
260 _4 $c ?1983
260 _4 $c phonogram 1993
Extent
300 __ $a ix, 300 p. :
300 __ $a ix, 300 p. ;
300 __ $a ix, 300 p.
Dimensions
300 __ $c 23 cm.
300 __ $c 9 mm.
11. Text, Not Data (2)
ISBN
020 __ $a 9780285638976
020 __ $a 012002618X
Title
245 10 $a British goblins :
Place of publication
260 __ $a K?ln :
Copyright date
260 __ $c c2005
264 _4 $c ?2002
260 _4 $c copyright 2005
260 _4 $c ?1983
260 _4 $c phonogram 1993
Extent
300 __ $a ix, 300 p. :
300 __ $a ix, 300 p. ;
300 __ $a ix, 300 p.
Dimensions
300 __ $c 23 cm.
300 __ $c 9 mm.
12. Text, Not Data (3)
ISBN
020 __ $a 9780285638976 :
020 __ $a 012002618X :
Title
245 10 $a British goblins :
Place of publication
260 __ $a K?ln :
Copyright date
260 __ $c c2005
264 _4 $c ?2002
260 _4 $c copyright 2005
260 _4 $c ?1983
260 _4 $c phonogram 1993
Extent
300 __ $a ix, 300 p. :
300 __ $a ix, 300 p. ;
300 __ $a ix, 300 p.
Dimensions
300 __ $c 23 cm.
300 __ $c 9 mm.
13. Data Mixed Up
GMD
245 10
245 10
$a Data on the web
$h [electronic resource] :
$b research and applications /
$c Antonis Bikakis, Adrian Giurca (eds.).
$a Data on the web
$b research and applications /
$c Antonis Bikakis, Adrian Giurca (eds.).
Nothing allowed after 245$c
245 10
$a Enduring resistance :
$b cultural theory after Derrida /
$c edited by Sjef Houppermans, Rico Sneller, Peter van Zilfhout. =
La r¨¦sistance pers¨¦rv¨¨re : la th¨¦orie de la culture (d')apr¨¦s Derrida / edit¨¦ par Sjef
Houppermans, Rico Sneller, Peter van Zilfhout.
14. Changing Text as Primary Key for
Headings and Authorities
Author heading for deceased person
Niemeyer, Oscar, 1907Different preferences for writing name
Mao, Tse-tung, 1893-1976 [Former heading]
Mao, Zedong, 1893-1976
ëÔó¶«, 1893-1976
Small differences could break match
Mao, Zedong, 1893-1976.
Mao, Zedong, 1893-1976
16. Record Not Data
00788nam a2200181 a
450000100270000000500170002700800410004402400150008524502100010026000
490031030000320035950400410039165000330043270000230046571000390048871
0003000527710004900557_UCL01000000000000000477125_20061112120300.0_85
0710s1965 enka b 000 0 eng _8 _ax280050495_00_aModels for decision :_ba
conference under the auspices of the United Kingdom Automation Council organised
by the British Computer Society and the Operational Research Society /_cedited by
C.M. Berners-Lee._ _aLondon :_bEnglish Universities Press,_c1965._ _ax, 149 p.
:_bill. ;_c23 cm._ _aIncludes bibliographical references._ 0_aDecision
making_vCongresses._1 _aBerners-Lee, C. M._2 _aUnited Kingdom Automation
Council._2 _aBritish Computer Society._2 _aOperational Research Society (Great
Britain)__
Leader
Directory
Data
245 field, final 710 field
17. Doesn't Handle FRBR/RDA Well
245
00
264
_1
264
300
_4
__
336
__
337
__
338
__
504
700
__
1_
$a
$b
$c
$a
$b
$c
$c
$a
$b
$c
$a
$2
$a
$2
$a
$2
$a
$a
$e
Models for decision :
a conference under the auspices of the United Kingdom Automation
Council organised by the British Computer Society and the Operational
Research Society /
edited by C.M. Berners-Lee.
London :
The English Universities Press Limited,
1965.
?1965
x, 149 pages :
illustrations ;
23 cm.
text
rdacontent
unmediated
rdamedia
volume
rdacarrier
Includes bibliographical references.
Berners-Lee, C. M.,
editor of compilation.
18. Other Considerations
? Only libraries use MARC
¨C Libraries tied to library-specific software/processes
¨C Outside agencies can¡¯t take advantage of library data
and standards (See Also: RDA not freely available)
? Not even all of libraries use MARC
¨C Archives
¨C Repositories
¨C Non-MARC LMSs
20. What's Wrong With MARC?
100 1_ $a Meehan, Thomas.
245 __ $a What's wrong with MARC /
$c Thomas Meehan.
260 __ $a Birmingham :
$b CIG,
$c 2013.
490 1 _ $a Linked data
500 __ $a Presentation given at the CILIP CIG event " Linked Data:
what cataloguers need to know", Birmingham, England, 25 Nov. 2013.
710 1_ $a Chartered Institute of Library and Information Professionals
(Great Britain). Cataloguing and Indexing Group.
830 _0 $a Linked data.
21. References
?
?
?
?
?
?
MARC21 Standards http://www.loc.gov/marc/
MARC21 Bibliographic http://www.loc.gov/marc/bibliographic/ecbdhome.html
MARC21 Record Structure http://www.loc.gov/marc/specifications/specrecstruc.html
UKMARC Manual http://www.bl.uk/bibliographic/ukmarc.html
MARC Must Die / Roy Tennant. (Library Journal , Oct. 2002). http://www.libraryjournal.com/article/CA250046.html
Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary http://www.loc.gov/bibliographicfuture/rda/source/rda-execsummary-public-13june11.pdf
Editor's Notes
In the beginning was the index card. This is basically text, designed to be brief but readable by people. The elements are separated by punctuation.
MARC was created to essentially mount that index card onto a computer and share it. It is ideally placed to recreate the index card's data for human consumption.This could be for printing out more index cards, compiling a dictionary catalogue, or even a display screen for an OPAC, at least if you want to recreate the index card layout.
This is the same thing with an RDA record..
It's worth bearing in mind that this is what MARC actually looks like! Most of what we actually see of MARC is an abstraction for display or editing.To get any bit of information out, the data needs to pulled apart first.At the top is the Leader, which includes some basic information about the record.Next is the Directory, which has information about the fields and their length. The third part is the data itself: note the MARC fields are all absent, as are the subfield markers. There are markers there but they are hidden! There is one invisible code to end the record, another to end each field, and a third to mark the beginning of subfields. In the above example, all the hidden codes have been replaced with underscores so you get an idea what's there.
What is MARC for?Is it used for these things anymore?Storage: rarelyExchange and distribution. This is where MARC really lives on. To download records from others via z39 Manipulation: to manipulate MARC, you have to transform it to something else. E.g. in Marcedit. Or do bad thing to it. For example, to index and normalize involves dealing with lots of punctuation, which I'll deal with more in a moment.Display: Designed for traditional catalogue displays and standards, i.e. AACR2. However, the MARC and ISBD card index structure is more often hidden behind a labelled display. In discovery systems, this can be even more abstracted.Input: MARC fields are clearly used for inputting bibliographic data although it's unclear why they have to be. Rob Styles once recounted an anecdote where cataloguers were offered a simpler non-MARC interface and they didn't like it! As RDA is more element-based and takes out the ISBD, there are possibilities of this being more acceptable in the future. See RIMMF.Lingua franca: This is another Rob Styles phrase, which has afflicted many of us, especially those of us who talk to non-cataloguers. We talk of the 245a rather than the title proper. Some non-cataloguers barely believe that MARC replaced AACR2.Many of the following are rather faults with rules and practices, or even with MARC21 over, say, UKMARC. Again, many non-cataloguers conflate MARC21 with other cataloguing practices.
If you are familiar with the deficiences of Dewey, which is constrained by its notation.Answer:SuperMARC!
Admittedly, contexts can vary here, but do note the variety of methods for formatting the content too: codes, controlled full-language, language in free text.
A particularly MARC21 problem:First, ISBD punctuation onlySecond, MARC onlyBoth of these make sense on their own. However we put them back together!Third, MARC and ISBD. This makes human editing harder than it needs to be and automatic processing a real headache.
Many MARC data elements contain text. Especially when isolated they become harder to make sense ofIn the ISBN examples, qualifiers are included in the subfield. With the new subfield $q, this is at least partially solved¡
¡as this data is moved elsewhere \o/ ¡ although
¡ the pricing and availability information still needs punctuation in front of it. 8-(The title and place of publication both have extraneous punctuation after them too. This is important as the place of publication is not K?ln :, it is K?ln, or possibly Cologne, or the place we call Cologne, but the Germans call K?ln (unless you speak the local dialect: K?lle) but the Romans called Colonia Claudia AraAgrippinensium, and so on.The extent example shows the punctuation for an element divorced from the element it represents. This has to be cleaned out before any processing as to its meaning can take place. Even then, the units in this and the dimensions examples are mixed in with the numbers and can be very variable.
In the first example, some data about the media type is stuck in the middle of the title!Whatever the benefits of the GMD, this is not a good place to put it.In the third example, a lot of fine grained data is essentially dumped in a single box and it would be very hard to retrieve it. Clearly in this case, the record is designed only to provide display.
In the first example, poor Mr Niemeyer has passed away, so the heading changes. This is not helpful for linking or matching. Not to mention the huge maintenance issues.In the second case, preferences changes, or different communities might want to display different strings to users.The third example shows the kind of small textual issues that can break matching. LMS's indexing these have to process these to figure out they match.
Both these 700s express a relationship of some kind. But what?RDA, and the apparently enthusiastic adoption of relationship designators, have at least done something to address this with the $e.(Although note the weird comma between the heading and the relationship designator!)
To do anything, you have to take the MARC file to bits.One piece of information will not stand alone.
MARC is record based, and that record is a Manifestation record.However, Expression level elements are spread throughout (in red).Work and Expression level elements are implied and not explicit (e.g. title).There are internal relationships here: we know Berners-Lee is an expression level relationship merely by the relationship: the 700 and the $e otherwise give nothing away.Think of the language example given above: machine-readable expression information is all over the place."Many?survey?respondents?expressed?doubt?that?RDA?changes?would?yield?significant?benefits?without?a?change?to?the?underlying?MARC?carrier."Most?felt?any?benefits?of?RDA?would?be?largely?unrealized?in?a?MARC?environment.??MARC?may?hinder?the?separation?of?elements?and?ability?to?use?URIs?in?a?linked?data?environment.?"While?the?Coordinating?Committee?tried?to?gather?RDA?records?produced?in?schemas?other?than?MARC,?very?few?records?were?received.""Demonstrate credible progress towards a replacement for MARC"-- Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary
e.g. Primo converts all data, even MARC, into something else before processing.
Eleven years ago, Roy Tennant addressed in particular, recognising that AACR2 and MARC so closely entwined and focussed on the card catalogue.- Unreadably esoteric formatGranularity. E.g. authors' first and last names all in a string. Roles hidden within text in the title field!Extensibility, e.g. adding contents notes or additional contentClumsy handling of different scriptsTechnical marginalisation, i.e. only libraries use MARC limits us to niche vendors"With the advent of the web, XML, portable computing, and other technological advances, libraries can become flexible, responsive organizations that serve their users in exciting new ways. Or not. If libraries cling to outdated standards, they will find it increasingly difficult to serve their clients as they expect and deserve." Eleven years later, not much has changed.
Thank you.I mentioned that RDA, despite its sins, does make element based, non-ISBD, non-MARC cataloguing more possible. Celine will now talk about RIMMF.