| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Disadvantages-Synchronizing with data source

Page history last edited by GregReser 14 years, 10 months ago

From Jan Eklund, UC Berkely:

 

I have big concerns about embedding descriptive metadata because of the data refresh and update issue.  As much as I would like to have full-blown Core4 XML embedded in an image (saving me the work of creating it), my experience tells me that this is not a good idea because mistakes happen to the best of us, new information comes our way all the time, and the thought of millions of images being released with conflicting embedded metadata just looks like a huge headache to me.  The only thing I would embed at this point is the image rights info and a PID to the descriptive metadata.  Anything more runs the risk of stale metadata in uncontrolled distribution.

 

Here's a real-world example of what I mean about stale metadata. Even though our metadata and images are both being OAI harvested for our ARTstor institutional collection (which _should_ mean that our ARTstor data is updated regularly), we've found outdated AAT terms still in our ARTstor data records because while the term_id changed in our local work_AAT term table, the data record that referenced that term was not updated.  So even though our local website data record now says "plan(orthographic projection" the same record in ARTstor still says "plan (drawing)".  This is a very small problem that we just caught and can readily solve.  But imagine if that original metadata had been embedded in the image and someone downloaded it from ARtstor and shipped it off somewhere else with only the stale embedded metadata in the image.

Imagine the confusion of an end-user who got an image that said one thing and metadata from a PID that said something radically different.  Anyone with half a brain cell could probably figure out the difference between plans as orthographic projections and plans as drawings, but what if we had made a mistake in the work-image link (that has happened too) and embedded _completely wrong_ information in the image itself.  Even with a dataDate timestamp embedded in the image, who's going to bother to check to see if that data has been updated since it was embedded?  Not me.  Just another layer of overhead, even for a computer.

 

 

Another experiment we have tried with the stuff in the Media Vault is dumping all the descriptive metadata text into the IPTC Description field.  This allows us to search whatever metadata we have, regardless of how complete or incomplete it may be.  But it takes a manual procedure to refresh that data, and frankly it doesn't work very well.  If this descriptive metadata was dead wrong (for example an incorrect work_id-image link, which also happens more often than one would like)  and someone downloads this image and all that wrong data travels with the image, the potential for confusion is great.  There is also nothing in that descriptive metadata that says where this image came from so no way to even contact the image creator.  I still think embedding administrative metadata that takes you to another place where you might find more information about the object in the image makes more sense than embedding the descriptive work metadata.  As for orphaned PIDS, all I can say is that nothing in the digital world is really /permanent /so this does not surprise me.  I suppose you could also include an actionable hypertext reference for the rights holder (which probably runs a greater risk of being orphaned) which might at least get you to the person who created the image in the first place.  For copy photography you could include the image source, but that doesn't always get you the information you need.  Howard Brainen has done some interesting work on embedding technical metadata in tiff headers that says who created it, how it was captured, and it's intended use (i.e. use this tiff to create derivatives, archive this raw scan, etc.)  This could also be useful for tracking down the image owner/creator.  I'm still not convinced embedding descriptive metadata is a good idea.  But hey, I may change my mind if I can be convinced that there is some way to track and refresh the embedded metadata.

 

example image:

 

IPTC Caption/Abstract = Work title: Alexander Mosaic Larger entity title: House of the Faun (VI,xii,2,5,7) Work culture: Roman Dates: mosaic (visual work): created late 2nd century BCE Pompeii. Repository: Naples: Museo Archeologico Nazionale Geographic location/provenance: Created Pompeii, Italy, Europe Work type: mosaic (visual work) Work view: Close detail of soldier Work description: Floor mosaic from the House of the Faun, Pompeii, depicting the battle of Alexander and Darius III at Issus Subject: chariots (ancient vehicles); horses; spears (weapons); War; Alexander the Great (Alexander III, 336-323 BCE); battles of Alexander; Alexander fighting against Darius at Issus; mosaics (visual works); Mosaics, Roman--Italy; mosaics (visual works) Dimensions: 3.13 m(H) x 5.82 m(W) Classification: Roman Architecture--Pompeii--Private spaces, monuments, and buildings--houses--Faun, House of the--mosaics (removed) Call number: 32.18-525.471k Accession number: 100006 Image source: Bernard Andreae, Das Alexandermosaik (Stuttgart: Reclam, 1967), pl. 9 Label2: POMPEII Label3: House of the Faun Label4: Alexander mosaic, detail of soldier Label5: Before 79 CE Label6: Naples: MAN 10020 Label7: Andreae: Das Alexandermosaik, pl. 9 Requester: Eklund, Janice

Comments (0)

You don't have permission to comment on this page.