[OLAC-credits] Same word, multiple languages

Kelley McGrath kelleym at uoregon.edu
Sun Dec 15 16:25:35 PST 2013


Hi Alex,

The short answer is that as long as the meaning of "editor" is the same in English and Spanish, it doesn't really matter.

My gut's answer is that since editors are usually given in 508 notes and those notes are usually given in English, odds are that it's English.

The long answer asks what are we hoping to do with this information.

1. As a side effect of the project, we hope to build a multilingual dictionary of words that are actually used in moving image credits that could be a resource for catalogers and others. There are some resources out there (http://home.snafu.de/ohei/ofd/moviedict_e.html), but I think this would be a useful addition. This is one reason why we ask for literal translations (they'll make more sense in a dictionary format). For training the computer, the translation really only has to reflect the correct category. We'll lump together all the words that identify a director (directed, direction, directing, director, Regie, kantoku) into a group. For this, we really don't even need to know what language Regie is, just that the computer should use it to mark a name as a director. We do want to be able to make groupings that distinguish assistant directors or stage directors so we do need a specific translation.

2. Patterns found here may be useful in helping the computer identify the language of the credit and narrow down or prioritize the list of role words to look for. In the annotator, we provide what we think is the primary language of the movie (whatever is in 008 lang), but that doesn't necessarily say anything about the language of the credit. Credits given in notes are usually, but not always, translated into English. Credits for films in one language may be in another language. For example, some films in Arabic or African languages in our sample have credits in French.

It is hard for the computer to identify the language of very short passages, especially sentence fragments. If we know what movie languages commonly go with credits in a certain language, we may be able to make things more efficient for a computer.

Kelley

________________________________
From: olac-credits-bounces at lists.uoregon.edu [olac-credits-bounces at lists.uoregon.edu] on behalf of Kyrios, Alex (akyrios at uidaho.edu) [akyrios at uidaho.edu]
Sent: Friday, December 13, 2013 1:51 PM
To: olac-credits at lists.uoregon.edu
Subject: [OLAC-credits] Same word, multiple languages

How should we treat a credit when the language is indeterminate? Specifically, I’m getting “editor” in some Spanish records. If that’s the entirety of the credit, there’s no way to tell whether it’s English or Spanish. I could say “I don’t know what the language is,” but I know it’s one of the two. I think I was coding these English until I realized that wasn’t necessarily correct.

Alex Kyrios
Metadata and Catalog Librarian
University of Idaho
208-885-2513
akyrios at uidaho.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uoregon.edu/pipermail/olac-credits/attachments/20131216/cd8c60e6/attachment.html>


More information about the OLAC-credits mailing list