Skip to content. Skip to navigation

The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings

Sections
Personal tools
You are here: Home FAQ
Document Actions

FAQ

last modified 2009-03-09 12:55

Frequently Asked Questions


  • How do I access AMI corpus?
    Just login or register here for free and go to "Access AMI corpus" section of the menu.
  • Why do I need to register?
    We need to be able to argue that creating the data was time and money well-spent, and so we need enough information to make that case.
  • Where is the documentation?
    Full documentation of the corpus is available from Documentation section of the menu.
  • Can I browse the data without downloading it?
    Yes. First you need to login or register here for free and go to Browse section to browse the AMI corpus online.
  • What annotations have been performed?
    All the documentation about the annotation are visible in the Documentation section.
  • Can I download data?
    Yes. First you need to login or register here for free and go to Download section. This page can be used to download everything except high resolution videos - that is, audio, reduced video, transcription and annotations. The chooser allows you to specify what parts of the data you want, to reduce the download size.
  • Is there a manifest of the download files?
    You can find out what files should be in your download by looking at the wget script you used to obtain them - it lists each file separately. View it in any text editor. Some files that you might be expecting for a complete corpus might not be present; the HTML page you reach after asking for the download wget script explains any omissions.
  • Can I download everything there is?
    No. The full corpus contains high resolution videos that can only be obtained by requesting a firewire hard disk drive. Information about this is available from here for which you need to login or register here for free.
  • I can't find the transcriptions - where are they?
    They're in with the annotations, in a zip file on the Download page.
  • What consent form have meeting participants signed?
    The original text of the AMI consent form is in 041119-AMI_Meetings_Corpus_Consent_Formjp.pdf.
    • What is the best paper to reference if I use the corpus?
      Unless there is a specific publication detailing the aspects of it of you are using, the best compromise for being easy to find and high-level is: Carletta, J. (2007) Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus. Language Resources and Evaluation Journal 41(2): 181-190.
  • How can I get a printable transcription?
One way is to call up an NXT interface that has a transcription view that you like - you can print from NXT.  To get a simpler transcription view as tab-delimited data, use NXT's FunctionQuery utility at the command line.  After setting up your Java CLASSPATH to find NXT's programs, you can run, for example,

java FunctionQuery -o IB4010 -c AMI-metadata.xml -q '($s segment)' -atts '$s@starttime' '$s@who' '$s'

to get very simple transcription for meetings that only have transcription and no annotation, and, for example,

 java FunctionQuery -o IS1004c -c AMI-metadata.xml -q '($s dact)' -atts '$s@starttime' '$s@who' '@extract(($t da-type):$s >"da-aspect" $t, gloss)' '$s'

to get slightly facier transcription with one line per dialogue act, for coded meetings.  For help with FunctionQuery, see the NXT documentation.
 
Navigation
Log in


Forgot your password?
New user?
 

Powered by Plone