Forum Post : Re: Speaker id labels
Up one level
Posted by
jcarletta
at
2009-03-03 15:55
There are a few ways to extract information from an NXT format corpus into some other format. In increasing order of difficulty, suitable for increasing difficult kinds of data extraction:
- Use FunctionQuery, a command line interface to NXT, to extract what you want into a tab-delimited format, and further process that.
- if the information you want is all in one-file, use normal XML processing.
- Write a Java program that uses the NOM API to load and traverse the data, writing the output that you want.
You can find more information abut these techniques in the NXT documentation, particularly http://www.ltg.ed.ac.uk/NITE/search/search-methods.html and http://www.ltg.ed.ac.uk/NITE/search/data-processing.html.