The Orchive: A system for semi-automatic annotation and analysis of a large collection of bioacoustic recordings

Date

2013-12-23

Authors

Ness, Steven

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Advances in computer technology have enabled the collection, digitization and automated processing of huge archives of bioacoustic sound. Many of the tools previ- ously used in bioacoustics work well with small to medium-sized audio collections, but are challenged when processing large collections of tens of terabytes to petabyte size. In this thesis, a system is presented that assists researchers to listen to, view, anno- tate and run advanced audio feature extraction and machine learning algorithms on these audio recordings. This system is designed to scale to petabyte size. In addition, this system allows citizen scientists to participate in the process of annotating these large archives using a casual game metaphor. In this thesis, the use of this system to annotate a large audio archive called the Orchive will be evaluated. The Orchive contains over 20,000 hours of orca vocalizations collected over the course of 30 years, and represents one of the largest continuous collections of bioacoustic recordings in the world. The effectiveness of our semi-automatic approach for deriving knowledge from these recordings will be evaluated and results showing the utility of this system will be shown.

Description

Keywords

Bioacoustics, Machine learning, Orca, Human Computer Interaction, Citizen Science

Citation