Taxonomic Prediction with Tree-Structured Covariances

Matthew B. Blaschko, Wojciech Zaremba, and Arthur Gretton

Overview:

Taxonomies have been proposed numerous times in the literature in order to encode semantic relationships between classes. Such taxonomies have been used to improve classication results by increasing the statistical efficiency of learning, as similarities between classes can be used to increase the amount of relevant data during training. In this paper, we show how data-derived taxonomies may be used in a structured prediction framework, and compare the performance of learned and semantically constructed taxonomies. Structured prediction in this case is multi-class categorization with the assumption that categories are taxonomically related. We make three main contributions: (i) We prove the equivalence between tree-structured covariance matrices and taxonomies; (ii) We use this covariance representation to develop a highly computationally efficient optimization algorithm for structured prediction with taxonomies; (iii) We show that the taxonomies learned from data using the Hilbert-Schmidt Independence Criterion (HSIC) often perform better than imputed semantic taxonomies.

Learned VOC 2007 taxonomyLearned Oxford Flowers taxonomy

Code:

MatLab source code is available in a GIT repository: https://github.com/blaschko/tree-structured-covariance. To begin, download the repository and run Demo.m

Running Demo.m for the first time will call a unix shell script that will download the Oxford Flowers dataset and subsequently runs a simple experiment on that data.

Data:

Three main data sets were used in the experiments reported in this paper. Each of them are available for download from their respective webpages:

References:

  1. Blaschko, M. B., W. Zaremba, and A. Gretton: Taxonomic Prediction with Tree-Structured Covariances. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), 2013.
  2. Blaschko, M. B. and A. Gretton: Learning Taxonomies by Dependence Maximization. Neural Information Processing Systems (NIPS), 2008.
  3. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes (VOC) challenge. IJCV 88(2) (2010) 303-338
  4. Nilsback, M.E., Zisserman, A.: Delving deeper into the whorl of ower segmentation. Image and Vision Computing (2009)
  5. World Intellectual Property Organization: WIPO-alpha data set, http://www.wipo.int/ (2009)