SAXually Explicit Images: Data Mining Large Shape Databases

Google TechTalks
May 12, 2006

Eamonn Keogh

ABSTRACT
The problem of indexing large collections of time series and images has received much attention in the last decade, however we argue that there is potentially great untapped utility in data mining such collections. Consider the following two concrete examples of problems in data mining.

Motif Discovery (duplication detection): Given a large repository of time series or images, find approximately repeated patterns/images.

Discord Discovery: Given a large repository of time series or images, find the most unusual time series/image.

As we will show, both these problems have applications in fields as diverse as anthropology, crime…