Por favor, use este identificador para citar o enlazar este ítem: http://inaoe.repositorioinstitucional.mx/jspui/handle/1009/764
Algorithm and hardware architecture for the discovery of frequent sequences
OSVALDO NAVARRO GUZMAN
RENE ARMANDO CUMPLIDO PARRA
LUIS VILLASEÑOR PINEDA
Acceso Abierto
Atribución-NoComercial-SinDerivadas
Sequences
Data mining
Pattern recognition
Hardware-software codesign
Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, automatic text reuse detection, analyzing purchase behavior, among others. Nevertheless, with the dramatic increase in data volume, the current approaches result inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. We propose a new sequential pattern mining algorithm, which follows a pattern growth scheme to discover frequent patterns, that is, by recursively growing an already known frequent pattern p using frequent symbols from the projected database with respect to p. Our algorithm only maintains in memory a structure of the pseudo-projections and the symbols required for the algorithm in case it has to go back and try to grow a pattern with another valid element. Also, we propose a hardware architecture that implements the processes of generating pseudo-projection databases and finding frequent elements from a projection database, which comprehends the most costly operations of our algorithm, in order to accelerate its running time. Experimental results showed that our algorithm has a better performance and scalability, in comparison with the UDDAG and PLWAP algorithms. Moreover, a performance estimate showed us that our hardware architecture significantly reduces the running time of our proposed algorithm. To our knowledge, this is the first hardware architecture that tackles the problem of sequential pattern mining.
Instituto Nacional de Astrofísica, Óptica y Electrónica
2012
Tesis de maestría
Inglés
Estudiantes
Investigadores
Público en general
Navarro-Guzman O.
CIENCIA DE LOS ORDENADORES
Versión aceptada
acceptedVersion - Versión aceptada
Aparece en las colecciones: Maestría en Ciencias Computacionales

Cargar archivos:


Fichero Tamaño Formato  
NavarroGO.pdf1.04 MBAdobe PDFVisualizar/Abrir