Please use this identifier to cite or link to this item: http://inaoe.repositorioinstitucional.mx/jspui/handle/1009/109
Mining interesting frequent patterns in a single graph using inexact matching
MARISOL FLORES GARRIDO
JESUS ARIEL CARRAZCO OCHOA
Acceso Abierto
Atribución-NoComercial-SinDerivadas
Pattern
Classification
Databases
Algorithm theory
Frequent graph pattern mining is an important problem in diverse research fields but, until recently, algorithms that focus on the problem had only used graph isomorphism to identify occurrences of a given pattern. In the last years, however, a few works have focused on the case where a pattern could differ from its occurrences, which is a scenario that corresponds better to reality and that could be important, for example, to analyze noisy data. Although these algorithms allow differences in labels and structural differences in edges, none of them considers structural differences in vertices. How can we identify occurrences that differ by one (or several) nodes from the pattern they represent? Our work approaches the problem of frequent graph pattern mining with three main characteristics. First, we use inexact matching, allowing structural differences in both edges and vertices. Second, we focus on the problem of mining patterns in a single graph, a problem that has been less explored than the case in which patterns are mined from a graph collection. Third, instead of mining the whole set of frequent patterns, we are interested in obtaining an interesting subset of patterns, thus lessening redundancies and facilitating the analysis of the output set. We introduce the algorithm AGraP, which, by allowing patterns that can have structural differences, in vertices as well as in edges, respect to their occurrences, is able to find patterns missed by other state of the art algorithms. Then, we introduce the algorithms CloseAFG, MaxAFG and IntAFG that, by focusing on closed, maximal or interesting patterns, respectively, are able to reduce the amount of mined patterns by AGraP, while retaining new potentially useful information found by allowing structural differences in vertices. Finally, we show, through experimental results, that the "extra" patterns found by our proposed algorithms can help in tasks of classification and clustering, thus suggesting that useful information about the input data can be captured through the patterns mined by our algorithms.
Instituto Nacional de Astrofísica, Óptica y Electrónica
2015-04
Tesis de doctorado
Inglés
Público en general
Flores-Garrido M.
CIENCIA DE LOS ORDENADORES
Appears in Collections:Doctorado en Ciencias Computacionales

Upload archives


File Description SizeFormat 
FloresGaM.pdf2.68 MBAdobe PDFView/Open