Por favor, use este identificador para citar o enlazar este ítem: http://inaoe.repositorioinstitucional.mx/jspui/handle/1009/214
Hardware architecture for frequent itemset mining in static datasets using a segmentation strategy
MAURO MARTIN LETRAS LUNA
RENE ARMANDO CUMPLIDO PARRA
RAUDEL HERNANDEZ LEON
Acceso Abierto
Atribución-NoComercial-SinDerivadas
Hardware architecture
Frequent hemset
FPGA
In recent years there has been a significant increase in the information generated from distinct domains and the size of datasets overwhelm the human capacity to process them and obtain valuable information. Because of this, Data Mining has emerged as a set of techniques and algorithms dedicated to finding patterns in datasets, and then these patterns are used to classify or predict the behavior of some phenomena related to the data. Association Rules Mining is an important branch inside Data Mining, and it consists in finding relationships among the data in the form of implication rules. The problem is usually decomposed into two subproblems. One is to find those itemsets whose occurrences exceed a predefined threshold in the database; those itemsets are called frequent itemsets. The second problem is to generate association rules from those frequent itemsets. In this research, Frequent Itemset Mining is explored, because the huge amount of data in some cases makes dificult to obtain a response in an acceptable time according to the application requirements, due to the exhaustive nature of the problem. There are many algorithms dedicated to searching frequent itemsets, the most widely used are: Apriori, FP-Growth, and Eclat. They use strategies like breadth-first search and depth-first search to go over to the search space. They have to do a search in datasets, some of them like Apriori, have to access many times the dataset. FP-Growth reads the dataset twice, but it must keep in memory large amounts of data. Frequent Itemset Mining is an exhaustive task since the database must be read many times independently of the way in which the data is stored (in main memory or hard disk). In the literature, there have been reported two ways to accelerate Frequent Itemset Mining: the first one consists in improving the existing software algorithms through proposing new heuristics to save time, and the second one consists in developing hardware architectures dedicated to this task. The main goal of this research is to design a Hardware Architecture to accelerate the Frequent Itemsets Mining process. A segmentation strategy is proposed using equivalence classes to guarantee that all the frequent itemsets will be found independently of the available hardware resources. An implementation in FPGA willbe carried out to validate the proposed architecture and compare it with software only implementations.
Instituto Nacional de Astrofísica, Óptica y Electrónica
2015-11
Tesis de maestría
Inglés
Estudiantes
Investigadores
Público en general
Letras-Luna M.M.
CIENCIA DE LOS ORDENADORES
Versión aceptada
acceptedVersion - Versión aceptada
Aparece en las colecciones: Maestría en Ciencias Computacionales

Cargar archivos:


Fichero Descripción Tamaño Formato  
LetrasLMM.pdf1.39 MBAdobe PDFVisualizar/Abrir