Mi INAOE Alertas Editar Perfil

Por favor, use este identificador para citar o enlazar este ítem: http://inaoe.repositorioinstitucional.mx/jspui/handle/1009/2393

Título :	Determining and characterizing the reused text for plagiarism detection
Autor:	José Fernando Sánchez Vega ESAU VILLATORO TELLO Manuel Montes y Gómez Luis Villaseñor Pineda Paolo Rosso
Nivel de acceso:	Acceso Abierto
Licencia:	Atribución-NoComercial-SinDerivadas
Materia:	Plagiarism detection Text reuse Machine learning Supervised classiﬁcation
Resumen o descripción:	An important task in plagiarism detection is determining and measuring similar text portions between a given pair of documents. One of the main difﬁculties of this task resides on the fact that reused text is commonly modiﬁed with the aim of covering or camouﬂaging the plagiarism. Another difﬁculty is that not all similar text fragments are examples of plagiarism, since thematic coincidences also tend to pro- duce portions of similar text. In order to tackle these problems, we propose a novel method for detecting likely portions of reused text. This method is able to detect common actions performed by plagiarists such as word deletion, insertion and transposition, allowing to obtain plausible portions of reused text. We also propose representing the identiﬁed reused text by means of a set of features that denote its degree of plagiarism, relevance and fragmentation. This new representation aims to facilitate the recog- nition of plagiarism by considering diverse characteristics of the reused text during the classiﬁcation phase. Experimental results employing a supervised classiﬁcation strategy showed that the proposed method is able to outperform traditionally used approaches.
Editor:	Elsevier Ltd.
Fecha de publicación :	2013
Tipo de publicación :	Artículo
Idioma:	Inglés
Audiencia:	Estudiantes Investigadores Público en general
Forma de citación:	Sánchez. F., et al., (2013). Determining and characterizing the reused text for plagiarism detection, Expert Systems with Applications, Vol. 2013 (40): 1804–1813
Área de conocimiento:	CIENCIA DE LOS ORDENADORES
Versión de la publicación:	Versión aceptada
Versión de la publicación:	acceptedVersion - Versión aceptada
Aparece en las colecciones:	Artículos de Ciencias Computacionales

Cargar archivos:

Fichero	Tamaño	Formato
210. Determining and Characterizing the Reused Text for Plagiarism Detection.pdf	666.93 kB	Adobe PDF	Visualizar/Abrir