Actas y comunicaciones de congresos
URI permanente para esta colección
Examinar
Examinando Actas y comunicaciones de congresos por Centro "Facultades y escuelas::E.T.S. de Ingeniería Informática"
Mostrando 1 - 20 de 40
Resultados por página
Opciones de ordenación
Publicación A comparison of extrinsic clustering evaluation metrics based on formal constraints(Springer, 2009-05-11) Artiles, Javier; Verdejo, Felisa; Amigo Cabrera, Enrique; Gonzalo Arroyo, Julio AntonioThere is a wide set of evaluation metrics available to compare the quality of text clustering algorithms. In this article, we define a few intuitive formal constraints on such metrics which shed light on which aspects of the quality of a clustering are captured by different metric families. These formal constraints are validated in an experiment involving human assessments, and compared with other constraints proposed in the literature. Our analysis of a wide range of metrics shows that only BCubed satisfies all formal constraints. We also extend the analysis to the problem of overlapping clustering, where items can simultaneously belong to more than one cluster. As Bcubed cannot be directly applied to this task, we propose a modified version of Bcubed that avoids the problems found with other metrics.Publicación A data driven approach for person name disambiguation in web search results(2014-08-23) Víctor Fresno, Víctor; Montalvo, Soto; Delgado Muñoz, Agustín Daniel; Martínez Unanue, RaquelThis paper presents an unsupervised approach for the task of clustering the results of a search engine when the query is a person name shared by different individuals. We propose an algorithm that calculates the number of clusters and establishes the groups of web pages according to the different individuals without the need to any training data or predefined thresholds, as the successful state of the art systems do. In addition, most of those systems do not deal with social media web pages and their performance could fail in a real scenario. In this paper we also propose a heuristic method for the treatment of social networking profiles. Our approach is compared with four gold standard collections for this task obtaining really competitive results, comparable to those obtained by some approaches with supervision.Publicación A Scorewriter Application using Electrooculography-based Human-Computer Interface(IEEE, 2022) Pérez–Roa, Enrique M.; Mañoso Hierro, María Carolina; Pérez de Madrid y Pablo, Ángel; Romero Hortelano, MiguelAt present, many projects are being developed with human-computer interfaces in different areas but few are related to music. In this work we present a scorewriter application that uses electrooculography as input interface. For one side, the hardware used to record the electrooculogram consists mainly of a low-cost Arduino based microcontroller board that will receive the signal from the electrodes, collect it and send it via USB to the computer. On the other hand, we use free software to implement the application running on the computer. This application is in charge of processing, classifying (using a neural network) and translating the signal into commands to finally build the song and play it. The modularity of the application allows it to be easily modified for other tasks using the same interface. Due to the nature of the application it is very suitable for entertainment. Furthermore, due to the characteristics of its interface it is also suitable for people with reduced mobility who want to easily perform simple music composition tasks.Publicación A simple measure to assess non-response(2011-06-19) Peñas Padilla, Anselmo; Rodrigo Yuste, ÁlvaroThere are several tasks where is preferable not responding than responding incorrectly. This idea is not new, but despite several previous attempts there isn’t a commonly accepted measure to assess non-response. We study here an extension of accuracy measure with this feature and a very easy to understand interpretation. The measure proposed (c@1) has a good balance of discrimination power, stability and sensitivity properties. We show also how this measure is able to reward systems that maintain the same number of correct answers and at the same time decrease the number of incorrect ones, by leaving some questions unanswered. This measure is well suited for tasks such as Reading Comprehension tests, where multiple choices per question are given, but only one is correct.Publicación Analyzing information retrieval methods to recover broken web links(2011-06-19) Martínez Romo, Juan; Araujo Serna, M. LourdesIn this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the page containing the link, and the cache page in some digital library.The selected information is processed and submitted to a search engine. We have compared different information retrievalmethods for both, the selection of terms used to construct the queries submitted to the search engine, and the ranking of the candidate pages that it provides, in order to help the user to find the best replacement. In particular, we have used term frequencies, and a language model approach for the selection of terms; and cooccurrence measures and a language model approach for ranking the final results. To test the different methods, we have also defined a methodology which does not require the user judgments, what increases the objectivity of the results.Publicación Aplicaciónde técnica de inteligencia artificial y tratamiento de señales en fusión(CEA-IFAC, 2005, 2005-01-01) Farias Castro, Gonzalo Alberto; Santos, M.Publicación Automatic detection of trends in time-stamped sequences : an evolutionary approach(Springer-Verlag, 2009-01-14) Merelo, Juan Julián; Araujo Serna, M. LourdesThis paper presents an evolutionary algorithm for modeling the arrival dates in time-stamped data sequences such as newscasts, e-mails, IRC conversations, scientific journal articles or weblog postings. These models are applied to the detection of buzz (i.e. terms that occur with a higher-than-normal frequency) in them, which has attracted a lot of interest in the online world with the increasing number of periodic content producers. That is why in this paper we have used this kind of online sequences to test our system, though it is also valid for other types of event sequences. The algorithm assigns frequencies (number of events per time unit) to time intervals so that it produces an optimal fit to the data. The optimization procedure is a trade off between accurately fitting the data and avoiding too many frequency changes, thus overcoming the noise inherent in these sequences. This process has been traditionally performed using dynamic programming algorithms, which are limited by memory and efficiency requirements. This limitation can be a problem when dealing with long sequences, and suggests the application of alternative search methods with some degree of uncertainty to achieve tractability, such as the evolutionary algorithm proposed in this paper. This algorithm is able to reach the same solution quality as those classical dynamic programming algorithms, but in a shorter time. We also test different cost functions and propose a new one that yields better fits than the one originally proposed by Kleinberg on real-world data. Finally, several distributions of states for the finite state automata are tested, with the result that an uniform distribution produces much better fits than the geometric distribution also proposed by Kleinberg. We also present a variant of the evolutionary algorithm, which achieves a fast fit of a sequence extended with new data, by taking advantage of the fit obtained for the original subsequence.Publicación Combining evaluation metrics via the unanimous improvement ratio and its application in weps clustering task(Association for the Advancement of Artificial Intelligence, 2011-12-01) Artiles, Javier; Verdejo, Felisa; Amigo Cabrera, Enrique; Gonzalo Arroyo, Julio AntonioMany Artificial Intelligence tasks cannot be evaluated with a single quality criterion and some sort of weighted combination is needed to provide system rankings. A problem of weighted combination measures is that slight changes in the relative weights may produce substantial changes in the system rankings. This paper introduces the Unanimous Improvement Ratio (UIR), a measure that complements standard metric combination criteria (such as van Rijsbergen's F-measure) and indicates how robust the measured differences are to changes in the relative weights of the individual metrics. UIR is meant to elucidate whether a perceived difference between two systems is an artifact of how individual metrics are weighted. Besides discussing the theoretical foundations of UIR, this paper presents empirical results that confirm the validity and usefulness of the metric for the Text Clustering problem, where there is a tradeoff between precision and recall based metrics and results are particularly sensitive to the weighting scheme used to combine them. Remarkably, our experiments show that UIR can be used as a predictor of how well differences between systems measured on a given test bed will also hold in a different test bed.Publicación Composición fotográfica mediante el uso de un dron(Comité Español de Automática, 2024-07-15) Sánchez García, Juan Miguel; Sánchez Moreno, José; Moreno Salinas, DavidLa composición fotográfica, conocida como mosaicos, es crucial en aplicaciones donde no es posible capturar toda la extensión de grandes superficies en una sola toma. Por ende, se requiere fotografiar secciones más pequeñas para luego componerlas y lograr una reproducción lo más precisa posible de la realidad. En este trabajo se presenta el resultado de aplicar los principios de las distintas etapas necesarias para crear un mosaico, complementado con el uso de un dron para la captura de las imágenes. La creación del mosaico implica técnicas avanzadas de procesamiento de imágenes que facilitan la detección de características, la transformación geométrica y la alineación de píxeles. Sin embargo, la experimentación con diferentes algoritmos ha revelado que no siempre es viable encontrar una transformación geométrica que produzca un mosaico de calidad, especialmente cuando las características de la fotografía no son óptimas, lo cual puede ser atribuible, en parte, a la resolución de los dispositivos fotográficos utilizados.Publicación Dataset Generation and Study of Deepfake Techniques(Springer, 2023) Falcón López, Sergio Adrián; Robles Gómez, Antonio; Tobarra Abad, María de los Llanos; Pastor Vargas, RafaelThe consumption of multimedia content on the Internet has nowadays been expanded exponentially. These trends have contributed to fake news can become a very high influence in the current society. The latest techniques to influence the spread of digital false information are based on methods of generating images and videos, known as Deepfakes. This way, our research work analyzes the most widely used Deepfake content generation methods, as well as explore different conventional and advanced tools for Deepfake detection. A specific dataset has also been built that includes both fake and real multimedia contents. This dataset will allow us to verify whether the used image and video forgery detection techniques can detect manipulated multimedia content.Publicación Detecting malicious tweets in trending topics using a statistical analysis of language(Elsevier, 2013-06-01) Martínez Romo, Juan; Araujo Serna, M. LourdesTwitter spam detection is a recent area of research in which most previous works had focused on the identification of malicious user accounts and honeypot-based approaches. However, in this paper we present a methodology based on two new aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Trending topics capture the emerging Internet trends and topics of discussion that are in everybody’s lips. This growing microblogging phenomenon therefore allows spammers to disseminate malicious tweets quickly and massively. In this paper we present the first work that tries to detect spam tweets in real time using language as the primary tool. We first collected and labeled a large dataset with 34 K trending topics and 20 million tweets. Then, we have proposed a reduced set of features hardly manipulated by spammers. In addition, we have developed a machine learning system with some orthogonal features that can be combined with other sets of features with the aim of analyzing emergent characteristics of spam in social networks. We have also conducted an extensive evaluation process that has allowed us to show how our system is able to obtain an F-measure at the same level as the best state-ofthe- art systems based on the detection of spam accounts. Thus, our system can be applied to Twitter spam detection in trending topics in real time due mainly to the analysis of tweets instead of user accounts.Publicación Detection of Cerebral Ischaemia using Transfer Learning Techniques(IEEE) Antón Munárriz, Cristina; Haut, Juan M.; Paoletti, Mercedes E.; Benítez Andrades, José Alberto; Pastor Vargas, Rafael; Robles Gómez, AntonioCerebrovascular accident (CVA) or stroke is one of the main causes of mortality and morbidity today, causing permanent disabilities. Its early detection helps reduce its effects and its mortality: time is brain. Currently, non-contrast computed tomography (NCCT) continues to be the first-line diagnostic method in stroke emergencies because it is a fast, available, and cost-effective technique that makes it possible to rule out haemorrhage and focus attention on the ischemic origin, that is, due to obstruction to arterial flow. NCCT are quantified using a scoring system called ASPECTS (Alberta Stroke Program Early Computed Tomography Score) according to the affected brain structures. This paper aims to detect in an initial phase those CTs of patients with stroke symptoms that present early alterations in CT density using a binary classifier of CTs without and with stroke, to alert the doctor of their existence. For this, several well-known neural network architectures are implemented in the ImageNet challenges (VGG, NasNet, ResNet and DenseNet), with 3D images, covering the entire brain volume. The training results of these networks are exposed, in which different parameters are tested to obtain maximum performance, which is achieved with a DenseNet3D network that achieves an accuracy of 98% in the training set and 95% in the test setPublicación Determinación de parámetros de la transfomada Wavelet para la clasificación de señales del diagnóstico scattering Thomson(Jornadas de Automática 2004, 2004-01-01) Farias Castro, Gonzalo Alberto; Santos, M.; Fernández Marrón, José Luis; Dormido Canto, SebastiánPublicación Disentangling categorical relationships through a graph of co-occurrences(American Physical Society, 2011-10-19) Borge Holthoefer, Javier; Arenas, Alex; Capitán, José A.; Cuesta, José A.; Martínez Romo, Juan; Araujo Serna, M. LourdesThe mesoscopic structure of complex networks has proven a powerful level of description to understand the linchpins of the system represented by the network. Nevertheless, themapping of a series of relationships between elements, in terms of a graph, is sometimes not straightforward. Given that all the information we would extract using complex network tools depend on this initial graph, it is mandatory to preprocess the data to build it on in the most accurate manner. Here we propose a procedure to build a network, attending only to statistically significant relations between constituents. We use a paradigmatic example of word associations to show the development of our approach. Analyzing the modular structure of the obtained network we are able to disentangle categorical relations, disambiguating words with success that is comparable to the best algorithms designed to the same end.Publicación Evaluating Multilingual Question Answering Systems at CLEF(2010-05-17) Forner, Pamela; Giampiccolo, Danilo; Magnini, Bernardo; Sutcliffe, Richard; Peñas Padilla, Anselmo; Rodrigo Yuste, ÁlvaroThe paper offers an overview of the key issues raised during the seven years’ activity of the Multilingual Question Answering Track at the Cross Language Evaluation Forum (CLEF). The general aim of the Multilingual Question Answering Track has been to test both monolingual and cross-language Question Answering (QA) systems that process queries and documents in several European languages, also drawing attention to a number of challenging issues for research in multilingual QA. The paper gives a brief description of how the task has evolved over the years and of the way in which the data sets have been created, presenting also a brief summary of the different types of questions developed. The document collections adopted in the competitions are sketched as well, and some data about the participation are provided. Moreover, the main evaluation measures used to evaluate system performances are explained and an overall analysis of the results achieved is presented.Publicación Explicit predictive control of a hybrid system: A case study: Control of the longitudinal dynamics of a comercial vehicle at very low speed(IEEE, 2015) Mañoso Hierro, María Carolina; Pérez de Madrid y Pablo, Ángel; Romero Hortelano, MiguelMany complex systems of theoretical and practical interest can be modeled as hybrid systems, i.e., systems that exhibit both continuous and discrete dynamics. This work focuses on the explicit predictive control of hybrid systems using two free Matlab/Simulink toolboxes designed specifically to synthesize and simulate controllers for this kind of systems. A case study is considered. The system is modeled, from experimental results, after the longitudinal dynamics of a commercial car at low speed. Simulation results are analyzed and a comparative study of both toolboxes is carried outPublicación Filling knowledge gaps in text for machine reading(2010-08-22) Hovy, Eduard H.; Peñas Padilla, AnselmoTexts are replete with gaps, information omitted since authors assume a certain amount of background knowledge. We define the process of enrichment that fills these gaps. We describe how enrichment can be performed using a Background Knowledge Base built from a large corpus. We evaluate the effectiveness of various openly available background knowledge bases and we identify the kind of information necessary for enrichment.Publicación Forensic Technologies to Automate the Acquisition of Digital Evidences(IEEE, 2022) García Guerrero, David; Tobarra Abad, María de los Llanos; Robles Gómez, Antonio; Pastor Vargas, RafaelThe main goal of this work is to propose the automatic acquisition of evidences in a remote way. This automated capacity becomes interesting for companies with extensive networks and/or several locations, as it allows them to delegate and centralize the acquisition task at a single point in their structure, while saving time and travel costs. This research has been carried out through the initial implementation of a virtual laboratory made up of a network and different scenarios, by including an experimentation process. The virtual network includes both the machine from which automatic acquisitions are performed and the devices from retrieving the evidence. The group of devices will be made up of various experiments. The aim is to analyze the viability of the acquisition in different scenarios, since distributed networks are not homogeneous in the real worldPublicación Fundamentals of the MPC approach to stop-and-go Adaptive Cruise Control(IEEE, 2014) Pérez de Madrid y Pablo, Ángel; Mañoso Hierro, María Carolina; Romero Hortelano, MiguelIn this paper we review the fundamentals of Adaptive Cruise Control / Stop-and-Go systems based on model predictive control theory. Driver’s decisions can be formulated as few very general constraints, for which the mathematical expression is derived in depth. The resulting control system can adjust a car velocity to maintain an intervehicle security distance, avoid collisions, stop and eventually accelerate, and observe other considerations such as passengers’ comfort without human intervention.Publicación Identifying patterns for unsupervised grammar induction(2010-07-15) Santamaría, Jesús; Araujo Serna, M. LourdesThis paper describes a new method for unsupervised grammar induction based on the automatic extraction of certain patterns in the texts. Our starting hypothesis is that there exist some classes of words that function as separators, marking the beginning or the end of new constituents. Among these separators we distinguish those which trigger new levels in the parse tree. If we are able to detect these separators we can follow a very simple procedure to identify the constituents of a sentence by taking the classes of words between separators. This paper is devoted to describe the process that we have followed to automatically identify the set of separators from a corpus only annotated with Part-of-Speech (POS) tags. The proposed approach has allowed us to improve the results of previous proposals when parsing sentences fromtheWall Street Journal corpus.