Persona:
Amigo Cabrera, Enrique

Cargando...
Foto de perfil
Dirección de correo electrónico
ORCID
0000-0003-1482-824X
Fecha de nacimiento
Proyectos de investigación
Unidades organizativas
Puesto de trabajo
Apellidos
Amigo Cabrera
Nombre de pila
Enrique
Nombre

Resultados de la búsqueda

Mostrando 1 - 4 de 4
  • Publicación
    MT Evaluation : human-like vs. human acceptable
    (2006-07-17) Giménez, Jesús; Màrquez, Lluís; Amigo Cabrera, Enrique; Gonzalo Arroyo, Julio Antonio
    We present a comparative study on Machine Translation Evaluation according to two different criteria: Human Likeness and Human Acceptability. We provide empirical evidence that there is a relationship between these two kinds of evaluation: Human Likeness implies Human Acceptability but the reverse is not true. From the point of view of automatic evaluation this implies that metrics based on Human Likeness are more reliable for system tuning. Our results also show that current evaluation metrics are not always able to distinguish between automatic and human translations. In order to improve the descriptive power of current metrics we propose the use of additional syntax-based metrics, and metric combinations inside the QARLA Framework.
  • Publicación
    The contribution of linguistic features to automatic machine translation evaluation
    (2009-08-02) Giménez, Jesús; Verdejo, M. Felisa; Amigo Cabrera, Enrique; Gonzalo Arroyo, Julio Antonio
    A number of approaches to Automatic MT Evaluation based on deep linguistic knowledge have been suggested. However, n-gram based metrics are still today the dominant approach. The main reason is that the advantages of employing deeper linguistic information have not been clarified yet. In this work, we propose a novel approach for meta-evaluation of MT evaluation metrics, since correlation cofficient against human judges do not reveal details about the advantages and disadvantages of particular metrics. We then use this approach to investigate the benefits of introducing linguistic features into evaluation metrics. Overall, our experiments show that (i) both lexical and linguistic metrics present complementary advantages and (ii) combining both kinds of metrics yields the most robust metaevaluation performance.
  • Publicación
    An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results
    (Association for Computational Linguistics Note:, 2020-07-01) Amigo Cabrera, Enrique; Gonzalo Arroyo, Julio Antonio; Mizzarro, Stefano; Carrillo de Albornoz Cuadrado, Jorge Amando
    In Ordinal Classification tasks, items have to be assigned to classes that have a relative ordering, such as positive, neutral, negative in sentiment analysis. Remarkably, the most popular evaluation metrics for ordinal classification tasks either ignore relevant information (for instance, precision/recall on each of the classes ignores their relative ordering) or assume additional information (for instance, Mean Average Error assumes absolute distances between classes). In this paper we propose a new metric for Ordinal Classification, Closeness Evaluation Measure, that is rooted on Measurement Theory and Information Theory. Our theoretical analysis and experimental results over both synthetic data and data from NLP shared tasks indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously. In addition, it generalizes some popular classification (nominal scale) and error minimization (interval scale) metrics, depending on the measurement scale in which it is instantiated.
  • Publicación
    Evaluating Sequence Labeling on the basis of Information Theory
    (Association for Computational Linguistics, 2025-07-01) Amigo Cabrera, Enrique; Álvarez Mellado, Elena; Carrillo de Albornoz Cuadrado, Jorge Amando; European Commission; Agensia Estatal de Investigación (España)
    Various metrics exist for evaluating sequence labeling problems (strict span matching, token oriented metrics, token concurrence in sequences, etc.), each of them focusing on certain aspects of the task. In this paper, we define a comprehensive set of formal properties that captures the strengths and weaknesses of the existing metric families and prove that none of them is able to satisfy all properties simultaneously. We argue that it is necessary to measure how much information (correct or noisy) each token in the sequence contributes depending on different aspects such as sequence length, number of tokens annotated by the system, token specificity, etc. On this basis, we introduce the Sequence Labelling Information Contrast Model (SL-ICM), a novel metric based on information theory for evaluating sequence labeling tasks. Our formal analysis and experimentation show that the proposed metric satisfies all properties simultaneously.