
Martin Mozina; Claudio Giuliano; Ivan Bratko,
SAMT 2008 workshop on Crossmedia information analysis, extraction and management,
2008
, (SAMT 2008 workshop on Crossmedia information analysis, extraction and management,
Koblenz, Germany,
03/12/2008)

Elena Cabrio; Milen Ognianov Kouylekov; Bernardo Magnini,
Combining Specialized Entailment Engines for RTE4,
Text Analysis Conference (TAC 2008),
NIST  National Institute of Standards and Technol,
2008
, (Text Analysis Conference (TAC 2008),
Gaithersburg, Maryland, USA,
17/11/2008  19/11/2008)

Danilo Giampiccolo; Hoa Dang; Bernardo Magnini; Ido Dagan; Elena Cabrio; Bill Dolan,
The Fourth PASCAL Recognizing Textual Entailment Challenge,
Text Analysis Conference (TAC 2008),
2008
, (Text Analysis Conference (TAC 2008),
Gaithersburg, Maryland, USA,
17/11/2008  19/11/2008)

A. Corazza; Alberto Lavelli; G. Satta,
Measuring Parsing Difficulty Across Treebanks,
One of the main difficulties in statistical parsing is associated with the task of choosing the correct parse tree for the input sentence, among all possible parse trees allowed by the adopted grammar model. While this difficulty is usually evaluated by means of empirical performance measures, such as labeled precision and recall, several theoretical measures have also been proposed in the iterature, mostly based on the notion of crossentropy of a treebank. In this article we show how crossentropy can be misleading to this end. We propose an alternative theoretical measure, called the expected conditional crossentropy (ECC), which can be approximated through the inverse and normalized conditional loglikelihood of a treebank, relative to some model. We conjecture that the ECC provides a measure of the informativeness of a treebank, in such a way that more informative treebanks are easier to parse under the chosen model. We test our conjecture by comparing ECC values against standard performance measures across several treebanks for English, French, German and Italian, as well as other treebanks with different degrees of ambiguity and informativeness, obtained by means of artificial transformations of a source treebank. All of our experiments show the effectiveness of the ECC in characterizing parsing difficulty across different treebanks, making it possible treebank comparison.,
2008