You are here

Global view on Corpora


A corpus of political speeches tagged with specific audience reactions, such as applause or laughter

Dataset of the SemEval-2015 Task "TimeLine: Cross-Document Event Ordering"

RTE-5 pairs annotated with linguistic phenomena and monothematic pairs

The Dataset of the Evalita 2011 Named Entity Recognition Task

Tweets annotated with Named Entities following the NEEL-IT guidelines

Italian version of the English RTE-3 dataset

Italian Wikipedia automatically annotated with entity mentions

A corpus annotated with discourse contrast relations in Italian

An extension of the English ACE 2005 Corpus with Ground-truth Links to Wikipedia

Typed Predicate Argument Structures for Italian


A manually annotated Italian corpus of diary entries written by diabetic patients

Test data set of the EVENTI Pilot Task on "Temporal Processing of Historical Texts"

Annotated spoken requests in the tourism domain (Italian, Spanish, English and German)

An annotated corpus consisting of 525 news stories taken from a local newspaper

A semantically annotated corpus of 480 news articles in 4 languages

Wikipedia sentences with frame labels in English and Italian

A corpus of Italian news stories annotated with information about person cross-document coreference

The Italian section of the NewsReader MEANTIME corpus

A subpart of Ita-TimeBank annotated with factuality information

An English/Italian parallel corpus

A Temporally Annotated News Corpus in German

A gold standard dataset of entailment graphs for English and Italian

The TimeBank corpus taken from TempEval-3 task, annotated with causal information

A manually annotated corpus of around 66,000 tokens