You are here

Global view on Corpora


A corpus of political speeches tagged with specific audience reactions, such as applause or laughter

Dataset of the SemEval-2015 Task "TimeLine: Cross-Document Event Ordering"

RTE-5 pairs annotated with linguistic phenomena and monothematic pairs

Tweets annotated with Named Entities following the NEEL-IT guidelines

The Dataset of the Evalita 2011 Named Entity Recognition Task

Italian version of the English RTE-3 dataset

A corpus annotated with discourse contrast relations in Italian

Italian Wikipedia automatically annotated with entity mentions

An extension of the English ACE 2005 Corpus with Ground-truth Links to Wikipedia


A manually annotated Italian corpus of diary entries written by diabetic patients

Typed Predicate Argument Structures for Italian

Test data set of the EVENTI Pilot Task on "Temporal Processing of Historical Texts"

Annotated spoken requests in the tourism domain (Italian, Spanish, English and German)

An annotated corpus consisting of 525 news stories taken from a local newspaper

A semantically annotated corpus of 480 news articles in 4 languages

Wikipedia sentences with frame labels in English and Italian

The Italian section of the NewsReader MEANTIME corpus

A corpus of Italian news stories annotated with information about person cross-document coreference

A subpart of Ita-TimeBank annotated with factuality information

A Temporally Annotated News Corpus in German

An English/Italian parallel corpus

A gold standard dataset of entailment graphs for English and Italian

A manually annotated corpus of around 66,000 tokens

The TimeBank corpus taken from TempEval-3 task, annotated with causal information