You are here
Technologies
Software
A software package aimed at recognizing entailment relations between two portions of text
A suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts
An open source Java tool for Relation Extraction
An open source Java tool for efficiently searching the Web 1T 5-gram corpus
Java tool for Feature Extraction for Natural Language Processing applications
An open source Java tool for Instance Filtering
An open source java tool for language identification
A software tool for Web people search
A software tool for text categorization
An open source Java tool for Latent Semantic Indexing
EXCITEMENT Open Platform
Scalable storage for text and RDF data
Lexical resources
A multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet
A lexical resource created by augmenting WordNet with domain labels. It includes WordNet-Affect.
A high coverage resource containing roughly 155.000 words associated with a sentiment score
A FrameNet to WordNet Mapping
A domain-specific ontology for question answering in the domain of tourism
A sensorial lexicon that associates English words with senses
A lexicon for Italian discourse connectives
Corpora
A corpus of political speeches tagged with specific audience reactions, such as applause or laughter
An annotated corpus consisting of 525 news stories taken from a local newspaper
The Dataset of the Evalita 2011 Named Entity Recognition Task
A corpus of Italian news stories annotated with information about person cross-document coreference
Italian Wikipedia automatically annotated with entity mentions
An English/Italian parallel corpus
Typed Predicate Argument Structures for Italian
The TimeBank corpus taken from TempEval-3 task, annotated with causal information
Annotated spoken requests in the tourism domain (Italian, Spanish, English and German)
RTE-5 pairs annotated with linguistic phenomena and monothematic pairs
Wikipedia sentences with frame labels in English and Italian
Italian version of the English RTE-3 dataset
A subpart of Ita-TimeBank annotated with factuality information
An extension of the English ACE 2005 Corpus with Ground-truth Links to Wikipedia
A gold standard dataset of entailment graphs for English and Italian
Test data set of the EVENTI Pilot Task on "Temporal Processing of Historical Texts"
Dataset of the SemEval-2015 Task "TimeLine: Cross-Document Event Ordering"
A semantically annotated corpus of 480 news articles in 4 languages
Tweets annotated with Named Entities following the NEEL-IT guidelines
The Italian section of the NewsReader MEANTIME corpus
A corpus annotated with discourse contrast relations in Italian
A Temporally Annotated News Corpus in German
A manually annotated Italian corpus of diary entries written by diabetic patients
A manually annotated corpus of around 66,000 tokens
Annotation Tools
A Tool for Cross-Document Event and Entity Coreference
A tool for annotation of linguistic data
A utility to compute phonetic features of tokenized sentences