You are here


jLSI (java Latent Semantic Indexing) is an open source Java tool for Latent Semantic Indexing. jLSI requires only a shallow linguistic processing, such as tokenization, sentence splitting, part-of-speech tagging (optional) tagging and lemmatization (optional). jLSI is released as free software with full source code, provided under the terms of the Apache License Version 2.0. jLSI is developed by Claudio Giuliano at FBKHuman Language Technologies group. Some of jLSI's features include:

  • Implements the latent semantic kernel
  • Written in Java
  • Supports user-defined data representation

The latest version of jLSI is 1.0, released 01 September 2007. Sources, binaries, and documentation available here.  You are welcome to use the code under the terms of the license Apache License Version 2.0, however please acknowledge its use with a citation:

  • Claudio Giuliano, jLSI User's Guide, Technical Report, FBK-irst, Trento, September 2007.

jLSI has been used in:

  • Claudio Giuliano. Fine-Grained Classification of Named Entities Exploiting Lat ent Semantic Kernels. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CONLL 2009), Boulder, CO, USA, June 4-5, 2009
  • Sara Tonelli and Claudio Giuliano. Wikipedia as Frame Information Repository. In Proceedings of the The 2009 conference on Empirical Methods in Natural Language Processing (EMNLP 2009), Singapore, August 6-7, 2 009.
  • David Tomás and Claudio Giuliano. A semi-supervised approach to question classification. In Proceedings of the 17th European Symposium on Artificial Neural Networks: Advances in Computational Intelligence and Learning, Bruges, Belgium, 22 - 24 April 2009.

jLSI has been funded by X-Media Project.

Technology type: 
Contact us: