loading...
Essential Dimensions of Latent Semantic Indexing (LSI)
Big Island, Hawaii January 03-January 06
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/HICSS.2007.21340th Annual Hawaii International Conf ...
 This Article 
 
PURCHASE ARTICLE: $0
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
April Kontostathis, Ursinus College
Latent Semantic Indexing (LSI) is commonly used to match queries to documents in information retrieval applications. LSI has been shown to improve retrieval performance for some, but not all, collections, when compared to traditional vector space retrieval. In this paper, we first develop a model for understanding which values in the reduced dimensional space contain the term relationship (latent semantic) information. We then test this model by developing a modified version of LSI that captures this information, Essential Dimensions of LSI (EDLSI). EDLSI significantly improves retrieval performance on corpora that previously did not benefit from LSI, and offers improved runtime performance when compared with traditional LSI. Traditional LSI requires the use of a dimensionality reduction parameter which must be tuned for each collection. Applying our model, we have also shown that a small, fixed dimensionality reduction parameter (k=10) can be used to capture the term relationship information in a corpus.
Citation:
April Kontostathis, "Essential Dimensions of Latent Semantic Indexing (LSI)," hicss, pp.73c, 40th Annual Hawaii International Conference on System Sciences (HICSS'07), 2007
Usage of this product signifies your acceptance of the Terms of Use.