Information Retrieval
http://www.cc.gatech.edu/~isbell/tutorials/TextRetrieval.fm.pdf
Understanding Inverse Document Frequency:On theoretical arguments for IDF
Stephen RobertsonMicrosoft Research7 JJ Thomson AvenueCambridge CB3 0FBUK(and City University, London, UK)
http://www.soi.city.ac.uk/~ser/idfpapers/Robertson_idf_JDoc.pdf#search=
Using TF-IDF to Determine Word Relevance in Document Queries
Juan Ramos Department of Computer Science, Rutgers University, 23515 BPO Way, Piscataway, NJ, 08855
http://www.cs.rutgers.edu/~mlittman/courses/ml03/iCML03/papers/ramos.pdf#search=
TEXT MINING
Ian H. WittenComputer Science, University of Waikato, Hamilton, New Zealand
The "information retrieval" link looks like a good introduction to the various techniques used to retrieve documents based on user queries. It talks about inverse document frequency, vector space model, latent semantic indexing, relevance & pseudo-relevance feedback..
ReplyDelete