Friday, February 27, 2009

Read

A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese (1994)
by Makoto Nagao, Shinsuke Mori
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.57.8416

Class-Based n-Gram Models of Natural Language (1992)
by Peter F. Brown, Peter V. Desouza, Robert L. Mercer, Jenifer C. LaiComputational Linguistics
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.9919

Beyond word n-grams (1995)
by Fernando C. Pereira, Yoram Singer, Naftali TishbyIn Proceedings of the Third Workshop on Very Large Corpora
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.7

Friday, February 20, 2009

next in line

N-GRAM AND LOCAL CONTEXT ANALYSIS FOR PERSIAN TEXT RETRIEVAL
by Farhad Oroumchian A, Abolfazl Aleahmad A, Parsia Hakimian A, Farzad Mahdikhani A http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.8268

Modeling documents for structure recognition using generalized n-grams (1997) by R. Brugger, A. Zramdini, R. Ingoldin
Proceedings of ICDAR
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.2338

Discovering characteristic expressions from literary works: A new text analysis method beyond n-gram and kwic (2001)
by Masayuki Takeda, Tetsuya Matsumoto, Tomoko Fukuda, IchirĂ… Nanri
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.129.5337

currently reading

Information Retrieval

http://www.cc.gatech.edu/~isbell/tutorials/TextRetrieval.fm.pdf

Understanding Inverse Document Frequency:On theoretical arguments for IDF

Stephen RobertsonMicrosoft Research7 JJ Thomson AvenueCambridge CB3 0FBUK(and City University, London, UK)

http://www.soi.city.ac.uk/~ser/idfpapers/Robertson_idf_JDoc.pdf#search=

Using TF-IDF to Determine Word Relevance in Document Queries

Juan Ramos Department of Computer Science, Rutgers University, 23515 BPO Way, Piscataway, NJ, 08855

http://www.cs.rutgers.edu/~mlittman/courses/ml03/iCML03/papers/ramos.pdf#search=

TEXT MINING

Ian H. WittenComputer Science, University of Waikato, Hamilton, New Zealand