welcome to Aulibrary

Friday, 4 July 2014

CS6007 INFORMATION RETRIEVAL | syllabus (ELECTIVE-III)


CS6007    INFORMATION RETRIEVAL L T P C 3 0 0 3
                                                                         


OBJECTIVES:

The Student should be made to:
 Learn the information retrieval models.
 Be familiar with Web Search Engine.
 Be exposed to Link Analysis.
 Understand Hadoop and Map Reduce.
 Learn document text mining techniques.

UNIT I      INTRODUCTION    (9)

Introduction -History of IR- Components of IR - Issues –Open source Search engine Frameworks -
The impact of the web on IR - The role of artificial intelligence (AI) in IR – IR Versus Web Search -
Components of a Search engine- Characterizing the web.

UNIT II      INFORMATION RETRIEVAL    (9)

Boolean and vector-space retrieval models- Term weighting - TF-IDF weighting- cosine similarity –
Preprocessing - Inverted indices - efficient processing with sparse vectors – Language Model based
IR - Probabilistic IR –Latent Semantic Indexing - Relevance feedback and query expansion.

UNIT III      WEB SEARCH ENGINE – INTRODUCTION AND CRAWLING    (9)

Web search overview, web structure, the user, paid placement, search engine optimization/ spam.
Web size measurement - search engine optimization/spam – Web Search Architectures - crawling -
meta-crawlers- Focused Crawling - web indexes –- Near-duplicate detection - Index Compression -
XML retrieval.

UNIT IV      WEB SEARCH – LINK ANALYSIS AND SPECIALIZED SEARCH    (9)

Link Analysis –hubs and authorities – Page Rank and HITS algorithms -Searching and Ranking –
Relevance Scoring and ranking for Web – Similarity - Hadoop & Map Reduce - Evaluation -
Personalized search - Collaborative filtering and content-based recommendation of documents and
products – handling “invisible” Web - Snippet generation, Summarization, Question Answering, Cross-
Lingual Retrieval.

UNIT V      DOCUMENT TEXT MINING    (9)

Information filtering; organization and relevance feedback – Text Mining -Text classification and
clustering - Categorization algorithms: naive Bayes; decision trees; and nearest neighbor - Clustering
algorithms: agglomerative clustering; k-means; expectation maximization (EM).

                                                                                                                          TOTAL: 45 PERIODS

OUTCOMES:

Upon completion of the course, students will be able to
 Apply information retrieval models.
 Design Web Search Engine.
 Use Link Analysis.
 Use Hadoop and Map Reduce.
 Apply document text mining techniques.

TEXT BOOKS:

1. C. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval , Cambridge
University Press, 2008.
2. Ricardo Baeza -Yates and Berthier Ribeiro - Neto, Modern Information Retrieval: The Concepts
and Technology behind Search 2nd Edition, ACM Press Books 2011.
3. Bruce Croft, Donald Metzler and Trevor Strohman, Search Engines: Information Retrieval in
Practice, 1st Edition Addison Wesley, 2009.
4. Mark Levene, An Introduction to Search Engines and Web Navigation, 2nd Edition Wiley, 2010.

REFERENCES:

1. Stefan Buettcher, Charles L. A. Clarke, Gordon V. Cormack, Information Retrieval: Implementing
and Evaluating Search Engines, The MIT Press, 2010.
2. Ophir Frieder “Information Retrieval: Algorithms and Heuristics: The Information Retrieval Series “,
2nd Edition, Springer, 2004.
3. Manu Konchady, “Building Search Applications: Lucene, Ling Pipe”, and First Edition, Gate Mustru
Publishing, 2008.

Click here to download full syllabus                           AULibrary.com

0 comments :

Post a Comment