Unit 2
Muddiest point:
1.
Why does the stemming never lower recall?
2.
In what condition does the WSD not work?
Unit 1
Reading Notes
FOA section1.1
Introduces the “find out about
(FOA)” thing, which is a cognitive activity by asking question through language
or documents. In this process, coming up with the questions, answering the
questions and assessing the questions play vital role in finding out about something.
It also talks little about the information retrieval tradition, and the core
meaning of information retrieval is “search engine”. The schematic of this
search engine is as follows:
The one who has the information need has a query, when he
comes up with it, the descriptive features mentioned by users in their queries
and documents sharing those same features get matched by the algorithm.
ES section 1.1 and 1.2
1.1 What
Is Information Retrieval?
Introduce what is the
Information Retrieval, and some kinds of search method. Such as Web Search, the
most popular and heavily used one, and Desktop and file system search,
especially effective for files stored on a local hard disk and possibly on disks connected over a local network.
There also exists Digital libraries and other specialized IR system support
access to collections of high-quality material, often of a proprietary nature.
1.2 Information Retrieval
Systems
It introduces the fundamental terminology and technology
of the Information Retrieval System.
1.2.1 Basic IR System Architecture
Information
needsà a query to the IR
systemàProcessed by a search
engineàmaintaining collection statistics associated with the index
àComputes a scoreàthe result list may be
subjected to further processing
1.2.2 Documents and Update
Document referred to any self-contained unit that can be returned to the
user as search result.
1.2.3 Performance Evaluation
Two measures: efficiency: response time
and effectiveness: relevance (Probability Ranking Principle)
MIR sections 1.1-1.4
1.1 Information
Retrieval
It introduces the early development, the specified field of libraries and
digital libraries and the center stage in www filed.
1.2 The
IR Problems
The primary goal of an IR system is to
retrieve all the documents that are relevant to a user query while retrieving
as few non- relevant documents as possible; Two different user’s task:
searching and browsing; The differences between the information and data
retrieval (according to its accuracy).
1.3 The
IR System
It introduces the software architecture of the IR system and the retrieval
and ranking process, the same with the last reading.
1.4 The web
1.4.1 it introduces the history of the web development.
1.4.2 it introduces the popularity and its “free to publish” feather of web.
1.4.3 Ranking and
indexing components of any search engine are fundamental IR pieces of
technology. There exist some major imparts of the web on search: document collection;
the size of collection and the volume of user queries submitted on a daily
basic; the vast size of the document collection; not only just a repository of
documents and data, but also a medium to do the business; web advertising and
other economic incentives (a little about the web spam)
1.4.4 practical
issues on the web: security, privacy, copyright and patent rights, and
scanning, optional character recognition.
No comments:
Post a Comment