||This article possibly contains original research. (September 2012) (Learn how and when to remove this template message)|
A vertical search engine as distinct from a general web search engine, focuses on a specific segment of online content. They are also called specialty or topical search engines. The vertical content area may be based on topicality, media type, or genre of content. Common verticals include shopping, the automotive industry, legal information, medical information, scholarly literature, job search and travel. Examples of vertical search engines include; Mocavo, Nuroa, Trulia and Yelp. In contrast to general web search engines, which attempt to index large portions of the World Wide Web using a web crawler, vertical search engines typically use a focused crawler which attempts to index only relevant web pages to a pre-defined topic or set of topics.
Some vertical search sites focus on individual verticals, while other sites include multiple vertical searches within one search engine.
Vertical search offers several potential benefits over general search engines:
Vertical search can be viewed as similar to enterprise search where the domain of focus is the enterprise, such as a company, government or other organization. In 2013, consumer price comparison websites with integrated vertical search engines such as FindTheBest drew large rounds of venture capital funding, indicating a growth trend for these applications of vertical search technology.
Domain-specific verticals focus on a specific topic. John Battelle describes this in his book The Search (2005):
Domain-specific search solutions focus on one area of knowledge, creating customized search experiences, that because of the domain's limited corpus and clear relationships between concepts, provide extremely relevant results for searchers.
In the domain-specific setting one can combine the tf-idf approach implemented via an inverse index with semantic approaches of semantic headers and semantic skeletons. Instead of most frequent keywords, a set of entities is extracted from a portion of text to be matched against a potential question. This allows much more flexibility due to real-time reasoning capabilities while matching questions and answers in the form of semantic headers.
Any general search engine would be indexing all the pages and searches in breadth first manner to collect documents. Whereas, the spidering in domain specific search engines is more efficient which is through searching a small subset of documents by focussing on particular set. The spidering that can be accomplished using reinforcement learning framework which allows optimal behaviour, which is three times more efficient than breadth-first search as per experimental results.
Manage research, learning and skills at defaultLogic. Create an account using LinkedIn or facebook to manage and organize your IT knowledge. defaultLogic works like a shopping cart for information -- helping you to save, discuss and share.