The Meaning and Future of the Semantic Web
Many of today’s search engines rely on keywords. That is, the user enters the relevant words in your search ( “Albert Einstein” and “Nobel”, eg.), And the application returns all documents containing those words. Low accuracy or relevancy of the results (will return many irrelevant documents for search: the presence of a keyword in a document does not necessarily imply that the result is relevant).
Excessive sensitivity to the vocabulary used in searches (and, therefore, unable to get to the first relevant results available: many documents of interest may not include keywords, but synonyms, homonyms or antonyms of them).
A study by David Hawking and several researchers assessed 20 conventional search engines (based on keywords) using 54 searches. The percentage of relevant results after inspecting the top 20 websites returned was 0.5% for the best search engine (Northern Light), and Google was the second most precise form. Thus the popularity of keyword-based search engines has little to do with accuracy, but the patience of users.
A semantic search is a query that takes into account the context, and therefore the significance of that for what is being asked (and not just the words of the query), in order to avoid the ambiguities of both the queries as the text of the documents that users are looking for. For example, a semantic search with the words “discoverer” and “penicillin” would return documents about Alexander Fleming, but these did not appear for these two terms, because it would identify the concepts that structure the search (penicillin is a product that you want to find out discoverer or, more formally, Medicine (Penicillin) Inventor Person (Alexander Fleming)). The ultimate goal of semantic search is that users can make searches more precise and expressive, giving rise to significant results with minimal user intervention.
Normally, it is accepted that the searches are based on semantic techniques to extract information through the use of ontologies or metadata. The use of ontologies allows engines to formally define the domains of interest (scientific theories, for example) with an expressive enough foundation so that users can specify their searches in some detail, either before or during executing the query string.
From a technical standpoint, a semantic search engine is an application that includes the searches and the texts of documents on the web by using algorithms that simulate comprehension or understanding, and from them gives accurate results without the user having to open and inspect the document itself. A form of this type recognizes the right context for search words or sentences. Google or Yahoo search engines are not semantic, since algorithms are mainly based on statistics generated from words and links, not cognitive algorithms that capture the knowledge implicit in the words and context. For example, a search like “Who was Uranus?”. Some search engines return results related to the seventh planet from the sun, when it is clear that the purpose of the search is to find information about the primordial god of heaven in Greek mythology.
Semantic search engines are not always right the first time the meaning of a polysemous word [A distinction may be made between "true" homonyms]. Therefore, they must have means of disambiguation to know the exact meaning with the word in the search. For example, a semantic search engine which uses internal ontologies with computer concepts and means of transport must have tools to determine which referred the user when a query with the word bus, or it can mean “digital system that transfers data between components of a computer or computers”. To do this, you can choose the most likely meaning, ask the user to choose among several options (as the search engine Hakia, which has the options of extracting ontology and natural language processing (NLP)) or use the other words in the search to infer the exact meaning of bus in this context (eg., a query such as “What time does the bus Friday from Boston to Stockbridge?”).
As a semantic search engine is based on algorithms that simulate the understanding of words and thus, establish relationships between them, can search user interest in the documents returned but not including the words or search expressions. For example, a semantic search engine was introduced in the word “marsupial” show documents where they appear terms like: kangaroo, koala, Satanel New Guinea, mountain monkey, rat kangaroo, possum, opossum, Tasmanian devil. As this example shows, the semantic searches are far superior to those based on key words, one can find documents of interest to never find with keywords. Also, if you seek information on various species of marsupials, would not need to formulate the query in different ways, with the name of each species, to obtain the desired information.
The lack of structure and semantic annotation of web resources (Word, PDF, HTML pages, etc..) Requires that the semantic search engine uses algorithms to analyze cognitive resources, word by word, sentence by sentence, to assign words and sentences to ontological concepts. Hence, semantic search engines not covered by now many web resources as conventional, using statistical algorithms, much faster and completely automated. This limitation will disappear when improving cognitive algorithms as the “semantic islands” unite to form the semantic web, or at least “semantic continents”.
