This article is one in a series on designing the search experience.
This article is one in a series on designing the search experience.
If the interface is the most visible element of the search experience, then the search technology (search tech for short) is the most misunderstood one. It is the search tech, along with content—the other overlooked element—that most influences the quality of the search experience.
For example, an important job of search tech is to infer the meaning of a query. So if a user searches for 'tenders', then the tech, along with some human assistance, should be able to infer that: tenders, tender, notice, quotation, quote, project-invitations, etc., refer to the same thing. And since the query is a single term without qualifiers, perhaps the user is looking for a page listing all the tenders (a topic page of sorts). A good search tech can easily offer such experiences.
There are a couple of reasons why business and IT people misunderstand search tech. First, they believe that it is a utility item—you just have to switch it on and forget about it. Second, they think that the search engine is the search tech. They are surprised to hear that the search tech is a stack of technologies that, in addition to a search engine, may include text analytics, taxonomies, search analytics, visualisation and much more.
Don't blame the business and IT people for making such assumptions; blame the search tech vendors. They took advantage of the organisation's lack of search knowledge to peddle their products, promised Google-like search performance out of the box and built a great wall of search-ignorance that has withstood the march of understanding for a long time.
The good news is that the wall is showing signs of crumbling. The deluge of big data and the need to tame it has put search tech in the limelight once again. Emerging technologies like machine learning and artificial intelligence are forcing business and IT people to reassess how they design, implement and manage search tech.
Let’s now look at some important components of the search tech stack.
Taxonomies and associated vocabularies such as thesauri and dictionaries provide the semantic structure to make sense of content. These are created by humans and exploited by machines to offer relevant results to users. For example, the text of an article on football does not reveal much about the sport. But the taxonomy of sport can add that ‘football’ and ‘soccer’ mean the same thing (unless you are in the US where football is a completely different sport) and that it involves kicking a ball with the foot to score a goal.
Text analytics has a powerful stack of technologies on its own to extract meaning and add structure to unstructured content. It uses rule sets and natural language computations to analyse the content to extract named entities, facts and summaries, and offer insights via clustering and sentiment analysis. It can also use taxonomies to auto-categorise content. For example, text analytics can check if a document is about football played in the US and then categorise it under ‘American football’. Tom Reamy’s book, Deep Text, offers an easy introduction to the world of text analytics.
The search engine serves up the most relevant documents by using a ranking algorithm. In the most simplest form, when a search is executed the user’s query is compared against all the documents in the collection and each document is given a score on how well it matches the query. The documents are then sorted on this score, and the top n are returned. The quality of the search results are highly correlated to the quality of the content. For example, you can index 100,000 documents without any enhancements and offer ordinary search experiences. But you can also pass the collection through taxonomies and text analytics and deliver relevant, specific extraordinary search experiences (an example is coming up soon).
A few years back, you may not have even considered open source search engines. But today, it would be unwise if you did not evaluate them on the same levels as the commercial ones. Open source search engines like Solr and Elasticsearch are as powerful as, if not more powerful than, some commercial search engines.
Search analytics measures how search is performing. It collects terms people use, results they view and actions they take. It also finds terms that get zero results or a low number of results. The benefit of analysing search performance is so that tweaks can be made to close any gaps. This way search analytics and relevancy tuning go hand-in-hand. For example, if search analytics finds that people searching for 'hotline' are getting zero results then adding it as a synonym for 'contact' (tuning) can solve the issue.
A key benefit of search analytics is to be a source of feedback to the taxonomies and text analytics configurations. For example, the word "hotline" can be offered as a synonym term to the taxonomy system so that it can also be in sync with ground realities.
The diagram below, taken from an insightful article by Patrick Lambe shows how the different technologies are related in the search tech stack.
Example—how search tech works
Consider a collection of news articles. The users of this collection are policy makers who need to keep track of events and agreements between countries. One of their top queries is to study meetings between political leaders. Knowing this background how can we create a search experience that helps users get their job done in simple, helpful way?
Consider a sample query: obama meets indian pm
What would a vanilla search engine deliver? It would show top ranked articles for the keywords in the query. But that isn’t very helpful to our user, is it? Consider this alternative.
In the screenshot above, the user gets the profile pictures of the two leaders with their full names. The search results uses these names to query the collection and therefore gets more relevant results. The filters on the left show offer the user semantic handles to refine the query. How is all of this done?
Here are the steps that the search tech takes:
- Processes the query to identify named entities such as people and places mentioned (text analytics).
- Looks up the acronym 'pm' against a taxonomy to find its expanded form.
- Identifies the entity 'indian pm' as a person and looks it up in the taxonomy, which returns 'Narendra Modi'.
- Does the same for the person 'obama', which actually returns two results 'Barack Obama' and 'Michelle Obama' (the user is given the option to select the correct Obama).
- Modifies the search query to include the names of the entities involved to return relevant results.
- Looks up the entities 'Barack Obama' and 'Narendra Modi' in DBpedia to get their photos.
- Creates filters based on the taxonomic terms they are tagged with.
Finally, search usage is analysed to check what queries are used and to refine where necessary.
Just imagine if the search tech could work such magic for each top intent in your team, department and organisation. People will be more efficient in their jobs and more happy in their lives.
Search technology is more than a search engine. It is a stack of related technologies that must be stitched together to create desired experiences. This identifying, stitching and testing takes time. But the results are worth it. In a project we did, we saw search usage go from 0.5% to 5% in under six months and a corresponding increase in satisfaction rates (the website gets around 1.4 million visits per month). Search tech does make work more efficient and people happier.