Information Architecture and Search Engine Analysis – SQA

The Scottish Qualifications Authority accredit vocational qualifications that are offered across Scotland, including Scottish Vocational Qualifications and approve awarding bodies that wish to award them.

Desired outputs: recommendations for Lucene search engine configuration, wireframes for key pages

Users complained searching the site was difficult and search results were often irrelevant.

I needed to approach the problem from several directions: first, what were users looking for and how well did the search engine perform with these queries. Second, were there other problems with the website?

First step:  review data from Google Analytics and see what common search terms were being used. This revealed it was a combination of simple courses, natural language queries, spelling variations and abbreviations… nothing unexpected.

Next step was a little bit more tricky: look at how the site’s search *engines performed with the different search strings…

The site uses two main search engines: one has a look-ahead predictive function, which would explain the large amounts of single course queries in Google Analytics, while the other is a free text field. I observed several things: the predictive engine could be crashed by forcing searches before the search box returned a predictive match and abbreviations or miss-spellings returned no matches at all.

The next step was to look at the accuracy of matches returned: how robust was the quality of the search engine? I constructed a heuristic that awarded a relevancy to the results based on how many clicks it took to reach the **destination page, I then cross checked against results from Google as a control:

Finally, I tested 20 users from 4 different personas against a number of testing scenarios. From this I uncovered basic flaws and technical limitations with the search engine uncovered from earlier research and I also observed that users were confused between the functions of different search boxes (1) compounded by poor results layout (2).

I also discovered that it was the quality of the information architecture that was forcing users to search


  • Search was consolidated into one scoped search

  • Search results were simplified and advanced search features were removed:

  • If search predictions were to be used they were to be funneled based on the scope of the search; a ‘bank’ of the top 100 predictions should be included as a minimum.
  • There were no metadata models. I created rules for 20 key pages as defined by SQA.
  • The homepage was to change the ordering of .js script loading so that scripts for search were loaded first. This sped up returns of the predictive searches and avoided crashes if the user forced the search.
  • Extra meta data fields were created to legislate for abbreviations, misspellings and synonyms for each page. This was done within the CMS as additional attributes on the page template.
  • The ‘sloppiness’ factor for searches was increased (a term referring to tolerance of fuzzy searches)
  • Finally, I constructed a sandbox for SQA to test the success of the internal engine versus Google. If the internal search continued to perform poorer than Google then the Lucene search engine was to be abandoned completely.

Very difficult project which I can only compare to the joy one might experience trying to untangle a bowl of spaghetti: multiple search boxes spanning different data sources; a deep, impenetrable IA; weak metadata; poor quality of results; poor layouts….The combination of all these factors contributed to a problem of some magnitude.

By recommending simpler layouts and a scoped search SQA have been able to improve search engine performance; however, they have a long way to go to improve the information architecture, and without this essential step, the user experience will never be perfect.

*3 search engines. I know, there’s the problem right there…!

** destination page based on consensus among domain experts


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s