The “What the Heck!?!” Web

This is my article in the Irish Times published today,  on the semantic web and the opportunities.  It is of course based on my work with Sophia and also DERI.

The early world wide web was chiefly used by companies to publish content and advertisements about themselves. In Ireland, it used to be the “worldwide wait” while our weak broadband links retrieved pages of interest. Today, the web has developed into a social network in which all of us can easily contribute content ourselves, by Tweets, Facebook posts, online comments and so on. However sometimes I wonder has it become the “worldwide what the heck” as inane and puerile content is frequently automatically presented to us as a frustrating distraction.

Too often when we search the web, completely irrelevant search results are presented which often hide the real answers which we seek. Frequently when using a social network, pointless advertisements are served up to most of us, for extraneous offers such as animal print clothing, teeth implants and half marathons. Why is the web so weird and witless ?

Sir Tim Berners Lee is the founding father of the web, and built the world’s first web site in 1991. Since 2001, he has been promoting the “semantic web” as an extension of the current world wide web to give well-defined meaning to the information available via the web, so enabling better co-operation of computers and people. Much of today’s web software does not understand the meaning of web pages: while it may understand that a page should be formatted in a certain way to look pretty on a screen, it may not understand that for example my current page is relating to prescribed treatment from my mother’s doctor and now requires identifying a reputable physiotherapist within 10km of her home. Sir Berners Lee advocates that if a reasonably large amount of the information and data available worldwide on the web can be categorized, sorted and understood by computers, then the web would become immeasurably more valuable as a global resource.

The first step for the semantic web has been to classify information using taxonomies (akin to the Dewey Decimal Classification system widely used in libraries). This can then be augmented by ontologies, which are akin to equipping a computer with concepts: for example, the concept of a “company” is a “legal entity” owned by “shareholders” and at set of “places” where “people” come together to offer a “service” or “product” bought by “customers” in conjunction with “partners” and “suppliers” and obeying “regulations” established by “government authorities”. In principle, once data and content is labelled and tagged using these approaches, then more intelligent software tools could be engineered to not only understand how to lay out a web page for a screen, but also to understand what each page is actually describing.

There are some obvious challenges. Some of the content on a specific web page could be vague and incomplete – or even sarcastic, ironic or deceitful. Taxonomies and ontologies should work across all languages. In the absence of a global central authority, independently developed ontologies may be inconsistent: is the concept of a “company” in Ireland entirely correct for, for example, an IFSC back-office operation? Less obvious, but a key technical point, is that on the one hand web pages are currently structured as “trees” (a page contains sections which in turn contain paragraphs which in turn contain sentences..), but on the other hand knowledge is structured as “graphs” (for example, people do not contain other people, but rather could be related, and/or friends, and/or share interests, and/or work for the same company).

The “social semantic web” adds human intervention via social networks to the semantic web. The automatic classification of web content using taxonomies and ontologies can be augmented by collaborative labeling and tagging of data by humans. Some strategists believe that exploiting social networking can lead to higher quality results: for example, if I am seeking a good physiotherapist within 10km of my mother’s home, would asking my social circle of friends and acquaintances lead to a better result than that from an automatic search? If physiotherapists want to advertise online, which is the optimal online advertiser to use?

The quest for higher quality online advertising – the “right ad at the right time in the right place” – is a strong commercial catalyst for a better and wiser web. Google attempts to solve the right ad challenge by inferring our interests based on observing for what we search. Facebook attempts to solve it by analyzing our chit-chat with our friends. The more that an online advertiser can encourage us to directly or indirectly tell it about our interests, the more likely it can become highly successful – and useful! Despite what many in the traditional newspapers believe, I believe that for them there is a substantial opportunity online since they could then observe which specific articles we each read, and tailor advertisements to each once of us accordingly.

However, it would be disappointing if the sole benefit of the semantic web were better targeted advertising. Rather we should expect the semantic web to actively assist us, gently intervening when appropriate, politely bringing things to our attention. Currently we browse the web based on keywords we give the search engines, what our friends recommend, and what we come across. Rather than results from simple matching of keywords and just from what our friends like, imagine if web software tools were sufficiently powerful to unearth the latent intelligence already in the web. Medical clinicians may discover new links between diseases, deduced from research results already available today but currently lost in the mass of the web. Historians may realize that particular events are related, based on evidence whose importance had been overlooked. Researchers, journalists, genealogists and indeed all of us may all discover new relationships between stories, events and data which were latent but hitherto unrecognized in the web.

There have been two decades of the world wide web, and a decade of the semantic web, but the web still has many “what the heck” moments. There is substantial opportunity for innovation to make the web wise and intelligent.

[Disclosure: Chris Horn is an Advisory Board member of the SFI funded DERI project at NUIG researching the semantic web. He is also Chairman of Sophia Search, a Belfast based company with semantic search and discovery solutions.]


About chrisjhorn
This entry was posted in DERI, innovation, Ireland, Irish Times, social networking, Sophia. Bookmark the permalink.

3 Responses to The “What the Heck!?!” Web

  1. Seamus Grimes says:


    I would imagine that many people will opt to use all the privacy settings possible to prevent themselves being profiled for targeted marketing. Will this not defeat attempts by web companies to exploit personal data?

  2. chrisjhorn says:

    Hi Seamus,

    My view is that targetted marketing should be on an “opt-in” rather than “opt-out” basis.

    There is of course the trade-off between a free service (typically bootstrapped by risk capital…) and a paid-for service: would one be prepared to pay a membership fee to eg Facebook, and in return expect/demand an ad-free interface ? Would one be prepared to pay Google a membership fee and have no search-based ads ? In somewhat the same vein, would you be prepared to pay a higher TV license fee to a national broadcaster like RTE and have no ads ? Would be prepared to pay a higher price for your newspaper if it were ad-free ?…

    My own theory, for what its worth, is that the best solution is a completely free service augmented with really useful and relevant ads specifically tailored for each and every user and, in the clear opinion of each such user, of real value to each one of them. I have some specific ideas and work here, but can’t say much more for the moment … 🙂

    best wishes

  3. Seamus Grimes says:


    Obviously companies in this area need a business model that delivers profit, but from what I see of the trends and the underlying ‘philosophy’ (if that’s what you could call it) is to exploit the gullability and lack of understanding of many of the degree to which privacy has become commodity. An extreme example was the phone hacking busines, but my view is that many of these companies are not far behind in being opportunistic.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s