The evolution of human and machine interaction: where next?

Tim Olsen, Intelligent Automation Director at Hays Technology


Robots are digital workers. Digital workers work with digital data. The clue is in the name. I’m being flippant but it is an important point when considering automation. Robotic Process Automation is really at the core of automation technologies and is often the starting point for organisations wanting to drive productivity quickly.

The challenge is that few organisations are entirely digital, and that’s for two reasons:

1. Most organisations have been around a while and have grown over time with manual processes and legacy systems which are now outdated (technical debt)

2. Humans aren’t digital – and yet they are inevitably the starting point for any product or service. There is an interaction with an analogue entity which frustratingly, doesn’t interact in a consistent manner, using a variety of channels and media in an unstructured format – whether that is speech or in a written format. That doesn’t work well with robots, which need to have their incoming data in a consistent electronic format to make sense of it and then process.

Given that the aim of automation is to maximise productivity, it is necessary to extend the scope of automation across as much of an end-to-end process as possible. We have established that the trickiest elements tend to be at the start with human interaction, so if you imagine a process map from left to right, we want to ‘shift the automation capability left’ to start earlier in the process.

So, how to deal with those pesky humans?


Speech is one of the most difficult and least structured forms of data imaginable. Think about accents, languages and the different ways of constructing a sentence. Ultimately Natural Language Processing converts sounds into text and then processes the text. Simple forms of NLP basically look for trigger words, such as ‘Bill’, ‘Address’ or similar. More advanced solutions are conversational and replicate human speech patterns.

Whilst a chatbot usually uses typed text, the background programming is much the same, allowing the user to converse in real time with the machine. This creates digital data which robots can then pick up and process – for example to complete a payment, or a change of address. Some RPA vendors already offer speech-based triggers for bots as standard.


The simplest way of ‘forcing’ structured data is through the use of a web form – encouraging users to visit a web page and enter their requirements into a structured form. This is reliable and effective but not every sector of society wants to interact in this way.

Optical Character Recognition can look at an image and extract text from it, for example a written form or an email. Handwriting, especially cursive, is still pretty problematic (I can’t read my own half the time), but the more prescriptive (think capital letters in boxes) the greater the level of accuracy that can be achieved.

Emails can be ‘read’ quite easily, and key words identified to allow classification and response. We can even identify sentiment from the type of words used and tailor our handling of the query accordingly.

A consistent approach must incorporate all media

Increasingly, we can blur the distinction between the media. As mentioned, chatbots can effectively be visual or audible; some, like the Amelia chatbot, which presents an interactive ‘human’ face on screen, are both. We need to be able to transition between media seamlessly – a call can be initiated on a mobile phone and transition to a visual IVR, or vice versa. Chatbots hand off to human agents flawlessly. Conversational IVRs need to be imperceptible as machines.

Speech technologies have taken a massive step forward over the past ten years, yet we’re still only scratching the surface of what we will be doing via speech. As usage becomes more accepted and accuracy continues to grow, this medium will dominate our interactions and become more natural.

There is a snag, however. When we as humans know we’re interacting with a machine we accept it. We may even find its limitations appealing (we all preferred R2D2 to C3PO, right?). When the machine becomes more and more humanoid, especially visually, there is something deep in the human psyche that rejects it. We’ve all been creeped out by the humanoid robots emerging with fake skin and dead eyes. Until robots are genuinely imperceptible, there will be a chasm to cross psychologically – known as the uncanny valley.

In 1970, Masahiro Mori published his theory of people's reactions to robots looking and acting almost human – leading to the phrase “uncanny valley”.

Are we looking at human and machine interaction in the wrong way?

There is a change in direction which may turn the tables on this conundrum, however. Everyone expected physical bots to gradually become immersed into our daily lives which, over time, they will. We also presumed that the differences between themselves and us would become less and less perceptible (we all have in mind the Voight-Kampff test deployed by fictional blade runners in a dystopian 2019 to detect cyborgs). However, very few expected humans to become consumed instead into their digital domains.


As the metaverse evolves and we become immersed into augmented realities, operating and interacting with others through virtual avatars in a digital world, we will project a desired image of ourselves, a digital face or avatar. Anyone with a teenager who spends time on Roblox or World of Warcraft will know this is just over the horizon and not some unrealistic eventuality far away in the future.

Speech will be the main means of communication and as we become more digitised ourselves and start to work within the metaverse, the line we draw between human and robot workers will become more and more transparent. Bots will finally become fully interactive as AI capabilities grow and will be barely perceptible; it is then that they will become an active part of society as we know it.

Looking for more insights into the ways in which automation is shaping the world of work? Find them here.


Tim Olsen
Intelligent Automation Director, Hays Technology

Tim worked in digital transformation for 20 years developing solutions to improve user journeys and experience for blue chip clients. More recently he grew the UK’s largest RPA CoE and went on to specialise in helping organisations overcome their barriers to scaling automation. He is a thought leader and evangelist for Intelligent Automation, and leads the IA Consulting specialism for Hays.