Predicting Workplace Outcomes with Machine Learning

by | 07 Nov 2019

Our Platform is helping to bring the working environment into the 21st century, utilising a wide range of real-time monitoring and indoor-positioning technology to help organisations get the most from their office environment – from the environmental conditions, right down to the individual utilisation of a desk and guiding people to suitable spaces to work.

But there is a second huge bonus to all this technology which has so far gone relatively untapped: every minute of every hour of every day, each of these sensors send little packets of data to our servers to be securely stored. Over the course of a year, one single device will send half a million of these tiny packets of data. When scaled up to a modern office with 1000s of desks, this steady trickle of data becomes a torrent, flowing every minute into our databases, waiting to be mined for valuable business insights.

Over the past few years, techniques and technologies around Artificial Intelligence, Neural Networks, Machine Learning, Deep Learning, and Data Science have emerged from academic research, ready to be exploited. These technologies promise to revolutionise our entire economy over the coming decades, helping us to become more productive and solve problems that were previously out of reach. Trading stocks, driving a car, and discovering new cancer drugs could all soon become occupations for highly intelligent computer models.

At Spica, we wanted to explore how we could apply these technologies in the workplace and uncover how this new wealth of data can directly benefit the businesses we work with. Since our largest test dataset relates to individual desk sensing (presence), could we build a Machine Learning model that can uncover which factors most strongly determine whether a desk will be used or not? Can we predict when a desk will be occupied in the future and predict how many people will be in the building? What effect does a workplace policy have on the average number of hours spent at work? How do changes in environmental conditions affect how the space is used – and how will changes in the weather change how the space is used in the future?

These are the kinds of insights we wish to gain so that we can help our customers provide a highly productive, comfortable, and efficient workplace, whether through the optimisation of cleaning and maintenance services, the use of energy for power, light, and heating, right through to helping the catering teams avoid food waste.

So, to the technology. If you were going to predict if a particular desk in your office was occupied, probably the most important dimension would be the hour of the day, since it’s natural to assume that a desk will more likely be occupied at 2:00pm than at 2:00am.

Another important factor is the day of the week, people generally are in the office more on a Monday than on a Sunday. After a quick bit of data aggregation, we can see these simple patterns in our base dataset (based around roughly 1000 desks, and around 12 months of data).

Using an algorithm called a “decision tree regressor”, we can combine these inputs into a model which will predict desk occupancy throughout the week.

By examining this model we can see some interesting details. For example, people go home early on a Friday, then a smaller number of people come back into the office after a couple of hours, possibly to collect their belongings after a trip to the pub! If we train this model on the first 80% of the days in our dataset, we can see how well it fares when predicting the occupancy during the latter 20%:

The accuracy of our model on our test set turned out to be 93% when we compared it with actuals. Not bad, but could be a lot better. An interesting prediction of our model is that Monday is the second quietest weekday after Friday, a closer look at some of our test set yields this:

What happened on Monday? The model predicted a fairly busy day at the office as usual but almost nobody showed up according to our actual data. Upon closer inspection this particular Monday turns out to have been the August bank holiday.

Let’s educate our model about UK bank holidays  – after retraining our model:

Over 99% accuracy! This corresponds to an average error of less than 1 minute in 60. If we chart the model again, we now see that Monday has now overtaken Tuesday as the busiest day in the office, and Friday is now closer to the other weekdays too.

Now, this all sounds pretty obvious to us – but the point here is that the machine learning algorithm has uncovered these patterns itself and can begin to uncover other patterns and correlations which are much less obvious. We will be able to train our model to anticipate seasonal effects such as the school holidays, and plan to integrate historic weather, transport and traffic data to see if these factors can help improve our model even more.

Training better models and integrating machine learning predictions into our products will become an increasingly important tool in ensuring the experience of working and operating in a building is the best it can be.