As IoT matures and the numbers of connected devices start to soar from a relatively low level into the billions, the greater challenge for organisations employing IoT could be more in managing the device data than in managing the devices themselves. If there are billions of devices there will be trillions of data points transmitted on a daily basis. George Malim explores how all of this will be managed from the chip on the device through to the cloud, edge or onpremise data processing capability.
The sheer scale of data involved means that organisations cannot hope to address the challenges of managing IoT device data with casual approaches that involve applying technologies retrospectively. Technical frameworks, policies for ingestion, transforming and prioritising the data along with methods to assure compliance to regulations all need to be thought of, planned for and designed, ideally in advance of deployment. The sheer volume of data is a huge obstacle because only parts of IoT-originated data are valuable. This means that organisations need the capability to automatically sift and prioritise it to avoid the expense of processing, analysing and acting upon everything.
“In order to manage IoT data effectively, there has to be an overarching business strategy, matched up with supporting technology,” explains Peter Ruffley, the chairman of Zizo, a provider of edge analytics. “IoT data is in reality no different to any other type of data that we are creating with two big exceptions: one, there is a huge volume of it; and two, most of it is completely worthless.”
The issue of apparently worthless data is significant here because there’s often a feeling that all data contains at least some form of hidden value. Maybe it does, maybe it doesn’t but storing it and analysing certainly involves cost. “One of the key issues here is that organisations expect to have to keep all of this data, due to the fact that there is a belief that there may be some value within it in the future,” confirms Ruffley. “This is most likely not the case and is why there needs to be a clear business strategy in place to best manage any IoT project.”
“This won’t be a one size fits all approach, but rather a combination of solutions in various locations – edge, cloud and on premise – which throws up its own challenges from a data management perspective,” he adds. “Data ownership is also a key challenge; understanding who owns what data, and what can be done with it will become a major part of the IoT data management lifecycle.”
Joel Chimoindes, the vice president of Maverick AV Solutions Europe, agrees that a first step is to gain understanding of where the value lies in the data you have. “In order to manage data effectively you have to understand first what data you need and what data is relevant for your business – your insights, the objective, the business outcome,” he explains. “Having decided what data you require, you should think about how to model and store that data.”
As a specialist in the audiovisual industry, Chimoindes gives the example of the challenges associated with data in digital signage. “The place where we come in is about how to collect and then communicate that data,” he says. “The main challenge of this is the sheer scale of digital signage networks and then the opportunity that AV endpoints offer for businesses. The proliferation of them means you have to be very clear what you want to achieve through an IoT project and how that will benefit your business.”
The disparity of devices adds a further layer of variation that device data strategies must address. “You have the [data] generating device to consider,” points out Fredrik Forslund, the vice president of enterprise and cloud erasure solutions at blancco. “You have to work closely with manufacturers and understand what storage capabilities and security you have on the device. There will be quite smart ones like smart speakers and TVs and very stupid ones that are less intelligent and can’t do much more than broadcast some data via a GSM chip to the cloud.”
With all this data comes responsibility and Forslund is keen to emphasise that even with the hyperscale volumes of data generation, organisations still are bound by data regulation. “IoT generates a lot of new data and that generally goes straight into the cloud – either public or private,” he says. “Even so, someone has to be responsible for the data and then manage the lifecycle of the data from generation to ultimate erasure.”
Being responsible with IoT data is both a weighty burden and a potential key point of failure for IoT initiatives. “The rapid growth of IoT caused by expanding connections of unsecured end points, when combined with progressively worsening network attacks and system intrusions, has dramatically raised the risk exposure for unsecured networks to levels beyond most companies’ ability to calculate,” says Phil Celestini, the senior vice president and chief security and risk officer at Syniverse. “This emerging reality in turn calls into question previous risk acceptance decisions for connecting business systems to the public internet.”
So what are organisations doing to enable them to manage device data securely and responsibly? In essence, they’re turning to technology vendors to supply them with the data processing platforms they need to first handle scale and second uncover the value in the data. These vary widely depending on the data being collected and the format in which it is communicated. Some data is processed at the edge in relatively intelligent devices, thereby minimising what is communicated for central processing, while other data is simply shipped wholesale to the cloud for analysis.
“A huge variety of platforms are being used and developed to meet these needs,” confirms Ruffley. “These range from big cloud platforms like Microsoft Azure, down to [offerings from] specialist hardware and software vendors. Technologies such as time series databases which enable the analysis of data over time, through to NOSQL or NewSQL databases, which allow for high volumes of data alongside standard query technologies, for near real-time query, or Apache Spark for real-time data analysis and streaming, are being deployed. The main focus here for organisations is picking the right tool for the job; with a key understanding of what the costs will be at scale.”
For Chimoindes, effective data management is a complex equation involving six key areas – data gathering, transportation, storage, security, analysis and action – that must be addressed. “First is gathering the data, does the customer already have the means to measure particular data, or do you need new sensors need to be added?” he says. “Next is transportation. Moving large amounts of data can be costly and is also crucial to the success of IoT. Is a wired or wireless set-up ideal, will sensors feed straight to the cloud, or will you run on a hybrid of edge computing mixed with the cloud? This leads into the question of where will your data be stored.” “Once you have established what data is to be gathered and how, you must ensure all this information is secure,” he adds. “There should be a security layer on the level of every sensor, the gateway or edge level and the cloud.”
“Analysis comes next and is when you can start to begin to search for solutions from the data you have collected using edge or cloud solutions to derive patterns that can then be solved,” he continues. “Finally, action. From the data gathered what rules, alarms or actions can be set up to improve the customer’s workplace?”
This list is not exhaustive but it does help to lay out the sheer scale of the problem and the number of disciplines involved in device data management. This will require specialist expertise and inevitably human decision-making, even though automation is a prerequisite because of the sheer scale of data involved and the speed of data processing required.
“Due to the complex nature of any IoT deployment, there will be a requirement for specialist skills and expertise,” says Ruffley. “But, it is important to note that we must still hang onto first principles, which are how do we get the right data to the right people at the right time? This requires input from all areas of the business, including data science. As always, business engagement will make or break any analytical project.”
Chimoindes also acknowledges that human specialists are needed. “In the planning, installation and execution of an end-to-end solution, experts are still required at multiple levels,” he says. “The key figure is the solution architect to oversee the operation and then a human developer is still required to craft algorithms and patterns. The depth of understanding of the business and then what is possible in IoT will be the key to truly transformative solutions.”
Forslund agrees. “Everyone is looking for the ability to automate and integrate into the common process but sometimes that isn’t possible,” he says.
Ruffley sees technologies coming to the aid of business engagement, allowing greater intelligence to be automated. “Machine learning is going to be very important to battle the raft of data that is flowing into the IoT architecture, however, we have to create solutions that meet the first principles I mentioned earlier,” he adds. “This requires a new look at the data flow – from device to processing, and edge computing has a big part to play in sending the right data to tools and technologies to be accessed by individuals who can make a difference.”
New market requirements often result in new approaches from solution vendors and the area of IoT device data management is no exception. Chimoindes cites the aggregation of solutions from multiple vendors as a means to simplify the landscape for organisations. “Multiple vendors and products are needed in a single solution, so trying to make this packaged and repeatable is the key to success,” he says. “We are seeing manufacturers for the first time working together to create common protocols and solutions which can be scaled globally.”
Ruffley sees increased deployment of edge computing capability as another transformative technological lever. “The emergence of edge computing with connected micro data centres, such as Vapor IO, is taking the heat from the cloud data centres,” he says. “Also, with the creation of Microsoft Azure Stack and AWS Outposts, the bigger players are seeing a need to push processing further out into the wild. Finally, we are seeing some interesting hardware developments with players like Intel and Lenovo with their SE350 server simplifying data management at the edge.” In spite of vendor efforts to simplify device data management, the complexity of integrating data from multiple different devices and then analysing it to create valuable insights can’t be underestimated.
“The challenge is the same as integrating data from multiple different data sources to meet those goals: it’s very difficult without the right business strategy and the right technology in place,” says Ruffley. “That being said, the problem is not insurmountable; the trick is working with the device suppliers to understand data formats, and working out what data is actually needed to deliver your insights that meet the defined strategy. Many organisations will have some form of data management strategy in place, but they will have to look at the scale of what they need, and what they don’t need to meet these objectives.”
For Chimoindes, just because something is complicated doesn’t mean it’s difficult. “Integration is complex, however there is no reason it has to be a challenge,” he says. “As long as you are crystal clear on what you are trying to achieve, and when you specify the solution this is part of the application from the start then technology is no longer a gatekeeper to smart solutions such as IoT.”
Data management doesn’t end until the life of the data is concluded with its secure erasure from the system and all the devices. “The area has been neglected because there’s a business rationale to focus on the top line of generating revenues and increasing value by having data and the ability to analyse it,” says Forslund. “However, the need to manage data from first collection right through to its erasure cannot be ignored. If you delay thinking about this to the future you’ll increase the impact of the challenges.”