The Power of Digital Twins: A Podcast with ThoughtWire’s Stephen Owens

In this edition of the Early Adopter Research’s Designing Enterprise Platforms Podcast, Dan Woods, principal analyst of Early Adopter Research spoke with Stephen Owens, the CTO of ThoughtWire. They discussed ThoughtWire’s platform for creating interesting applications based on sensor data, digital twins, and various types of automation and analysis.

Their conversation covered:

* 2:00 — The power of semantics with digital twins
* 18:30 — The core dogmas of ThoughtWire
* 33:00 — ThoughtWire’s magic tricks with specific use cases

Listen to the podcast or read an edited version of the conversation below:

Woods: Today’s conversation is going to be around the original ideas I had about semantic technology and then about how those ideas are actually applied and put to work inside the Ambiant platform that ThoughtWire creates. The reason we’re talking is that I wrote an article called Are Semantics the Undead of Enterprise Tech? And what I meant by that article was ever since the World Wide Web was created, the next thing that Tim Berners-Lee did was start working on semantics. And people had done a variety of applications and then they decided to create a bunch of standards. And this was shortly after the web was created and those standards were RDF, OWL, and they were intended to allow you to create mappings of information and make information more useful and be able to do reasoning on information. And what happened is that those standards have come and gone, and come and gone, in terms of levels of excitement about how they were actually going to change enterprise applications and how they were going to make the web more powerful. And it seems that they would come and then they would fade away and then they would come back. And so, as far as I can tell, we’re now at a stage where the actual semantic standards have been around a while but we now have the mature graph technology, we now have a lot of mature ETL technology, we also understand how to do modeling and also have the graph analytics, the algorithms, all of that stuff is mature now so we can make more sense of all the semantics and I think that we’re entering an age in which semantics is finally going to actually come into its own. Now, you are obviously a close observer of this because your digital twins are essentially semantic models. How do you see where we’re at in terms of the state of semantic technology and whether it’s ready for prime time in the enterprise?

Owens: Yes, great question. I loved the article, first of all. I think that it hit on a lot of points that I also feel very strongly about in terms of the role of semantics in the enterprise and its sort of on again, off again, nature. I guess I might define it more as a gawky adolescent than an undead zombie rising from the grave. As with many technologies, we’ve gone through periods where the semantic graph just looked unwieldy. There weren’t tools, there weren’t standards, and there wasn’t good knowledge in the industry, good experience in how to use it. And I think that we’re seeing more people who understand the tech now and it’s being applied in a range of industries. We’re actually hiring people out of bioinformatics and out of building code and finance. So, a lot of industries are starting to take advantage of it and I think it’s one of those maybe slow-burn technologies that really is coming into its own as the scale of our problems grown and people just realize the need.

Another way of thinking about it is that the semantic standards were created and there was some set of tooling around them from the very beginning. But RDF didn’t find a lot of applications that we find in broad use except for, perhaps, the RSS feeds which I think are based on RDF technology. But the rise of graph databases, when combined with semantics, and when combined with an understanding that you needed really legitimate domain knowledge to come to this, to complete the picture, I think has been really proven, especially in the digital twin area of manufacturing. I think that those things explain why, right now, we’re actually ready for prime time. But also, if you look at the rising number of graph databases, the semantic technology usually defines a set of relationships but then it comes to life in a graph and using that graph and having the operational maturity of the graph databases, I think actually has really made a big difference in bringing this to life. I’m assuming that that’s sort of the way it works in your Ambiant platform.

Yes, it is. The existence of capable graph databases and suitably performant graph databases is making a big difference. And I think so is the existence of problems, like you described in, The Industrial Internet, where we’re no longer dealing with kind of silos of data where a fixed schema model of relatively table structure is sufficient to describe the richness of the information that we’re receiving. And I think both of those two things are really driving a lot of interest and just a lot of need. There aren’t other tools that solve that same problem, as well. 

What I’d like to do today is talk about your origin story and then go into your product dogmas and then some of the points of technology leverage and then we’ll end up by talking about the use cases that you have chosen to go after. And perhaps one interesting discussion could be why you’ve chosen to go to market as a use case-based company rather than as a platform company. So, let’s start out back with the origin story, what was the moment that the team came together and thought, “We have to start ThoughtWire.”

Yes, it started with our CEO, Mike Monteith, he was actually working as an architect and technology lead for the Province of Ontario, working in healthcare. And the problem that he was faced with is, how do we bring together all of the clinical systems, that inform patient care, in the same way that the banking industry brought together the debit card network. And looking at the technology vendors’ roadmaps over the next N years, looking at things like SOA technology and data duplication and ETL tech, he didn’t see a solution that would solve the problem that he saw in healthcare. And primarily that problem that he came up against was all of the information that is needed to deliver really great care is present in a huge range of systems. So, an emergency nurse in an emergency room or a surgeon preparing for an operation, don’t have access, in the moment, conveniently, with the information they need. So, he actually defined quite a different problem definition than most of the vendor landscape was looking at, which is, how do we bring together a useful set of information, a very confidential, privacy-intensive information, at the time it’s necessary to provide a sort of just-in-time solution to an outcome that people need. So much of clinical technology is driven by transactional requirements. It’s driven by insurance and billing and other things and it’s very true in Canada too. We have a single-payer but the problem remains the same. One fascinating stat that Mike likes to quote is that at the time we started the company, technology adoption in healthcare had been going up by large multiples every year for the past two decades. Meanwhile, nursing productivity had gone down by 30% because we turned them into data entry operators. And it’s kind of a crime to take these people who want to be providing patient care and making them interact with computers all day.

How did this problem area lead itself into the basic product dogmas that you have? Because the way I see it is one of your product dogmas is that there’s an urgent need for information integration in the use case areas that you’re addressing, both in commercial real estate and in healthcare. But the second dogma would be that using a digital twin is the right way to perform that information integration. And then I suppose a third dogma would be that using semantic technology to build that digital twin is the right way to do that.

Yes. So, I love the framework of dogmas to think about this, first of all, because I think it provides a really interesting framing and when I was sort of reanalyzing what we’ve done with the platform and how we got from our origin to where we are now, that lens of the dogmas that we chose is actually really productive. One, we believe that people matter. So, we’re trying to bring people into the system, not disintermediate them. And that’s driven a lot of platform decisions.

How would you explain the difference between bringing people in the system and not bringing them into the system?

A lot of the work that’s preceded what we are doing, work that we were reacting to, if we think about service-oriented architecture or system-to-system integration, it was all about taking information from one system and making it accessible to another system. And that leaves a big last-mile problem which is, okay, but who actually gets the outcome, who derives value from that integration? And so many of the things that we were looking at were multi-year integration and coordination problems between systems, to deliver tenuous value to an unknown set of stakeholders. We turned that around and said, hey, let’s start with the stakeholders, let’s start with the outcomes that are meaningful to them and then figure out an architecture that makes it possible to attack those problems without the five-year plan to produce the uber system that has all the information in it. Which, by the way, is also a privacy disaster and a scaling problem and a security problem and all kinds of other issues that come up with it.

How did that lead you to the doorstep of semantic technology, digital twins, and that approach?

We had, first of all, the idea that we wanted to solve for an outcome. And an outcome implies a set of constraints that aren’t often true in system-to-system integration. So, for example, in healthcare, the big clinical systems that we were dealing with had sometimes thousands or even tens of thousands of individual tables of information in them. And so when you think of an integration project of that scale, just identifying the data dictionary around that is a multi-year effort. We turned it around, we said, okay, we want to constrain this to a much smaller problem domain. It’s not about all the information that’s in there, it’s about the information that, let’s say, an admissions nurse needs to check a patient in or a clinician needs before doing surgery, do a patient overview, when we put that constraint around it, we immediately came to the idea that, first of all, we’ve got a much smaller set of meaningful information but we also don’t know in advance what the structure of that information is going to be and we need it to be very flexible. And my background coming into the company was in semantic markup, I started back in the SGML days, worked through XML, had patents in other companies to do with information structure. And, in particular, with imposing structure externally on information that didn’t already have structure. And so the application of true semantic markup, had been very interested in RDF and in Tim Berners-Lee’s vision for the semantic web for quite a while, the application of that to solve this problem of a constrained but extremely flexible set of data, it just seemed kind of obvious. And, going along with that, one of the challenges that I think semantics has had, formal semantics has had, is that the problems they’ve aimed at have been very, very big. It’s a corpus of knowledge on the internet or all the information in a particular subject domain, it’s a very hard modeling problem. Because we constrain our digital twins to particular outcomes, the modeling problem is smaller and it allows us to better apply that tech.

In this case, what’s happening is that you have a wealth of information and the job that you need to do is to find the actual subsets, the rows, the columns, the subsets of those vast, 10,000 tables that are actually relevant to the problem at hand and then you just suck those things out into your digital twin and use them. Your job is to find the 5% or smaller of information, in this vast ocean of information, that is actually relevant to the problem and the outcomes you’re trying to create.

Yes, exactly right. 

Once you have the core dogmas of finding that information, bringing it into a digital twin, what are the dogmas or the principles you have where you’re taking that information from that model to actually presenting it to the people using the data, using the system, so that they can get a better outcome?

There’s one more input dogma that I’ll add just before I go into that. And that’s that we have believed from the beginning that the scale of the inputs that people are dealing with, the set of systems that they are interacting with on a day-to-day basis, has been increasing geometrically and is starting to go up exponentially. So, it used to be that an individual, let’s say a nurse or an operations manager in a commercial building, had kind of one system that they had to deal with. And over the past 20 years, that’s gone from one to a small number of tens. And as the internet becomes the system that everybody’s dealing with, it goes from tens to hundreds and thousands. And so this means that the functionality and the information acquisition problem that individuals have to deal with, is scaling beyond the individual’s ability to deal with it. 

It sounds like what you’re saying is, we accept the principle of information anxiety and we’re going to actually calm everybody down by taking responsibility for that and then delivering what’s relevant for a specific user?

Yes, and I think the information anxiety is heightened in, I’ll call it a functional anxiety, frequently people fulfilling a job, being a nurse, being an operations manager, they often don’t believe that they even know which systems or capabilities are available to solve the problems that they have. They’re not comfortable that they know enough about the environment around them to do a good job at the things they’re being asked to do. And that’s a horrible position to be in, as someone with responsibility for life safety in the case of healthcare or the operation of a major commercial building.

Now let’s move to the trip from loading up that digital twin to using that digital twin to gain an outcome.

We started, again, in a very user-centric position. So, coming from our view that people matter and we don’t want disintermediate them, we want to empower them, we started in all of our early engagements with the end users, the stakeholders that really cared about that. So, for example, we would go into a hospital and we’d talk directly to nurses or porters or physicians and find out what part of their job has been made difficult by the inability to access or record the information they need. Then we would discover where the systems were that had the available capability, the functionality or the information, and figure out how to bring into a digital twin and what the twin would look like that could encode that information. I’ll give you a concrete example. One of the very early things we did was give on ward nurses an enriched view of the patients they were interacting with. We talked earlier about how transactional medical systems are and clinical systems are, they tend to be good at ordering a new test or putting in a requisition for medication, but not so good at getting a holistic picture of what’s happening to a patient, on the ward, on a given day. So, when we talked to a nurse and they say, “Okay, what I really need to know is how long has this person been on the ward? Have they exceeded their length of stay? Are they going to be checked out and back to their own home or are they going to long-term care? Are they scheduled for a procedure soon?” That’s the sort of type of shape of information they want. So, we start to build a model, a digital twin, that encodes the relationship between that provider, that nurse, and their patient, and all the information around it, and make it all accessible to them. So, in this particular case, that was a matter of driving all the way out to a user interface that describes the patient, their length of stay, put indicators up so that it would notify if they had been in the bed too long or if they had a new physician appointment coming up the next day. 

How did you go from the kind of information integration challenge and presenting that information in a holistic way that was tuned to the user, to adding analytics and advanced capabilities for finding insights that would be useful to the person using the system?

It’s a fairly natural outcome of the semantic models that we chose. Because we enrich the information on the way into the digital twin with a very deep layer of meaning, the relationships between the information sort of automatically surface new insights that the individual is unaware of. And this became starkly apparent to us in a digital hospital we were working in where the input signal was a binary signal from a fridge. The fridge would say, “I’m too warm,” or, “I’m too cold.” And in the predigital twin world, that signal would go to a maintenance worker whose job it is to come fix the fridge, and that’s kind of all we knew. In the digital twin world, we enriched that information so we knew what the role of that fridge was. Is that holding ice cream in the cafeteria or is it holding blood plasma or is it holding chemotherapy materials. So that when that signal comes in, we combine and enrich that information to say, “Okay, we know that the signal is bad, the fridge is too warm.” We also know this is a pretty high-priority fridge. We also know who’s on staff in that ward at this given time because we’ve combined it with the RTLS system to tell us which people are actually present. And we know the role of those people, we know whether they’re a pharmacist or they’re working blood services or whatever. So, we can tell the right person who’s near it, that a really high-priority event, possibly involving tens or hundreds of thousands of dollars worth of medication or other products, needs their attention and they can just move those products into another fridge. Nobody’s thought about operational data like that, being combined with the business data, all the way down to the employee schedules and roles, in real time, before.

Have you been able to use any graph algorithms, any advanced graphic analytics, that help you identify opportunities for optimization or danger?

Yes. So, we have, in one of our clinical applications, we spot early warning for patient deterioration in code blue situations. We actually partnered with a research organization, Hamilton Health Sciences, and they had done all this really interesting research on what are the key indicators, from a patient point of view, that predict a patient decline, heading towards a code blue or a heart stopped kind of scenario. There are things like mental acuity, their pain tolerance, their sensitivity, their blood pressure, respiration rate, a whole bunch of things. Some of them are collected by sensors, so you can do a good job with telemetry from medical devices, and some of them are individuals, you have to have people who are noticing them and recording them. One of the really unique things about the digital twin, we bring the people-level information together with the machine telemetry and then we run the advanced analytics on it, based on this research about how these factors should be graded, rated, and applied, and then surface the resulting event to be actioned. 

Let’s move on to the points of technology leverage. The idea of technology leverage is that you’re able to do something relatively simply that, in the past, might have been very complex and taken a lot of time. And so the product architecture that you have is essentially you have a bunch of connectors that can bring in data from lots of different sources, you have a digital twin that can allow you to make sense of that data, you have, then, applications that use the digital twin and those deliver a user experience that, then, brings everything to life. And you also have a user experience for the people who are actually maintaining the system, not just the people who are using the applications. What have you learned about this product architecture that helps you create leverage?

Several things. We live in a very dynamic technology landscape. So, adding and removing new services from deployed applications is relatively common. As is, having heterogeneous data sources. So, even in the same commercial building, we’ll have multiple different HVAC systems or multiple different lighting systems. And so the ability to define the interaction between the digital twin and the built environment around it has to be really dynamic. It can’t be, you know, a set of precompiled binaries that have to be deployed into an environment and restarted to talk to a new thing. 

It seems like a core part of the intellectual property that you’re creating are these normalized domain models inside the digital twin, so that you can understand not just what a specific HVAC system is but the general characteristics of a HVAC system are so that you can map any of the hundreds of HVAC systems to that standardized domain model and then, above that, deal with the applications.

Yes, that’s right. And we work both with industry standards for those domain models. So, in healthcare, that’s HL7 and for smart buildings there’s a range of them, but Brick is certainly an important one. And then we also enhance them and combine them. 

The key capabilities that we talked about in the stack that we just mentioned are integration of data, complex event processing, security, reasoning engines, process orchestration, alerts and notifications, analytics and reporting, data privacy, mobile enablement, and then self-tuning and optimization. Could you give me a few examples of magic tricks that you can perform that show how an individual is really helped and can do many powerful things based on the automation and analytics provided by the Ambiant platform?

Yes, definitely. One of the amazing core capabilities of the digital twin itself is the ability to react at whatever time scale the events occur in. So, we operate across everything from very low-level device time scales where it’s millisecond kind of event frequency for our ‘RTLS systems’, all the way up to human time scales where sometimes they don’t notice for a couple of minutes. And so that allows us to have a very unique asynchronous processing model that both keeps enough context to provide the right, meaningful answer in the moment, as well as respond whenever an event occurs. And so in CRE, for example, we use that to do things like alert the building operations staff about near real time changes in their building. Take an example, let’s say they’ve got their building in lockdown. So, it’s the end of the day, they’ve put the security system in lock mode, all the outside doors are locked. So, now they’re walking around the building, as soon as one of those doors open, that building is no longer in secure state. But finding that out has been surprisingly difficult. The security system, as far as its concerned, it’s in a locked state. You know, if you pass a security card through the reader, that’s fine, you’re not really breeching the lock state and, yet, the state of the building has changed. So, this is something where we’re listening to elevator events, we’re listening to lighting events, occupancy sensors in the lighting system and the HVAC system, CO2 sensors. So, by bringing all that together, using this asynchronous processing model, we can keep people informed and allow them to take action when things are happening. Another one that came up was actually an office cleaning use case. It never fails to amaze me just how sophisticated humans can make the simplest of interactions. So, there was a very security conscious client we were talking to, where the cleaners may not clean an office unless the occupant is in that office, because they have to be eyes-on at all times if someone else is in their office. So these poor cleaners are literally running around the floor, trying to find somebody in each office. We can change that experience so that they’re just alerted, as soon as the person walks into their office, they know it’s available for cleaning, they get a green checkmark on it, and they can go clean it.

What’s really interesting about your platform is that the same way that in the cloud right now we have a separation of data storage and compute engines and increasingly the compute engines, whether it’s Redshift or BigQuery or whichever database or whether it’s other analytical systems, are using that object storage as the raw materials. And numerous compute engines can be going after the same object storage or creating new object storage at the same time. And so it seems like what you guys are really doing is you’re putting the digital twin in sort of the role of that kind of data platform. And then you’re able to provide as many different kind of compute engines, whether it’s complex event processing or reasoning engine or process orchestration or various forms of alert definition and notification or analytics and reporting, you can put all of those on top of that sort of digital twin layer to do whatever you need. And when a new one comes along, maybe it’s AI and ML or maybe it’s graph analytics, you can just add it, as well, and then incorporate it in the applications. I assume that was something that you did on purpose.

Yes, exactly right. And this follows on from a bunch of AI and other research from the sort of symbolic stream of AI. You think AI and ML has moved toward statistical learning because of the prevalence of big data in that field. But before statistical learning, there was all this big movement toward symbolic AI and symbolic processing. And so a lot of the work we did comes more from that tradition. And so this ability to apply arbitrary rules and compute to known collections of facts, statements of facts, not just raw data, but assertions of truth — it’s very intentional and has ended up as a very interesting and scalable platform for solving these problems.

If you look at the landscape of the world of systems that are trying to take advantage of industrial—the internet of things and industrial applications, you can see a variety of them that are going to market as platforms, like C3 IoT and a variety of other of these systems are really all about creating a platform. And they hand that platform to the user and, maybe with a partner or maybe on their own, they build an application for their own use. You guys said, decided, “We’re going to build a platform, but that’s not the right way to go to market. We want to build applications and go to market with applications.” Why is it that you chose that direction?

That’s a great question. When we were looking at the problem, and I think this really comes right from the origin, from the belief that user interaction and putting the users’ problems first is crucial to providing leverage for the organizations that we’re dealing with, so, when we did that analysis, the gap between the traditional IT providers, the people who would have adopted our platform as a solution platform to try to bring it into these organizations and the stakeholders in the organization that could benefit from what we’d built, we felt was problematic. They weren’t—they were continuing to deliver technology that was of benefit to the organizational stakeholders, so, we’ll talk about the administration or the ownership or the payers, as opposed to the operational stakeholders. And we felt that if we want to directly benefit the operational stakeholders, the building management, the building operators, the employees of those organizations, the nurses, the doctors, that we had to be willing to take on that last-mile role of showing how the platform can apply directly to their problems instead of just to kind of ETL kind of problems that IT or the CIO or the exec suite perceives.

Why do you think that the use of semantics is perhaps more narrow than you would think, given its power?

I think that semantic is hard. I still remember the first time I dealt with a SGML file and I’m sure you know the expression, “It’s turtles, all the way down.” So, when I dealt with SGML for the first time, the idea that there wasn’t a defined meaning for a tag, that it was a consensual illusion what that thing meant, there was nowhere I could go look it up, was kind of meaningless to me. It’s, like, I didn’t get it. And so I think there’s a level of abstraction and abstract thinking required to appreciate the power of the schema-less RDF kind of graph approach to modeling data. And modeling is not a particularly universal skill. In any event, a lot of programmers get by much of their career without deep exposure to sort of formal data modeling or data modeling techniques.And I think that it’s taking the concrete use cases and demonstrations that companies like ours are producing now, to make it more self-evident what the value is. And as companies like ours and others are showing this value, and it’s happening in financial services, it’s happening in insurance, and in bioinformatics and a range of fields, I think those demonstrations will start to change the leverage. But it takes time.