Mesosphere DC/OS Brings Large-Scale Real-Time Processing to Geospatial Data

Sep 29th, 2017 6:00am by Scott M. Fulton III

All of a sudden, the planet Earth has become one of the world’s most important sources of real-time data. So the business of gathering that data — climate information, travel and commuting data, crime statistics, sporting event attendance, freeway traffic — is growing on behalf of the growing number of academic institutions, research facilities, emergency response teams, humanitarian and relief organizations, and intelligence agencies (yes, they’re growing too).

These use cases require a scale that goes beyond what traditional enterprise infrastructures offer. This means an increasing need for high throughput, no latency and a next-generation degree of orchestration to process weather data, freeway traffic and any other information from infinite nodes that are programmatically capable of capturing and delivering data for analysis.

What’s emerging is the need for platforms that absorb input from these nodes, process the data securely and do so in isolation.

The Spatiotemporal Tipping Point

“What we’re seeing now, with this emergence of IoT [the Internet of Things], is a new class of customers,” explained Adam Mollenkopf, who leads GIS development for real-time and (very) big data at Esri, the geospatial software provider. “They can’t just use a few machines that process thousands of events per second. We have customers that need millions of events per second, potentially, depending on the use cases they have. And taking our traditional architecture and trying to deploy that across tens or hundreds or thousands of machines, is not really a reasonable or tenable thing to do. This requires a new approach.”

Esri’s geospatial mapping platform ArcGIS has evolved from a global mapping application to a sophisticated geocoding system, assimilating a huge variety of data sources — including crowdsourcing — and producing 3D analytical plots in real-time. At the center of ArcGIS is what Mollenkopf describes as a spatiotemporal database, capable of rendering results not just representing now but in the recent past, aided by 3D animation. You might say, obviously, such an application won’t be usable by everybody, like Google Maps or The Weather Channel app. But Esri’s architects have already begun asking the question, why not?

“If you took the concept of a rack of machines in a data center and treated it as one logical unit,” he told attendees at MesosCon 2017 earlier this month, “we would just treat this as one operating system where we schedule work to run. So we don’t look at it as 33 machines; we look at it as a whole bunch of resources. We have a lot of RAM, a lot of storage, a lot of CPU resources that we can make use of. And we schedule work to run on that, which is what [Mesosphere] DC/OS is all about. DC/OS gives us a different starting point for us to deploy these applications through clusters.”

The machines themselves must be isolated, for a number of reasons that only begin with security. But they should be manageable as though they were the same unit.

On the surface, that sounds like “hyperconvergence,” the methodology for managing server hardware being built into more and more servers. But HPE has been experimenting with pre-loading certain models of its ProLiant servers with DC/OS, as an alternative to its own hyperconverged systems. It’s an indication that certain classes of customers require a scheduling mechanism that provides a deeper level of control, and a more reliable means of gauging performance.

A few years ago, Esri entered into a partnership with Mesosphere to produce an internal platform that Mollenkopf, for better or worse, dubbed Trinity. He found himself explaining to customers, no, it’s not a religious reference; no, it’s not named after the atomic bomb project. It’s supposed to be a character from “The Matrix.”

Trinity is a managed services stack, deployed on Mesosphere’s Data Center Operating System (DC/OS), that incorporates the Scala language for the connectors used in assembling data into hubs; Kafka for assimilating the data from the brokers used to represent those hubs; Spark for consuming that data as discrete topics, and for hosting ArcGIS’ proprietary geospatial analysis; ElasticSearch for maintaining the spatiotemporal data archive; Spark (again) for performing batch analytics on that archive; and recently, the Apache Play framework (built on Akka) for quickly developing web apps using a classic console methodology.

(You can provision and deploy an open source demo of Trinity at work with real-world IoT data, using Azure or AWS plus the key components of the Trinity stack, by downloading the package and following the instructions on GitHub.)

What’s missing from this scheme is Docker. While containerization is a critical part of Esri’s solution, the firm needed to maintain multiple deployment options for itself and its partners. Mesos does provide flexibility in its handling of what it considers “frameworks;” by comparison, Mollenkopf has maintained since the beginning of Esri’s partnership, Docker would tie Esri’s hands with respect to options.

There’s another reason for Esri’s architecture decision, dealing with the need for a tighter relationship between geospatial applications and the infrastructure that supports them.

Temporal Inconsistency

If you’ve seen the hypothesis that any application, once containerized and orchestrated, accrues at least some benefits over its old, monolithic architecture, it’s at a global scale where the hypothesis falls apart. Many high-availability, database-driven applications were created with server deployment in mind. At the beginning of the real-time era, “conventional” architecture utilized timing algorithms and processing pipelines that took measure of the capabilities and limitations of the underlying servers.

In 1996, the noted computer science professor Azer Bestavros explained this scenario in plain and unmistakable terms: “Database systems are designed to manage persistent data that are shared among concurrent tasks,” Prof. Bestavros wrote. “Maintaining its logical consistency while processing concurrent transactions is one of the primary requirements for most databases. In real-time databases, temporal aspects of data and timing constraints of transactions must be considered as well. The temporal consistency of data requires that the actual state of the external world and the state represented by the contents of the database must be close enough to remain within the tolerance limit of applications.”

This proximity, the professor continued, could not be guaranteed without a comprehensive modeling of the underlying operating system, and the components it maintained. That connection is precisely what gets severed by the decoupling of the software platform from physical infrastructure — the disconnection that makes virtualization at this scale feasible.

For years, software developers were deterred from building real-time database-driven systems for virtualized infrastructure. Up until recently, the principal means for ensuring deterministic, predictable performance relied upon the same intimate connection between high-level software and low-level hardware that virtualization prohibited. As a result, simply virtualizing the real-time systems that were relied upon at the turn of the century, rendered them unreliable.

Decision-Readiness

“We needed something to come in and basically create a paradigm shift,” declared Todd G. Myers, a senior architect with the DoD’s National Geospatial-Intelligence Agency (NGA).

Left to right: NGA system engineering contractor Kevin Fitzhenry; NGA senior architect Todd G. Myers, speaking at MesosCon 2017 in Los Angeles.

NGA is America’s key repository for live geospatial data, gathered by the country’s huge collection of sensors and satellites. But its need to assemble the mass of commercially collected and processed data (including from Esri), from assets outside of Department of Defense, compelled the agency to rethink its business model. Myers said during a keynote session at MesosCon, held in Los Angeles earlier this month, that the paradigm shift NGA had been looking for, from a technological perspective, involved the incorporation of Mesos frameworks, container services, and a more commercial-like, transactional approach to managing its IT.

The processing of geospatial data is a critical, modern function of what DoD employees call the “IC” — the intelligence community. To maintain its position at the center of this community, NGA needed partners. But it wasn’t going to get partners by doing business the way government agencies typically work: by advertising for bids from candidate suppliers, and taking months to consider their applications.

So it built a platform called NSG Data Analytic Architecture Service (NDAAS), with a cloud-like provisioning model. Here, individual tenants have limited access to geospatial data on a transactional basis. Tenants define their own jobs and how they want to run them. And to make everything easier for its people to manage, NGA applies a policy model that isolates each job, but stages them on an equal footing with one another.

“We’re trying to streamline to keep up with the volume of the data,” Myers told The New Stack, “that needs to be provided as products or services to humanitarian, or mission space, or DoD mission partners. The means by which we get that delivered out, will be the same.”

NGA Scale’s stack consists of some 17 open-source components, most of which will be familiar to New Stack readers and Mesos users, including Kafka, Jenkins, Ansible, ElasticSearch, PostgreSQL for relational queries, WeaveWorks for software-defined networking, Redis for data caching, GlusterFS as the file system, and Marathon for scheduling long-running services. “We’re trying to break away from databases and servers all tightly coupled, and having ephemeral data pipelines,” said Myers.

In the past, he noted, NGA did indeed cache certain data clusters for expedited access by particular customers, and you can guess which ones he meant. But as NGA began deploying its services stack on Mesos, and managing it with DC/OS, it discovered that the act of accumulating those caches and moving them to designated locations, consumed too much time and resources — so much so that it eliminated any advantage to be gained from having the caches in the first place.

Quite literally, when your customer space is the entire planet, searching for “the edge” in order to gain a few seconds’ advantage, ends up being wasted time.

“In most of the [intelligence] community, NGA is recognized as a more forward-leaning agency,” said Myers. “We’re taking full advantage of the opportunity to bring in what I consider to be the next-generation thing now.”

What Esri’s and NGA’s experiences with global-scale staging of geospatial technologies demonstrate is that a single, centralized, conventional “command-and-control” system doesn’t scale well — at least, no better for a technology of machines than an organization of people. Componentizing the platform, and enabling options for provisioning and scaling each of those components, gives the entire organization greater freedom to maneuver. And such freedom will become more precious for this ever-smaller spatiotemporal sensor cluster on which we live.

Mesosphere sponsored this story.

Scott M. Fulton, III is a 39-year veteran technology journalist, author, analyst, and content strategist, the latter of which means he thought almost too carefully about the order in which those roles should appear. Decisions like these, he’ll tell you,...