Blurred Boundaries in the Clouds with Fog around the Edges
In our first piece we wrote about how greater abstraction of compute from hardware, growing sophistication of event-triggered and “serverless” software design, and a continuous movement towards the edges of the cloud all conspire to make the coming decade’s software much more diffuse, local, and discrete. We did not mention the buzzy term “DevSecOps” by name, speaking instead about testing, pipelines, and ongoing feedback loops between continuous development, security, and day-to-day operations. This shift towards permanent feedback loops is making diffuse and discrete software manageable and sustainable across this increasingly distributed topography. Instead of thinking in terms of browsers, applications, physical storage, clients and servers, all of us are starting to rely, and will come to rely much more, on the mental models of services, nodes, wallets, hubs, and agents.
“Hyperconverged” software is not so much installed or deployed as disseminated, grabbing resources and permissions at any scale wherever it finds itself. It blurs all kinds of boundaries that have traditionally structured the “layer/stack” metaphor of software development, and the economics of the software industry are evolving accordingly. Take, for example, Cloudflare’s recent announcement that their new “trustless” service offerings might make many corporate VPNs and tunnel services obsolete: they’ve gone from security company (DDoS-protection as a service) to infrastructure company to infrastructural security company, and in the process they’ve become essential to the infrastructure of the Web 2.0.
Critics worry, however, about the long-term ramifications of having no “firewall” between operators of platforms and service providers on those same platforms, throwing into question their neutrality and the fairness of their pricing. Especially as the cloud market lurches towards an infrastructural software version of the “Amazon Basics” conundrum, and as competition among and within clouds is threatened by a trend towards massive mergers and acquisitions, it is worth questioning how the neutrality of a given cloud’s operations or competition among huge players might be preserved for the good of the economy as a whole. In a recently published paper that we will be releasing shortly on this very blog, Spherity’s very own Carsten Stöcker explored the economics of platforms from the perspective of this very problem and how open standards could drive us out of the impasse of clouds as walled gardens.
In more exciting (and decentralizing) business-model tendencies, event-driven, agent-based infrastructures could soon generate new, truly on-demand and “entity-driven” business models that respond directly to the needs of connected identities at scale. Rather than relying on the centralized “attention economy” to connect buyers and sellers via a surveillance-based economy of middlemen, demand could be signaled in a privacy-preserving way across a trustless internet and supply could find it in a disintermediated way. In many ways, identity and encryption are just as generative of “trustless” and spontaneous business models as they are fundamental to secure data transmission in trustless environments, as we have argued in almost every publication of the Spherity blog.
But before these kinds of global data economics become concrete realities, decentralized identity may need to move away from the blockchain-based infrastructure it’s currently using for its key management needs, or at least for its maintenance, rotation, and revocation of identity-controlling key-pairs. For this reason, a theoretical protocol called Key Event Receipt Infrastructure (KERI for short) is currently being incubated in the Decentralized Identity Foundation. This kind of dedicated state-consensus system or others like it could create a more efficient, shared infrastructure that could work on any cloud or in any security perimeter to maintain keystate for millions of entities. Infrastructural advances like these (in lockstep with security and user experience advances) will be essential to the scaling and uptake of identity-secured metaplatforms for more entity-driven businesses. We expect that first movers will start to adopt KERI already in 2020.
What this “low-trust” evolution in cloud topography means for SSI
As the shifts mentioned above grow more and more mainstream and ripple out into the consciousness of software designers and developers everywhere, the vocabulary and mental models available to software will shift in the direction SSI has been moving for years. Namely, this means “agencies”, agents, edge, cloud layers & agent-to-agent protocols, orchestrating a collection of largely independent infrastructure components to manage and control the identity relationships and transactions of any given “client” identity in a peer-to-peer network. Given the move towards the kind of cloud development described above, an ever-more containerized stack, and with systems like KERI coming into common usage that isolate key control operations from other low-level infrastructure, the only limitations on mobility between clouds and between service providers will be lock-in techniques that will grow increasingly hard to justify. Overcoming these could be great news for the economics of SSI.
In such a future, “Vaults” will also grow in importance and centrality, which are abstractions of credential and data storage, and key control, spread across various edge- and cloud-layer instances and making up an entity’s custom or personalized self-sovereign identity infrastructure. We are imagining an economy in which humans and enterprises select one “agency” among many interoperable, competing ones. These will host cloud-layer software that is synchronized with the edge-layer identity software running on the edge devices of their “clients”.
In particular enterprise vaults will manage the keys and credentials associated with an organization aggregating data and control in both directions: it will manage all kinds of individual credentials that employees and other participants can take with them elsewhere, while also delegating to employees various authorities to represent the enterprise and transact on its behalf in a global peer-to-peer economy. Similar to human KYC credentials, enterprises manage massive collections of their own Know Your Enterprise (“KYE”) Credentials and those of their partners, clients, and suppliers: legal registration numbers, DUNS numbers, GS1 Global Location Numbers, vendor on-boarding and audit reports, customer master data, bank account data and much more. Interacting with the vaults of my enterprise peers ensures that I have always the most recent set of master data and that I can on-board a new business partner using existing credentials. In an administrative analogue to the “once-only” principle, data can be maintained in one place and subsets or cross-sections of that data can be fetched by different stakeholders in real time for all different business processes.
More forward-looking enterprises are eagerly waiting on these technologies to be standardized, stabilized and road-tested by the market to unlock substantial efficiency gains, on top of others around security and confidentiality. We predict that in 2020 many operational field tests will be conducted with decentralized enterprise identity vaults to manage these kinds of vendor/customer relationships. These operational solutions will be deployed based on rigorous business cases and in the context of regulatory requirements in existing industry ecosystems such as in the pharmaceutical, automotive, mining or industrial solutions industry. These solutions will likely be scaled out quickly in 2021. Enterprises will start to push human SSI uses cases for B2B2C or B2B2E (C= Customer, E= Employee) into production much faster than government SSI projects.
Historically, the community of projects building SSI solutions on the shared Hyperledger Indy codebase steered by the Sovrin Foundation has focused substantial effort and iteration on open-source, agent-driven interoperability, empowering the kinds of data flows described above today. As the gains made on the shared basis of Indy are applied outward to the broader collaborative architecture of the Hyperledger Aries project, interoperability on the agent-to-agent API layer is being road-tested more broadly. An enterprise can be either a customer or a vendor at a given time, so agent-to-agent communication needs to be fully “symmetrical” with regards to offering, requesting, issuing and presenting credentials, even if some enterprises require more or less complexity in their issuance capabilities for different use cases.
By current roadmaps, there should be 20 interoperable in 2021, fully symmetrical Aries-based wallets in stable release working on the Hyperledger Indy ledger. By 2022, we predict there will be upwards of 50 such interoperable wallets total, integrating not just the Indy and Ethereum ledgers but also others integrated via the tools and primitives elaborated through the wide-based Aries project. Bitcoin (whether via Sidetree or via something that evolves out of the fast-moving Lightning layer), Corda, and maybe even some iterated and evolved form of Libra all seem reasonable candidates to support these kind of projects in the coming years.
“Decentralize Alexa!”: Secure virtual personal assistants as SSI’s first killer app
Despite an initial boom in smart home devices and “virtual private assistants” (VPA) like Alexa and Siri that span multiple devices and centralized networks, we are already seeing a major backlash due to significant data privacy concerns. Even in the historically privacy-lax United States, scandal after scandal after scandal has pushed public perception to a tipping point which may prove an existential threat to the core business model of the centralized VPA.
Truly self-sovereign and secure VPA agents could automate all the same tasks, however, without risking the same kind of data leakage and amassing of personal data. Building on the core technologies described above, and a few others that we will now touch on briefly, the promise of the VPA could be realized not just in our lifetime but in the decade to come. We expect businesses and/or hacker communities to begin experimenting with such systems and devices in earnest as SSI goes mainstream, and we hope to see tentative real-world trials as soon as 2022 or 2023.
One key requirement before such systems are workable is for all the moving parts to have heterogeneous (and probably hierarchical) systems of encryption and selective disclosure, to minimize leakage and risk. One way of doing this would be to take advantage of bleeding-edge key management ideas and what some have called a “Cambrian explosion” in mathematical theory around so-called “zero knowledge proofs” (zKP). While some forms of zero-knowledge are already in use in SSI, this envelope will be pushed much further in the decade to come, not only for quantum-resistance but also to decentralize risk and build heterogeneity and resilience into an increasingly “trustless” topography, as discussed in part 1 of this essay. Over time, these kinds of trusted islands/moats, runtime verifications, and execution environments will probably have to incorporate zero-knowledge mathematics to stay safe enough for high-security use cases like VPAs. These same incremental gains in trustless security, combined with MPC, will advance privacy-preserving data sharing in cyber-physical value chains and complex use cases around delegation and enterprise identity.
While all the above will be needed to make VPAs secure and heterogeneous enough on the software level, they will also require a certain degree of advancement of hardware security and identity as well. In terms of connected devices and IoT, we expect Device Identifier Composition Engine Architectures (DICE) to go mainstream, providing new security and privacy technologies applicable to systems and components, even those without a trusted processing module (TPM). These DICE components provide self-certifying hardware features that are very important for implicit identity-based attestation. We expect that DICE — which is a combination of multiple best practices for secure key management — will start be available at some scale by 2021/2022 in hardware components and open source projects. The combination of these building blocks make so-called “autonomous things” sure to follow, which could spin up something like an autonomous economy well before the decade’s halfway mark.
Cyber-Physical systems and agile hardware
In the event-driven, “hyperconverged” decade to come, and relying on a lot of the same technological building blocks outlined above, the boundary between data-events and physical-events will also be blurred. Perhaps Spherity’s business niche skews our perspective, but we also anticipate the boundary between digital-native assets and digitally-twinned physical ones will blur more as well, as more and more manufactured things will be digitally twinned from “birth”. Smart cities will buzz with collective data serving the common good, if we can get the incentives and the standards right.
The data-rich manufacturing of the future is an even more dramatic example of this convergence of data and things. The data-rich, AI-powered manufacturing processes, full of robots, human oversight, and swarms of sensors, will be in a continuous feedback loop with the data-rich, AI-powered, digital-twinned products they produce. The metaphor of DevSecOps and secure CPS DID Integration might well apply to the manufacturing floor of some factories by 2030, as analytics and design merge into continuous processes.
Another particularly porous space where merely “smart” things will give way to fully cyber-physical ones is in healthcare, where data-generating personal wearables, data-driven medicine, and implanted devices have the potential to gradually turn us all into “cyborgs”. This boundary between the body and its complements might already be more blurred today if not for the serious cybersecurity and privacy implications. The arrival of privacy-preserving data standards, more iron-clad anonymization techniques, better AI benchmarking, and collective bargaining/control over data can’t come soon enough.
We expect that by 2024 there are AI controlled systems for decentralized infrastructures in which decentralized identity is used to establish closed cyber-physical loops for operating and maintaining a fleet of decentralized assets. DIDs and Verifiable credentials will be used for attribute bases access management, data attestation and provenance within distributed cloud edge infrastructures.
Especially as medical research on many fronts, particularly cyber-physical ones, is held back or even stymied by today’s paradigms of data ownership, consent and capitalization. Indeed, much of the cultural impetus (and funding) driving identity systems forward today comes from the medical sector, looking for ways around extractive data practices and siloed data. In 2023 medical uses cases will push decentralized identity into health applications with cyber-physical feedback loops — aka Virtual Clinic — with focus on securing the end users.
Critical Infrastructure 4.0
The other major driver of open-ended identity infrastructure has lately been governments and militaries looking to explore alternate protections for both data infrastructure and physical infrastructure, against a backdrop of rapidly professionalizing malware and hacking industries worldwide. In the years since the StuxNet and NotPetya incidents drove home the risks of digitizing critical infrastructure, electrical grids and power plants have come to be seen as the internet-connected pacemakers of entire cities and nations. Now is a great time to be developing bold new ideas in cybersecurity or network topographies, particularly if game-changing techniques like quantum computing or quantum encryption are involved.
These days, there is a kind “quantum gold rush” underway, as billions in venture capital are being poured into the development of quantum computers. When these are stable and operational, many predict they will break much of the world’s asymmetrical encryption schemata, which is motivating a race to “quantum-proof” all the vital infrastructure currently relying on said asymmetrical encryption systems, including our identity infrastructure. Research into symmetrical encryption, which is not racing against the same kind of “Y2K”-like deadline, is also flourishing, primarily because of that very impunity.
This is as much a geopolitical as an economic contest, and there are many recent entrants into the fray driving more and more attention to the arena for quantum “supremacy”. For years, China has outspent the rest of the world investing in quantum technologies generally, but particularly in so-called “quantum cryptography”, which uses pairs of quantum-spooked particles to pre-share “quantum” key material for tamper-evident key management. This return to the original use-case of encryption (shielded signals intelligence between exactly two parties) is ironic and fascinating, and China will likely follow through on its plans to make a quantum-encrypted layer for global communications at great expense in the decade to come.
The devil is in the details, though, because even the most ironclad encryption can still be insecure in its implementation, as a recent whitepaper claimed of China’s current quantum-secured communication channel. Securing the cryptography, software, identity systems, and networks of critical infrastructure is hardly sufficient to the task of protecting critical assets like power plants or electrical grids, since in many cases, the hardware itself (and more vulnerable still, the firmware) are a much more worrisome attack vector. The difficulties that governments and militaries have experienced trying to defend their hardware supply chains shows that the weakest attack vector is usual social or economic — one bribed inspector or one weak sysadmin password can do as much damage as a cracked elliptical curve! As the old saying goes, the weakest link in every information security system is the humans.
In so-called “supply chain hacks,” malicious code is snuck into manufacturing processes, OTA (“over-the-air”) updates, remote control data and other automatic configuration or firmware updates, and activated or exploited much later, once the hardware or software has been installed and configured. (NotPetya, by some forensic theories, was spread by such methods.) The most critical infrastructure, then, might well be the delivery pipeline for software updates and patches, since a breakdown of this infrastructure (or even a breakdown of trust in this infrastructure) could have ripple effects on the entirety of the interconnected (and interdependent) world. The prescription we propose, it will surprise no one to hear, is more identity, more open standards, more open-source code review, crowd-sourced infrastructure hardening, and more immutable audit trails of all things software and hardware. We expect that in 2025 at least in 10 mega cities safety and civil order will be seriously threatened by cyber-physical infrastructure hacks. Governments will start to significantly ramp-up investments into city continuity capabilities and cyber-physical disaster recovery plans by this time as well.
Other Infrastructures, Still Awaiting 1.0
For fear of sounding like a broken record, we could apply the same prescription to another field that has increasingly captured the world’s attention: “deep fakes,” or documentary forgeries that cannot be proven false by the human eye or forensic analysis. The problem is not new, but like quantum computing, the watershed is coming soon and calls to start building a provenance for documentary provenance are growing in urgency and prominence. By far the best overview of these problems, as well as an even-handed treatment of the conceptual obstacles to solving them quickly, has been put out by the non-profit Witness. As one media insider put it in his predictions for 2020, “Deepfakes will impact democracy and bring about a need for publisher certified content.” All kinds of blockchain and other technical solutions have been proposed for this ecosystem-wide certification and auditing problem, as well as mitigation strategies for all non-auditable material, but fundamentally, what we really need is a trust fabric and reliable reputation systems for publishing. This is, ultimately, a data commons problem that can only be solved with an open, non-profit, and neutral meta-platform.
It is not difficult to create dangerous high-quality fake pictures. It is also not difficult to create fake entities and IoT or telematic data sets on behalf of fake entities that are consumed by people, enterprises, data markets and computerized systems. The problem of deep-IoT-fakes will have dramatic impact in supply chains when hackers simulate real IoT devices and fake IoT data streams or machine learning labels. This will be recognized in 2022 as a major industry problem and a blocker for the adoption of connected IoT systems when people move more towards dynamically defined value chains. As the consequences of fake data can be very serious, owners of platforms will be held accountable for the distribution of fake data sets and damages resulting from the use of this data. Realistic assessments of risk and liability may well hold back the development and application of these technologies until provenance can be adequately verified for the data sets training the algorithms. We expect that this problem will be addressed with data provenance and risk scoring models applied to verifiable data chains. This solution will evolve into a new risk management domain called insure your cyber-physical value chain and insure AI until 2028.
On that final note, we would like to turn our attention to one last data-commons infrastructure remaining to be built. The most fundamental physical infrastructure of life on earth (climate, water quality, energy, and minerals) is becoming increasingly urgent to audit, to identify, to standardize, and to expose to open-source code review. Elaborating and enacting a climate policy that keeps economy and industry within the limits of the sustainable is definitively a task for democratic institutions, not scientists, billionaires, or technologists. Nevertheless, technologists can and should and need to start working today to support this mission by retooling our literal and conceptual frameworks to accomodate a fuller accounting of the earth and humankind’s footprint.
This means making the finitude of earth’s resources and not the arbitrary scarcity of man-made currencies the base unit of measure in our systems of accounting, leaving no “externalities” out of scope. This also means thinking in terms of a data commons, finding alternatives to national governments or privately-held corporations to build, govern, and maintain the neutral and open information systems that can allow for this kind of common-good accounting. We were proud to sponsor the Berlin portion of the international Open Climate Collabathon in November, which took this “global accounting” as its structuring theme, and look forward to pushing forward the conversation about the informational needs of a sustainable world economy.
There is, of course, no more urgent topic: if the generations alive today fail to make significant progress along these lines in the decade to come, there will be little to predict ten years from now. So let’s make it happen and make real progress on this starting today!