ZK 身份 : 为什么需要及怎样做到?(第一部分)

0xPARC

Last month, we kicked off the 0xPARC ZK-Identity Working Group: a working group experimenting with zkSNARKs to build digital identity tools. This post is the first in a series on why advances in cryptography will be important for enabling new identity primitives. This first post covers the “Why”; future posts will cover the “How.”

The design of online identity systems has been the subject of intense debate in the last few years. Modern digital identity systems have enabled new and complex kinds of online interactions and communities. Unfortunately, many of these systems also have significant weaknesses.

Many of these weaknesses can be attributed to inherent limitations of centralized identity system designs. Firstly, these systems are generally built around central points of control—and therefore, central points of failure. Modern e-commerce, social media, and messaging platforms are vulnerable to pressure or interference from powerful actors (such as authoritarian governments), or to technical attacks from malicious hackers; when a central operator is coerced or hacked, many parties beyond just the central operator are put at risk. Secondly, these systems rely on the concentration of power in the hands of operators who cannot possibly be fully aligned (financially, socially, or morally) with all of their users—for example, private social media companies with diverse global audiences must often make decisions about what constitutes an act of unjustified censorship versus an act in the interest of public safety, though they are often ill-equipped to do so.

Decentralized and cryptographic mechanisms are not a magical panacea, but they do offer some useful tools, and they expand the design space for digital identity systems. As more of our social and economic lives move online, designing secure, privacy-preserving, and user-controlled identity systems will become increasingly important. In this post, we’ll argue that new cryptographic primitives such as zkSNARKs will be crucial for building identity systems with these properties.

At their core, zkSNARKs are useful because they enable users of digital systems to produce credible claims of arbitrary complexity, without reliance on a trusted party. All identity systems are built around some mechanism for generating credible claims of identity and reputation—typically, fairly complex claims attached to attestations from a trusted authority, like a government or a company. By applying zkSNARK constructions to claims about identity and reputation, we can rearchitect digital identity systems and put control and data custody in the hands of users.

Credible Claims

As zkSNARKs operate over precise, mathematically-defined “claims,” we must first precisely break down the nature of claims involved in identity systems.

It’s hard to do business with a completely unknown and untrusted counterparty. Common sense tells us that as trust between counterparties decreases, the likelihood of cooperation does as well; game theory tells us that the optimal strategy in a one-shot prisoner’s dilemma is always to defect. Would you be more willing to buy a used car from a close friend who is tightly linked to your social circles, or from a sleazy Craigslist seller dropping by from out-of-town, who won’t even give you their name?

To build trust with each other, we need to be able to make credible claims: claims about our identities and reputations that people who we interact with can find believable. It’s not really a credible claim if the Craigslist seller above assures you he’s “sold tons of cars before, and everyone loves my cars—take my word for it.” But if this claim was associated with a collection of five-star ratings from verified buyers on a popular website that you know about, it would certainly feel more credible.

The idea of credible claims sounds obvious, but building and legitimizing a mechanism (in this example, a popular listings website) for producing credible claims is no easy task. In traditional models, our usual solution is to delegate record-keeping management to a trusted authority, so that they can attest to our claims about identity and reputation and lend them credibility. This authority must prove their legitimacy and trustworthiness over time (often in adversarial environments), and maintain infrastructure for generating and distributing attestations at scale.

Crucially, in most models, the central authority’s attestation is what makes a claim credibleThis is a valid government ID, so I’m a citizen; that is an accurate list of my followers, so I’m a social influencer; these are a vetted set of reviews and ratings, so I’m a trustworthy online retailer.

Another application for credible claims lives at an even lower level of the stack. To begin with, how do you know that the person or business you’re interacting with is presenting you with a claim about themselves, and not someone else? In systems that depend on a trusted authority, these authorities take on the even more basic function of attesting to identity itself. An API access token, a government-issued passport, or a chain of signatures generated by Certificate Authorities when you visit a website are all attestations for claims about identity.

Useful identity systems allow participants to make a very wide variety of complex credible claims:

Privacy

The fact that nearly all identity systems inherently require privacy to function as intended further complicates the picture.

Privacy is important for ethical and ideological reasons that are sometimes controversial; but even more fundamentally, it’s often necessary as a simple matter of system design. For example, almost all identity systems rely on the notion of secret data to generate believable claims about identity—a password, Social Security number, private key, credit card PIN, account recovery question, etc. This data must be kept private, for obvious reasons. Additionally, the process of producing credible claims with completely transparent data may have negative externalities, or at least externalities that are hard to reason about; privacy guarantees prevent these. For example, if you had to present your entire financial history just to buy or sell an item in an online marketplace—bank statements, credit card transactions, loan payments, and the works—a counterparty could use this information to initiate out-of-scope interactions that have nothing to do with the original transaction (negative examples include advertising, or even harassing and blackmailing). Privacy “sandboxes” one-off interactions, cleanly defining and limiting their scope, so that we can build more complex systems from simple and understandable building blocks.

In traditional systems that require privacy, we have to delegate even more power to the central authority—in such systems, central authorities store private data and attest to credible claims about this data that are nearly impossible to verify.

The Role of Cryptography

So far, all of the models we’ve discussed for credible claim generation and identity systems have involved a centralized actor. And as we’ve discussed, there are plenty of reasons why we may want to explore systems that don’t rely on a powerful record-keeper or manager.

Immediately, we run into the obvious problems: how can I trust your claims, when I don’t have your data? If you send me your data, how do I know that the data is valid? And what do we do if you’re trying to make claims about private data? This is exactly where cryptography comes in.

Viewed in our lens, much of applied cryptography (and consensus) in the last fifty years has been a project of gradually expanding the scope of what credible claims it is possible to make without a trusted authority, under various resource constraints and privacy conditions:

Progress has historically been slow—each of these cryptographic primitives defines a new and tightly scoped kind of claim, whose structure is highly specified. However, this has changed in the last several years.

What’s exciting today is that we now have the machinery to make arbitrary credible claims efficiently, thanks to SNARKs. And with the zero-knowledge property of zkSNARKs, we can also tune the privacy guarantees of our claims exactly to our liking.

Here are a few examples of the kinds of claims that you can make with a zkSNARK, that would not have been possible before:

These claims can be combined, composed, and even programmed in arbitrarily complex ways.

While all of this is theoretically possible, we still have a long ways to go to. Producing a robust suite of ZK identity tools for the next generation of applications requires making substantial improvements in performance, reliability, developer experience, and application design patterns. In the next post, we’ll discuss our understanding of the road ahead.

Addendum: What’s in an Identity?

To understand where cryptography can be useful for building an identity system, it’s useful to break down the idea of an identity system into its key components.

In analyzing a particular identity system, we might ask some of the following questions:

Some of these concepts blend into each other: identity, reputation, and proof-of-identity are closely related, and not easily divisible. For example, in some systems, the atomic unit of identity is even defined as “that which a central authority can provide a valid attestation for”—there is no notion of a Facebook account that isn’t stored in Facebook’s database.

In general, however, we use identity in this series of posts to refer to a persistent tag for an entity (a person, an organization, a bot) that stays constant and representative of the entity over time—legal personhood, a public key, an account ID, etc. We use reputation to refer to the claims about past behavior that can be made about the entity (”Alice has always kept her word,” “Bob has always paid his credit card bills on time,” “Comfort Homes has always used accurate pictures for its Airbnb listings”).

Links and Acknowledgements

Thanks to Yi Sun and David Schwartz for feedback and review.

上个月,我们启动了 0xPARC ZK-Identity 工作组:一个尝试使用 zkSNARKs 构建数字身份工具的工作组。这篇文章是关于为什么密码学的进步对于启用新的身份原语很重要的系列文章中的第一篇。第一篇文章涵盖了“为什么”;未来的帖子将涵盖“如何”。

在过去的几年里,在线身份系统的设计一直是激烈争论的主题。现代数字身份系统已经实现了新的和复杂的在线互动和社区。不幸的是,许多这些系统也有明显的弱点。

许多这些弱点可归因于集中式身份系统设计的固有局限性。首先,这些系统通常是围绕中心控制点构建的,因此也是中心故障点。现代电子商务、社交媒体和消息平台容易受到强大势力(如威权政府)的**压力或干扰,或技术攻击**来自恶意黑客;当中央运营商受到胁迫或黑客攻击时,除了中央运营商之外的许多各方都处于危险之中。其次,这些系统依赖于运营商手中的权力集中,这些运营商不可能与所有用户完全一致(财务、社会或道德)——例如,拥有不同全球受众的私人社交媒体公司必须经常做出决策关于什么是不正当审查行为与什么是为了公共安全利益的行为,尽管他们通常没有能力这样做。

去中心化和加密机制并不是灵丹妙药,但它们确实提供了一些有用的工具,并且扩展了数字身份系统的设计空间。随着我们越来越多的社会和经济生活转向在线,设计安全、隐私保护和用户控制的身份系统将变得越来越重要。在这篇文章中,我们将论证诸如 zkSNARK 之类的新密码原语对于构建具有这些属性的身份系统至关重要。

zkSNARK 的核心是有用的,因为它们使数字系统的用户能够产生任意复杂性的可信声明,而无需依赖受信任的一方。所有身份系统都是围绕某种机制构建的,用于生成可信的身份和声誉声明——通常,相当复杂的声明附在来自政府或公司等受信任机构的证明上。通过将 zkSNARK 结构应用于关于身份和声誉的声明,我们可以重新构建数字身份系统,并将控制和数据保管在用户手中。

可信的声明

由于 zkSNARKs 在精确的、数学定义的“声明”上运行,我们必须首先准确地分解身份系统中涉及的声明的性质。

很难与完全未知且不受信任的交易对手做生意。常识告诉我们,随着交易对手之间的信任度降低,合作的可能性也会降低;博弈论告诉我们,一次性囚徒困境中的最优策略总是背叛。你会更愿意从与你的社交圈紧密联系的密友那里购买二手车,还是从外地来访的卑鄙 Craigslist 卖家那里购买二手车,他们甚至不会告诉你他们的名字?

为了彼此建立信任,我们需要能够做出可信的声明:关于我们的身份和声誉的声明,我们与之互动的人会认为这些声明是可信的。如果上面的 Craigslist 卖家向你保证他“以前卖过很多车,而且每个人都喜欢我的车——相信我的话,这并不是一个真正可信的说法。” 但是,如果此声明与您所知道的流行网站上经过验证的买家的五星级评级相关联,那么它肯定会感觉更可信。

可信声明的想法听起来很明显,但是建立一个机制(在这个例子中是一个流行的列表网站)并使其合法化以产生可信声明并不是一件容易的事。在传统模型中,我们通常的解决方案是将记录保存管理委托给受信任的机构,这样他们就可以证明我们关于身份和声誉的主张并赋予他们可信度。随着时间的推移(通常在对抗性环境中),该机构必须证明其合法性和可信赖性,并维护用于大规模生成和分发证明的基础设施。

至关重要的是,在大多数模型中,中央权威的证明是使声明可信的原因是有效的政府身份证,所以我是公民;是我的追随者的准确列表,所以我是一个社会影响者;这些都是经过审查的评论和评级,所以我是一个值得信赖的在线零售商。

可信声明的另一个应用程序位于堆栈的更低级别。首先,您如何知道与您互动的人或企业向您展示的是关于他们自己而不是其他人的声明?在依赖于可信权威的系统中,这些权威承担了证明身份本身的更基本的功能。当您访问网站时, API访问令牌、政府颁发的护照或证书颁发机构生成的签名链都是关于身份声明的证明。

有用的身份系统允许参与者提出各种各样的复杂可信声明:

隐私

几乎所有身份系统都需要隐私才能按预期运行,这一事实使情况进一步复杂化。

出于有时会引起争议的道德和意识形态原因,隐私很重要;但更根本的是,作为系统设计的简单问题,它通常是必要的。例如,几乎所有的身份系统都依赖于秘密数据的概念来生成关于身份的可信声明——密码、社会安全号码、私钥、信用卡 PIN、帐户恢复问题等。这些数据必须保密,因为显而易见原因。此外,使用完全透明的数据生成可信声明的过程可能具有负面的外部性,或者至少是难以推理的外部性;隐私保证可以防止这些。例如,如果您必须展示您的全部财务历史,只是为了在在线市场上购买或出售商品——银行对账单、信用卡交易、贷款支付、和作品——交易对手可以使用这些信息发起与原始交易无关的超出范围的互动(负面例子包括广告,甚至骚扰和勒索)。隐私“沙盒”一次性交互,清晰地定义和限制其范围,以便我们可以从简单易懂的构建块构建更复杂的系统。

在需要隐私的传统系统中,我们必须将更多权力委托给中央机构——在此类系统中,中央机构存储私人数据并证明有关这些数据的可信声明,这些声明几乎 无法验证

密码学的作用

到目前为止,我们为可信声明生成和身份系统讨论的所有模型都涉及一个集中的参与者。正如我们已经讨论过的,有很多理由让我们想要探索不依赖强大的记录员或经理的系统。

立即,我们遇到了明显的问题:当我没有你的数据时,我怎么能相信你的说法?如果你把你的数据发给我,我怎么知道数据是有效的?如果您试图对私人数据提出索赔,我们该怎么办?这正是密码学的用武之地。

从我们的角度来看,过去 50 年中的大部分应用密码学(和共识)都是在各种资源限制和隐私条件下,逐渐扩大在没有受信任的权威的情况下可以做出的可信声明的范围的项目:

从历史上看,进展一直很缓慢——这些密码原语中的每一个都定义了一种新的、范围狭窄的声明,其结构被高度指定。然而,这种情况在过去几年中发生了变化。

今天令人兴奋的是,由于 SNARK,我们现在拥有了有效地提出*任意可信声明的机制。*并且借助 zkSNARKs 的零知识属性,我们还可以根据自己的喜好调整我们声明的隐私保证。

以下是您可以使用 zkSNARK 提出的声明类型的一些示例,这在以前是不可能的:

这些声明可以以任意复杂的方式组合、组合,甚至编程。

虽然这一切在理论上都是可能的,但我们还有很长的路要走。为下一代应用程序生产一套强大的 ZK 身份工具需要在性能、可靠性、开发人员体验和应用程序设计模式方面进行实质性改进。在下一篇文章中,我们将讨论我们对未来道路的理解。

附录:身份中有什么?

要了解密码学在哪些方面可用于构建身份系统,将身份系统的概念分解为其关键组件很有用。

在分析特定身份系统时,我们可能会问以下一些问题:

其中一些概念相互融合:身份、声誉和身份证明密切相关,不易分割。例如,在某些系统中,身份的原子单位甚至被定义为“中央权威机构可以为其提供有效证明的东西”——不存在不存储在 Facebook 数据库中的 Facebook 帐户的概念。

然而,一般来说,我们在这一系列帖子中使用身份来指代实体(个人、组织、机器人)的持久标签,该标签随着时间的推移保持不变并代表实体——法人身份、公钥、帐户 ID 等。我们使用声誉来指代可以对实体做出的关于过去行为的声明(“Alice 一直信守诺言”、“Bob 一直按时支付信用卡账单”、“Comfort Homes一直使用准确的图片作为其 Airbnb 房源”)。

链接和致谢

感谢 Yi Sun 和 David Schwartz 的反馈和审查。