（中英文对照）ZK Identity: Why and How (Part 1)

Last month, we kicked off the 0xPARC ZK-Identity Working Group: a working group experimenting with zkSNARKs to build digital identity tools. This post is the first in a series on why advances in cryptography will be important for enabling new identity primitives. This first post covers the “Why”; future posts will cover the “How.”

The design of online identity systems has been the subject of intense debate in the last few years. Modern digital identity systems have enabled new and complex kinds of online interactions and communities. Unfortunately, many of these systems also have significant weaknesses.

Many of these weaknesses can be attributed to inherent limitations of centralized identity system designs. Firstly, these systems are generally built around central points of control—and therefore, central points of failure. Modern e-commerce, social media, and messaging platforms are vulnerable to pressure or interference from powerful actors (such as authoritarian governments), or to technical attacks from malicious hackers; when a central operator is coerced or hacked, many parties beyond just the central operator are put at risk. Secondly, these systems rely on the concentration of power in the hands of operators who cannot possibly be fully aligned (financially, socially, or morally) with all of their users—for example, private social media companies with diverse global audiences must often make decisions about what constitutes an act of unjustified censorship versus an act in the interest of public safety, though they are often ill-equipped to do so.

Decentralized and cryptographic mechanisms are not a magical panacea, but they do offer some useful tools, and they expand the design space for digital identity systems. As more of our social and economic lives move online, designing secure, privacy-preserving, and user-controlled identity systems will become increasingly important. In this post, we’ll argue that new cryptographic primitives such as zkSNARKs will be crucial for building identity systems with these properties.

At their core, zkSNARKs are useful because they enable users of digital systems to produce credible claims of arbitrary complexity, without reliance on a trusted party. All identity systems are built around some mechanism for generating credible claims of identity and reputation—typically, fairly complex claims attached to attestations from a trusted authority, like a government or a company. By applying zkSNARK constructions to claims about identity and reputation, we can rearchitect digital identity systems and put control and data custody in the hands of users.

Credible Claims

As zkSNARKs operate over precise, mathematically-defined “claims,” we must first precisely break down the nature of claims involved in identity systems.

It’s hard to do business with a completely unknown and untrusted counterparty. Common sense tells us that as trust between counterparties decreases, the likelihood of cooperation does as well; game theory tells us that the optimal strategy in a one-shot prisoner’s dilemma is always to defect. Would you be more willing to buy a used car from a close friend who is tightly linked to your social circles, or from a sleazy Craigslist seller dropping by from out-of-town, who won’t even give you their name?

To build trust with each other, we need to be able to make credible claims: claims about our identities and reputations that people who we interact with can find believable. It’s not really a credible claim if the Craigslist seller above assures you he’s “sold tons of cars before, and everyone loves my cars—take my word for it.” But if this claim was associated with a collection of five-star ratings from verified buyers on a popular website that you know about, it would certainly feel more credible.

The idea of credible claims sounds obvious, but building and legitimizing a mechanism (in this example, a popular listings website) for producing credible claims is no easy task. In traditional models, our usual solution is to delegate record-keeping management to a trusted authority, so that they can attest to our claims about identity and reputation and lend them credibility. This authority must prove their legitimacy and trustworthiness over time (often in adversarial environments), and maintain infrastructure for generating and distributing attestations at scale.

Crucially, in most models, the central authority’s attestation is what makes a claim credible. This is a valid government ID, so I’m a citizen; that is an accurate list of my followers, so I’m a social influencer; these are a vetted set of reviews and ratings, so I’m a trustworthy online retailer.

Another application for credible claims lives at an even lower level of the stack. To begin with, how do you know that the person or business you’re interacting with is presenting you with a claim about themselves, and not someone else? In systems that depend on a trusted authority, these authorities take on the even more basic function of attesting to identity itself. An API access token, a government-issued passport, or a chain of signatures generated by Certificate Authorities when you visit a website are all attestations for claims about identity.

Useful identity systems allow participants to make a very wide variety of complex credible claims:

(Digital) When you order food delivery via Doordash, the Doordash webserver makes a credible claim to you (”I am the Doordash webserver” via a chain of DNS signatures); you make credible claims about your identity to Doordash via a third-party identity provider (”I am a Doordash user who should be allowed to access this account’s saved credit cards” via ”Login with Google”); you make credible claims about future payment to Doordash via various financial institutions (”I have the money to pay you for my order, and this payment will arrive soon” via a credit card provider that does not decline the transaction).
(Physical) When you take out a mortgage to purchase a house, you’re implicitly making a huge number of credible claims about your identity and reputation to your bank, to your real estate agent, to a seller, and to the government.
(Mixture) When you apply for a job, you make credible claims to your potential employer by drawing on many different attestation systems. You claim that you have adequate training and temperament for the job, citing attestations (degrees, certificates) from educational institutions or professional certification authorities, from other colleagues you’ve worked with, and from prior companies. Social media and other online account providers provide further attestations for implicit claims about the kind of person you are.

Privacy

The fact that nearly all identity systems inherently require privacy to function as intended further complicates the picture.

Privacy is important for ethical and ideological reasons that are sometimes controversial; but even more fundamentally, it’s often necessary as a simple matter of system design. For example, almost all identity systems rely on the notion of secret data to generate believable claims about identity—a password, Social Security number, private key, credit card PIN, account recovery question, etc. This data must be kept private, for obvious reasons. Additionally, the process of producing credible claims with completely transparent data may have negative externalities, or at least externalities that are hard to reason about; privacy guarantees prevent these. For example, if you had to present your entire financial history just to buy or sell an item in an online marketplace—bank statements, credit card transactions, loan payments, and the works—a counterparty could use this information to initiate out-of-scope interactions that have nothing to do with the original transaction (negative examples include advertising, or even harassing and blackmailing). Privacy “sandboxes” one-off interactions, cleanly defining and limiting their scope, so that we can build more complex systems from simple and understandable building blocks.

In traditional systems that require privacy, we have to delegate even more power to the central authority—in such systems, central authorities store private data and attest to credible claims about this data that are nearly impossible to verify.

The Role of Cryptography

So far, all of the models we’ve discussed for credible claim generation and identity systems have involved a centralized actor. And as we’ve discussed, there are plenty of reasons why we may want to explore systems that don’t rely on a powerful record-keeper or manager.

Immediately, we run into the obvious problems: how can I trust your claims, when I don’t have your data? If you send me your data, how do I know that the data is valid? And what do we do if you’re trying to make claims about private data? This is exactly where cryptography comes in.

Viewed in our lens, much of applied cryptography (and consensus) in the last fifty years has been a project of gradually expanding the scope of what credible claims it is possible to make without a trusted authority, under various resource constraints and privacy conditions:

Digital signature schemes allow me to make a credible claim about the consistency of my identity online, across a sequence of multiple different actions, by signing a series of messages with the same private key. “I am authorized to charge to Alice’s credit card.”
Group signature schemes allow me to make more complex privacy-preserving claims about identity. “I am a member of this alumni organization, but I won’t tell you which member.”
Signature aggregation, multi-signature, and threshold signature schemes allow me to make claims about group behavior, under various different resource constraints. “This large collective body—not just a single rogue employee—has authorized a currency transfer from our financial accounts.”
Consensus schemes and programmable smart contracts allow me to make credible and irreversible commitments to future actions. “If you send me digital asset A, I will immediately send you digital asset B in return.”

Progress has historically been slow—each of these cryptographic primitives defines a new and tightly scoped kind of claim, whose structure is highly specified. However, this has changed in the last several years.

What’s exciting today is that we now have the machinery to make arbitrary credible claims efficiently, thanks to SNARKs. And with the zero-knowledge property of zkSNARKs, we can also tune the privacy guarantees of our claims exactly to our liking.

Here are a few examples of the kinds of claims that you can make with a zkSNARK, that would not have been possible before:

“I’m a trustworthy debtor: I’ve paid off large loans from three trusted banks in a timely fashion, though I won’t reveal the banks or what the loans were taken out for.”
“I’m a respected community member: Though I am writing this post anonymously, under my named account I have accumulated over 10000 upvotes on this forum.”
“I’m a long-term cryptotoken collector: Ethereum addresses that I control collectively hold at least two NFTs from the Dark Forest Valhalla collection, and at least 100ETH.”

These claims can be combined, composed, and even programmed in arbitrarily complex ways.

While all of this is theoretically possible, we still have a long ways to go to. Producing a robust suite of ZK identity tools for the next generation of applications requires making substantial improvements in performance, reliability, developer experience, and application design patterns. In the next post, we’ll discuss our understanding of the road ahead.

Addendum: What’s in an Identity?

To understand where cryptography can be useful for building an identity system, it’s useful to break down the idea of an identity system into its key components.

In analyzing a particular identity system, we might ask some of the following questions:

What is the atomic unit of identity?
- Physical world: identity is often associated with legal personhood. In other words, the atomic unit of identity is an individual person, or a corporation.
- Cyberspace: identity can be a Google/Facebook/Twitter account; the public/private key pair associated with a Certificate Authority; a holder of some Ethereum-based token (which may not be tied to a specific address!); or others.
What constitutes a valid proof or attestation of identity? Who can issue attestations for identity? Who can revoke the privileges associated with identity attestation?
- Physical world: a valid attestation might look like a state-issued ID or an EIN letter. Government ultimately has power of the privileges that come with holding a valid identity attestation: for example, the government can revoke your passport.
- Cyberspace: a valid attestation might be a FB-provided OAuth token, or a valid digital signature (or chain of signatures). Various service providers have power over various attestations: for example, Twitter can ban your account.
Who custodies auxiliary data associated with your identity? Who can access this data, and who controls this access?
- Physical world: auxiliary data is held by a combination of government agencies and bureaucracies, private service providers (banks, credit score agencies), and private individuals (your personal network).
- Cyberspace: in centralized models, auxiliary data is held by big tech companies. In decentralized models, auxiliary data is held by a combination of client software (a browser, a personal webserver) which you control, as well as decentralized storage networks (for example, historical transaction data or smart contract state in a blockchain).
What records, artifacts, or attestations signal reputation and credibility? Who decides these signals and how they are interpreted? Who has access to the underlying input data that determines reputation? Who can access these signals?
- Physical world: credit score reports, background checks, social references, letters of employment, credentials and honors and titles.
- Cyberspace: NFT ownership, account age and previous activity, networks of attestations, karma/forum upvotes.

Some of these concepts blend into each other: identity, reputation, and proof-of-identity are closely related, and not easily divisible. For example, in some systems, the atomic unit of identity is even defined as “that which a central authority can provide a valid attestation for”—there is no notion of a Facebook account that isn’t stored in Facebook’s database.

In general, however, we use identity in this series of posts to refer to a persistent tag for an entity (a person, an organization, a bot) that stays constant and representative of the entity over time—legal personhood, a public key, an account ID, etc. We use reputation to refer to the claims about past behavior that can be made about the entity (”Alice has always kept her word,” “Bob has always paid his credit card bills on time,” “Comfort Homes has always used accurate pictures for its Airbnb listings”).

Links and Acknowledgements

Thanks to Yi Sun and David Schwartz for feedback and review.

上个月，我们启动了 0xPARC ZK-Identity 工作组：一个尝试使用 zkSNARKs 构建数字身份工具的工作组。这篇文章是关于为什么密码学的进步对于启用新的身份原语很重要的系列文章中的第一篇。第一篇文章涵盖了“为什么”；未来的帖子将涵盖“如何”。

在过去的几年里，在线身份系统的设计一直是激烈争论的主题。现代数字身份系统已经实现了新的和复杂的在线互动和社区。不幸的是，许多这些系统也有明显的弱点。

许多这些弱点可归因于集中式身份系统设计的固有局限性。首先，这些系统通常是围绕中心控制点构建的，因此也是中心故障点。现代电子商务、社交媒体和消息平台容易受到强大势力（如威权政府）的**压力或干扰，或技术攻击**来自恶意黑客；当中央运营商受到胁迫或黑客攻击时，除了中央运营商之外的许多各方都处于危险之中。其次，这些系统依赖于运营商手中的权力集中，这些运营商不可能与所有用户完全一致（财务、社会或道德）——例如，拥有不同全球受众的私人社交媒体公司必须经常做出决策关于什么是不正当审查行为与什么是为了公共安全利益的行为，尽管他们通常没有能力这样做。

去中心化和加密机制并不是灵丹妙药，但它们确实提供了一些有用的工具，并且扩展了数字身份系统的设计空间。随着我们越来越多的社会和经济生活转向在线，设计安全、隐私保护和用户控制的身份系统将变得越来越重要。在这篇文章中，我们将论证诸如 zkSNARK 之类的新密码原语对于构建具有这些属性的身份系统至关重要。

zkSNARK 的核心是有用的，因为它们使数字系统的用户能够产生任意复杂性的可信声明，而无需依赖受信任的一方。所有身份系统都是围绕某种机制构建的，用于生成可信的身份和声誉声明——通常，相当复杂的声明附在来自政府或公司等受信任机构的证明上。通过将 zkSNARK 结构应用于关于身份和声誉的声明，我们可以重新构建数字身份系统，并将控制和数据保管在用户手中。

可信的声明

由于 zkSNARKs 在精确的、数学定义的“声明”上运行，我们必须首先准确地分解身份系统中涉及的声明的性质。

很难与完全未知且不受信任的交易对手做生意。常识告诉我们，随着交易对手之间的信任度降低，合作的可能性也会降低；博弈论告诉我们，一次性囚徒困境中的最优策略总是背叛。你会更愿意从与你的社交圈紧密联系的密友那里购买二手车，还是从外地来访的卑鄙 Craigslist 卖家那里购买二手车，他们甚至不会告诉你他们的名字？

为了彼此建立信任，我们需要能够做出可信的声明：关于我们的身份和声誉的声明，我们与之互动的人会认为这些声明是可信的。如果上面的 Craigslist 卖家向你保证他“以前卖过很多车，而且每个人都喜欢我的车——相信我的话，这并不是一个真正可信的说法。” 但是，如果此声明与您所知道的流行网站上经过验证的买家的五星级评级相关联，那么它肯定会感觉更可信。

可信声明的想法听起来很明显，但是建立一个机制（在这个例子中是一个流行的列表网站）并使其合法化以产生可信声明并不是一件容易的事。在传统模型中，我们通常的解决方案是将记录保存管理委托给受信任的机构，这样他们就可以证明我们关于身份和声誉的主张并赋予他们可信度。随着时间的推移（通常在对抗性环境中），该机构必须证明其合法性和可信赖性，并维护用于大规模生成和分发证明的基础设施。

至关重要的是，在大多数模型中，中央权威的证明是使声明可信的原因。这是有效的政府身份证，所以我是公民；这是我的追随者的准确列表，所以我是一个社会影响者；这些都是经过审查的评论和评级，所以我是一个值得信赖的在线零售商。

可信声明的另一个应用程序位于堆栈的更低级别。首先，您如何知道与您互动的人或企业向您展示的是关于他们自己而不是其他人的声明？在依赖于可信权威的系统中，这些权威承担了证明身份本身的更基本的功能。当您访问网站时， API访问令牌、政府颁发的护照或证书颁发机构生成的签名链都是关于身份声明的证明。

有用的身份系统允许参与者提出各种各样的复杂可信声明：

（数字）当您通过 Doordash 订购外卖时，Doordash 网络服务器会通过 DNS 签名链向您做出可信的声明（“我是 Doordash 网络服务器”）；您通过第三方身份提供商就您的身份向 Doordash 做出可信声明（“我是 Doordash 用户，应该被允许访问此帐户保存的信用卡”，通过“使用 Google 登录”）；您对未来通过各种金融机构向 Doordash 支付的款项做出可信的声明（“我有钱支付您的订单，这笔款项将很快到达”，通过不拒绝交易的信用卡提供商）。
（物理）当您通过抵押贷款购买房屋时，您隐含地向您的银行、您的房地产经纪人、卖方和政府提出了大量关于您的身份和声誉的可信声明。
（混合）当您申请工作时，您通过利用许多不同的证明系统向您的潜在雇主提出可信的要求。你声称你有足够的训练和气质来胜任这份工作，并引用了来自教育机构或专业认证机构、与你共事过的其他同事以及以前的公司的证明（学位、证书）。社交媒体和其他在线账户提供者为隐含的关于你是怎样的人的声明提供了进一步的证明。

隐私

几乎所有身份系统都需要隐私才能按预期运行，这一事实使情况进一步复杂化。

出于有时会引起争议的道德和意识形态原因，隐私很重要；但更根本的是，作为系统设计的简单问题，它通常是必要的。例如，几乎所有的身份系统都依赖于秘密数据的概念来生成关于身份的可信声明——密码、社会安全号码、私钥、信用卡 PIN、帐户恢复问题等。这些数据必须保密，因为显而易见原因。此外，使用完全透明的数据生成可信声明的过程可能具有负面的外部性，或者至少是难以推理的外部性；隐私保证可以防止这些。例如，如果您必须展示您的全部财务历史，只是为了在在线市场上购买或出售商品——银行对账单、信用卡交易、贷款支付、和作品——交易对手可以使用这些信息发起与原始交易无关的超出范围的互动（负面例子包括广告，甚至骚扰和勒索）。隐私“沙盒”一次性交互，清晰地定义和限制其范围，以便我们可以从简单易懂的构建块构建更复杂的系统。

在需要隐私的传统系统中，我们必须将更多权力委托给中央机构——在此类系统中，中央机构存储私人数据并证明有关这些数据的可信声明，这些声明几乎 无法验证。

密码学的作用

到目前为止，我们为可信声明生成和身份系统讨论的所有模型都涉及一个集中的参与者。正如我们已经讨论过的，有很多理由让我们想要探索不依赖强大的记录员或经理的系统。

立即，我们遇到了明显的问题：当我没有你的数据时，我怎么能相信你的说法？如果你把你的数据发给我，我怎么知道数据是有效的？如果您试图对私人数据提出索赔，我们该怎么办？这正是密码学的用武之地。

从我们的角度来看，过去 50 年中的大部分应用密码学（和共识）都是在各种资源限制和隐私条件下，逐渐扩大在没有受信任的权威的情况下可以做出的可信声明的范围的项目：

数字签名方案允许我通过使用相同的私钥签署一系列消息，通过一系列不同的操作，对我的在线身份的一致性做出可信的声明。“我被授权从爱丽丝的信用卡中扣款。”
组签名方案允许我对身份提出更复杂的隐私保护声明。“我是这个校友组织的成员，但我不会告诉你是哪个成员。”
签名聚合、多重签名和阈值签名方案允许我在各种不同的资源约束下对组行为做出声明。“这个大型集体机构——不仅仅是一个流氓员工——已经授权从我们的财务账户进行货币转账。”
共识计划和可编程智能合约使我能够对未来的行动做出可信且不可逆转的承诺。“如果你给我发数字资产A，我会立即给你发数字资产B作为回报。”

从历史上看，进展一直很缓慢——这些密码原语中的每一个都定义了一种新的、范围狭窄的声明，其结构被高度指定。然而，这种情况在过去几年中发生了变化。

今天令人兴奋的是，由于 SNARK，我们现在拥有了有效地提出*任意可信声明的机制。*并且借助 zkSNARKs 的零知识属性，我们还可以根据自己的喜好调整我们声明的隐私保证。

以下是您可以使用 zkSNARK 提出的声明类型的一些示例，这在以前是不可能的：

“我是一个值得信赖的债务人：我已经及时还清了三家受信任银行的大笔贷款，但我不会透露这些银行或贷款的用途。”
“我是一个受人尊敬的社区成员：虽然我是匿名写这篇文章的，但在我的命名帐户下，我在这个论坛上已经积累了超过 10000 个赞。”
“我是一个长期的加密代币收集者：我控制的以太坊地址共同持有至少两个来自黑暗森林瓦尔哈拉收藏的 NFT，以及至少 100 ETH。”

这些声明可以以任意复杂的方式组合、组合，甚至编程。

虽然这一切在理论上都是可能的，但我们还有很长的路要走。为下一代应用程序生产一套强大的 ZK 身份工具需要在性能、可靠性、开发人员体验和应用程序设计模式方面进行实质性改进。在下一篇文章中，我们将讨论我们对未来道路的理解。

附录：身份中有什么？

要了解密码学在哪些方面可用于构建身份系统，将身份系统的概念分解为其关键组件很有用。

在分析特定身份系统时，我们可能会问以下一些问题：

身份的原子单位是什么？
- 物理世界：身份通常与法人身份相关联。换句话说，身份的原子单位是个人或公司。
- 网络空间：身份可以是 Google/Facebook/Twitter 帐户；与证书颁发机构关联的公钥/私钥对；某些基于以太坊的代币的持有者（可能与特定地址无关！）；或其他。
什么构成有效的身份证明或证明？谁可以签发身份证明？谁可以撤销与身份证明相关的特权？
- 物理世界：有效的证明可能看起来像国家颁发的 ID 或 EIN 字母。政府最终拥有持有有效身份证明所带来的特权：例如，政府可以吊销您的护照。
- 网络空间：有效的证明可能是 FB 提供的 OAuth 令牌，或有效的数字签名（或签名链）。各种服务提供商对各种证明拥有权力：例如，Twitter 可以禁止您的帐户。
谁保管与您的身份相关的辅助数据？谁可以访问这些数据，谁控制这种访问？
哪些记录、人工制品或证明表明声誉和可信度？谁决定这些信号以及如何解释它们？谁有权访问决定声誉的基础输入数据？谁可以访问这些信号？

其中一些概念相互融合：身份、声誉和身份证明密切相关，不易分割。例如，在某些系统中，身份的原子单位甚至被定义为“中央权威机构可以为其提供有效证明的东西”——不存在不存储在 Facebook 数据库中的 Facebook 帐户的概念。

然而，一般来说，我们在这一系列帖子中使用身份来指代实体（个人、组织、机器人）的持久标签，该标签随着时间的推移保持不变并代表实体——法人身份、公钥、帐户 ID 等。我们使用声誉来指代可以对实体做出的关于过去行为的声明（“Alice 一直信守诺言”、“Bob 一直按时支付信用卡账单”、“Comfort Homes一直使用准确的图片作为其 Airbnb 房源”）。

链接和致谢

感谢 Yi Sun 和 David Schwartz 的反馈和审查。