ZKML: Bridging AI/ML and Web3 with Zero-Knowledge Proofs

ZKML：用零知识证明桥接 AI/ML 和 Web3

0. Introduction

I am thrilled to share that my project on ZKML has successfully been completed with the invaluable support from the Ecosystem Support Program of Privacy & Scaling Explorations (Ethereum Foundation). This platform bridges the AI/ML and Web3 worlds, providing a privacy-preserving solution with immense potential to revolutionize both industries.

This is a POC of an end-to-end platform for machine learning developers to seamlessly convert their TensorFlow Keras models into ZK-compatible versions. This all-in-one solution consists of three core components:

circomlib-ml: A comprehensive Circom library containing circuits that compute common layers in TensorFlow Keras.
keras2circom: A user-friendly translator that converts ML models in Python into Circom circuits.
ZKaggle: A decentralized bounty platform for hosting, verifying, and paying out bounties, similar to Kaggle, but with the added benefit of privacy preservation.

ZKML addresses the limitations of traditional machine learning bounty platforms, which often require full model disclosure for performance verification. The solution leverages ZKPs to enable developers to verify private models with public data, ensuring privacy and security. This is a powerful POC that can attract experienced Web2 developers to the Web3 ecosystem.

1. Background and Rationale

The challenges of traditional ML bounties

Traditional machine learning bounty platforms, such as Kaggle, often require developers to submit their full model to the host for performance verification. This can lead to several issues:

Loss of intellectual property: Disclosing the complete model architecture and weights may expose valuable trade secrets or innovative techniques that developers would prefer to keep private.
Lack of transparency: The evaluation process can be opaque, and participants may not be able to verify the rankings of their models against others.
Data privacy concerns: Sharing models that have been trained on sensitive data may inadvertently reveal information about the underlying data, violating privacy norms and regulations.

These challenges have created a demand for solutions that can protect the privacy of machine learning models and the data they are trained on.

The potential of ZKPs in machine learning

ZKPs present a promising approach to address the challenges faced by traditional ML bounties. By leveraging the power of ZKPs, ZKML offers a privacy-preserving solution with the following benefits:

Model privacy: Developers can participate in bounties without disclosing their entire model architecture and weights, protecting their intellectual property.
Transparent verification: ZKPs enable the verification of model performance without revealing the model’s internals, fostering a transparent and trustless evaluation process.
Data privacy: ZKPs can be used to verify private data with public models or private models with public data, ensuring that sensitive information remains undisclosed.

Integrating ZKPs into the machine learning process provides a secure and privacy-preserving platform that addresses the limitations of traditional ML bounties. This not only promotes the adoption of machine learning in privacy-sensitive industries but also attracts experienced Web2 developers to explore the possibilities within the Web3 ecosystem.

2. Current Scope: A Comprehensive POC

circomlib-ml: A Circom Library for Machine Learning

circomlib-ml is a library of circuit templates for machine learning tasks using the circom language. It contains various templates for neural network layers, such as convolutional layers, dense layers, and activation functions. This library enables the creation of custom circuits for machine learning tasks.

keras2circom: Seamless Model Conversion

keras2circom is a Python tool that transpiles TensorFlow Keras models into circom circuits. This enables seamless conversion of machine learning models from the popular deep learning framework into privacy-preserving ZKP circuits.

ZKaggle: A Decentralized Bounty Platform for Machine Learning

ZKaggle’s first version emerged as a hackathon submission at ETHGlobal FVM Space Warp Hack. The platform enabled decentralized computing by allowing users to share their processing power and monetize their proprietary machine learning models. With a browser-based frontend, bounty providers could upload their data to Filecoin and create computing tasks with associated rewards. Bounty hunters could browse available bounties, download data, and perform computations locally. Upon completion, they would submit a proof with hashed results on-chain for the bounty provider to review. Once approved, bounty hunters could claim their rewards by providing the pre-image of the hashed results. ZKPs were used to maintain a succinct proof of computation and enable bounty hunters to monetize private models with credibility.

ZKaggleV2 presents an improved version with enhanced features and functionality. In this version, multiple files are aggregated into a single circuit, allowing for more efficient processing. The platform also verifies the accuracy of the computations and incorporates a secure method for transferring model weights from the bounty hunter to the bounty provider using elliptic curve Diffie-Hellman (ECDH) encryption. This added layer of security ensures that only authorized parties can access and utilize the model weights, further solidifying the platform’s commitment to privacy and data protection.

3. Code Highlights

circomlib-ml: ZK-friendly Polynomial Activation

circomlib-ml/circuits/Poly.circom

pragma circom 2.0.0;

// Poly activation layer: <https://arxiv.org/abs/2011.05530>
template Poly (n) {
    signal input in;
    signal output out;

    out <== in * in + n*in;
}

Inspired by Ali, R. E., So, J., & Avestimehr, A. S. (2020), the Poly() template has been addded as a template to implement f(x)=x**2+x as an alternative activation layer to ReLU. The non-polynomial nature of ReLU activation results in a large number of constraints per layer. By replacing ReLU with the polynomial activation f(n,x)=x**2+n*x, the number of constraints drastically decrease with a slight performance tradeoff. A parameter n is required when declaring the component to adjust for the scaling of floating-point weights and biases into integers.

keras2circom: Model Weights “Quantization”

keras2circom/keras2circom/circom.py

...
    def to_json(self, weight_scale: float, current_scale: float) -> typing.Dict[str, typing.Any]:
        '''convert the component params to json format'''
        self.weight_scale = weight_scale
        self.bias_scale = self.calc_bias_scale(weight_scale, current_scale)
        # print(self.name, current_scale, self.weight_scale, self.bias_scale)

        json_dict = {}
        for signal in self.inputs:
            if signal.value is not None:
                if signal.name == 'bias' or signal.name == 'b':
                    # print(signal.value)
                    json_dict.update({f'{self.name}_{signal.name}': list(map('{:.0f}'.format, (signal.value*self.bias_scale).round().flatten().tolist()))})
                else:
                    json_dict.update({f'{self.name}_{signal.name}': list(map('{:.0f}'.format, (signal.value*self.weight_scale).round().flatten().tolist()))})
        return json_dict

    def calc_bias_scale(self, weight_scale: float, current_scale: float) -> float:
        '''calculate the scale factor of the bias of the component'''
        if self.template.op_name in ['ReLU', 'Flatten2D', 'ArgMax', 'MaxPooling2D', 'GlobalMaxPooling2D']:
            return current_scale
        if self.template.op_name == 'Poly':
            return current_scale * current_scale
        return weight_scale * current_scale
...

Circom only accepts integers as signals, but Tensorflow weights and biases are floating-point numbers. Instead of quantizing the model, weights are scaled up by 10**m times. The larger m is, the higher the precision. Subsequently, biases (if any) must be scaled up by 10**2m times or even more to maintain the correct output of the network. keras2circom automates this process by calculating the maximum m possible and scaling each layer accordingly.

ZKaggle: IPFS CID Matching and Universal Encryption Circuits

ZKaggleV2/hardhat/circuits/utils/cid.circom

pragma circom 2.0.0;

include "../sha256/sha256.circom";
include "../../node_modules/circomlib-ml/circuits/circomlib/bitify.circom";

// convert a 797x8 bit array (pgm) to the corresponding CID (in two parts)
template getCid() {
    signal input in[797*8];
    signal output out[2];

    component sha = Sha256(797*8);
    for (var i=0; i<797*8; i++) {
        sha.in[i] <== in[i];
    }

    component b2n[2];

    for (var i=1; i>=0; i--) {
        b2n[i] = Bits2Num(128);
        for (var j=127; j>=0; j--) {
            b2n[i].in[127-j] <== sha.out[i*128+j];
        }
        out[i] <== b2n[i].out;
    }
}

Machine learning datasets are frequently too large to be uploaded directly onto the blockchain, so they are instead uploaded to IPFS. To ensure data integrity throughout the model computation process, a proof-of-concept circuit has been designed to demonstrate the capability of computing an IPFS Content Identifier (CID) that is uploaded as a raw buffer in a circom circuit. This approach verifies that the computation is performed on the designated file, thereby maintaining the integrity of the process.

ZKaggleV2/hardhat/circuits/utils/encrypt.circom

pragma circom 2.0.0;

include "../../node_modules/circomlib-ml/circuits/crypto/encrypt.circom";
include "../../node_modules/circomlib-ml/circuits/crypto/ecdh.circom";

// encrypt 1000 inputs
template encrypt1000() {
    // public inputs
    signal input public_key[2];

    // private inputs
    signal input in[1000];
    signal input private_key;

    // outputs
    signal output shared_key;
    signal output out[1001];

    component ecdh = Ecdh();

    ecdh.private_key <== private_key;
    ecdh.public_key[0] <== public_key[0];
    ecdh.public_key[1] <== public_key[1];

    component enc = EncryptBits(1000);
    enc.shared_key <== ecdh.shared_key;

    for (var i = 0; i < 1000; i++) {
        enc.plaintext[i] <== in[i];
    }

    for (var i = 0; i < 1001; i++) {
        out[i] <== enc.out[i];
    }

    shared_key <== ecdh.shared_key;
}
...

To maintain the integrity of the proof during the bounty claim process, ZKaggleV2 incorporates a universal model weight encryption circuit. This circuit is precompiled and deployed for use across all bounties and models. The existing implementation supports models with up to 1000 weights, and any model with fewer weights can be zero-padded at the end to conform to the required size. This approach ensures a consistent and secure method of handling model weights

Please visit the respective repositories linked above for full implementation and usage details.

4. Limitations and Potential Improvements

Proving Scheme: Groth16

The project currently employs Groth16 as the proving scheme to minimize proof size. However, the platform could be extended to support other proving schemes supported by snarkjs that do not require a circuit-specific trusted setup, such as PLONK or FFLONK.

Contract Size and Local Testing

At present, the contracts and frontend can only be tested locally due to the contract size exceeding EIP-170 limit. This constraint poses a challenge for deploying the platform on the Ethereum mainnet (or its testnets) and restricts its usability for wider audiences. To address this limitation, developers could investigate alternative L2 solutions or EVM-compatible chains that offer higher capacity for contract size, enabling this POC to be deployed and used more extensively.

5. TLDR and Call to Action

In summary, this project is an innovative proof-of-concept platform trying to bridge the AI/ML and Web3 worlds using ZKPs, by offering a comprehensive suite of tools, including circomlib-ml, keras2circom, and ZKaggleV2.

The open-source community is invited to contribute to the ongoing development of ZKML. In particular, contributions in the form of additional templates for circomlib-ml, extending support for more layers in keras2circom, and reporting any bugs or issues encountered are highly encouraged. Through collaboration and contributions to this exciting project, the boundaries of secure and privacy-preserving machine learning in the Web3 ecosystem can be pushed even further.

https://hackmd.io/@cathie/zkml

0. 介绍

我很高兴地分享，我的 ZKML 项目已经成功完成，得益于Ethereum Foundation 的生态系统支持计划（Privacy & Scaling Explorations）的宝贵支持。这个平台桥接了 AI/ML 和 Web3 世界，提供了一个具有巨大潜力的隐私保护解决方案，可以彻底改变这两个行业。

这是一个端到端平台的 POC，用于让机器学习开发人员轻松将他们的 TensorFlow Keras 模型转换为 ZK 兼容版本。这个全能的解决方案包括三个核心组件：

circomlib-ml：一个综合的 Circom 库，包含计算 TensorFlow Keras 中常见层的电路。
keras2circom：一个用户友好的翻译器，将Python中的 ML 模型转换为 Circom 电路。
ZKaggle：一个分散式赏金平台，用于托管、验证和支付赏金，类似于 Kaggle，但具有隐私保护的附加优势。

ZKML 解决了传统机器学习赏金平台的局限性，这些平台通常需要公开整个模型以进行性能验证。该解决方案利用 ZKP，使开发人员能够使用公共数据验证私有模型，确保隐私和安全。这是一个强大的 POC，可以吸引有经验的 Web2 开发人员进入 Web3 生态系统。

1. 背景和合理性

传统 ML 赏金的挑战

传统的机器学习赏金平台，如 Kaggle，通常要求开发人员向主持人提交完整的模型以进行性能验证。这可能会导致几个问题：

知识产权的丧失：披露完整的模型架构和权重可能会暴露有价值的商业机密或创新技术，开发人员希望保持私有。
缺乏透明度：评估过程可能不透明，参与者可能无法验证其模型与其他模型的排名。
数据隐私问题：共享经过敏感数据训练的模型可能会意外地揭示有关底层数据的信息，违反隐私规范和法规。

这些挑战已经创造出对保护机器学习模型和它们训练的数据隐私的需求。

ZKP 在机器学习中的潜力

ZKP 为解决传统 ML 赏金所面临的挑战提供了一种有前途的方法。通过利用 ZKP 的威力，ZKML 提供了一个隐私保护的解决方案，具有以下好处：

模型隐私：开发人员可以参与赏金而不必公开整个模型架构和权重，从而保护他们的知识产权。
透明的验证：ZKP 可以在不揭示模型内部的情况下验证模型的性能，促进透明和不信任的评估过程。
数据隐私：ZKP 可以用于验证具有公共模型的私有数据或具有公共数据的私有模型，确保敏感信息保持未公开。

将 ZKP 集成到机器学习过程中提供了一个安全的、隐私保护的平台，解决了传统 ML 赏金的局限性。这不仅促进了机器学习在隐私敏感行业中的采用，也吸引有经验的 Web2 开发人员探索 Web3 生态系统内的可能性。

2. 当前范围：综合 POC

circomlib-ml：用于机器学习的 Circom 库

circomlib-ml 是一个使用 Circom 语言进行机器学习任务的电路模板库。它包含各种神经网络层的模板，例如卷积层、稠密层和激活函数。该库使得可以为机器学习任务创建自定义电路。

keras2circom：无缝模型转换

keras2circom 是一个 Python 工具，将 TensorFlow Keras 模型转换为 Circom 电路。这使得流行的深度学习框架中的机器学习模型可以无缝转换为具有隐私保护的 ZKP 电路。

ZKaggle：机器学习的分散式赏金平台

ZKaggle 的第一个版本是在 ETHGlobal FVM Space Warp Hack 上提交的一个黑客马拉松活动。该平台通过允许用户共享其处理能力和将其专有的机器学习模型货币化，来实现分散式计算。通过基于浏览器的前端，赏金提供者可以将数据上传到 Filecoin 并创建与奖励相关的计算任务。赏金猎人可以浏览可用的赏金、下载数据并在本地执行计算。完成后，他们将提交带有散列结果的证明到链上供赏金提供者审查。一旦获得批准，赏金猎人就可以提供散列结果的预图像来领取奖励。ZKP 被用于维护计算证明的简洁性，并使赏金猎人能够以可信度货币化私有模型。

ZKaggleV2 提供了一个增强版的版本，增强了功能。在这个版本中，多个文件聚合成单个电路，可以更有效地处理。该平台还验证了计算的准确性，并使用椭圆曲线 Diffie-Hellman（ECDH）加密的安全方法，将模型权重从赏金猎人传输到赏金提供者。这种额外的安全层确保只有授权方可以访问和利用模型权重，进一步巩固了平台对隐私和数据保护的承诺。

3. 代码亮点

circomlib-ml：ZK-friendly 多项式激活

circomlib-ml/circuits/Poly.circom

pragma circom 2.0.0;

// Poly activation layer: <https://arxiv.org/abs/2011.05530>
template Poly (n) {
    signal input in;
    signal output out;

    out <== in * in + n*in;
}

受Ali, R. E., So, J., & Avestimehr, A. S. (2020)的启发，已经添加了 Poly() 模板，作为实现 f(x)=x**2+x 的替代激活层，可替换 ReLU。ReLU 激活的非多项式特性导致每层的约束数目很大。通过使用多项式激活 f(n,x)=x**2+n*x 替换 ReLU，约束数大大减少，稍微牺牲一些性能。在声明组件时需要一个参数 n，以调整浮点权重和偏差的缩放为整数。

keras2circom：模型权重“量化”

keras2circom/keras2circom/circom.py

...
    def to_json(self, weight_scale: float, current_scale: float) -> typing.Dict[str, typing.Any]:
        '''convert the component params to json format'''
        self.weight_scale = weight_scale
        self.bias_scale = self.calc_bias_scale(weight_scale, current_scale)
        # print(self.name, current_scale, self.weight_scale, self.bias_scale)

        json_dict = {}
        for signal in self.inputs:
            if signal.value is not None:
                if signal.name == 'bias' or signal.name == 'b':
                    # print(signal.value)
                    json_dict.update({f'{self.name}_{signal.name}': list(map('{:.0f}'.format, (signal.value*self.bias_scale).round().flatten().tolist()))})
                else:
                    json_dict.update({f'{self.name}_{signal.name}': list(map('{:.0f}'.format, (signal.value*self.weight_scale).round().flatten().tolist()))})
        return json_dict

    def calc_bias_scale(self, weight_scale: float, current_scale: float) -> float:
        '''calculate the scale factor of the bias of the component'''
        if self.template.op_name in ['ReLU', 'Flatten2D', 'ArgMax', 'MaxPooling2D', 'GlobalMaxPooling2D']:
            return current_scale
        if self.template.op_name == 'Poly':
            return current_scale * current_scale
        return weight_scale * current_scale
...

Circom 只接受整数信号，但 TensorFlow 的权重和偏差为浮点数。模型不会量化，而是将权重放大 10**m 倍。 m 越大，精度越高。随后，偏置（如果有）必须放大 10**2m 倍或更多，以保持网络的正确输出。keras2circom 自动计算最大可能的 m 并相应地调整每个层。

ZKaggle：IPFS CID 匹配和通用加密电路

ZKaggleV2/hardhat/circuits/utils/cid.circom

pragma circom 2.0.0;

include "../sha256/sha256.circom";
include "../../node_modules/circomlib-ml/circuits/circomlib/bitify.circom";

// convert a 797x8 bit array (pgm) to the corresponding CID (in two parts)
template getCid() {
    signal input in[797*8];
    signal output out[2];

    component sha = Sha256(797*8);
    for (var i=0; i<797*8; i++) {
        sha.in[i] <== in[i];
    }

    component b2n[2];

    for (var i=1; i>=0; i--) {
        b2n[i] = Bits2Num(128);
        for (var j=127; j>=0; j--) {
            b2n[i].in[127-j] <== sha.out[i*128+j];
        }
        out[i] <== b2n[i].out;
    }
}

机器学习数据集通常太大，无法直接上传到区块链上，因此它们被上传到IPFS。为了确保整个模型计算过程中的数据完整性，设计了一个概念验证电路，以证明计算IPFS内容标识符（CID）的能力，该标识符作为电路中的原始缓冲器上传。这种方法验证了计算是在指定的文件上执行的，从而保持了过程的完整性。

ZKaggleV2/hardhat/circuits/utils/encrypt.circom

pragma circom 2.0.0;

include "../../node_modules/circomlib-ml/circuits/crypto/encrypt.circom";
include "../../node_modules/circomlib-ml/circuits/crypto/ecdh.circom";

// encrypt 1000 inputs
template encrypt1000() {
    // public inputs
    signal input public_key[2];

    // private inputs
    signal input in[1000];
    signal input private_key;

    // outputs
    signal output shared_key;
    signal output out[1001];

    component ecdh = Ecdh();

    ecdh.private_key <== private_key;
    ecdh.public_key[0] <== public_key[0];
    ecdh.public_key[1] <== public_key[1];

    component enc = EncryptBits(1000);
    enc.shared_key <== ecdh.shared_key;

    for (var i = 0; i < 1000; i++) {
        enc.plaintext[i] <== in[i];
    }

    for (var i = 0; i < 1001; i++) {
        out[i] <== enc.out[i];
    }

    shared_key <== ecdh.shared_key;
}
...

为了在赏金认领过程中维护证明的完整性，ZKaggleV2采用了通用模型权重加密电路。该电路已经预编译并部署，可用于所有赏金和模型。现有实现支持最多1000个权重的模型，任何权重较少的模型都可以在末尾填充零以符合所需的大小。这种方法确保了一种一致且安全的处理模型权重的方法。

有关完整的实现和使用详细信息，请访问上文链接的相应存储库。

4. 限制和潜在改进

证明方案：Groth16

该项目目前采用Groth16作为证明方案，以最小化证明大小。然而，该平台可以扩展以支持snarkjs支持的其他证明方案，这些方案不需要特定于电路的可信设置，例如PLONK或FFLONK。

合约大小和本地测试

目前，由于合约大小超过EIP-170限制，合约和前端只能在本地进行测试。这种限制为在以太坊主网（或其测试网）上部署平台并限制其可用性提出了挑战。为了解决这个限制，开发人员可以研究替代的L2解决方案或支持更高合约大小容量的EVM兼容链，以使该POC能够被部署和更广泛地使用。

5. TLDR和行动呼吁

总之，该项目是一个创新的概念验证平台，旨在使用ZKP桥接AI / ML和Web3世界，提供包括circomlib-ml，keras2circom和ZKaggleV2在内的全面工具套件。

开源社区被邀请为ZKML的持续开发做出贡献。特别是，鼓励以circomlib-ml的其他模板形式，扩展keras2circom中更多层的支持，以及报告遇到的任何错误或问题等形式的贡献。通过合作和对这个令人兴奋的项目的贡献，可以进一步推动Web3生态系统中安全和隐私保护的机器学习的边界。