zkML: Evolving the Intelligence of Smart Contracts Through Zero-Knowledge Cryptography

zkML：通过零知识密码学进化智能合约

Untitled

Proving machine learning (ML) model inference via zkSNARKs is poised to be one of the most significant advancements in smart contracts this decade. This development opens up an excitingly large design space, allowing applications and infrastructure to evolve into more complex and intelligent systems.

By adding ML capabilities, smart contracts can be made more autonomous and dynamic, allowing them to make decisions based on real-time on-chain data, rather than just static rules. Smart contracts would be flexible and adaptable to various scenarios, including those that may not have been anticipated when the contract was initially created. In short, ML capabilities will broaden the automation, accuracy, efficiency, and flexibility of any smart contract we put on-chain.

In many ways, it is surprising that smart contracts do not use embedded ML models today given the prominence of ML in a majority of applications outside of web3. This absence of ML was largely due to the high computational cost of running these models on-chain. For example, FastBERT, a compute-optimized language model, uses ~1800 MFLOPS (million floating point operations), which would be infeasible to run directly on the EVM.

When considering the application of ML models on-chain, the primary focus is with the inference phase: applying the model to make predictions over real-world data. In order to have ML-scaled smart contracts, contracts must be able to ingest such predictions, but as we mentioned earlier, it’s infeasible to run the model directly on the EVM. zkSNARKs give us a solution: anyone can run a model off-chain and generate a succinct and verifiable proof showing that the intended model did in fact produce a particular result. This proof can be published on-chain and ingested by smart contracts to amplify their intelligence.

In this piece, we will:

Review the potential applications and use cases for on-chain ML
Explore the emerging projects and infrastructure building at the heart of zkML
Discuss some of the challenges of existing implementations and what the future of zkML could look like

A Quick Primer on ML

Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on developing algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data. ML models typically have three primary components:

Training Data: A set of input data that is used to train a machine learning algorithm to make predictions or classify new data. Training data can take many forms, such as images, text, audio, numerical data, or a combination of these.
Model Architecture: The overall structure or design of a machine learning model. It defines the types and number of layers, activation functions, and connections between nodes or neurons. The choice of architecture depends on the specific problem and data being used.
Model Parameters: Values or weights that the model learns during the training process to make predictions. These values are adjusted iteratively through an optimization algorithm to minimize the error between predicted and actual outcomes.

Models are produced and deployed in two phases:

Training Phase: During training, the model is exposed to a labeled dataset and adjusts its parameters to minimize the error between predicted and actual outcomes. The training process typically involves several iterations or epochs, and the accuracy of the model is evaluated on a separate validation set.
Inference Phase: The inference phase is when a trained machine learning model is used to make predictions on new, unseen data. The model takes in input data and applies the learned parameters to generate an output, such as a classification or regression prediction.

zkML today is primarily focused on the inference stage of ML models rather than on the training phase primarily due to the computational complexity of verifying training in-circuit. zkML’s focus on verifying inference is not a limitation though: we expect some very interesting use cases and applications to be produced from the inference phase.

Verified Inference Scenarios

There are four possible scenarios for verified inference:

Private Input, Public Model. A Model Consumer (MC) may wish to keep their inputs private from a Model Provider (MP). For example, a MC may wish to prove the result of a credit-scoring model to a lender without disclosing their personal financial information. This could be done using a pre-commitment scheme and running the model locally.
Public Input, Private Model. A common issue with ML-as-a-Service is that the MP may wish to hide their parameters or weights to protect their IP, and the MC wants to verify that the generated inferences do in fact come from the specified model in an adversarial setting. Think of it this way: a MP has an incentive to run a lighter model to save on costs when serving inferences to a MC. Using a commitment of the model weights on-chain, a MC can audit a private model at any time.
Private Input, Private Model. This scenario arises when the data being used for inference is highly sensitive or confidential, and the model itself is hidden to protect IP. An example of this might include auditing a healthcare model using private patient information. Compositional techniques in zk or use of multi-party computation (MPC) or variations of FHE can be used to service this scenario.
Public Input, Public Model. When all aspects of the model can be public, zkML solves for a different use case: compression and verification of off-chain computation to an on-chain environment. For larger models, it is more cost-effective to verify a succinct zk proof of an inference than to re-run the model themselves.

Applications and Opportunities

Verified ML inference opens up a new design space for smart contracts. Some crypto-native applications include:

DeFi

**Verifiable Off-Chain ML Oracles.**Continued adoption of generative AI may help push industries to implement signature schemes for their content (e.g. a news publication signing an article or image). Signed data is ZK-ready and makes the data composable and trustworthy. ML models can process this signed data off-chain to make predictions and classifications (e.g. classifying the outcome of an election or a weather event). These off-chain ML oracles could be used to trustlessly settle real-world prediction markets, insurance protocol contracts, and more by verifying the inference and publishing the proof on-chain.
**ML-parameterized DeFi applications.**Many facets of DeFi could be more automated. For example, a lending protocol could use an ML model to update parameters in real time. Today, lending protocols primarily trust off-chain modelsrun by organizations to determine the collateral factor, LTV, liquidation threshold, etc., but a better alternative could be community-trained open-source models that can be run and verified by anyone.
Automated trading strategies. A common way of showcasing the return profile of a financial model strategy is for the MP to provide various backtests to investors. However, there is no way to verify the strategist is following the model when executing a trade – instead, the investor must trust the strategist is actually following the model. zkML offers a solution where the MP can provide proof of the financial model inference when deploying into a specific position. This could be particularly useful for DeFi managed vaults.

Security

**Fraud monitoring for smart contracts.**Rather than having slow human governance or centralized actors control the ability to pause a contract, an ML model could be used to detect possible malicious behavior and enact a pause.

Traditional ML

A decentralized, trustless implementation of Kaggle. A protocol or marketplace could be created that allows a MC or other interested party to verify the accuracy of a model without the MP disclosing the model weights. This can be useful for selling models, running competitions around model accuracy, etc.
Decentralized prompt marketplaces for generative AI. Prompt creation for generative AI has evolved into a sophisticated craft where the best output generating prompts are often complex with many modifiers. External parties could be willing to purchase these sophisticated prompts from a creator. zkML can be used in two ways here: 1) verifying the outputs of a prompt to ensure to a potential purchaser that the prompt does in fact create the desired images, and 2) allowing the prompt owner to maintain ownership of the prompt after a purchase by keeping it obfuscated from the purchaser while still generating verified images for them.

Identity

**Replacing the private key with privacy-preserving biometric authentication.**Private key management remains one of the largest frictions in web3 UX. Abstracting the private key via facial recognition or other unique factors is one possible solution from zkML.
Fair airdrops and contributor rewards. An ML model could be used to create detailed personas of users that determine airdrop allocations or contribution rewards based on multiple factors. This can be particularly powerful when combined with an identity solution. In this scenario, one possibility could be having users run an open-source model that evaluates their participation in an application along with higher context participation like governance forum posts to infer an allocation for them. They then provide this proof to the contract to receive their token allocation.

Web3 Social

Filtering for web3 social media. The decentralized nature of web3 social applications will result in higher volumes of spam and malicious content. Ideally, a social media platform could use an open-source ML model that is agreed upon by the community and publish proofs of the model’s inference when electing to filter a post. For more on this, check out Daniel Kang’s zkML analysis of the Twitter algorithm here.
Advertising / Recommendations. As a social media user, I may be willing to view personalized advertisements but wish to keep my preferences and interests private from advertisers. I could elect to run a model on my tastes locally that feeds into media applications to serve me content. In doing this, advertisers in this scenario may be willing to pay for the end-user to do this, however it is likely these models would be much less sophisticated than targeted advertising models in production today.

Creator Economy / Gaming

In-game Economic Rebalancing. ML models could be used to dynamically adjust token issuance, supply, burn, voting thresholds, etc. One possible model would be an incentivized contract that rebalances an in-game economy if a certain rebalancing threshold is met and a proof of inference is verified.
New types of on-chain games. Co-operative human versus AI games and other innovative takes on on-chain games could be created where a trustless AI model serves as a non-playable character. Each action the NPC takes is published to the chain with a proof that anyone can verify to determine the correct model is being run. In the case of Modulus Labs’ Leela vs. the World, the verifier wants to ensure the stated 1900 ELO AI is picking the chess moves and not Magnus Carlson. Another example is AI Arena, a Super Smash Brothers style AI fighting game. Players in high stakes competitive environments would want to be sure that there is no interference or cheating involved with their trained models.

Emerging Projects and Infrastructure

The ecosystem for zkML can be broadly broken down into four primary categories:

Model-to-Proof Compilers: Infrastructure that compiles a model from an existing format (e.g. Pytorch, ONNX, etc) into a verifiable computational circuit.
Generalized Proving Systems: Proving systems built to verify an arbitrary computational trace.
zkML-Specific Proving Systems: Proving systems specifically built to verify an ML model’s computational trace.
Applications: Projects working on unique zkML use cases.

Model-to-Proof Compilers

When looking at the zkML ecosystem, most attention has been given to creating model-to-proof compilers. Generally, these compilers translate high-level ML models written in Pytorch, Tensorflow, or the like into zk circuits.

EZKL is a library and command-line tool for doing inference for deep learning models in a zk-SNARK. With EZKL, you can define a computational graph in Pytorch or TensorFlow, export it as a ONNX file with some sample inputs in a JSON file, and point EZKL to these files to generate a zkSNARK circuit. With the latest round of performance improvements, EZKL can now prove an [MNIST](https://en.wikipedia.org/wiki/MNIST_database#:~:text=The MNIST database (Modified National,the field of machine learning.)-sized model in about 2 seconds and 700MB of RAM. EZKL has seen some significant early adoption thus far, being used as infrastructure for various hackathon projects to date.

Cathie So’s circomlib-ml library contains various ML circuit templates for Circom. Circuits include some of the most common ML functions. Keras2circom, also built by Cathie, is a python tool that transpiles a Keras model into a Circom circuit using the underlying circomlib-ml library.

LinearA has developed two frameworks for zkML: Tachikoma and Uchikoma. Tachikoma is used to convert neural networks into an integer-only form and generate a computational trace. Uchikoma is a tool that converts TVM’s intermediate representation into programming languages that don’t support floating point operations. LinearA plans to support Circom, which uses field arithmetic, and Solidity, which uses signed and unsigned integer arithmetic.

Daniel Kang’s zkml is a framework for constructing proofs of ML model execution in ZK-SNARKs based on his work in the Scaling up Trustless DNN Inference with Zero-Knowledge Proofs paper. As of the time of this writing, it is capable of proving an MNIST circuit using around 5GB of memory and around 16 seconds to run.

On the more generalized model-to-proof compilers, there is both Nil Foundation and Risc Zero. Nil Foundation’s zkLLVM is an LLVM-based circuit compiler and is capable of verifying computational models written in popular programming languages such as C++, Rust, and JavaScript/TypeScript, among others. It is generalized infrastructure versus some of the other model-to-proof compilers noted here, but it is still suitable for complex computations like zkML. This can be particularly powerful when combined with their proof market.

Risc Zero has built a general-purpose zkVM tagetting theopen-source RISC-V instruction set, hence supporting existing mature languages such as C++ and Rust, as well as the LLVM toolchain. This allows for seamless integration between host and guest zkVM code, similar to Nvidia’s CUDA C++ toolchain, but with a ZKP engine in place of a GPU. Similar to Nil, it is possible to verify the computational trace of an ML model using Risc Zero.

Generalized Proving Systems

Improvements in proving systems have been the primary enabler in bringing zkML to fruition, specifically the introduction of custom gates and lookup tables. This is primarily due to ML’s reliance on nonlinearities. In short, nonlinearities are introduced through activation functions (e.g. ReLU, sigmoid, and tanh), which are applied to the outputs of linear transformations within a neural network. These nonlinearities are challenging to implement in zk circuits due to limitations around mathematical operation gates. Bitwise decomposition and lookup tables can help solve this problem by precomputing the possible outcomes of a nonlinearity into a lookup table, which interestingly is much more computationally efficient in zk.

[Plonkish](https://zcash.github.io/halo2/concepts/arithmetization.html#:~:text=PLONKish circuits are defined in,will be elements of F.) proof systems tend to be the most popular backends for zkML for this reason. Halo2 and Plonky2 with their table-style arithmetization scheme can handle neural network non-linearities well via lookup arguments. In addition, the former has a vibrant developer tooling ecosystem coupled with flexibility, making it the de facto backend for many projects including EZKL.

Other proof systems have their benefits as well. R1CS-based proof systems include Groth16 for its small proof sizes and Gemini for its handling of extremely large circuits and linear time prover. STARK-based systems like the Winterfell prover/verifier library are also useful especially when implemented via Giza’s tooling that takes a Cairo program’s trace as an input and generates a STARK proof using Winterfell to attest to the correctness of the output.

zkML-Specific Proving Systems

Some progress has been made in designing efficient proof systems that can handle the complex, circuit-unfriendly operations of advanced ML models. Systems like zkCNN, which is based on the GKR proving system, or the use of compositional techniques in the case of systems like Zator are often more performant than their generalized counterparts as evidenced by the Modulus Labs’ benchmarking report.

zkCNN is a method for proving the correctness of convolutional neural networks using zero knowledge proofs. It uses a sumcheck protocolto prove fast Fourier transforms and convolutions with a linear prover time that is faster than computing the result asymptotically. Several improvements and generalizations have been introduced for interactive proofs, including verifying the convolutional layer, the ReLU activation function, and max pooling. zkCNN is particularly interesting given Modulus Labs’ report on benchmarking where they found it outperformed other generalized proving systems in both proof generation speed and RAM consumption.

Zator is a project aimed at exploring the use of recursive SNARKs to verify deep neural networks. The current constraint for verifying deeper models is fitting the entire computation trace into a single circuit. Zator proposes verifying one layer at a time using recursive SNARKs, which can verify N-step repeated computation incrementally. They use Nova to reduce N instances of computation into a single instance that can be verified at the cost of a single step. With this approach, Zator was able to snark a network with 512 layers, which is as deep as most production AI models today. Proof generation and verifying time for Zator are still too large for mainstream use cases, but their compositional technique is interesting nonetheless.

Applications

Much of the focus of zkML given its early stage has been on the above infrastructure. However, there are a few projects working on applications today.

Modulus Labs is one of the most diverse projects within the zkML space, working on both example applications and relevant research. On the application side, Modulus Labs has demonstrated zkML use cases through RockyBot – an on-chain trading bot – and Leela vs. the World – a chess game where all of humanity plays against a verified, on-chain instance of the Leela chess engine. The team has also branched into research, writing The Cost of Intelligence which benchmarked various proving systems for the speed and efficiency across differing model sizes.

Worldcoin is applying zkML in its attempt to make a privacy-preserving proof of personhood protocol. Worldcoin is using custom hardware to process high-resolution iris scans which are inserted into their Semaphore implementation. This can then be used to perform useful operations like membership attestation and voting. They currently use a trusted runtime environment with a secure enclave to verify the camera-signed iris scan, but they ultimately aim to use a ZKP to attest to the correct inference of the neural network for cryptographic-level security guarantees.

Giza is a protocol that enables the deployment of AI models on-chain using a fully trustless approach. It uses a technology stack that includes the ONNX format for representing machine learning models, the Giza Transpiler for converting these models into the Cairo program format, the ONNX Cairo Runtime for executing the models in a verifiable and deterministic way, and the Giza Model smart contract for deploying and executing the models on-chain. While Giza could also fit into the Model-to-Proof compiler category, their positioning as a marketplace for ML models is one of the more interesting applications out there today.

Gensyn is a distributed hardware provisioning network used to train ML models. Specifically, they are engineering a probabilistic audit system based on gradient descent and using model checkpoints to enable a decentralized GPU network to service the training of full-scale models. While their zkML application here is highly specific to their use case – they want to ensure that when a node downloads and trains a piece of a model that they are honest about the model updates – it showcases the power of combining zk and ML.

ZKaptcha is focused on the bot problem within web3, providing captcha services for smart contracts. Their current implementation has the end user essentially producing a proof of human work by completing the captcha, which is verified by their on-chain verifier and accessible by a smart contract via a couple lines of code. Today, they are primarily only relying on zk, but they intend to implement zkML in the future similar to existing web2 captcha services that analyze behaviors like mouse movements to determine if a user is human.

Given how early the zkML market is, a lot of applications have been experimented with at the hackathon level. Projects include AI Coliseum, an on-chain AI competition built using EZKL that uses ZK proofs to validate machine learning outputs, Hunter z Hunter, a photo scavenger hunt using the EZKL library to validate an image classification model’s outputs with halo2 circuits, and zk Section 9, which converted an AI image generating model into a circuit for minting and verification of AI art.

The Challenges of zkML

While improvements and optimizations are being made at light-speed, some core challenges remain for the zkML space. These challenges range from technical to practical and include:

Quantization with minimal accuracy loss
Circuit sizes especially when a network is composed of many layers
Efficient proofs for matrix multiplication
Adversarial attacks

Quantization is the process of representing floating-point numbers, which most ML models use to represent model parameters and activations, as fixed-point numbers, which is necessary when dealing with the field arithmetic of zk circuits. The impact of quantization on accuracy for machine learning models depends on the level of precision that is used. Generally, using lower precision (i.e., fewer bits) can result in reduced accuracy because it can introduce rounding and approximation errors. However, there are several techniques that can be used to minimize the impact of quantization on accuracy, such as fine-tuning the model after quantization and using techniques like quantization-aware training. In addition, Zero Gravity – a hackathon project at zkSummit 9 – has shown that alternative neural network architectures developed for edge devices such as the weightless neural network can be useful for avoiding the problems of quantization in-circuit.

Outside of quantization, hardware is another key challenge. Once a machine learning model is properly represented via a circuit, it is cheap and fast to verify proofs of its inferences due to the succinctness property of zk. The challenge here is not on the verifier, but on the prover since RAM consumption and proof generation time scales quickly as the model size grows. Certain proving systems (e.g. GKR-based systems that use sumcheck protocols and layered arithmetic circuits) or compositional techniques (e.g. wrapping Plonky2, which is efficient at proving time but poor at efficient proof sizes for large models, with Groth16, which doesn’t see a growing proof size with complexity of the model) are better suited for handling these issues, but managing tradeoffs is a core challenge for projects building in zkML.

On the adversarial side, there is also still work to be done. For one, if a trustless protocol or DAO elects to implement a model, there is still risk around adversarial attacks during the training phase (e.g. training a model to behave a specific way when it sees an input which could be used to manipulate later inferences). Federated learning techniques and training phase zkML could be one way of minimizing this attack surface.

Another core challenge is the risk of model-stealing attacks when a model is privacy-preserved. While a model’s weights can be obfuscated, it is theoretically possible to reverse engineer the weights given enough input-output pairs. This is mainly a risk for smaller scale models, but it is a risk nonetheless.

Scaling Smart Contracts

Although there have been some challenges with optimizing these models to operate within the constraints of zk, improvements are being made at exponential speed and some anticipate we will reach parity with the broader ML space soon assuming further hardware acceleration. To underscore the pace of these improvements, zkML has gone from 0xPARC’s zk-MNIST demonstration in 2021 showing how a small-scale [MNIST](https://en.wikipedia.org/wiki/MNIST_database#:~:text=The MNIST database (Modified National,the field of machine learning.) image classification model could be performed in a verifiable circuit to Daniel Kang’s paper doing the same for ImageNet-scale models just under a year later. In April of 2022, this ImageNet-scale model was further improved from 79% accuracy to 92% accuracy, and networks as large as GPT-2are potentially feasible in the near-term albeit with slow proving times today.

We view zkML as a rich and growing ecosystem that looks to scale the capabilities of blockchains and smart contracts to be more flexible, adaptable, and intelligent.

While zkML is still in its early stages of development, it has already begun to show promising results. As the technology evolves and matures, we can expect to see even more innovative use cases for zkML on-chain.

https://en.foresightnews.pro/zkml-evolving-the-intelligence-of-smart-contracts-through-zero-knowledge-cryptography/

Untitled

通过 zkSNARKs 证明机器学习 (ML) 模型推理有望成为这十年智能合约最重要的进步之一。 这一发展开辟了一个激动人心的大设计空间，允许应用程序和基础设施发展成为更复杂和智能的系统。

通过添加 ML 功能，智能合约可以变得更加自主和动态，能够基于实时链上数据做出决策，而不仅仅是静态规则。智能合约将变得灵活和适应各种情况，包括在合约最初创建时可能没有预料到的情况。简而言之，机器学习能力将扩大我们放在链上的任何智能合约的自动化、准确性、效率和灵活性。

在许多方面，令人惊讶的是，智能合约在当今并没有使用嵌入式的机器学习模型，尽管机器学习在Web3以外的大多数应用中都非常突出。这种缺乏机器学习的原因主要是由于在链上运行这些模型的计算成本很高。例如，一个经过计算优化的语言模型FastBERT需要大约1800 MFLOPS（百万次浮点运算），这在以太坊虚拟机（EVM）上直接运行是不可行的。

在考虑链上 ML 模型的应用时，主要关注的是推理阶段：应用模型对真实世界的数据进行预测。为了拥有 ML 规模的智能合约，合约必须能够摄取此类预测，但正如我们之前提到的，直接在 EVM 上运行模型是不可行的。zkSNARKs 为我们提供了一个解决方案：任何人都可以在链下运行一个模型，并生成一个简洁且可验证的证明，证明预期的模型确实产生了特定的结果。这个证明可以发布到链上，并被智能合约接受，以增强它们的智能性。

在本文中，我们将会：

回顾链上 ML 的潜在应用程序和用例
探索 zkML 核心的新兴项目和基础架构的建设
讨论现有实现的一些挑战以及 zkML 的未来可能形态

机器学习快速入门

机器学习 (ML) 是人工智能 (AI) 的一个子领域，专注于开发算法和统计模型，使计算机能够从数据中学习并根据数据做出预测或决策。ML 模型通常由三个主要组成部分构成：

训练数据：一组输入数据，用于训练机器学习算法以进行预测或对新数据进行分类。训练数据可以采用多种形式，例如图像、文本、音频、数值数据或它们的组合。
模型架构：机器学习模型的整体结构或设计。它定义了层次的类型和数量、激活函数以及节点或神经元之间的连接。架构的选择取决于具体的问题和使用的数据。
模型参数：模模型在训练过程中学习的值或权重，用于进行预测。这些值通过优化算法进行迭代调整，以最小化预测结果与实际结果之间的误差。

模型的生成和部署分两个阶段进行：

训练阶段：在训练过程中，模型接触带有标签的数据集，并调整其参数以最小化预测结果与实际结果之间的误差。训练过程通常涉及多个迭代或周期，并在单独的验证集上评估模型的准确性。
推理阶段：推理阶段是指使用经过训练的机器学习模型对新的、未见过的数据进行预测。模型接收输入数据，并应用学到的参数生成输出，例如分类或回归预测。

目前，zkML主要关注 ML 模型的推理阶段，而不是训练阶段，主要是因为在电路中验证训练的计算复杂性。然而，zkML对验证推理的关注并不是一种限制：我们预计会有一些非常有趣的用例和应用从推理阶段产生。

已验证的推理场景

有四种可能的经过验证的推理场景：

私有输入，公共模型。

模型使用者（MC）可能希望将他们的输入对模型提供者（MP）保密。例如，MC可能希望向贷款人证明信用评分模型的结果，而不需披露个人财务信息。这可以通过使用预承诺方案并在本地运行模型来实现。
公共输入，私有模型。ML-as-a-Service
私有输入，私有模型。
公共输入，公共模型。

应用程序和机会

Verified ML 推理为智能合约开辟了一个新的设计空间。一些加密原生应用程序包括：

去中心化金融（DeFi）

可验证的链下机器学习预言机。
基于机器学习参数的 DeFi 应用。借贷协议主要信任由组织运行的链下模型
自动交易策略。

安全

智能合约的欺诈监控。

传统机器学习

Kaggle 的去中心化、去信任化实现。
生成人工智能的去中心化提示市场。

身份

用保护隐私的生物特征认证代替私钥。
公平的空投和贡献者奖励。

Web3 社交

过滤 web3 社交媒体。有关更多信息，请在此处
广告/推荐。

创作者经济/游戏

游戏内经济再平衡。
新型链上游戏。