Workshop summary
On July 15, at the "Data Security and Trusted AI" academic workshop hosted by the Tsinghua University Institute for Intelligent Industry (AI R), MoQi Technology CEO and co-founder Tai Cheng delivered a presentation titled "High-Performance, Privacy-Preserving Biometric Technology." Tai explained how fingerprint recognition can be reframed as a high-precision image search problem, reporting, for the first time in the industry, second-scale, high-precision, automated matching at the scale of two billion entries. He also described privacy-preserving biometric features and implementation approaches.
Urgency of overcoming large-database degradation for high-performance biometrics
As fingerprint-based biometric systems are increasingly deployed across scenarios, ensuring both recognition performance and user privacy has become an urgent technical challenge.
Biometric tasks generally fall into two types: verification, also known as 1:1 matching; and identification, also known as 1:N matching. The 1:N problem is roughly N times harder than 1:1, and becomes increasingly difficult as database size approaches the 2 billion level.
Beyond the algorithmic challenges of billion-scale fingerprint identification, traditional fingerprint recognition pipelines have four main issues:
- It is difficult to automatically process low-quality fingerprint images; systems still rely on fingerprint experts to manually annotate detailed features. This creates a high technical barrier and low efficiency.
- Traditional methods are based on local minutiae features, which only capture a small portion of fingerprint information. Curvature and geometric information are often lost. As database size increases, accuracy degrades rapidly, a phenomenon known as "large-database degradation."
- Deep learning approaches require large amounts of training data, but such data are not easy to obtain.
- Matching against large databases takes too long.
Converting fingerprint matching into high-precision image search
MoQi Technology has explored new technical directions and identified a viable approach: convert fingerprint matching into a high-precision image search problem. The company developed a high-precision image indexing engine that comprises three components: adaptive multi-scale image representation and indexing, a self-learning framework requiring little or no annotation, and a high-speed heterogeneous search system.

Figure: multi-scale features
Key elements of the approach include:
- First, construct an optimal multi-scale representation for fingerprint images using a more effective mathematical framework. This makes high-precision, high-performance image search possible. At every intermediate scale from pixels to the whole image, different types of features are extracted, such as labels, vectors, and graphs, substantially expanding the captured fingerprint information.
- Second, employ a self-learning AI framework that requires only very few samples, reducing annotation needs by several thousand- to ten-thousand-fold.
- Third, use a very high-performance heterogeneous system and architecture to improve both accuracy and speed. A heterogeneous, multi-layer distributed system optimized for visual search handles multi-scale features. For large-volume images that do not require the highest precision, computation can be assigned to GPUs; images requiring higher precision can be handled by CPUs, enabling fast fingerprint image matching.

Top: manual annotation of detailed features
Bottom: annotation-free matching automatically finds similar regions in fingerprint images

With these ideas and underlying technical innovations, MoQi Technology's next-generation fingerprint recognition system changes the traditional workflow, achieving second-scale, high-precision, automated matching at the two-billion scale. The same techniques have been applied beyond fingerprints to other images such as palm prints and may be extended to broader image search applications.
Privacy as a key challenge in biometric development
Biometric systems bring convenience, but raise growing concerns about data and privacy protection. Privacy issues include not only database breaches but also various external attacks such as spoofing, device substitution, replay attacks, and brute-force attacks. Biometric recognition therefore requires more effective privacy protection mechanisms.
A truly privacy-preserving biometric system should satisfy three properties:
- Irreversibility. Given a stored matching feature, it should be computationally difficult to recover the original template. This prevents misuse of stored biometric data for spoofing or replay attacks and improves system security.
- Revocability. If a template is compromised or deemed insecure, it should be safely revoked and replaced with a new template, similar to revoking and reissuing passwords.
- Non-linkability. It should be computationally difficult to determine whether one or more transformed templates originate from the same original feature. In other words, a user's biometric identities across different applications should not be linkable.
Only solutions meeting these three criteria can be considered cancelable biometrics. Three representative schemes are Biohashing, Fuzzy Commitment, and Fuzzy Vault.
- Biohashing. This approach relies on both the original biometric template and an external key for verification, which can improve accuracy. Its drawback is that a remembered key is required; if the key is exposed, the original biometric data may become insecure.
- Fuzzy Commitment. This scheme uses error-correcting codes. Its advantage is that users do not need to remember a key, but revocability and non-linkability may be compromised depending on the error-correcting code construction.
- Fuzzy Vault. This method assumes the original biometric template contains many feature points and partitions the space into a grid, quantizing feature points into grid cells. If enough correct points are present, a polynomial can be recovered using generalized Reed-Solomon decoding; otherwise the polynomial reconstruction problem with many erroneous points is effectively NP-hard. The difficulty of polynomial reconstruction provides security, but comparison speed is slow.

Figure: Biohashing, Fuzzy Commitment, and Fuzzy Vault
In summary, these three schemes have trade-offs; they cannot simultaneously maximize accuracy, security, and matching speed. Developing privacy-preserving biometric solutions is therefore challenging and remains an active area of research.
Ongoing research
Although privacy-preserving biometric techniques are not yet widely adopted, they hold significant potential. MoQi Technology continues research into privacy-preserving biometric authentication and high-performance biometric search, with the aim of advancing technical capabilities in the field.
ALLPCB