## INTRODUCTION

In a classical random walk, a particle or “walker” moves stochastically around a discrete space. Quantum walks are the quantum analog, where the walker is also governed by quantum effects such as superposition, quantum interference, and entanglement. They have attracted broad interest because of a growing range of applications in quantum information processing (*1*–*4*) and quantum simulations (*5*, *6*). Their markedly distinct properties provide a quantum speedup in solving problems, such as database search (*7*) and graph isomorphism (GI) (*8*, *9*), that more broadly can be applied to pattern recognition and computer vision (*10*), network analysis and navigation (*11*), and website traffic optimization (*12*) and could find application in the use and analysis of large but imperfect graph states in measurement-based quantum computing (*13*).

Further application of quantum walks can be explored using multiple quantum walkers (*14*). This is because the quantum state space increases exponentially with the number of walkers, and scenarios are emerging where the underlying particle properties are strongly influential. How distinguishable quantum particles are determines the amount of nonclassical multiparticle interference and leads to rich and complex interference phenomena (*15*). Specifying particle exchange statistics can facilitate generation of high-dimensional entangled states via multifermion quantum walks (*16*), enables study of braiding statistics with anyonic walks (*17*), and can dictate whether sampling from multiparticle quantum interference patterns is efficient (fermions) or intractable (bosons) to reproduce on a classical computer (*18*).

A range of physical systems (*19*–*21*) including photonics (*22*–*24*) have been used to implement analog simulations of quantum walks as well as digital simulations with quantum logic (*19*, *25*, *26*). By using arrays of evanescently coupled integrated waveguides, quantum walks of up to five photons (*27*) have been demonstrated, and using the inherent stability of integrated optics, linear graphs of hundreds of vertices could be realized (*28*). However, these planar devices are each limited to graphs defined by the layout of the passive underlying photonic circuit, requiring additional modified circuits to observe different variations in quantum walk parameters, such as time evolution (*29*). Increasing capabilities of integrated optics for implementing unitary transformation are demonstrated over the recent years (*24*, *26*, *30*, *31*), including a silica nonreconfigurable multipath unitary that is fixed at the point of fabrication (*30*), reconfigurability over six paths in silica (*31*), programmable optical circuits in silicon (*24*), and multiple reconfigurable two-mode operations combined in silicon, together with pre-entanglement to implement reconfigurable two-qubit processes (*26*).

Here, we present a fully programmable silicon photonic device capable of simulating quantum walk dynamics of correlated particles with full control over all important parameters including Hamiltonian structure, evolution time, particle distinguishability, and exchange symmetry. Specifically, we realize the device by implementing two simultaneous reconfigurable five-mode operations acting on the two parties of a bipartite photonic entangled state: Two on-chip photon pair sources (*32*) generate spatially entangled photons that are manipulated to continuously tune the distinguishability and exchange symmetry of two simulated particles (*33*, *34*); they then undergo continuous-time quantum walks (CTQWs) on any graph of up to five vertices for arbitrary time evolution by using five-mode universal optical circuits (*35*, *31*). By equating the state space of multiparticle quantum walks on one graph to a single particle undergoing a quantum walk evolution on another exponentially larger graph (*14*, *36*), we can program particle properties to access geometries of larger single-particle Hamiltonians that are otherwise beyond the capabilities of a fixed-size universal linear optics circuit acting on single photons or coherent light. With the full programmability of the device and the inherent subwavelength stability of its monolithic integration, we are able to experimentally implement quantum walk–based algorithms of hundreds of graph configurations on a single device—we search for marked vertices in graphs and distinguish nonisomorphic graph pairs. These demonstrations physically explore the suitability of simulating quantum walk phenomena and applications in silicon photonics, as the scale and capability of the technology grow.

## RESULTS

### Silicon photonic chip for quantum walks with tunable particle properties

Particle exchange symmetry and indistinguishability are critical to the dynamics of multiparticle quantum walks (*33*, *34*)—for example, the antisymmetry associated to fermions leads to the Pauli exclusion principle, while the symmetry associated to bosons enables bunching. Exchange symmetry refers to a wave function acquiring a phase ϕ after interchanging two indistinguishable particles. This effect is characterized by the creation operator

and annihilation operator *a _{j}* of mode

*j*satisfying

and

${a}_{i}{a}_{j}^{\u2020}={e}^{i\mathrm{\varphi}}{a}_{j}^{\u2020}{a}_{i}+{\mathrm{\delta}}_{\text{ij}}$, where ϕ is 0 for bosons and π for fermions. Indistinguishability, which we denote as γ (0 ≤ γ ≤ 1), decides the strength of multiparticle quantum interference in the quantum walk evolution. For example, a pair of partially distinguishable photons in the polarization degree of freedom can be represented as

$\mid {H}_{1}\u27e9(\mathrm{\gamma}{H}_{2}+\sqrt{1-{\mathrm{\gamma}}^{2}}\mid {V}_{2}\u27e9)$, where *H* and *V* represent the horizontal and vertical polarizations in modes 1 and 2, respectively.

We extend the protocols of (*33*, *34*) to simulate the CTQW evolutions of multiple particles with tunable ϕ by adding control over both γ and ϕ. The protocol works by sending each particle of an entangled pair of photons through identical copies of the target CTQW evolution and then measuring the corresponding correlated detection probabilities (Fig. 1A)—controlling parameters of the entangled state tunes the properties of the simulated particles. For two particles parameterized by ϕ and γ evolving in a unitary CTQW evolution *U*, the probability to measure two particles occupying vertices *r* and *q* is

(1)with the inputs *j* and *k* of a given graph. Now, consider two photons in an entangled state

, with α, β ∈ ℝ and α^{2} + β^{2} = 1, and pass each photon into a copy of *U*, denoted *U ^{a}* and

*U*, respectively. The correlated detection probability

^{b} of measuring a photon at output *r* of *U ^{a}* and

*q*of

*U*is

^{b}(2)

Comparing

${\mathrm{\Gamma}}_{r,q}^{\mathrm{\varphi},\mathrm{\gamma}}$and

${P}_{r,q}^{\mathrm{\varphi}}$, we obtain

${\mathrm{\Gamma}}_{r,q}^{\mathrm{\varphi},\mathrm{\gamma}}$by measuring

${P}_{r,q}^{\mathrm{\varphi}}$as below (see section S1)

$${\mathrm{\Gamma}}_{r,q}^{\mathrm{\varphi},\mathrm{\gamma}}=\mathrm{\mu}{P}_{r,q}^{\mathrm{\varphi}}$$(3)where

$\mathrm{\mu}=\frac{1}{2}(1+\mathrm{\gamma}+2\mathrm{\gamma}\sqrt{1-{\mathrm{\gamma}}^{2}})$, by creating the two-photon entangled state as

$$\mid \mathrm{\Psi}(\mathrm{\varphi},\mathrm{\gamma})\u27e9=\frac{1}{\sqrt{2}}(\frac{\mathrm{\gamma}+\sqrt{1-{\mathrm{\gamma}}^{2}}}{\sqrt{\mathrm{\mu}}}{a}_{j}^{\u2020}{b}_{k}^{\u2020}+{e}^{i\mathrm{\varphi}}\frac{\mathrm{\gamma}}{\sqrt{\mathrm{\mu}}}{a}_{k}^{\u2020}{b}_{j}^{\u2020})\mid 0\u27e9$$(4)

To generate the entangled state (Eq. 4) and apply the unitary evolutions, our device consists of two main parts: entangled photon-pair generation and universal linear optical transformation (Fig. 1, B and C). The 11 × 3 – mm^{2}–sized chip comprises mainly 2 spontaneous four wave mixing (SFWM) photon sources, 22 simultaneously running thermo-optic phase shifters, 32 multimode interferometer beam splitters, and 16 optical grating couplers (not shown). The two SFWM sources used to create possible (signal-idler) photon pairs are pumped with a laser that is launched into the chip and split across the two sources. The spatially bunched photon pairs are coherently generated in any of the two sources. By postselecting the cases that the signal photons exit at the top of the device (as orientated in Fig. 1C) and the idler photons exit at the bottom, respectively, the required path-entangled state ∣Ψ(ϕ, γ)⟩ is created with success probability of one-quarter. The two five-mode universal linear optical networks are designed to implement unitary transformation of up to five dimensions with fixed inputs and can be reconfigured at ≈kHz rates with thermal phase shifts (*37*).

### Harnessing particle properties to simulate CTQWs on complex graphs

The CTQW evolution on a graph *G* is governed by the adjacency matrix *A* of *G*, whose elements are *A _{jk}* = 1, if vertices

*j*and

*k*are connected by an unweighted edge (

*A*=

_{jk}*w*for an edge of weight

*w*), and

*A*= 0 otherwise. The unitary time evolution at time

_{jk}*t*can be derived as

(5)where ∣ψ_{ini}⟩ and ∣ψ(*t*)⟩ are the initial and evolved states in the orthonormal basis ∣1⟩, ∣2⟩, ⋯, ∣*N*⟩ corresponding to the vertices of *G*. With our device, we first experimentally simulated the CTQW dynamics of two particles with continuously tunable particle exchange symmetry and indistinguishability. For this, the two Reck *et al.*–style networks are configured to implement the same CTQW evolution of a specific time, and the particle property parameters ϕ and γ are controlled by creating the corresponding two-photon entangled state ∣Ψ(ϕ, γ)⟩.

As shown in Fig. 2A, we implemented CTQW evolution on a four-vertex circle graph and obtained the two-particle interference statistics for different cases of exchange symmetry and indistinguishability. The results show substantial changes of the particle correlations in the same unitary evolution depending on the particle properties. With particle exchange symmetry changing, the two-particle interference statistics ranges from corresponding to the Bose-Einstein statistics (ϕ = 0), to the intermediate fractional statistics (

$\mathrm{\varphi}=\frac{\mathrm{\pi}}{4},\frac{\mathrm{\pi}}{2},\frac{3\mathrm{\pi}}{4}$), and to Fermi-Dirac statistics (ϕ = π). With indistinguishability increasing from 0 to 1, the particle correlations are gradually changing from being governed by classical effects (γ = 0) to the case that quantum interference becomes dominant (

$\mathrm{\gamma}=\frac{1}{2},\frac{\sqrt{2}}{2},\frac{\sqrt{3}}{2}$), and lastly, quantum interference completely determines (γ = 1). For example, diagonal terms are still reduced for the fermionic case when γ < 1, but the Pauli exclusion principle does not hold anymore. The dynamics of CTQW evolution of correlated particles can also be easily simulated on our device by reconfiguring the on-chip networks to implement different time evolutions (see section S4).

Multiparticle quantum walks can simulate a single-particle walk on an exponentially increasing sized graph, whose size and geometry depend on the particle properties. A multiparticle quantum walk of *P* fully distinguishable particles on a graph of *N* vertices can simulate a single-particle walk on the Cartesian product graph of *N ^{P}* vertices, while

*P*fully indistinguishable particles can simulate a single-particle walk on a graph of

vertices (for bosons) or

$\left(\begin{array}{c}N\\ P\end{array}\right)$ vertices (for fermions). Given the adjacency matrix of a graph *G*, the simulated larger Hamiltonians corresponding to multiparticle CTQWs on *G* can be obtained explicitly, according to the specific particle indistinguishability and exchange symmetry (see section S2). For example, as shown in Fig. 2 (B and C), two indistinguishable fermion CTQWs on a 4-vertex star graph simulate a single-particle walk on a 6-vertex circle graph—such quantum walks were previously implemented physically using a passive three-dimensional direct-write waveguide chip (*38*), and two indistinguishable boson CTQWs on the same 4-vertex star graph simulate a single-particle walk on a 10-vertex graph (see additional results in section S5).

Particle properties are important factors affecting the efficiency of classical simulation of multiparticle quantum interference, e.g., simulation of multiparticle interference with bosonic symmetry is classically inefficient (*18*), but with fermionic symmetry, it is efficient, and partial indistinguishability reduces the hardness of boson sampling task (*39*). This implies that some simulated exponentially sized quantum walk evolution in our scheme can be simulated classically efficiently, depending on the specific properties of particles undergoing physical quantum interference, while most would include both classically tractable and intractable aspects, giving rise to complicated analysis of complexities. On the other hand, controllable particle properties enable experiments regarding fundamental interest: Tunable distinguishability enables investigation of the nonmonotonicity of the quantum-to-classical transition (*40*) and the full spectrum of multiphoton nonclassical interference (*41*); the ability to tune continuously between bosonic and fermionic quantum interference with a single device may be essential for verification of boson sampling (*34*, *42*).

On our device, CTQW evolution operators are calculated classically and then configured to the optical circuits; this nevertheless does not eliminate the potential quantum computational advantage. When the underpinning structures and even processes of quantum walks can be classically described and simulated, there are known examples that indicate that the properties of the resulting output quantum state are difficult to reproduce classically—in (*25*), it was shown that for quantum walks on circulant graphs, it is likely to be inefficient on a classical computer to sample from the output distributions, while research into boson sampling (*18*) is revealing that the probability distributions associated with the output states of multiboson quantum interference in large unitary processes are also inefficient to sample from with classical resources, even when the underpinning unitary process acting on the modes is completely understood. While in our scheme, for example, a sample from the output states of *P*-boson quantum walk can be obtained via *P* photons entering *P* copies of *N*-dimensional optical circuits (*34*), with polynomial complexity [i.e., *O*(*N*^{3} + *PN*^{2})] in total for calculating the underpinning unitary externally and configuring the circuits, which, by contrast, is classically intractable as the size of experiment increases.

### Implementing quantum walk–based search

The quantum walk model provides a framework for solving the search problem, able to achieve a quadratic speedup over classical algorithms in the same way as Grover’s algorithm (*7*, *43*). Quantum walk–based search is attractive because it is expected to have useful properties with regard to their robustness to noise and ease of physical implementation (*44*). To find one marked element *a _{w}* in an unsorted database

*a*

_{1},

*a*

_{2}, ⋯,

*a*, the quantum walk–based algorithm constructs a search Hamiltonian from the adjacency matrix of the graph

_{N}*G*of

*N*vertices and performs a continuous-time evolution. The element

*a*can be found by measuring the position of the evolved quantum walker on the graph (

_{w}*43*).

Specifically, the search Hamiltonian for finding the vertex *w* in the graph *G* is defined as

(6)where − ∣*w*⟩⟨*w*∣ is the “oracle Hamiltonian” to identify the marked vertex, *A* represents the adjacency matrix of *G*, and κ represents the jumping rate between adjacent vertices. Starting with an equal superposition state over all vertices as

, the search evolves with the unitary operator of *e*^{−iHsht}. The probability of finding the vertex *w* at time *t* is obtained as

(7)

It has been shown that *P _{w}* can achieve

*O*(1) with an evolution time of

, for searching on complete graphs, hypercubes (*7*), and even almost all graphs (*43*).

The optimality of the quantum walk–based search holds even for more general cases. Suppose that finding *n _{w}* marked vertices of

*G*(denoted as

*w*

_{1},

*w*

_{2}, ⋯,

*w*) and starting with an initial superposition state over

_{nw}*n*chosen vertices (denoted as

_{r}*r*

_{1},

*r*

_{2}, ⋯,

*r*), i.e.,

_{nr}, we can define the search Hamiltonian to be

$${H}_{\text{sh}}=-{\displaystyle \sum _{i=1}^{{n}_{r}}}\mid {r}_{i}\u3009\u3008{r}_{i}\mid -{\displaystyle \sum _{j=1}^{{n}_{w}}}\mid {w}_{j}\u3009\u3008{w}_{j}\mid -\mathrm{\kappa}A$$(8)

The probability of finding one marked vertex is then obtained by

$${P}_{w}=\sum _{i=1}^{{n}_{w}}{\mid \mathrm{\u3008}{w}_{i}\mid {e}^{-{\mathit{iH}}_{\text{sh}}t}\mid {\mathrm{\psi}}_{\text{ini}}\mathrm{\u3009}\mid}^{2}$$(9)when *n _{r}* =

*n*≪

_{w}*N*,

*P*can also achieve

_{w}*O*(1) with an evolution time

, holding quadratic speedup over classical algorithms (see section S6).

Full programmability of our device enables implementation of quantum walk–based searches on a diverse variety of graph structures and in various scenarios with different initial states and targets. By first using a single-photon CTQW, we realized the search over a five-vertex complete graph (Fig. 3A) to find one marked vertex, starting with a uniformly superposition state (Fig. 3D) or with a single-vertex state (Fig. 3E) and to find two marked vertices with an initial superposition state over two vertices (Fig. 3F). With the capability of simulating multiparticle CTQWs with controllable particle properties, our chip allows realization of the quantum walk–based search over graphs larger and with higher connectivity than a single-photon CTQW does. By using a two-fermion CTQW on a four-vertex circle graph, we realized the search over a six-vertex graph (Fig. 3B) to find two marked vertices, starting with a single-vertex state (Fig. 3G). We also realized the search over a 15-vertex graph (Fig. 3C) to find two marked vertices via two-boson CTQW on a five-vertex complete graph, starting with a single-vertex state (Fig. 3H). We show the experimentally obtained probability of finding the marked vertices evolving in time for each case in Fig. 3, with the classical fidelities between experimental and theoretical results (see additional results in section S6). Note that the targets are previously known in these search implementations; however, this does not imply that the algorithmic dynamics of quantum walk search can also be classically tractable—example exceptions include searching on circulant graphs (*25*) and searching via multiboson CTQWs (*18*). Compared to the implementations of search in pre-engineered systems (*28*), our programmable device makes it possible to investigate rich behaviors in quantum walk search on graphs with dynamical structure (*45*) and target and with the presence of noise and disorder (*44*).

### Implementing quantum walk–based GI algorithm

The GI problem is to determine whether two graphs can be made identical by relabeling their vertices, and it plays an important role in the fields of pattern recognition and computer vision. For graphs of polynomial size *M*, there are, in total, *M*! possible permutations over *M* vertices, which makes it hard to solve GI for polynomial-sized graphs using a brute force approach. GI is thought to be possibly nondeterministic polynomial (NP)-intermediate (neither NP-complete nor polynomial time) problem, and currently, the best proven general algorithm is in quasi-polynomial time (*46*). To explore the potentials of quantum computation in solving GI, a number of algorithms have been proposed on the basis of various quantum models (*47*–*49*), and most of them use quantum walk evolutions to calculate “graph certificates” to distinguish nonisomorphic graphs (*8*, *9*, *50*–*52*). The quantum walk (QW)-based GI algorithms are mainly different in the aspects of the used model (i.e., discrete or continuous time), the particles involved, the presence of particle interactions and localized inhomogeneities, and the way of constructing graph certificates (see section S7). Although none of the proposed QW-based GI algorithms have been proven analytically to be able to distinguish all graphs in polynomial time, their capability of distinguishing nonisomorphic graphs has been tested by abundant numerical simulations on wide classes of graphs including the strongly regular graphs (SRGs) that are difficult to distinguish.

Here, we present a single-particle CTQW-based algorithm for tackling the GI problem, aiming to distinguish two nonisomorphic polynomial-sized graphs in polynomial time. The algorithm is specifically tailored for the implementation in linear optics using entangled photons together with reconfigurable optical circuits. It obtains its distinguishing power by circularly adding phases to the edges of vertices of the graph, which adds local inhomogeneities into the graphs to break the symmetry with respect to the walk evolution that exists between two similar graphs. Specifically, we construct a hierarchical graph certificate *C* for each of the two graphs and distinguish graphs by comparing two certificates *C _{G}* and

*C*: If

_{H}*C*≠

_{G}*C*, the algorithm determines that the two graphs are nonisomorphic; if

_{H}*C*=

_{G}*C*and an isomorphism between graphs is found by running an extension to the algorithm in polynomial time, the two graphs are determined to be isomorphic; otherwise, the algorithm cannot distinguish the graphs down to their isomorphism classes (Fig. 4A).

_{H}*C*=

*C*

^{(s)}(

*s*= 0, ⋯,4) consists of the sorted lists defined as

(10)where *i* ∈ 1,2, ⋯, *N* represents the *N* vertices of the graph, and *A*^{(s)} is the CTQW Hamiltonian that is obtained by adding phases to the elements of the adjacency matrix *A*: *A*^{(0)} is equal to the original *A*; *A*^{(1)} is obtained by iteratively adding phases to each edge of a single vertex; and *A*^{(2)}, *A*^{(3)}, and *A*^{(4)} are obtained by iteratively adding phases to two, four, and six edges from two distinct vertices, respectively. Numerically, we have tested the algorithm for more than 226 million pairs of nonisomorphic graphs, including 46,119 pairs of SRGs with same parameters, and all nonisomorphic graphs were successfully distinguished by using up to *C*^{(4)} certificate (Fig. 4B). The numerical results indicate that the distinguishing ability is increasing with the level of certificates, i.e., phase additions to more edges. However, it remains to be proven whether the algorithm using up to *C*^{(4)} certificate or even some higher level can distinguish all graphs, and further analytical investigations are required. There could be a fraction of graphs on which the algorithm fails to distinguish—a possible direction for further investigation is to explore whether the unique certificates constructed were sufficient for the number of nonisomorphic SRGs of Latin square group with particular parameters (*52*) that have not been tested here. Thus, our algorithm does not hold the overall theoretical superiority over the best proven GI algorithm (*46*); it can nevertheless find potentials in applications such as discriminating graphs and measuring graph similarity, due to its distinguishing ability for broad classes of graphs.

On a classical computer, a CTQW on an *N*-vertex graph can be simulated via approximating the exponential of the adjacency matrix, which has the complexity of *O*(*N*^{3}) for a general, unstructured matrix (*53*); and thus, our algorithm has an overall classical complexity of up to *O*(*N*^{11}) for constructing *C*^{(4)}—which requires *N*^{8} times of CTQW evolutions (see section S7). Although our algorithm can be implemented on a classical computer for polynomial-sized graphs, potential quantum computational advantage can be achieved when implementing the algorithm using quantum systems for particular kinds of graphs. Using a digital quantum computer, CTQWs on families of graphs [e.g., circulant graphs (*25*)] can be implemented efficiently, i.e., via *O*[poly (log *N*)]-sized quantum circuits; the proposed algorithm could accordingly have a much reduced complexity, e.g., *O*[*N*^{8}poly (log *N*)] for constructing *C*^{(4)}. Using our device, a sample of CTQW on the graph of

vertices can be obtained by a *P*-photon CTQW on an *N*-vertex graph with *O*(*N*^{3} + *PN*^{2}) complexity, which suggests that a sample of polynomial-sized CTQW could be obtained with logarithmic complexity via multiphoton walk. Therefore, for the polynomial-sized graphs that are constructed via multiphoton walks, the proposed GI algorithm can be implemented using our device, achieving a potential quantum speedup compared to the complexity cost of using a classical computer. In addition, here we use phase addition to edges as the way to introduce local inhomogeneities, and there could also be other ways that induce possibly different complexities.

With our device, we experimentally demonstrated the proposed CTQW-based algorithm for distinguishing graphs. In total, 763 pairs of graphs were tested, including 85 pairs of isomorphic graphs and 678 pairs of nonisomorphic graphs. By using two-particle CTQW evolutions with fermionic or bosonic symmetry, we are able to test graphs of up to 15 vertices. In the experiments, we constructed the graph certificates *C _{G}*,

*C*of the given graphs and distinguished them by investigating the classical fidelity between the two experimentally obtained certificates defined as

_{H}. Isomorphic graphs have ideally unity fidelity, while nonisomorphic graphs can only achieve a theoretical fidelity less than 1. As shown in Fig. 4C, the experimental and theoretical results achieve high agreement. Furthermore, the fidelity between the proposed graph certificates can be further used for applications of practical interest, such as measuring the similarity between two graphs. Consider, for example, an unweighted five-vertex graph and the same graph but with only one weighted edge, then the obtained fidelities show clear distinctness changing with the weight (Fig. 4D), suggesting a possible way for ranking similar graphs.

**Acknowledgments: **We are grateful to J. Adcock, A. Montanaro, Y. Liu, S. Chakraborty, and L. Novo for the useful discussion. We acknowledge support from the China Greatwall Technology. **Funding:** This work is supported by the National Natural Science Foundation of China (11804389 and 61632021) and the BAQIS Research Program (Y18G16). X.Q. acknowledges support from the National Young Talents Program, the Youth Talent Lifting Project (18-JCJQ-QT-023), and the Open Fund of the State Key Laboratory of Optoelectronic Materials and Technologies SYSU (OMET-2018-KF-05). P.X. acknowledges support from the Open Fund (OEMT-2018-KF-04) from the State Key Laboratory of Optoelectronic Materials and Technologies SYSU. J.C.F.M. acknowledges support from the Centre for Nanoscience and Quantum Information, a Leverhulme Trust early career fellowship, and a European Research Council starting grant (ERC-2018-STG 803665). X.C. acknowledges support from the Key R&D Program of Guangdong Province (2018B030329001) and the Local Innovative and Research Teams Project of Guangdong Pearl River Talents Program (2017BT01X121). **Author contributions:** X.Q., X.C., X.Y., and J.W. conceived and designed the project. X.Q., J.D.A.M., and J.C.F.M. developed the theoretical scheme. X.Q., R.G., L.C., and X.C. designed and fabricated the device. X.Q., Y.W., S.X., and Y.L. built the experimental setup. X.Q., Y.W., S.X., Y.L., A.H., X.F., P.X., T.Y., F.X., and M.D. carried out the experiments and analyzed the data. X.Q., Y.W., S.X., and J.B.W. carried out the analysis of the algorithms. X.Q., X.C., and J.W. managed the project. All authors discussed the results and contributed to writing the manuscript. **Competing interests:** The authors declare that they have no competing interests. **Data and materials availability:** All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.