07 January 2021
###
The cost of universality

The authors of this work provides a comparative study of the overhead of state distillation and code-switching, and enables a deeper understanding of these two approaches to fault-tolerant universal quantum computation (FTQC). FTQC requires a gate set that involves at least one non-Clifford gate, which is typically generated through such protocols. State distillation relies on first producing many noisy encoded copies of a T state (magic state) and then processing them using Clifford operations to output a high fidelity encoded version of the state. The high-fidelity T state can be then used to implement the non-Clifford T gate. State generation and distillation is still a resource intensive process, despite significant recent improvements. A compelling alternative is ‘code-switching’, for example via gauge fixing to a 3D topological code. However, moving to 3D architectures is experimentally difficult, even if it significantly reduces the overhead compared to state distillation.

This work estimates the resources needed to prepare high-fidelity T states encoded in the 2D color code. This is accomplished via either state distillation or code-switching assuming that both approaches are implemented using noisy quantum-local operations in 3D. In particular, these two approaches were compared by simulating noisy circuits built from single-qubit state preparations, unitaries and measurements, and two-qubit unitaries between nearby qubits.

It is reported that the circuit noise threshold achieved for the code-switching implementation in this work is the highest presented to date and is equivalent to the circuit-noise threshold for the state distillation scheme with the 2D color code.

A direct way to switch between the 2D color code and the 3D subsystem color code is explored. The method exploits a particular gauge-fixing of the 3D subsystem color code for which the code state admits a local tensor product structure in the bulk and can therefore be prepared in constant time. The restriction decoder was adapted for the 3D color code with a boundary and produced a high error threshold. However, the failure probability of implementing the T gate with code-switching and the estimated T gate threshold is found to be lower than the one calculated for the state distillation.

The results suggest that code-switching does not offer substantial savings over state distillation in terms of both space overhead, i.e., the number of physical qubits required, and space-time overhead (the space overhead multiplied by the number of physical time units required) for most circuit noise error rates. Code switching holds a lot of promise, but is dependent on the clever design of codes that have transversal implementations of non-Clifford gates, and such a protocol is at the moment inferior to the state distillation. There exist other protocols that rival state distillation, like the pieceable fault-tolerant implementation of the CCZ gate, which in general shows the importance of having a protocol for non-Clifford gate implementation with a relatively small number of qubits and operations.

This work estimates the resources needed to prepare high-fidelity T states encoded in the 2D color code. This is accomplished via either state distillation or code-switching assuming that both approaches are implemented using noisy quantum-local operations in 3D. In particular, these two approaches were compared by simulating noisy circuits built from single-qubit state preparations, unitaries and measurements, and two-qubit unitaries between nearby qubits.

It is reported that the circuit noise threshold achieved for the code-switching implementation in this work is the highest presented to date and is equivalent to the circuit-noise threshold for the state distillation scheme with the 2D color code.

A direct way to switch between the 2D color code and the 3D subsystem color code is explored. The method exploits a particular gauge-fixing of the 3D subsystem color code for which the code state admits a local tensor product structure in the bulk and can therefore be prepared in constant time. The restriction decoder was adapted for the 3D color code with a boundary and produced a high error threshold. However, the failure probability of implementing the T gate with code-switching and the estimated T gate threshold is found to be lower than the one calculated for the state distillation.

The results suggest that code-switching does not offer substantial savings over state distillation in terms of both space overhead, i.e., the number of physical qubits required, and space-time overhead (the space overhead multiplied by the number of physical time units required) for most circuit noise error rates. Code switching holds a lot of promise, but is dependent on the clever design of codes that have transversal implementations of non-Clifford gates, and such a protocol is at the moment inferior to the state distillation. There exist other protocols that rival state distillation, like the pieceable fault-tolerant implementation of the CCZ gate, which in general shows the importance of having a protocol for non-Clifford gate implementation with a relatively small number of qubits and operations.

Tagged under

30 December 2020
###
Quantum computational advantage using photons

Recently, a team from the University of Science and Technology of China (USTC) demonstrated an experiment on a photonic quantum computer that outperformed even the fastest classical supercomputer in a computational task. Such kind of experiments are targeting algorithms and hardware platforms that can provide “quantum supremacy”, which occurs when a quantum computer is outperforming a classical computer.

A photonic quantum computer harnesses particles of light (photons) and consists of a complex array of optical devices, such as light sources, beam splitters, mirrors and photon detectors, that shuttle photons around. In such a computer, the quantum computation is based on a process called Boson Sampling, which is a task deliberately designed to prove quantum supremacy. Boson sampling is trying to understand what the distribution of photons is going to be at the output of a photonic interferometer. In the case of the quantum device implementation of boson sampling, the problem is solved `by itself’ since the distribution of the measured output is the desired photon distribution. In the case of the classical computer, a large computation is required to find the photon distribution, which increases with the size of the problem since the photon’s quantum properties lead to an exponentially increasing number of possible distributions. If operated with large numbers of photons and many channels, the quantum computer will produce a distribution of numbers that is too complex for a classical computer to calculate. In the new experiment, up to 76 photons traversed a network of 100 channels, which is a much larger amount than previously demonstrated, both experimentally and numerically.

This claim for quantum supremacy comes to reinforce what Google presented last year with their superconducting qubit-based quantum computer. The main difference between the two experiments in terms of the result is that the photonics experiment can create many more possible output states: ~1030 of them compared to ~1016. Such a large number makes it infeasible to calculate the whole probability distribution over outputs and store it for future generation of samples (something other researchers suggested as a rebuttal against Google’s claims, but which can certainly not hold in this new experiment).

Although researchers are currently looking for ways to get similar results with classical computers, it has not yet been successful. The main concern around this quantum experiment is the photon loss. It was reported that up to ~70% of the photons get lost on their way through the beam splitter network, allowing only ~30% to be detected. Typically, that amount of photon loss would be considered fatal for quantum supremacy. Furthermore, the classical simulations that are used for comparisons require fixing the rate of noise and then letting the numbers of photons and modes go to infinity. However, any real experiment has a fixed number of photons and modes (in USTC’s case, they’re ~50 and ~100 respectively).

Achieving the goal of quantum supremacy through such kind of experiments does not indicate the definitive, general, superiority of quantum computers over classical computers, since such kind of problems are deliberately designed to be hard for classical computers. On the other hand, it would also be an understatement to say this experiment is `only a proof of principle’, since boson sampling could have actual practical applications, for example solving specialized problems in quantum chemistry and mathematics.

Currently, most proposals in the literature apply boson sampling to vibronic spectra or finding dense subgraphs, but it is not certain whether these proposals will yield real speedups for a task of practical interest that involves estimating specific numbers (as opposed to sampling tasks, where boson sampling almost certainly does yield exponential speedups).

Future research will focus both on algorithm development, exploiting the particular characteristics of such a specialized quantum device, as well as experimental improvements such as decreased photon loss, higher quality sources and detectors, and larger number of modes. The described experiment presents a promising indication of this sub-field of quantum computing, and we keep a close eye on future developments.

A photonic quantum computer harnesses particles of light (photons) and consists of a complex array of optical devices, such as light sources, beam splitters, mirrors and photon detectors, that shuttle photons around. In such a computer, the quantum computation is based on a process called Boson Sampling, which is a task deliberately designed to prove quantum supremacy. Boson sampling is trying to understand what the distribution of photons is going to be at the output of a photonic interferometer. In the case of the quantum device implementation of boson sampling, the problem is solved `by itself’ since the distribution of the measured output is the desired photon distribution. In the case of the classical computer, a large computation is required to find the photon distribution, which increases with the size of the problem since the photon’s quantum properties lead to an exponentially increasing number of possible distributions. If operated with large numbers of photons and many channels, the quantum computer will produce a distribution of numbers that is too complex for a classical computer to calculate. In the new experiment, up to 76 photons traversed a network of 100 channels, which is a much larger amount than previously demonstrated, both experimentally and numerically.

This claim for quantum supremacy comes to reinforce what Google presented last year with their superconducting qubit-based quantum computer. The main difference between the two experiments in terms of the result is that the photonics experiment can create many more possible output states: ~1030 of them compared to ~1016. Such a large number makes it infeasible to calculate the whole probability distribution over outputs and store it for future generation of samples (something other researchers suggested as a rebuttal against Google’s claims, but which can certainly not hold in this new experiment).

Although researchers are currently looking for ways to get similar results with classical computers, it has not yet been successful. The main concern around this quantum experiment is the photon loss. It was reported that up to ~70% of the photons get lost on their way through the beam splitter network, allowing only ~30% to be detected. Typically, that amount of photon loss would be considered fatal for quantum supremacy. Furthermore, the classical simulations that are used for comparisons require fixing the rate of noise and then letting the numbers of photons and modes go to infinity. However, any real experiment has a fixed number of photons and modes (in USTC’s case, they’re ~50 and ~100 respectively).

Achieving the goal of quantum supremacy through such kind of experiments does not indicate the definitive, general, superiority of quantum computers over classical computers, since such kind of problems are deliberately designed to be hard for classical computers. On the other hand, it would also be an understatement to say this experiment is `only a proof of principle’, since boson sampling could have actual practical applications, for example solving specialized problems in quantum chemistry and mathematics.

Currently, most proposals in the literature apply boson sampling to vibronic spectra or finding dense subgraphs, but it is not certain whether these proposals will yield real speedups for a task of practical interest that involves estimating specific numbers (as opposed to sampling tasks, where boson sampling almost certainly does yield exponential speedups).

Future research will focus both on algorithm development, exploiting the particular characteristics of such a specialized quantum device, as well as experimental improvements such as decreased photon loss, higher quality sources and detectors, and larger number of modes. The described experiment presents a promising indication of this sub-field of quantum computing, and we keep a close eye on future developments.

Tagged under

13 December 2020
###
Micron-scale electro-acoustic qubit architecture for FTQC

In this paper, researchers from Amazon AWS & IQIM present an architecture for a fault-tolerant quantum computer, that is based on hybrid acoustic-electro devices to implement a stabilized cat code with highly biased noise, dominated by dephasing. To combat these sources of noise, they concatenated the cat code with an outer code that focuses mostly on correcting the dephasing errors, based on the repetition code and the surface code. The assumed error model is critical, since it will affect the fidelities of all required operations (initialization, measurement, gates, etc.) based on which the results are compared to previous works. Therefore, a detailed error analysis of measurements and gates, including CNOT and Toffoli gates is presented according to this realistic noise model.

Fault-tolerant quantum computing requires a universal set of gates, which can be divided into two categories, namely gates that belong to the Clifford group and gates that do not. Clifford gates can be typically achieved easily for a variety of codes, however non-Clifford gates require sophisticated protocols to create and then purify to increase their fidelity, like the magic state preparation/distillation protocol. A novel magic-state distillation protocol for Toffoli states is introduced here (injected via lattice surgery), which in combination with the error correction techniques that were used, result in a lower overhead compared to previous works. Actually, it is estimated that the factory that generates the magic states only accounts for approximately 7% of the total resource overhead requirements, with the other 93% coming from the rotated surface code.

In terms of quantum advantage, the authors find that with around 1,000 superconducting circuit components, one could construct a fault-tolerant quantum computer that can run circuits which are intractable for classical supercomputers.

However, when comparing this work to other related works, one should keep in mind that the assumed gate fidelities and the assumed error model can greatly affect the presented results. The error model in this work assumes Z error rates that are far less optimistic than typically assumed for transmon qubits and due to the cat state encoding there is bit-flip noise suppression that can naturally lead to increased performance. Furthermore, transmon architecture resource estimates are based on a simple depolarizing noise model, whereas this noise model has been derived from first principles modeling of the hardware, making the analysis more realistic.

Moreover, the authors claim to have a comparable or up to 3 times fewer qubits required compared to other superconducting transmon qubit architectures, based on the assumed gate fidelities. Also, similar runtime figures are reported compared to other superconducting transmon qubit architectures, however an important distinction of this protocol is that the magic states are created slightly faster than they can be transported to the main algorithm, whereas in other architectures the main algorithm has to wait for the magic states to be created which is a bottleneck in the runtime.

Although such a protocol shows promise for fault-tolerant quantum computing, the injection of magic states comes with an additional qubit cost for data access and routing. The choice of routing solution leads to a lower bound on runtime execution, so more careful optimization of routing costs and speed of magic state injection is crucial.

Fault-tolerant quantum computing requires a universal set of gates, which can be divided into two categories, namely gates that belong to the Clifford group and gates that do not. Clifford gates can be typically achieved easily for a variety of codes, however non-Clifford gates require sophisticated protocols to create and then purify to increase their fidelity, like the magic state preparation/distillation protocol. A novel magic-state distillation protocol for Toffoli states is introduced here (injected via lattice surgery), which in combination with the error correction techniques that were used, result in a lower overhead compared to previous works. Actually, it is estimated that the factory that generates the magic states only accounts for approximately 7% of the total resource overhead requirements, with the other 93% coming from the rotated surface code.

In terms of quantum advantage, the authors find that with around 1,000 superconducting circuit components, one could construct a fault-tolerant quantum computer that can run circuits which are intractable for classical supercomputers.

However, when comparing this work to other related works, one should keep in mind that the assumed gate fidelities and the assumed error model can greatly affect the presented results. The error model in this work assumes Z error rates that are far less optimistic than typically assumed for transmon qubits and due to the cat state encoding there is bit-flip noise suppression that can naturally lead to increased performance. Furthermore, transmon architecture resource estimates are based on a simple depolarizing noise model, whereas this noise model has been derived from first principles modeling of the hardware, making the analysis more realistic.

Moreover, the authors claim to have a comparable or up to 3 times fewer qubits required compared to other superconducting transmon qubit architectures, based on the assumed gate fidelities. Also, similar runtime figures are reported compared to other superconducting transmon qubit architectures, however an important distinction of this protocol is that the magic states are created slightly faster than they can be transported to the main algorithm, whereas in other architectures the main algorithm has to wait for the magic states to be created which is a bottleneck in the runtime.

Although such a protocol shows promise for fault-tolerant quantum computing, the injection of magic states comes with an additional qubit cost for data access and routing. The choice of routing solution leads to a lower bound on runtime execution, so more careful optimization of routing costs and speed of magic state injection is crucial.

Tagged under

01 December 2020
###
Active Quantum Research Areas: Barren Plateaus in PQCs

In this edition of Active Quantum Research Areas (AQRAs), we discuss recent research on barren plateaus in variational quantum algorithms.

Parametrized quantum circuits (PQCs) are employed in Quantum Neural Networks (QNNs) and Variational Quantum Algorithms (VQAs), which are typical researched algorithms that may allow for near-term quantum advantage. QNNs have been recently proposed as generalizations of classical neural networks that can potentially analyse quantum data more efficiently. VQAs have been proposed as a more resilient to noise circuit for the NISQ era.

Both algorithms may rely on gradient based optimization to iteratively minimize a cost function evaluated by measuring output(s) of a quantum processor. Training PQCs involves a hybrid quantum-classical optimization loop. Typically, the problem is encoded in a cost (or loss) function that is ideally efficient to evaluate on a quantum computer but computationally expensive for a classical one. While the quantum computer evaluates the cost, a classical optimizer updates some (usually continuous) parameters associated with the quantum operations. PQCs with fixed gate structure are often referred to as a variational ansatze.

A `barren plateau' is the phenomenon of exponentially vanishing gradients in sufficiently expressive parametrized quantum circuits and it was first coined in the context of quantum circuits in [McClean2018]. It has been established that the onset of a barren plateau regime depends on the cost function, although the particular behaviour has been demonstrated only for certain classes of cost functions. Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Therefore, an exponentially large precision is needed to navigate through the landscape. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. Even if one manages to avoid these barren plateaus, there is an additional difficulty of optimizing in the presence of the hardware noise that defines NISQ devices, which is expected to modify the cost landscape.

The classical optimization step, integral to these algorithms, can be implemented using either gradient-free or gradient-based methods. At first glance, the use of the former seems more effective as the evaluation of the cost function is subject to imperfection(s). However, further research developed methods of evaluating gradients analytically (i.e. without relying on finite differences), and the access to gradients was proven to speed up the convergence to local minima.

In the absence of a barren plateau, the determination of a minimizing direction in the cost function landscape does not require an exponentially large precision, meaning that one can always navigate through the landscape by measuring expectation values with a precision that grows at most polynomially with the system size. Polynomial overhead is the standard goal for quantum algorithms, which aim to achieve a speedup over classical algorithms that often scale exponentially. Hence, the absence or existence of a barren plateau can determine whether or not a quantum speedup is achievable.

Recently, particularly in the past month of November 2020, many interesting papers appeared on the arXiv describing research on these barren plateaus, which we briefly summarize here.

In [Pesah2020], the authors analyse gradient scaling for the parameters of their proposed Quantum Convolutional Neural Network (QCNN). The main result of this research is the proposition of a novel method for analyzing the scaling of the variance and lower bounding the variance for the QCNN architecture.

In [Zhang2020], the authors propose a tree tensor architecture for QNNs. The main result is the proof that tree tensor QNNs have gradients that vanish polynomially with the qubit number, irrespective of which encoding methods are employed.

In [Fontana2020], the authors examine ways to exploit the symmetries and degeneracies that exist in PQCs and investigate how hardware noise affect such symmetries. The main result is that they find an exponentially large symmetry in PQCs, yielding an exponentially large degeneracy of the minima in the cost landscape. Also, they show that noise (specifically non-unital noise) can break these symmetries and lift the degeneracy of minima, making many of them local minima instead of global minima. The main contribution is an novel optimization method, which exploits underlying symmetries in PQCs.

In [Uvarov2020], the authors derive a lower bound on the variance of the gradient, which depends mainly on the width of the circuit causal cone of each term in the Pauli decomposition of the cost function. The main contribution is that this result further clarifies the conditions under which barren plateaus can occur. Given a Hamiltonian, they are able to estimate its susceptibility to the barren plateaus, therefore one can pre-process Hamiltonians in order to make the optimization more viable.

In [Arrasmith2020], the authors answer the question of whether only gradient based optimizers suffer from the vanishing gradient phenomenon. The main result is that gradient-free optimizers do not solve the barren plateau problem and prove that cost function differences are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization.

In [Anand2020], the authors explore Natural Evolutionary Strategies (NES) for the optimization of randomly-initialized PQCs in the region of vanishing gradients. The main result is that using the NES gradient estimator the exponential decrease in variance can be alleviated. They compare their two approaches, namely the exponential and separable NES, against standard gradient descent and show that optimization of randomly initialized PQCs can be performed with significantly less circuit evaluations using NES, while achieving comparable accuracy to gradient based methods. They also show that NES methods can be used to amplify gradients and improve parameter initialization for gradient-based approaches.

It is clear from the ongoing research that the vanishing gradient problem is substantial when working with PQCs, which in turn seems to be a crucial aspect for proving quantum advantage in the NISQ era. Although different approaches have been suggested to overcome the limitations arising from barren plateaus, there is not one simple way to do it. Most of the strategies discussed, focus on the near term heuristic solutions that can be obtained from PQCs, with the main goal being a polynomial instead of an exponential overhead with the number of qubits for the precision of the cost function, which is a way one can prove quantum advantage. This is done by employing new optimization strategies, optimizers, re-design of the Hamiltonian/ansatz, exploitation of symmetries and many other ways. Researchers in this field will likely keep inventing new ways to avoid or mitigate barren plateaus for the foreseeable future, and these recent developments have ushered in a new era of understanding of this domain.

References:

[Anand2020] Natural Evolutionary Strategies for Variational Quantum Computation, by Abhinav Anand, Matthias Degroote and Alán Aspuru-Guzik, arXiv:2012.00101

[Arrasmith2020] Effect of barren plateaus on gradient-free optimization, by Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz Cincio, Patrick J. Coles, arXiv:2011.12245

[Fontanta2020] Optimizing parametrized quantum circuits via noise-induced breaking of symmetries, Enrico Fontana, M. Cerezo, Andrew Arrasmith, Ivan Rungger, Patrick J. Coles, arXiv:2011.08763

[Pesah2020] Absence of Barren Plateaus in Quantum Convolutional Neural Networks, by Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, Andrew T. Sornborger and Patrick J. Coles, arXiv:2011.02966

[McClean2018] Barren plateaus in quantum neural network training landscapes, by McClean, J.R., Boixo, S., Smelyanskiy, V.N., R. Babbush and H. Neven, Nat Commun 9, 4812 (2018)

[Uvarov2020] On barren plateaus and cost function locality in variational quantum algorithms, by Alexey Uvarov and Jacob Biamonte, arXiv:2011.10530

[Zhang2020] Toward Trainability of Quantum Neural Networks, by Kaining Zhang, Min-Hsiu Hsieh, Liu Liu and Dacheng Tao, arXiv:2011.06258

Parametrized quantum circuits (PQCs) are employed in Quantum Neural Networks (QNNs) and Variational Quantum Algorithms (VQAs), which are typical researched algorithms that may allow for near-term quantum advantage. QNNs have been recently proposed as generalizations of classical neural networks that can potentially analyse quantum data more efficiently. VQAs have been proposed as a more resilient to noise circuit for the NISQ era.

Both algorithms may rely on gradient based optimization to iteratively minimize a cost function evaluated by measuring output(s) of a quantum processor. Training PQCs involves a hybrid quantum-classical optimization loop. Typically, the problem is encoded in a cost (or loss) function that is ideally efficient to evaluate on a quantum computer but computationally expensive for a classical one. While the quantum computer evaluates the cost, a classical optimizer updates some (usually continuous) parameters associated with the quantum operations. PQCs with fixed gate structure are often referred to as a variational ansatze.

A `barren plateau' is the phenomenon of exponentially vanishing gradients in sufficiently expressive parametrized quantum circuits and it was first coined in the context of quantum circuits in [McClean2018]. It has been established that the onset of a barren plateau regime depends on the cost function, although the particular behaviour has been demonstrated only for certain classes of cost functions. Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Therefore, an exponentially large precision is needed to navigate through the landscape. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. Even if one manages to avoid these barren plateaus, there is an additional difficulty of optimizing in the presence of the hardware noise that defines NISQ devices, which is expected to modify the cost landscape.

The classical optimization step, integral to these algorithms, can be implemented using either gradient-free or gradient-based methods. At first glance, the use of the former seems more effective as the evaluation of the cost function is subject to imperfection(s). However, further research developed methods of evaluating gradients analytically (i.e. without relying on finite differences), and the access to gradients was proven to speed up the convergence to local minima.

In the absence of a barren plateau, the determination of a minimizing direction in the cost function landscape does not require an exponentially large precision, meaning that one can always navigate through the landscape by measuring expectation values with a precision that grows at most polynomially with the system size. Polynomial overhead is the standard goal for quantum algorithms, which aim to achieve a speedup over classical algorithms that often scale exponentially. Hence, the absence or existence of a barren plateau can determine whether or not a quantum speedup is achievable.

Recently, particularly in the past month of November 2020, many interesting papers appeared on the arXiv describing research on these barren plateaus, which we briefly summarize here.

In [Pesah2020], the authors analyse gradient scaling for the parameters of their proposed Quantum Convolutional Neural Network (QCNN). The main result of this research is the proposition of a novel method for analyzing the scaling of the variance and lower bounding the variance for the QCNN architecture.

In [Zhang2020], the authors propose a tree tensor architecture for QNNs. The main result is the proof that tree tensor QNNs have gradients that vanish polynomially with the qubit number, irrespective of which encoding methods are employed.

In [Fontana2020], the authors examine ways to exploit the symmetries and degeneracies that exist in PQCs and investigate how hardware noise affect such symmetries. The main result is that they find an exponentially large symmetry in PQCs, yielding an exponentially large degeneracy of the minima in the cost landscape. Also, they show that noise (specifically non-unital noise) can break these symmetries and lift the degeneracy of minima, making many of them local minima instead of global minima. The main contribution is an novel optimization method, which exploits underlying symmetries in PQCs.

In [Uvarov2020], the authors derive a lower bound on the variance of the gradient, which depends mainly on the width of the circuit causal cone of each term in the Pauli decomposition of the cost function. The main contribution is that this result further clarifies the conditions under which barren plateaus can occur. Given a Hamiltonian, they are able to estimate its susceptibility to the barren plateaus, therefore one can pre-process Hamiltonians in order to make the optimization more viable.

In [Arrasmith2020], the authors answer the question of whether only gradient based optimizers suffer from the vanishing gradient phenomenon. The main result is that gradient-free optimizers do not solve the barren plateau problem and prove that cost function differences are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization.

In [Anand2020], the authors explore Natural Evolutionary Strategies (NES) for the optimization of randomly-initialized PQCs in the region of vanishing gradients. The main result is that using the NES gradient estimator the exponential decrease in variance can be alleviated. They compare their two approaches, namely the exponential and separable NES, against standard gradient descent and show that optimization of randomly initialized PQCs can be performed with significantly less circuit evaluations using NES, while achieving comparable accuracy to gradient based methods. They also show that NES methods can be used to amplify gradients and improve parameter initialization for gradient-based approaches.

It is clear from the ongoing research that the vanishing gradient problem is substantial when working with PQCs, which in turn seems to be a crucial aspect for proving quantum advantage in the NISQ era. Although different approaches have been suggested to overcome the limitations arising from barren plateaus, there is not one simple way to do it. Most of the strategies discussed, focus on the near term heuristic solutions that can be obtained from PQCs, with the main goal being a polynomial instead of an exponential overhead with the number of qubits for the precision of the cost function, which is a way one can prove quantum advantage. This is done by employing new optimization strategies, optimizers, re-design of the Hamiltonian/ansatz, exploitation of symmetries and many other ways. Researchers in this field will likely keep inventing new ways to avoid or mitigate barren plateaus for the foreseeable future, and these recent developments have ushered in a new era of understanding of this domain.

References:

[Anand2020] Natural Evolutionary Strategies for Variational Quantum Computation, by Abhinav Anand, Matthias Degroote and Alán Aspuru-Guzik, arXiv:2012.00101

[Arrasmith2020] Effect of barren plateaus on gradient-free optimization, by Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz Cincio, Patrick J. Coles, arXiv:2011.12245

[Fontanta2020] Optimizing parametrized quantum circuits via noise-induced breaking of symmetries, Enrico Fontana, M. Cerezo, Andrew Arrasmith, Ivan Rungger, Patrick J. Coles, arXiv:2011.08763

[Pesah2020] Absence of Barren Plateaus in Quantum Convolutional Neural Networks, by Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, Andrew T. Sornborger and Patrick J. Coles, arXiv:2011.02966

[McClean2018] Barren plateaus in quantum neural network training landscapes, by McClean, J.R., Boixo, S., Smelyanskiy, V.N., R. Babbush and H. Neven, Nat Commun 9, 4812 (2018)

[Uvarov2020] On barren plateaus and cost function locality in variational quantum algorithms, by Alexey Uvarov and Jacob Biamonte, arXiv:2011.10530

[Zhang2020] Toward Trainability of Quantum Neural Networks, by Kaining Zhang, Min-Hsiu Hsieh, Liu Liu and Dacheng Tao, arXiv:2011.06258

Tagged under

20 November 2020
###
Quantum algorithm for solving nonlinear differential equations

Qu&Co in collaboration with our academic advisor Oleksandr Kyriienko at the University of Exeter has developed a proprietary quantum algorithm which promises a generic and efficient way to solve nonlinear differential equations. The algorithm is compatible with near-term quantum-processors, with promising extensions for fault-tolerant implementation. Using a quantum feature map encoding, we define functions as expectation values of parametrized quantum circuits. We use automatic differentiation to represent function derivatives in an analytical form as differentiable quantum circuits (DQCs), thus avoiding inaccurate finite difference procedures for calculating gradients. We describe a hybrid quantum-classical workflow where DQCs are trained to satisfy differential equations and specified boundary conditions. As a particular example setting, we show how this approach can implement a spectral method for solving differential equations in a high-dimensional feature space. From a technical perspective, we design a Chebyshev quantum feature map that offers a powerful basis set of fitting polynomials and possesses rich expressivity. We simulate the algorithm to solve an instance of Navier-Stokes equations, and compute density, temperature and velocity profiles for the fluid flow in a convergent-divergent nozzle.

Tagged under

03 November 2020
###
The power of data in quantum machine learning

Enhancing machine learning applications with quantum computing is currently being massively investigated, since it might prove quantum advantage during the NISQ era. Enhancement is typically performed through improvement of the training process of existing classical models or enhancing inference in graphical models. Another way is through the construction of quantum models that generate correlations between variables that are inefficient to represent through classical computation (e.g. quantum neural networks). If the model leverages a quantum circuit that is hard to sample results from classically, then there is potential for a quantum advantage.

The main message of this work is to show quantitatively that when classical models are provided with some training data, even if those were obtained from a quantum model that cannot be computed classically, they can reach similar performance as the quantum model. The authors provide rigorous prediction error bounds for training classical and quantum ML methods based on kernel functions in order to learn quantum mechanical models. It has been proven that kernel methods provide provable guarantees, but are also very flexible in the functions they can learn.

The main contribution of the author’s work lies in the development of quantum kernels and the implementation of a guidebook that generates ML problems which give a large separation between quantum and classical models. The use of prediction error bounds quantifies the separation between prediction errors of quantum and classical ML models for a fixed amount of training data. Typically, the comparison is done based on the geometric difference defined by the closest efficient classical ML model, however, in practice, one should consider the geometric difference with respect to a suite of optimized classical ML models. When the geometric difference is small then a classical ML method is guaranteed to provide similar or better performance in prediction on the data set, independent of the function values or labels. When the geometry differs greatly, there exists a data set that exhibits large prediction advantage using the quantum ML model.

The small geometric difference is a consequence of the exponentially large Hilbert space employed by existing quantum models, where all inputs are too far apart. To circumvent the setback, the authors propose an improvement which enlarges the geometric difference by projecting quantum states embedded from classical data back to approximate classical representation. This proposal allows for the construction of a data set that demonstrates large prediction advantage over common classical ML models in numerical experiments up to 30 qubits. In that way, one can use a small quantum computer to generate efficiently verifiable ML problems that could be challenging for classical ML models.

The main message of this work is to show quantitatively that when classical models are provided with some training data, even if those were obtained from a quantum model that cannot be computed classically, they can reach similar performance as the quantum model. The authors provide rigorous prediction error bounds for training classical and quantum ML methods based on kernel functions in order to learn quantum mechanical models. It has been proven that kernel methods provide provable guarantees, but are also very flexible in the functions they can learn.

The main contribution of the author’s work lies in the development of quantum kernels and the implementation of a guidebook that generates ML problems which give a large separation between quantum and classical models. The use of prediction error bounds quantifies the separation between prediction errors of quantum and classical ML models for a fixed amount of training data. Typically, the comparison is done based on the geometric difference defined by the closest efficient classical ML model, however, in practice, one should consider the geometric difference with respect to a suite of optimized classical ML models. When the geometric difference is small then a classical ML method is guaranteed to provide similar or better performance in prediction on the data set, independent of the function values or labels. When the geometry differs greatly, there exists a data set that exhibits large prediction advantage using the quantum ML model.

The small geometric difference is a consequence of the exponentially large Hilbert space employed by existing quantum models, where all inputs are too far apart. To circumvent the setback, the authors propose an improvement which enlarges the geometric difference by projecting quantum states embedded from classical data back to approximate classical representation. This proposal allows for the construction of a data set that demonstrates large prediction advantage over common classical ML models in numerical experiments up to 30 qubits. In that way, one can use a small quantum computer to generate efficiently verifiable ML problems that could be challenging for classical ML models.

Tagged under

05 October 2020
###
Quantum Speed-up in Supervised Machine Learning

There is currently a big effort in proving the existence of quantum advantage in NISQ devices, with typical applications including machine learning due to its wide applicability. However, an argument has been raised stipulating whether the advantage being proven in such applications arises from the nature of quantum computing or the way the data is provided. Actually, there are recent works which have shown that, when classical algorithms have an analogous sampling access to data, then they can reach similar performance as their quantum counterparts. Therefore, a fair study concerning quantum advantage in such a setting should only assume classical access to data for both the classical and quantum algorithm.

Most quantum algorithms being explored at the moment are of the variational nature, where a circuit is selected from a parametrized circuit family via classical optimization. However, such algorithms are heuristic and there is no formal evidence that they exhibit a genuine quantum advantage. In their work, the authors describe an exponential quantum speedup via the use of a quantum-enhanced feature space, where each data point is mapped non-linearly to a quantum state and then classified by a linear classifier in the high dimensional Hilbert space. The classification is achieved through a support vector machine (SVM), whose kernel matrix is obtained by measuring the pairwise inner product of the feature states, known as quantum kernel estimation (QKE). This kernel matrix is then given to a classical optimizer that efficiently finds the linear classifier that optimally separates the training data in feature space by running a convex quadratic program.

The advantage of such a quantum learner comes from the ability to recognize classically intractable complex patterns, using a feature map. This implementation combines the ideas of the generalization of soft margin classifiers and rigorously bounding the misclassification error of the SVM-QKE algorithm, thereby providing quantum advantage for the quantum classifier with guaranteed high accuracy. Furthermore, it is proven that this SVM classifier can learn the optimal separator in the exponentially large feature space, while also making it robust against additive sampling errors due to the error mitigation techniques that were used.

Finally, the classification problem showing the exponential quantum speed-up was based on the discrete logarithm problem (DLP) and it is proven that no efficient classical algorithm can achieve an accuracy that is inverse-polynomially better than random guessing, assuming the widely-believed classical hardness of DLP.

The results from this paper are relevant to a broader class of quantum machine learning algorithms exploiting feature maps and which aim to avoid the input-output problem. Although this particular problem is not practically motivated, these results set a positive theoretical foundation for the search of practical quantum advantage in machine learning.

Most quantum algorithms being explored at the moment are of the variational nature, where a circuit is selected from a parametrized circuit family via classical optimization. However, such algorithms are heuristic and there is no formal evidence that they exhibit a genuine quantum advantage. In their work, the authors describe an exponential quantum speedup via the use of a quantum-enhanced feature space, where each data point is mapped non-linearly to a quantum state and then classified by a linear classifier in the high dimensional Hilbert space. The classification is achieved through a support vector machine (SVM), whose kernel matrix is obtained by measuring the pairwise inner product of the feature states, known as quantum kernel estimation (QKE). This kernel matrix is then given to a classical optimizer that efficiently finds the linear classifier that optimally separates the training data in feature space by running a convex quadratic program.

The advantage of such a quantum learner comes from the ability to recognize classically intractable complex patterns, using a feature map. This implementation combines the ideas of the generalization of soft margin classifiers and rigorously bounding the misclassification error of the SVM-QKE algorithm, thereby providing quantum advantage for the quantum classifier with guaranteed high accuracy. Furthermore, it is proven that this SVM classifier can learn the optimal separator in the exponentially large feature space, while also making it robust against additive sampling errors due to the error mitigation techniques that were used.

Finally, the classification problem showing the exponential quantum speed-up was based on the discrete logarithm problem (DLP) and it is proven that no efficient classical algorithm can achieve an accuracy that is inverse-polynomially better than random guessing, assuming the widely-believed classical hardness of DLP.

The results from this paper are relevant to a broader class of quantum machine learning algorithms exploiting feature maps and which aim to avoid the input-output problem. Although this particular problem is not practically motivated, these results set a positive theoretical foundation for the search of practical quantum advantage in machine learning.

Tagged under

29 September 2020
###
Meta variational quantum eigensolver

Qu&Co comments on this publication:

In this paper, the meta-VQE algorithm is presented, which is an adaptation of the Variational Quantum Eigensolver (VQE). VQE is a variational algorithm that was originally proposed for finding the ground state energy of a given Hamiltonian by variationally minimizing its expectation value with a parametrized quantum circuit. The cost function of VQE is the expected value of the model Hamiltonian. The variational principle states that this value is an upper bound of the ground state energy, so everything reduces to minimize this value by fine-tuning the parameters of the circuit.

The meta-VQE algorithm is inspired by quantum machine learning algorithms (QML) and is able to learn the ground state energy profile of a parametrized Hamiltonian. First, this circuit is trained with a set of Hamiltonian parameters, which are encoded in the gates of an encoding unitary. By designing a cost function with the expected values of all these Hamiltonian training points, the algorithm extracts the optimal values of the variational parameters. Then, one can compute the energy for other Hamiltonian values by just running the meta-VQE circuit with the parameters obtained in the minimization. In addition, one can also use the result of a meta-VQE training as a starting point of a standard VQE algorithm, the opt-meta-VQE, instead of random initialization.

This technique can make the algorithm converge to the correct solution, while previous variants of VQE often suffer from convergence issues due to the exponential increase of the Hilbert space paired with a random initialized ansatz, typically resulting in a limited probability of finding the ground state as the quantum system increases. The characteristic trait of the meta-VQE is that it can be used to first explore the ground state energies of Hamiltonian parameter space with only a few training points and then use the result as an initial state for a precise VQE, resulting in high precision. Furthermore, meta-VQE can easily be adapted to be a part of other VQE strategies to increase performance. Moreover, the meta-VQE is able to capture global correlation with a few training points alleviating the refined optimization of the individual points in a later step. The authors conclude that the meta-VQE can find the general energy shape but not provide an accurate value, in contrast to standard VQE. However, the opt-meta-VQE proves valuable, achieving better results than standard VQE with random initialization.

Tagged under

12 August 2020
###
Density Functional Theory on a Quantum Computer

A prospective application of quantum computing is solving quantum chemistry problems, however, obtaining exact solutions is difficult due to the lack of general method of obtaining such solutions. Typically the solution lies in finding the ground state energy, even though the energy is not descriptive enough to fully characterize all desired properties of a system. In order to find such properties, many measurements of the wavefunction are required. These measurements are expensive, because the wavefunction cannot be copied and must often be re-prepared before a second measurement is performed. Finding the full wavefunction would require exponentially many measurements, so one option would be to encode many solutions into one measurement by using a machine learned (ML) model. Training of the ML model requires finding exact quantities at several different external potentials. Besides that, one can use density functional theory (DFT), to replace the wavefunction with the one-body density, n(r), which has fewer variables. When DFT is used, instead of the Hamiltonian, the universal functional, F[n], must be found. The quantities required for the classical user to find self-consistent solutions are the exact functional (determining the energy) and the functional derivative. So, in addition to finding F[n], one also must find some other quantity such as the density, n(r), or the Kohn-Sham(KS) potential, vs(r). With these components, one can fully characterize a quantum ground state and solve for other measurable quantities.

The authors propose an algorithm that finds the ML model for F[n] on the quantum computer if a ground-state wavefunction is available. The algorithm leaves the wavefunction largely undisturbed so it can be used as the starting state for another system, greatly reducing the pre-factor required to solve other systems through a quantum counting algorithm to extract descriptive quantities such as the density. Most of this algorithm is kept entirely on the quantum computer to motivate future improvements for speed, but the counting algorithm does allow for information to be output classically. Moreover, the authors demonstrate that the exact Kohn-Sham potential can be solved in a faster way with a gradient evaluated on a cost function for the Kohn-Sham system.

The novelty of the proposed algorithm suggested in this work is the limitation of the number of measurements and re-preparations of the wavefunction especially in the case of time-dependent quantities. Since there is no algorithm for the general case that is exponentially better than the proposed algorithm, limiting the number of measurements and re-preparations of the wavefunction is as best as one can achieve. The proposed algorithm is a combination of several known algorithms including quantum phase estimation, quantum amplitude estimation, and quantum gradient methods that are iteratively used to train a machine learned model.

The authors propose an algorithm that finds the ML model for F[n] on the quantum computer if a ground-state wavefunction is available. The algorithm leaves the wavefunction largely undisturbed so it can be used as the starting state for another system, greatly reducing the pre-factor required to solve other systems through a quantum counting algorithm to extract descriptive quantities such as the density. Most of this algorithm is kept entirely on the quantum computer to motivate future improvements for speed, but the counting algorithm does allow for information to be output classically. Moreover, the authors demonstrate that the exact Kohn-Sham potential can be solved in a faster way with a gradient evaluated on a cost function for the Kohn-Sham system.

The novelty of the proposed algorithm suggested in this work is the limitation of the number of measurements and re-preparations of the wavefunction especially in the case of time-dependent quantities. Since there is no algorithm for the general case that is exponentially better than the proposed algorithm, limiting the number of measurements and re-preparations of the wavefunction is as best as one can achieve. The proposed algorithm is a combination of several known algorithms including quantum phase estimation, quantum amplitude estimation, and quantum gradient methods that are iteratively used to train a machine learned model.

Tagged under

Qu&Co comments on this publication:

Various heuristic quantum optimization approaches have been suggested to solve combinatorial optimization problems, since the NP hardness of such problems makes heuristics the only viable option for many problems that need to be routinely solved in real-world applications. Usually the performance of the heuristic approach is examined; however, an equally important aspect is their application to current hardware. During the NISQ era, hardware with only a limited number of qubits is available and quantum error correction cannot be fully exploited. Furthermore, the error rates of the quantum circuitry is still high (10^{-3} – 10^{-4}), which makes the computation inefficient.

This paper reviews existing approaches and develop new methods or improvements of many prominent approaches to combinatorial optimization on a small quantum computer by compiling circuits that implement them and optimizing their fault-tolerant realizations. Many of these methods are bottlenecked by calls to the same subroutines; thus, optimized circuits for those primitives should be of interest regardless of which heuristic is most effective in practice. In essentially all heuristic approaches to quantum optimization there is a primitive that is repeated many times in order to perform the optimization. Instead of investigating how many times those primitives must be repeated, one should focus on the best strategies for realizing those primitives within a fault-tolerant cost model. The goal of this paper is to estimate the performance of an early universal quantum computer for key steps of combinatorial optimization.

These bottlenecks are compiled for several families of optimization problems and it is reported for how long and for what size systems one can perform these heuristics in the surface code given a range of resource budgets. The obtained results discourage the notion that any quantum optimization heuristic realizing only a quadratic speedup will achieve an advantage over classical algorithms on modest superconducting qubit surface code processors without significant improvements in the implementation of the surface code. The essential reason for this is the substantial constant factor slowdown between error-corrected quantum computation and classical computation. Based on these results, we will either need quantum optimization algorithms that afford speedups which are much better than quadratic, or we will need significant improvements in the way that we realize error-correction.

For example, the authors calculate that to implement problems between N = 64 and N = 1024, hundreds of thousands of physical qubits are required when physical gate error rates are on the order of 10^{-4} and sometimes over a million are required for physical gate error rates on the order of 10^{-3}. Even more concerning is that the number of updates achieved in a day (given realistic cycle times for the error correcting codes) is relatively low, on the order of about ten thousand updates for the smallest instances considered of the cheapest cost functions. With such overheads, these heuristics would need to yield dramatically better improvements in the objective function per step than classical optimization heuristics. Therefore, barring significant advances in the implementation of the surface code (e.g., much faster state distillation), quantum optimization algorithms offering only a quadratic speedup are unlikely to produce any quantum advantage on the first few generations of superconducting qubit surface code processors.

Tagged under

27 June 2020
###
Quantum optimization for linear algebra problems

Qu&Co comments on this publication:

One hallmark problem in computational linear algebra is the binary linear least squares (BLLS), which is formally in the NP-Hard complexity class. Efficient classical methods for solving this problem exists with limited approximations to the solution. Quantum computing may solve these problems with a better approximation ratio and/or in a faster runtime scaling. So-far, this problem has only been considered on a quantum annealing by mapping it to a QUBO. In this paper, the problem is solved using a QAOA approach on the gate-based model of quantum computing. The performance is assessed both on a wavefunction simulator, shotnoise simulator and on the 5-qubit IBM cloud computing quantum device ibmq_london. As an outlook: BLLS may serve as a building block for other problems such as Non-negative Binary Matrix Factorization, or clubbed together for a fixed-point approximation of real variables. This paper was partially supervised by Vincent Elfving from Qu & Co.

Tagged under

11 May 2020
###
How many qubits for quantum supremacy?

Qu&Co comments on this publication:

Quantum computational supremacy (QCS) arguments have been provided to demonstrate that quantum computers will soon outperform classical computers in a variety of algorithms. However, in order to truly prove supremacy, several strict measures need to be taken: an appropriate algorithm must be selected, a quantum device to run the algorithm must be designed, the ability to verify the results of the calculations must be considered and a complexity theory that supports the claim that a classical computer would be unable to run such an algorithm should be provided. Quantum circuits running on quantum computing chips that are currently experimentally realized might still be able to be simulated in highly parallelized state-of-the-art classical supercomputers, therefore one can only make *conjectures* about QCS at the moment. Typically, classical simulation of certain families of quantum circuits require scaling that is worse than any polynomial in the size of the circuit, which prevents us from calculating exactly the number of qubits these quantum circuits must have for their classical simulation to be intractable on modern classical supercomputers.

In this paper, three refined fine-grained conjectures about quantum supremacy are provided and it is calculated that 208 qubits and 500 gates for Instantaneous Quantum Polynomial-Time (IQP) circuits, 420 qubits and 500 constraints for Approximate Optimization Algorithm (QAOA) circuits and 98 photons and 500 optical elements are sufficient. Although noise in current quantum devices cannot be fully approximated, a lower bound on the runtime of all three algorithms for any multiplicative-error classical simulation is provided.

This paper provides a concrete estimation on the number of qubits required for three algorithms that have gained a lot of attention during the NISQ era. While the orginal work stems from 2018, the number of qubits required has been recalculated in the newest version of this paper, which provides a good indication of how fidelity of quantum chips has been improved in the last two years, as well as the latest understanding in complexity and the on-going evolution in classical competition.

Tagged under

- Avoiding Barren Plateaus 22 January 2022 in Blog
- Decompositional QGNN 15 January 2022 in Blog
- Quantum Capsule Networks 08 January 2022 in Blog
- Quantum Digital-Analog Fermionic Simulation 31 December 2021 in Blog
- Quantum Semidefinite Programming 19 December 2021 in Blog
- Shallow Circuits in Variational QML 14 December 2021 in Blog

Copyright © Qu & Co BV