Optimization

Quantum Natural Gradient

QNG optimizer using the Fubini-Study metric

The Quantum Natural Gradient optimizer generalizes classical natural gradient descent to the quantum regime by accounting for the non-Euclidean geometry of parameterized quantum states through the Fubini-Study metric tensor. Unlike vanilla gradient descent, which treats parameter space as flat, QNG uses the Quantum Fisher Information matrix to rescale update directions according to the state's local curvature. This approach often yields faster convergence and more stable optimization landscapes in variational quantum algorithms.

Fubini-Study Metric & Quantum Fisher Information

In variational quantum algorithms, a parameterized quantum state |\psi(\theta)\rangle is generated by applying a sequence of parameterized gates to a reference state. Classical gradient descent updates parameters as if the geometry of this state space were Euclidean, which ignores the fact that moving by \Delta \theta in parameter space can produce vastly different changes in the quantum state depending on the local curvature. The Fubini-Study metric tensor quantifies this intrinsic geometry by measuring how the quantum state changes with respect to infinitesimal parameter variations.

The Quantum Fisher Information matrix is the real part of the Quantum Geometric Tensor, which itself derives from the overlap between states with infinitesimally shifted parameters. For pure states, the QFI captures the sensitivity of the state to parameter changes and is directly related to the Bures metric on the projective Hilbert space. When used as a preconditioner in gradient descent, the inverse QFI rescales the update direction to follow the natural geodesics of the quantum state manifold rather than straight lines in parameter space.

Geometrically, the QFI endows the parameter space with a Riemannian metric that reflects the information geometry of the underlying quantum state family. This metric is invariant under reparameterizations of the circuit and captures genuine physical distinguishability rather than arbitrary coordinate dependencies. The classical Fisher information emerges as a special case when the metric is evaluated with respect to measurement statistics rather than the pure state itself.

\theta_{k+1} = \theta_k - \eta \; g^{-1}(\theta_k) \; \nabla L(\theta_k)

QNG Update Rule

g_{ij}(\theta) = \text{Re}\left[ \frac{\partial \langle \psi |}{\partial \theta_i} \frac{\partial | \psi \rangle}{\partial \theta_j} - \frac{\partial \langle \psi |}{\partial \theta_i} | \psi \rangle \langle \psi | \frac{\partial | \psi \rangle}{\partial \theta_j} \right]

Metric Tensor

\mathcal{G}_{ij} = \langle \partial_i \psi | (\hat{I} - |\psi\rangle\langle\psi|) | \partial_j \psi \rangle

Quantum Geometric Tensor

•State Geometry: The Fubini-Study metric measures distances on the projective Hilbert space, inducing a non-trivial geometry in parameter space.
•Quantum Fisher Information: The QFI matrix equals the real part of the Quantum Geometric Tensor and quantifies parameter estimation precision via the quantum Cramer-Rao bound.
•Gauge Invariance: The metric tensor is invariant under global phase rotations of the quantum state, reflecting physical rather than gauge-dependent geometry.
•Classical Limit: For product states generated by classical neural networks, the QFI reduces to the classical Fisher information matrix.

Efficient Implementation in Variational Circuits

Computing the full QFI matrix exactly requires evaluating O(P^2) expectation values for P parameters, which quickly becomes prohibitive as circuit width and depth increase. To address this, several approximate strategies have been developed that exploit the structure of typical variational ansatze. The block-diagonal approximation computes the metric tensor for individual layers independently, ignoring cross-layer correlations. This reduces the complexity to O(L d^2) where L is the number of layers and d is the maximum number of parameters per layer, but often captures the dominant curvature directions.

The diagonal approximation goes further by retaining only the diagonal elements of the metric tensor, assuming parameters are locally orthogonal. While this sacrifices some geometric accuracy, it reduces the computational cost to O(P) and can still provide significant convergence benefits over vanilla gradient descent. In hardware-efficient ansatze where gates within a layer commute or are disjoint, the block-diagonal approximation becomes exact, making it particularly suitable for NISQ-era circuits.

Regularization is essential when inverting the metric tensor, as the QFI can become singular or ill-conditioned near critical points in the optimization landscape. Adding a small multiple of the identity matrix, known as Tikhonov regularization, ensures numerical stability while minimally distorting the natural geometry. Adaptive regularization schemes that shrink the regularization strength as optimization progresses have been shown to balance stability and geometric fidelity effectively.

\tilde{g}_{ij} = g_{ij} + \lambda \delta_{ij}

Regularized Inverse

g \approx \bigoplus_{l=1}^{L} g^{(l)}

Block-Diagonal Structure

\Delta \theta = -\eta \left( g + \lambda I \right)^{-1} \nabla L

Parameter Update

•Block-Diagonal Approximation: Computing the metric layer-by-layer reduces complexity and is exact for hardware-efficient ansatze with commuting gates.
•Diagonal Approximation: Keeping only diagonal entries g_ii further reduces cost to O(P) while still preconditioning updates.
•Tikhonov Regularization: Adding lambda I to the metric tensor before inversion prevents singularities and stabilizes optimization near flat directions.
•Hardware Efficiency: The metric tensor for local gates can often be measured with the same circuits used for gradient estimation via the parameter-shift rule.

Code in Action

Runnable implementations you can copy and experiment with.

Qiskit: Quantum Natural Gradient Step

Compute a single Quantum Natural Gradient update step using finite differences for the Fubini-Study metric tensor and the parameter-shift rule for gradients.

qng_qiskit.py

from qiskit import QuantumCircuit
from qiskit.circuit import Parameter
from qiskit.quantum_info import SparsePauliOp, Statevector
import numpy as np

n_qubits = 2
theta = Parameter("θ")
phi = Parameter("φ")
qc = QuantumCircuit(n_qubits)
qc.rx(theta, 0)
qc.ry(phi, 1)
qc.cx(0, 1)

obs = SparsePauliOp.from_list([("ZZ", 1.0)])

def get_state(params):
    bound = qc.assign_parameters({theta: params[0], phi: params[1]})
    return Statevector.from_label("0" * n_qubits).evolve(bound)

def expectation(params):
    return np.real(get_state(params).expectation_value(obs))

def gradient(params):
    shift = np.pi / 2
    grad = np.zeros(len(params))
    for i in range(len(params)):
        plus = params.copy(); plus[i] += shift
        minus = params.copy(); minus[i] -= shift
        grad[i] = (expectation(plus) - expectation(minus)) / 2
    return grad

def metric_tensor(params):
    eps = 1e-4
    n = len(params)
    g = np.zeros((n, n))
    sv = get_state(params)
    for i in range(n):
        for j in range(n):
            p_i = params.copy(); p_i[i] += eps
            p_j = params.copy(); p_j[j] += eps
            sv_i = get_state(p_i)
            sv_j = get_state(p_j)
            o_i = np.vdot(sv.data, sv_i.data)
            o_j = np.vdot(sv.data, sv_j.data)
            o_ij = np.vdot(sv_i.data, sv_j.data)
            g[i, j] = np.real((o_ij - np.conj(o_i) * o_j) / eps**2)
    return g

params = np.array([0.5, 0.5])
grad = gradient(params)
g = metric_tensor(params)
reg = 0.01
g_inv = np.linalg.inv(g + reg * np.eye(len(params)))
new_params = params - 0.01 * g_inv @ grad
print(f"Gradient: {grad}")
print(f"Metric tensor:\n{g}")
print(f"Updated params: {new_params}")

PennyLane: Built-in QNG Optimizer

Use PennyLane's native QNGOptimizer to train a variational circuit with quantum natural gradient descent.

qng_pennylane.py

import pennylane as qml
from pennylane import numpy as np

n_qubits = 2
dev = qml.device("default.qubit", wires=n_qubits)

@qml.qnode(dev)
def circuit(params):
    qml.RX(params[0], wires=0)
    qml.RY(params[1], wires=1)
    qml.CNOT(wires=[0, 1])
    return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1))

opt = qml.QNGOptimizer(stepsize=0.01)
params = np.array([0.5, 0.5], requires_grad=True)

for i in range(100):
    params = opt.step(circuit, params)

print(f"Optimized params: {params}")
print(f"Final energy: {circuit(params)}")

Explore next

Variational Quantum Algorithms

Explore next

Quantum Approximate Optimization Algorithm

from qiskit import QuantumCircuit from qiskit.circuit import Parameter from qiskit.quantum_info import SparsePauliOp, Statevector import numpy as np n_qubits = 2 theta = Parameter("θ") phi = Parameter("φ") qc = QuantumCircuit(n_qubits) qc.rx(theta, 0) qc.ry(phi, 1) qc.cx(0, 1) obs = SparsePauliOp.from_list([("ZZ", 1.0)]) def get_state(params): bound = qc.assign_parameters({theta: params[0], phi: params[1]}) return Statevector.from_label("0" * n_qubits).evolve(bound) def expectation(params): return np.real(get_state(params).expectation_value(obs)) def gradient(params): shift = np.pi / 2 grad = np.zeros(len(params)) for i in range(len(params)): plus = params.copy(); plus[i] += shift minus = params.copy(); minus[i] -= shift grad[i] = (expectation(plus) - expectation(minus)) / 2 return grad def metric_tensor(params): eps = 1e-4 n = len(params) g = np.zeros((n, n)) sv = get_state(params) for i in range(n): for j in range(n): p_i = params.copy(); p_i[i] += eps p_j = params.copy(); p_j[j] += eps sv_i = get_state(p_i) sv_j = get_state(p_j) o_i = np.vdot(sv.data, sv_i.data) o_j = np.vdot(sv.data, sv_j.data) o_ij = np.vdot(sv_i.data, sv_j.data) g[i, j] = np.real((o_ij - np.conj(o_i) * o_j) / eps**2) return g params = np.array([0.5, 0.5]) grad = gradient(params) g = metric_tensor(params) reg = 0.01 g_inv = np.linalg.inv(g + reg * np.eye(len(params))) new_params = params - 0.01 * g_inv @ grad print(f"Gradient: {grad}") print(f"Metric tensor:\n{g}") print(f"Updated params: {new_params}")

import pennylane as qml from pennylane import numpy as np n_qubits = 2 dev = qml.device("default.qubit", wires=n_qubits) @qml.qnode(dev) def circuit(params): qml.RX(params[0], wires=0) qml.RY(params[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1)) opt = qml.QNGOptimizer(stepsize=0.01) params = np.array([0.5, 0.5], requires_grad=True) for i in range(100): params = opt.step(circuit, params) print(f"Optimized params: {params}") print(f"Final energy: {circuit(params)}")