Quantum Embeddings for Machine Learning

Team 2, ENPH 353

Joshua Himmens and George Sleen

S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, and N. Killoran, "Quantum embeddings for machine learning," arXiv preprint arXiv:2001.03622, 2020.

  • Quantum computers have extremely limited circuit depths
  • Variational quantum classifiers struggle to train embeddings
  • Embeddings can be trained to naturally distinguish classes
  • There are competing metrics to semantic embeddings

Background

Qubits 4 / 13
Quantum computers operate on qubits, the fundamental unit of quantum information
  • A qubit is a unit of quantum information similar to a classical bit
  • They exist as a superposition of the eigenvalues \(|0\rangle\) and \(|1\rangle\) on a 2D sphere
  • Qubits are operated on by performing rotations on the vector in the sphere
Bloch sphere
Qubits and Quantum Computers 5 / 13
Quantum computers operate on qubits, the fundamental unit of quantum information
  • Qubits are difficult to work with, they decohere (lose their state) extremely quickly
  • Qubits can interfere with each other
Bloch sphere
Classification and Hyperplanes 6 / 13
Classification of data is done by creating hyperplanes in a high dimensional Hilbert space
  • Each piece of data is embedded as a point in a Hilbert space
  • \(n\) qubits form a \(2^n\) dimensional Hilbert space
  • Hyperplanes are used to classify the embedded data
Hyperplane classification

Implementations

Approaches 8 / 13
Two complementary approaches to quantum machine learning

Quantum Metric Learning

  • Training is done to optimize the embedding
  • The classifier is analytically chosen after training
Metric learning

Variational Quantum Classifiers (VQCs)

  • Classifier is a parametrized variational quantum circuit
  • Similar to traditional machine learning
  • The embedding is only weakly trainable
NISQ Computing 9 / 13
Neural embedding networks are too large for NISQ computers
  • Noisy intermediate scale quantum (NISQ) computers only permit limited depth due to qubit decoherence
  • Network scales cannot get large enough to make VQCs alone effective
  • Qubit decoherence times are on the order of 1ms
NISQ computing
Loss Functions for Metric Learning 10 / 13
Quantum metric's loss function is a distance metric
  • Metric learning aims to:
    • Maximize distance between unrelated data (Maximize trace distance)
    • Minimize distance between related data (Maximize fidelity)
Data overlap
Distance Metric Implementations 11 / 13
The Hilbert-Schmidt distance metric is easier to optimize than the trace distance and gives similar results
  • Trace distance \(D_{\text{tr}}\) is essential to quantum information
  • Hilbert-Schmidt distance \(D_{HS}\) is closely related and much easier to compute
  • Much easier to compute means it's optimized better
\[D_{\text{tr}}(\rho, \sigma) = \tfrac{1}{2} Tr[\rho - \sigma]\] \[D_{\text{HS}}(\rho, \sigma) = Tr[(\rho - \sigma)^2]\]
\[\tfrac{1}{2} D_{\text{HS}} \leq D_{\text{tr}}^2 \leq r \, D_{\text{HS}}\] \[r = \frac{\text{rank}(\rho)\,\text{rank}(\sigma)}{\text{rank}(\rho) + \text{rank}(\sigma)}\]
SWAP procedure 12 / 13
The efficacy of a quantum embedding is measured by performing a SWAP operation to find the fidelity. Trace distance is measured directly

Fidelity

  • How tightly data is clustered
  • \(F = |\langle\phi|\psi\rangle|^2\)

Trace distance

  • How far apart data clusters are
  • \(D_{\text{tr}}(\rho, \sigma) = \tfrac{1}{2} Tr[\rho - \sigma]\)
Fidelity and distance

The SWAP procedure is a simple quantum circuit that creates an interference pattern between the swapped and not swapped data. This is used to measure the fidelity of the data.

Questions!

  • Quantum computers have extremely limited circuit depths
  • Variational quantum classifiers struggle to train embeddings
  • Embeddings can be trained to naturally distinguish classes
  • There are competing metrics to semantic embeddings