Data encoding
Introduction and notation
To use a quantum algorithm, classical data must somehow be brought into a quantum circuit. This is usually referred to as data encoding, but is also called data loading. Recall from previous lessons the notion of a feature mapping, a mapping of data features from one space to another. Just transferring classical data to a quantum computer is a sort of mapping, and could be called a feature mapping. In practice, the built-in feature mappings in Qiskit (like `z_Feature Map and ZZ Feature Map) will typically include rotation layers and entangling layers that extend the state to many dimensions in the Hilbert space. This encoding process is a critical part of quantum machine learning algorithms and directly affects their computational capabilities.
Some of the encoding techniques below can be efficiently classically simulated; this is particularly easy to see in encoding methods that yield product states (that is, they do not entangle qubits). And remember that quantum utility is most likely to lie where the quantum-like complexity of the dataset is well-matched by the encoding method. So it is very likely that you will end up writing your own encoding circuits. Here, we show a wide variety of possible encoding strategies simply so that you can compare and contrast them, and see what is possible. There are some very general statements that can be made about the usefulness of encoding techniques. For example, efficient_su2 (see below) with a full entangling scheme is much more likely to capture quantum features of data than methods that yield product states (like z_feature_map). But this does not mean efficient_su2 is sufficient, or sufficiently well-matched to your dataset, to yield a quantum speed-up. That requires careful consideration of the structure of the data being modeled or classified. There is also a balancing act with circuit depth, since many feature maps which fully entangle the qubits in a circuit yield very deep circuits, too deep to get usable results on today's quantum computers.
Notation
A dataset is a set of data vectors: , where each vector is dimensional, that is, . This could be extended to complex data features. In this lesson, we may occasionally use these notations for the full set and its specific elements like . But we will mostly refer to the loading of a single vector from our dataset at a time, and will often simply refer to a single vector of features as .
Additionally, it is common to use the symbol to refer to the feature mapping of data vector . In quantum computing specifically, it is common to refer to mappings in quantum computing using a notation that reinforces the unitary nature of these operations. One could correctly use the same symbol for both; both are feature mappings. Throughout this course, we tend to use:
- when discussing feature mappings in machine learning, generally, and
- when discussing circuit implementations of feature mappings.
Normalization and information loss
In classical machine learning, training data features are often "normalized" or rescaled which often improves model performance. One common way of doing this is by using min-max normalization or standardization. In min-max normalization, feature columns of the data matrix (say, feature ) are normalized:
where min and max refer to the minimum and maximum of feature over the data vectors in the dataset . All the feature values then fall in the unit interval: for all , .
Normalization is also a fundamental concept in quantum mechanics and quantum computing, but it is slightly different from min-max normalization. Normalization in quantum mechanics requires that the length (in the context of quantum computing, the 2-norm) of a state vector is equal to unity: , ensuring that measurement probabilities sum to 1. The state is normalized by dividing by the 2-norm; that is, by rescaling
In quantum computing and quantum mechanics, this is not a normalization imposed by people on the data, but a fundamental property of quantum states. Depending on your encoding scheme, this constraint may affect how your data are rescaled. For example, in amplitude encoding (see below), the data vector is normalized as is required by quantum mechanics, and this affects the scaling of the data being encoded. In phase encoding, feature values are recommended to be rescaled as so that there is no information loss due to the modulo- effect of encoding to a qubit phase angle[1,2].
Methods of encoding
In the next few sections, we will refer to a small example classical dataset consisting of data vectors, each with features:
In the notation introduced above, we might say the feature of the data vector in our set is for example.
Basis encoding
Basis encoding encodes a classical -bit string into a computational basis state of a -qubit system. Take for example This can be represented as a -bit string as , and by a -qubit system as the quantum state . More generally, for a -bit string: , the corresponding -qubit state is with for . Note that this is just for a single feature.
Basis encoding in quantum computing represents each classical bit as a separate qubit, mapping the binary representation of data directly onto quantum states in the computational basis. When multiple features need to be encoded, each feature is first converted to its binary form and then assigned to a distinct group of qubits — one group per feature — where each qubit reflects a bit in the binary representation of that feature.
As an example, let us encode the vector (5, 7, 0).
Suppose all features are stored in four bits (more than we need, but enough to represent any integer that is single-digit in base 10):
5 → binary 0101
7 → binary 0111
0 → binary 0000
These bit strings are assigned to three sets of four qubits, so the overall 12-qubit basis state is:
Here, the first four qubits represent the first feature, the next four qubits the second feature, and the last four qubits the third feature. The code below converts the data vector (5,7,0) to a quantum state, and is generalized to do so for other single-digit features.
from qiskit import QuantumCircuit
# Data point to encode
x = 5 # binary: 0101
y = 7 # binary: 0111
z = 0 # binary: 0000
# Convert each to 4-bit binary list
x_bits = [int(b) for b in format(x, "04b")] # [0,1,0,1]
y_bits = [int(b) for b in format(y, "04b")] # [0,1,1,1]
z_bits = [int(b) for b in format(z, "04b")] # [0,0,0,0]
# Combine all bits
all_bits = x_bits + y_bits + z_bits # [0,1,0,1,0,1,1,1,0,0,0,0]
# Initialize a 12-qubit quantum circuit
qc = QuantumCircuit(12)
# Apply x-gates where the bit is 1
for idx, bit in enumerate(all_bits):
if bit == 1:
qc.x(idx)
qc.draw("mpl")

Check your understanding
Read the question below, think about your answer, then click the triangle to reveal the solution.
Write code to encode the first vector in our example data set :
using basis encoding.
Answer:
import math
from qiskit import QuantumCircuit
# Data point to encode
x = 4 # binary: 0100
y = 8 # binary: 1000
z = 5 # binary: 0101
# Convert each to 4-bit binary list
x_bits = [int(b) for b in format(x, '04b')] # [0,1,0,0]
y_bits = [int(b) for b in format(y, '04b')] # [1,0,0,0]
z_bits = [int(b) for b in format(z, '04b')] # [0,1,0,1]
# Combine all bits
all_bits = x_bits + y_bits + z_bits # [0,1,0,0,1,0,0,0,0,1,0,1]
# Initialize a 12-qubit quantum circuit
qc = QuantumCircuit(12)
# Apply x-gates where the bit is 1
for idx, bit in enumerate(all_bits):
if bit == 1:
qc.x(idx)
qc.draw('mpl')
Amplitude encoding
Amplitude encoding encodes data into the amplitudes of a quantum state. It represents a normalized classical -dimensional data vector, , as the amplitudes of a -qubit quantum state, :
where is the same dimension of the data vectors as before, is the element of and is the computational basis state. Here, is a normalization constant to be determined from the data being encoded. This is the normalization condition imposed by quantum mechanics:
In general, this is a different condition than the min/max normalization used for each feature across all data vectors. Precisely how this is navigated will depend on your problem. But there is no way around the quantum mechanical normalization condition above.
In amplitude encoding, each feature in a data vector is stored as an amplitude of a different quantum state. As a system of qubits provides amplitudes, amplitude encoding of features requires qubits.
As an example, let's encode the first vector in our example dataset , using amplitude encoding. Normalizing the resulting vector, we get: