Die Seide is noch nich ibersetzt worn. Se guggen de englsche Originalversion.
Krylov quantum diagonalization
In this lesson on Krylov quantum diagonalization (KQD) we will answer the following:
- What is the Krylov method, generally?
- Why does the Krylov method work and under what conditions?
- How does quantum computing play a role?
The quantum part of the calculations are based largely on work in Ref [1].
The video below gives an overview of Krylov methods in classical computing, motivates their use, and explains how quantum computing can play a role in that workstream. The subsequent text offers more detail and implements a Krylov method both classically, and using a quantum computer.
1. Introduction to Krylov methods
A Krylov subspace method can refer to any of several methods built around what is called the Krylov subspace. A complete review of these is beyond the scope of this lesson, but Ref [2-4] can all give substantially more background. Here, we will focus on what a Krylov subspace is, how and why it is useful in solving eigenvalue problems, and finally how it can be implemented on a quantum computer. Definition: Given a symmetric, positive semi-definite matrix , the Krylov space of order is the space spanned by vectors obtained by multiplying higher powers of a matrix , up to , with a reference vector .
Although the vectors above span what we call a Krylov subspace, there is no reason to think that they will be orthogonal. One often uses an iterative orthonormalizing process similar to Gram-Schmidt orthogonalization. Here the process is slightly different since each new vector is made orthogonal to the others as it is generated. In this context this is called Arnoldi iteration. Starting with the initial vector , one generates the next vector , and then ensures that this second vector is orthogonal to the first by subtracting off its projection on . That is
It is now easy to see that since
We do the same for the next vector, ensuring it is orthogonal to both the previous two:
If we repeat this process for all vectors, we have a complete orthonormal basis for a Krylov space. Note that the orthogonalization process here will yield zero once , since orthogonal vector necessarily span the full space. The process will also yield zero if any vector is an eigenvector of since all subsequent vectors will be multiples of that vector.
1.1 A simple example: Krylov by hand
Let us step through a generation of a Krylov subspace generation on a trivially small matrix, so that we can see the process. We start with an initial matrix of interest to us:
For this small example, we can determine the eigenvectors and eigenvalues easily even by hand. We show the numerical solution here.
# Added by doQumentation — required packages for this notebook
!pip install -q matplotlib numpy qiskit qiskit-ibm-runtime scipy sympy
# One might use linalg.eigh here, but later matrices may not be Hermitian. So we use linalg.eig in this lesson.
import numpy as np
A = np.array([[4, -1, 0], [-1, 4, -1], [0, -1, 4]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print("The eigenvalues are ", eigenvalues)
print("The eigenvectors are ", eigenvectors)
The eigenvalues are [2.58578644 4. 5.41421356]
The eigenvectors are [[ 5.00000000e-01 -7.07106781e-01 5.00000000e-01]
[ 7.07106781e-01 1.37464400e-16 -7.07106781e-01]
[ 5.00000000e-01 7.07106781e-01 5.00000000e-01]]
We record them here for later comparison:
We would like to study how this process works (or fails) as we increase the dimension of our Krylov subspace, . To this end, we will apply this process:
- Generate a subspace of the full vector space starting with a randomly-chosen vector (call it if it is already normalized, as above).
- Project the full matrix onto that subspace, and find the eigenvalues of that projected matrix .
- Increase the size of the subspace by generating more vectors, ensuring that they are orthonormal, using a process similar to Gram-Schmidt orthogonalization.
- Project onto the larger subspace and find the eigenvalues of the resulting matrix, .
- Repeat this until the eigenvalues converge (or in this toy case, until you have generated vectors spanning the full vector space of the original matrix ).
A normal implementation of the Krylov method would not need to solve the eigenvalue problem for the matrix projected on every Krylov subspace as it is built. You could construct the subspace of the desired dimension, project the matrix onto that subspace, and diagonalize the projected matrix. Projecting and diagonalizing at each subspace dimension is only done for checking convergence.
Dimension :
We choose a random vector, say
If it is not already normalized, normalize it.
We now project our matrix onto the subspace of this one vector:
This is our projection of the matrix onto our Krylov subspace when it contains just a single vector, . The eigenvalue of this matrix is trivially 4. We can think of this as our zeroth-order estimate of the eigenvalues (in this case just one) of . Although it is a poor estimate, it is the correct order of magnitude.
Dimension :
We now generate the next vector in our subspace through operation with on the previous vector:
Now we subtract off the projection of this vector onto our previous vector to ensure orthogonality.
If it is not already normalized, normalize it. In this case, the vector was already normalized, so
We now project our matrix A onto the subspace of these two vectors:
We are still left with the problem of determining the eigenvalues of this matrix. But this matrix is slightly smaller than the full matrix. In problems involving very large matrices, working with this smaller subspace may be highly advantageous.
Although this is still not a good estimate, it is better than the zeroth order estimate. We will carry this out for one more iteration, to ensure the process is clear. However, this undercuts the point of the method, since we will end up diagonalizing a 3x3 matrix in the next iteration, meaning we have not saved time or computational power.
Dimension :
We now generate the next vector in our subspace through operation with A on the previous vector:
Now we subtract off the projection of this vector onto both our previous vectors to ensure orthogonality.
If it is not already normalized, normalize it. In this case, the vector was already normalized, so
We now project our matrix onto the subspace of these vectors:
We now determine the eigenvalues:
These eigenvalues are exactly the eigenvalues of the original matrix . This must be the case, since we have expanded our Krylov subspace to span the entire vector space of the original matrix .
In this example, the Krylov method may not appear particularly easier than direct diagonalization. Indeed, as we will see in later sections, the Krylov method is only advantageous above a certain matrix dimension; this is intended to help us solve eigenvalue/eigenvector problems of extremely large matrices.

This is the only example we will show worked “by hand”, but section 2 below shows computational examples.
Clarification of terms
A common misconception is that there is just a single Krylov subspace for a given problem. But of course, since there are many initial vectors to which our matrix could be applied, there are many possible Krylov subspaces. We will only use the phrase "the Krylov subspace" to refer to a specific Krylov subspace already defined for a specific example. For general problem-solving approaches we will refer to "a Krylov subspace". A final clarification is that it is valid to refer to a "Krylov space". One often sees it called a "Krylov subspace" because of its use in the context of projecting matrices from an initial space into a subspace. In keeping with that context, we will mostly refer to it as a subspace here.
Check your understanding
Read the questions below, think about your answer, then click the triangle to reveal each solution.
Explain why it is not (a) useful, and (b) possible to extend the dimension of the Krylov subspace beyond the dimension of the matrix of interest.
Answer:
(a) Since we are orthonormalizing the vectors as we produce them, a set of such vectors will form a complete basis, meaning a linear combination of them can be used to create any vector in the space. (b) The orthogonalization process consists of subtracting off the projection of a new vector onto all previous vectors. If all previous vectors span the full vector space, then subtracting off projections onto the full subspace will always leave us with a zero vector.
Suppose a fellow researcher is demonstrating the Krylov method applied to a small toy matrix for some colleagues. This fellow researcher plans to use the matrix and initial vector:
and
Is there something wrong with this? How would you advise your colleague?
Answer:
Your colleague has accidentally chosen an eigenvector for his/her initial vector. Acting with the matrix on the initial vector will simply return the same vector back, scaled by the eigenvalue. This will not generate a subspace of increasing dimension. Advise your colleague to select a different initial vector, making sure it is not an eigenvector.
Apply the Krylov method to the matrix below, selecting an appropriate new initial vector. Write down the estimates of the minimum eigenvalue at the 0th and 1st order of your Krylov subspace.
Answer:
There are many possible answers depending on the choice of initial vector. We will choose:
To get we apply once to , and then make orthogonal to
At 0th order, the projection onto our Krylov subspace is
At 1st order, the projection onto this Krylov subspace is
This can be done by hand, but is most easily done using numpy:
import numpy as np
vstar = np.array([[1/np.sqrt(3),1/np.sqrt(3),1/np.sqrt(3)],[-1/np.sqrt(6),np.sqrt(2/3),-1/np.sqrt(6)]]
)
A = np.array([[1, 1, 0],
[1, 1, 1],
[0, 1, 1]])
v = np.array([[1/np.sqrt(3),-1/np.sqrt(6)],[1/np.sqrt(3),np.sqrt(2/3)],[1/np.sqrt(3),-1/np.sqrt(6)]])
proj = vstar@A@v
print(proj)
eigenvalues, eigenvectors = np.linalg.eig(proj)
print("The eigenvalues are ", eigenvalues)
print("The eigenvectors are ", eigenvectors)
outputs:
[[ 2.33333333 0.47140452]
[ 0.47140452 -0.33333333]]
The eigenvalues are [ 2.41421356 -0.41421356]
The eigenvectors are [[ 0.98559856 -0.16910198]
[ 0.16910198 0.98559856]]
The minimum eigenvalue estimate is -0.414.
1.2 Types of Krylov methods
"Krylov subspace methods" can refer to any of several iterative techniques used to solve large linear systems and eigenvalue problems. What they all have in common is that they construct an approximate solution from a Krylov subspace
where is the initial guess (see Ref [5]). They differ in how they choose the best approximation from this subspace, balancing factors such as convergence rate, memory usage, and overall computational cost. The focus of this lesson is to leverage quantum computing in the context of Krylov subspace methods; an exhaustive discussion of these methods is beyond its scope. The brief definitions below are for context only and include some references for investigating these methods further.
The conjugate gradient (CG) method: This method is used for solving symmetric, positive definite linear systems[6]. It minimizes the A-norm of the error at each iteration, making it particularly effective for systems arising from discretized elliptic PDEs[7]. We will use this approach in the next section to motivate why a Krylov subspace would be an effective subspace in which to probe for improved solutions to linear systems.
The generalized minimal residual (GMRES) method: This is designed for solving general nonsymmetric linear systems. It minimizes the residual norm over a Krylov space at each iteration, making it robust but potentially memory-intensive for large systems[7].
The minimal residual (MINRES) method: This method is used for solving symmetric indefinite linear systems. It's similar to GMRES but takes advantage of the matrix symmetry to reduce computational cost[8].
Other approaches of note include the full orthogonalization method (FOM), which is closely related to Arnoldi's method for eigenvalue problems, the bi-conjugate gradient (BiCG) method, and the induced dimension reduction (IDR) method.
1.3 Why the Krylov subspace method works
Here we will motivate that the Krylov subspace method should be an efficient way to approximate matrix eigenvalues via iterative refinement of eigenvector approximations, through the lens of steepest descent. We will argue that given an initial guess of a ground state, the space of successive corrections to that initial guess that yields the fastest convergence is a Krylov subspace. We stop short of a rigorous proof of convergence behavior.
Assume our matrix of interest is symmetric and positive definite. This makes our argument most relevant to the CG method above. We make no assumptions about sparsity here; nor are we claiming that must be a Hermitian (which it needs to be if it is a Hamiltonian).
We typically wish to solve a problem of the form
One might imagine that where is some constant, as in an eigenvalue problem. But our problem statement remains more general for now.
We start with a vector that is an approximate solution. Although there are parallels between this guess and in Section 1.1, we are not leveraging these here. Our guess has error, which we call
We also define the residual
Here we use capital to distinguish the residual from the dimension of our Krylov subspace .

We now want to make a correction step of the form
which we hope improves our approximation. Here is some vector yet to be determined. Let be the error after the correction is made. Then

We are interested in how our error behaves when transformed by our matrix. So let us calculate the -norm of the error. That is
where we have used the symmetry of and also that Here is some constant independent of . As mentioned in Section 1.2, the -norm of the error is not the only quantity we might choose to minimize, but it is a good one. We want to see how this quantity varies with our choice of correction vectors So we define the function by setting
is just the error as a function of the correction measured in the -norm. Hence, we want to choose such that is as small as possible. For this purpose, we compute the gradient of . Using the symmetry of we have
The gradient points in the direction of steepest ascent, meaning its opposite gives us the direction in which the function decreases the most: the direction of steepest descent. At our initial guess , where , we have that Thus, the function decreases the most in the direction of the residual So our initial choice would benefit most by the addition of the vector for some scalar .
In the next step, we choose, again, a vector and add its value to the current approximation. Using the same argument as before we choose for some scalar . We continue in this manner, such that the iteration of our vector is
Equivalently, we want to build up the space from which we choose our improved estimates by adding , , and so on, in order. The estimated vector lies in
Now, using the relation that
we see that
That is, the space we build up that most efficiently approximates the correct solution is exactly the space built up by successive operation of the matrix on A Krylov subspace is the space spanned by the vectors of successive directions of steepest descent.
Finally we reiterate that we have made no numerical claims about the scaling of this approach, nor have we discussed the comparative benefit for sparse matrices. This is only meant to motivate the use of Krylov subspace methods, and add some intuitive sense for them. We will now explore the behavior of these methods numerically.
Check your understanding
Read the question below, think about your answer, then click the triangle to reveal the solution.
In the workflow above, we proposed minimizing the -norm of the error. What other quantities might one consider minimizing in seeking the ground state and its eigenvalue?
Answer:
One could imagine using the residual vector instead of the -norm of the error. There might be cases in which considering the error vector itself is useful.
2. Krylov methods in classical computation
In this section we implement Arnoldi iterations computationally so that we may leverage a Krylov subspace in solving eigenvalue problems. We will apply this first to a small-scale example, then examine how computation time scales as the size of the matrix of interest increases. A key idea here will be that the generation of the vectors spanning the Krylov space will be a large contributor to total computing time required. The memory required will vary between specific Krylov methods. But memory constraints can limit the use of traditional Krylov methods.
2.1 Simple small-scale example
In the process of creating a Krylov subspace, we will need to orthonormalize the vectors in our subspace. Let us define a function that takes an established vector from our subspace vknown (not assumed to be normalized) and a candidate vector to add to our subspace vnext and make vnext orthogonal to vknown and normalized. Let us further define a function that steps through this process for all established vectors in our Krylov subspace to ensure a fully orthonormal set.
# vknown is some established vector in our subspace. vnext is one we wish to add, which must be orthogonal to vknown.
def orthog_pair(vknown, vnext):
vknown = vknown / np.sqrt(vknown.T @ vknown)
diffvec = vknown.T @ vnext * vknown
vnext = vnext - diffvec
return vnext
# v is the candidate vector to be added to our subspace. s is the existing subspace.
def orthoset(v, s):
v = v / np.sqrt(v.T @ v)
temp = v
for i in range(len(s)):
temp = orthog_pair(s[i], temp)
v = temp / np.sqrt(temp.T @ temp)
return v
Now let us define a function that builds an iteratively larger and larger Krylov subspace, until the space of Krylov vectors spans the full space of the original matrix. This will enable us to see how well the eigenvalues obtained using our Krylov subspace method match the exact values, as a function of Krylov subspace dimension. Importantly, our function krylov_full_build returns the Krylov vectors, the projected Hamiltonians, the eigenvalues, and the time required.
# Necessary imports and definitions to track time in microseconds
import time
def time_mus():
return int(time.time() * 1000000)
# This function constructs a Krylov subspace that spans the whole space of the original matrix.
# Input:
# v0 : initial vector
# matrix : original matrix to be diagonalized
# Output:
# ks : Krylov vectors
# Hs : projected Hamiltonians
# eigs : eigenvalues
# k_tot_times : time required for the operation
def krylov_full_build(v0, matrix):
t0 = time_mus()
b = v0 / np.sqrt(v0 @ v0.T)
A = matrix
ks = []
ks.append(b)
Hs = []
eigs = []
Hs.append(b.T @ A @ b)
eigs.append(np.array([b.T @ A @ b]))
k_tot_times = []
for j in range(len(A) - 1):
vec = A @ ks[j].T
ortho = orthoset(vec, ks)
ks.append(ortho)
ksarray = np.array(ks)
Hs.append(ksarray @ A @ ksarray.T)
eigs.append(np.linalg.eig(Hs[j + 1]).eigenvalues)
k_tot_times.append(time_mus() - t0)
# Return the Krylov vectors, the projected Hamiltonians, the eigenvalues, and the total time required.
return (ks, Hs, eigs, k_tot_times)
We will test this on a matrix that is still quite small, but larger than we might want to do by hand.
# Define our small test matrix
test_matrix = np.array(
[
[4, -1, 0, 1, 0],
[-1, 4, -1, 2, 1],
[0, -1, 4, 3, 3],
[1, 2, 3, 4, 0],
[0, 1, 3, 0, 4],
]
)
# Give the test matrix and an initial guess as arguments in the function defined above. Calculate outputs.
test_ks, test_Hs, test_eigs, text_k_tot_times = krylov_full_build(
np.array([0.5, 0.5, 0, 0.5, 0.5]), test_matrix
)
We can check our functions by ensuring that in the last step (when the Krylov space is the full vector space of the original matrix) the eigenvalues from the Krylov method exactly match those of the exact numerical diagonalization:
print(np.linalg.eig(test_matrix).eigenvalues)
print(test_eigs[len(test_matrix) - 1])
[-1.36956923 8.43756009 2.9040308 5.34436028 4.68361806]
[-1.36956923 8.43756009 2.9040308 4.68361806 5.34436028]
That was successful. Of course, what really matters is how good our approximation is as a function of the dimension of our Krylov subspace dimension. Because we are often concerned with finding ground states and other minimum eigenvalues (and for other more algebraic reasons explained below), let's look at our estimate of the lowest eigenvalue as a function of Krylov subspace dimension. That is
def errors(matrix, krylov_eigs):
targ_min = min(np.linalg.eig(matrix).eigenvalues)
err = []
for i in range(len(matrix)):
err.append(min(krylov_eigs[i]) - targ_min)
return err
import matplotlib.pyplot as plt
krylov_error = errors(test_matrix, test_eigs)
plt.plot(krylov_error)
plt.axhline(y=0, color="red", linestyle="--") # Add dashed red line at y=0
plt.xlabel("Order of Krylov subspace") # Add x-axis label
plt.ylabel("Error in minimum eigenvalue") # Add y-axis label
plt.show()
We see that the minimum eigenvalue is reached fairly accurately once the Krylov subspace has grown to and is perfect by
2.2 Time scaling with matrix dimension
Let us convince ourselves that the Krylov method can be advantageous over exact numerical eigensolvers in the following way:
- Construct random matrices (not sparse, not the ideal application for KQD)
- Determine eigenvalues using two methods: directly using NumPy and using a Krylov subspace.
- We choose a cutoff for how precise our eigenvalues must be, before we accept the Krylov estimates.
- Compare the wall time required to solve in these two ways.
Caveats: As we will discuss in detail below, Krylov quantum diagonalization is best applied to operators whose matrix representations are sparse and/or can be written using a small number of groups of commuting Pauli operators. The random matrices we are using here do not fit that description. These are only useful in probing the scale at which classical Krylov methods might be useful. Secondly, in using the Krylov method we will calculate eigenvalues using many different-sized Krylov subspaces. We will report the time required for the minimum-dimension Krylov subspace that achieves our required accuracy for the ground state eigenvalue. Again, this is a bit different from solving a problem that is intractable for exact eigensolvers, since we are using the exact solution to assess the dimension needed.
We begin by generating our set of random matrices.
import numpy as np
# Set the random seed
np.random.seed(42)
# how many random matrices will we make
num_matrix = 200
matrices = []
for m in range(1, num_matrix):
matrices.append(np.random.rand(m, m))
We now diagonalize each matrix directly, using numpy. We calculate the time required for diagonalization for later comparison.
matrix_numpy_times = []
matrix_numpy_eigs = []
for mm in range(num_matrix - 1):
t0 = time_mus()
matrix_numpy_eigs.append(min(np.linalg.eig(matrices[mm]).eigenvalues))
matrix_numpy_times.append(time_mus() - t0)
plt.plot(matrix_numpy_times)
plt.xlabel("Dimension of matrix") # Add x-axis label
plt.ylabel("Time to diagonalize (microsec)") # Add y-axis label
plt.show()
Note that in the image above, the anomalously high time around a dimension of 125 may be due to the random nature of the matrices or due to implementation on the classical processor used, but it is not reproducible. Re-running the code will yield a different profile with different anomalous peaks.
Now for each matrix we will build up a Krylov subspace and calculate eigenvalues in steps. At each step, we will check to see if the lowest eigenvalue has been obtained to within our specified absolute error. The subspace that first gives us eigenvalues within our specified error is the subspace for which we will record computation times. Executing this cell may take several minutes, depending on processor speed. Feel free to skip evaluation or reduce the maximum dimension of matrices diagonalized. Looking at the pre-calculated results is sufficient.
# Choose the absolute error you can tolerate, and make a list for tracking the Krylov subspace size at which that error is achieved.
abserr = 0.05
accept_subspace_size = []
# Lists to store total time spent on the Krylov method, and the subset of that time spent on diagonalizing the projected matrix.
matrix_krylov_tot_times = []
matrix_krylov_dim = []
# Step through all our random matrices
for mm in range(0, num_matrix - 1):
test_ks, test_Hs, test_eigs, test_k_tot_times = krylov_full_build(
np.ones(len(matrices[mm])), matrices[mm]
)
# We have not yet found a Krylov subspace that produces our minimum eigenvalue to within the required error.
found = 0
for j in range(0, len(matrices[mm]) - 1):
# If we still haven't found the desired subspace...
if found == 0:
# ...but if this one satisfies the requirement, then record everything
if (
abs((min(test_eigs[j]) - matrix_numpy_eigs[mm]) / matrix_numpy_eigs[mm])
< abserr
):
accept_subspace_size.append(j)
matrix_krylov_tot_times.append(test_k_tot_times[j])
matrix_krylov_dim.append(mm)
found = 1
Let us plot the times we have obtained for these two methods for comparison:
plt.plot(matrix_numpy_times, color="blue")
plt.plot(matrix_krylov_dim, matrix_krylov_tot_times, color="green")
plt.xlabel("Dimension of matrix") # Add x-axis label
plt.ylabel("Time to diagonalize (microsec)") # Add y-axis label
plt.show()
These are the actual times required, but for the purposes of discussion, let us smooth these curves by averaging over a few adjacent points / matrix dimensions. This is done below:
smooth_numpy_times = []
smooth_krylov_times = []
# Choose the number of adjacent points over which to average forward; the same will be used backward.
smooth_steps = 10
# We will do this smoothing for all points/matrix dimensions
for i in range(len(matrix_krylov_tot_times)):
# Ensure we don't exceed the boundaries of our lists
start = max(0, i - smooth_steps)
end = min(len(matrix_krylov_tot_times) - 1, i + smooth_steps)
# Dummy variables for accumulating an average over adjacent points. This is done for both Krylov and the NumPy calculations.
smooth_count = 0
smooth_numpy_sum = 0
smooth_krylov_sum = 0
for j in range(start, end):
smooth_numpy_sum = smooth_numpy_sum + matrix_numpy_times[j]
smooth_krylov_sum = smooth_krylov_sum + matrix_krylov_tot_times[j]
smooth_count = smooth_count + 1
# Appending the averaged adjacent values to our new smooth lists
smooth_numpy_times.append(smooth_numpy_sum / smooth_count)
smooth_krylov_times.append(smooth_krylov_sum / smooth_count)
plt.plot(smooth_numpy_times, color="blue")
plt.plot(smooth_krylov_times, color="green")
plt.xlabel("Dimension of matrix") # Add x-axis label
plt.ylabel("Time to diagonalize (smoothed, microsec)") # Add y-axis label
plt.show()
Note that the time required for the building of a Krylov subspace initially exceeds the time required for numpy's full diagonalization. But as the size of the matrix grows, the Krylov method becomes advantageous. This is true even if we lower our acceptable error, but the advantage sets in at a larger matrix size. This is worth picking apart.
The time complexity of numerical diagonalization is (with some variation among algorithms). The time complexity of generating an orthonormal basis of vectors is also . So the advantage of the Krylov method is not related to the use of orthonormal basis, but to the use of a particular orthonormal basis that effectively picks out the eigenvalues of interest. We have already seen this from the sketch of a proof in the first section of this lesson, and this is critical for the convergence guarantees in Krylov methods. Let us review our progress so far:
- For very large matrices, the Krylov subspace method may yield approximate eigenvalues within required tolerances faster than traditional diagonalization algorithms.
- For such very large matrices, the generation of a Krylov subspace is the most time-consuming part of the Krylov subspace method.
- Thus an efficient way of generating a Krylov subspace would be highly valuable. This is finally where quantum computer comes into the picture.
Check your understanding
Read the question below, think about your answer, then click the triangle to reveal the solution.
Refer to the smoothed plot of diagonalization times vs. matrix dimension above.
(a) At approximately what matrix dimension did the Krylov method become faster, according to this plot?
(b) What aspects of the calculation could change that dimension at which the Krylov method becomes faster?
Answer:
(a) Answers may vary if you re-run the calculation, but the Krylov method becomes faster at approximately dimension 80-85. (b) There are many possible answers. Some important factors are the precision we require and the sparsity of the matrices being diagonalized.
3. Krylov via time evolution
Everything that we have described so far can be done classically. So how and when would we use a quantum computer? For very large matrices, the Krylov method can require long computing times and large amounts of memory. The time required for matrix operation of on scales like in the worst case. Even multiplying sparse matrices on a vector (the typical case for classical Krylov-type solvers) has a time complexity scaling like . This is done for every vector we want in our subspace. The subspace dimension is usually not a significant fraction of , and often scales like . So generating all vectors scales like in the worst case. Although there are other steps, like orthogonalization, this is the dominant scaling to keep in mind.
Quantum computing allows us to change what attributes of the problem determine the scaling of the time and resources required. Instead of dependence on matrix size across the board, we will see things like number of shots and number of non-commuting Pauli terms that make up the Hamiltonian. Let’s explore how this works.
3.1 Time evolution
Recall that the operator that time-evolves a quantum state is (and it is very common, especially in quantum computing to drop the from the notation). One way of understanding and even realizing such an exponential function of an operator is to look at its Taylor series expansion. Note that this operation acting on some initial vector yields a sum of terms with increasing powers of applied to the initial state. It looks like we can just make our Krylov subspace by time-evolving our initial guess state!