Skip to content
Pasqal Documentation

qek.kernel.kernel

module
qek.kernel.kernel

The Quantum Evolution Kernel itself, for use in a machine-learning pipeline.

Classes

  • BaseKernel Base class for implementations of the Quantum Evolution Kernel.

  • FastQEK FastQEK class.

  • IntegratedQEK A variant of the Quantum Evolution Kernel that supports fit/transform/fit_transform from raw data (graphs).

Functions

class
BaseKernel (mu: float, size_max: int | None = None, similarity: Callable[[NDArray[np.floating], NDArray[np.floating]], np.floating] | None = None)

Bases : abc.ABC, Generic[KernelData]

Base class for implementations of the Quantum Evolution Kernel.

Unless you are implementing a new kernel, you should probably consider using one of the subclasses: - FastQEK (lower-level API, requires processed data, optimized); - IntegratedQEK (higher-level API, accepts graphs, slower).

Initialize the kernel.

Attributes

  • - X (Sequence[ProcessedData]) Training data used for fitting the kernel.

  • - kernel_matrix (np.ndarray) Kernel matrix. This is assigned in the fit() method

Training parameters

mu (float): Scaling factor for the Jensen-Shannon divergence size_max (int, optional): If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed.

Parameters

  • mu : float Scaling factor for the Jensen-Shannon divergence

  • size_max : int, optional If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed.

  • similarity : optional If specified, a custom similarity metric to use. Otherwise, use the Jensen-Shannon divergence.

Note: This class does not accept raw data, but rather ProcessedData. See class IntegratedQuantumEvolutionKernel for a subclass that provides a more powerful API, at the expense of performance.

Methods

  • to_processed_data Convert the raw data into features.

  • default_similarity The Jensen-Shannon similarity metric used to compute the kernel, used when calling kernel(X1, X2).

  • similarity Compute the similarity between two graphs using Jensen-Shannon divergence.

  • fit Fit the kernel to the training dataset by storing the dataset.

  • transform Transform the dataset into the kernel space with respect to the training dataset.

  • fit_transform Fit the kernel to the training dataset and transform it.

  • create_train_kernel_matrix Compute a kernel matrix for a given training dataset.

  • create_test_kernel_matrix Compute a kernel matrix for a given testing dataset and training set.

  • set_params Set multiple parameters for the kernel.

  • get_params Retrieve the value of all parameters.

method
to_processed_data (X: Sequence[KernelData]) → Sequence[ProcessedData]

Convert the raw data into features.

Raises

  • NotImplementedError

method
default_similarity (row: NDArray[np.floating], col: NDArray[np.floating]) → np.floating

The Jensen-Shannon similarity metric used to compute the kernel, used when calling kernel(X1, X2).

This is the default similarity, if no parameter similarity is provided.

method
similarity (graph_1: KernelData, graph_2: KernelData) → float

Compute the similarity between two graphs using Jensen-Shannon divergence.

This method computes the square of the Jensen-Shannon divergence (JSD) between two probability distributions over bitstrings. The JSD is a measure of the difference between two probability distributions, and it can be used as a kernel for machine learning algorithms that require a similarity function.

The input graphs are assumed to have been processed using the ProcessedData class from qek_os.data_io.dataset.

Parameters

  • graph_1 : KernelData First graph.

  • graph_2 : KernelData Second graph.

Returns

  • float Similarity between the two graphs, scaled by a factor that depends on mu.

method
fit (X: Sequence[KernelData], y: list | None = None) → None

Fit the kernel to the training dataset by storing the dataset.

Parameters

  • X : Sequence[KernelData] The training dataset.

  • y : list | None list: Target variable for the dataset sequence. This argument is ignored, provided only for compatibility with machine-learning libraries.

method
transform (X_test: Sequence[KernelData], y_test: list | None = None) → np.ndarray

Transform the dataset into the kernel space with respect to the training dataset.

Parameters

  • X_test : Sequence[KernelData] The dataset to transform. y_test: list: Target variable for the dataset sequence. This argument is ignored, provided only for compatibility with machine-learning libraries.

  • Returns np.ndarray: Kernel matrix where each entry represents the similarity between the given dataset and the training dataset.

Raises

  • ValueError

method
fit_transform (X: Sequence[KernelData], y: list | None = None) → np.ndarray

Fit the kernel to the training dataset and transform it.

Parameters

  • X : Sequence[KernelData] The dataset to fit and transform. y: list: Target variable for the dataset sequence. This argument is ignored, provided only for compatibility with machine-learning libraries.

  • Returns np.ndarray: Kernel matrix for the training dataset.

method
create_train_kernel_matrix (train_dataset: Sequence[KernelData]) → np.ndarray

Compute a kernel matrix for a given training dataset.

This method computes a symmetric N x N kernel matrix from the Jensen-Shannon divergences between all pairs of graphs in the input dataset. The resulting matrix can be used as a similarity metric for machine learning algorithms. Args: train_dataset: A list of objects to compute the kernel matrix from. Returns: np.ndarray: An N x N symmetric matrix where the entry at row i and column j represents the similarity between the graphs in positions i and j of the input dataset.

method
create_test_kernel_matrix (test_dataset: Sequence[KernelData], train_dataset: Sequence[KernelData]) → np.ndarray

Compute a kernel matrix for a given testing dataset and training set.

This method computes an N x M kernel matrix from the Jensen-Shannon divergences between all pairs of graphs in the input testing dataset and the training dataset. The resulting matrix can be used as a similarity metric for machine learning algorithms, particularly when evaluating the performance on the test dataset using a trained model. Args: test_dataset: The testing dataset. train_dataset: The training set. Returns: np.ndarray: An M x N matrix where the entry at row i and column j represents the similarity between the graph in position i of the test dataset and the graph in position j of the training set.

method
set_params (**kwargs: dict[str, Any]) → None

Set multiple parameters for the kernel.

Parameters

  • **kwargs : dict[str, Any] Arbitrary keyword dictionary where keys are attribute names

  • and values are their respective values

method
get_params (deep: bool = True) → dict[str, Any]

Retrieve the value of all parameters.

Parameters

  • deep : bool Ignored for the time being. Added for compatibility with various machine learning libraries, such as scikit-learn.

Returns dict: A dictionary of parameters and their respective values. Note that this method always performs a copy of the dictionary.

class
FastQEK (mu: float, size_max: int | None = None, similarity: Callable[[NDArray[np.floating], NDArray[np.floating]], np.floating] | None = None)

Bases : BaseKernel[ProcessedData]

FastQEK class.

Initialize the kernel.

Attributes

  • - X (Sequence[ProcessedData]) Training data used for fitting the kernel.

  • - kernel_matrix (np.ndarray) Kernel matrix. This is assigned in the fit() method

Training parameters

mu (float): Scaling factor for the Jensen-Shannon divergence size_max (int, optional): If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed.

Parameters

  • mu : float Scaling factor for the Jensen-Shannon divergence

  • size_max : int, optional If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed.

  • similarity : optional If specified, a custom similarity metric to use. Otherwise, use the Jensen-Shannon divergence.

Note: This class does not accept raw data, but rather ProcessedData. See class IntegratedQEK for a subclass that provides a more powerful API, at the expense of performance.

Methods

method
to_processed_data (X: Sequence[ProcessedData]) → Sequence[ProcessedData]

Convert the raw data into features.

class
IntegratedQEK (mu: float, extractor: BaseExtractor[GraphType], size_max: int | None = None, similarity: Callable[[NDArray[np.floating], NDArray[np.floating]], np.floating] | None = None)

Bases : BaseKernel[GraphType]

A variant of the Quantum Evolution Kernel that supports fit/transform/fit_transform from raw data (graphs).

Initialize an IntegratedQEK

Performance note

This class uses an extractor to convert the raw data into features. This can be very slow if you use, for instance, a remote QPU, as the waitlines to access a QPU can be very long. If you are using this in an interactive application or a server, this will block the entire thread during the wait.

We recommend using this class only with local emulators.

Training parameters

mu (float): Scaling factor for the Jensen-Shannon divergence extractor: An extractor (e.g. a QPU or a Quantum emulator) used to conver the raw data (graphs) into features. size_max (int, optional): If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed. similarity (optional): If specified, a custom similarity metric to use. Otherwise, use the Jensen-Shannon divergence.

Parameters

  • mu : float Scaling factor for the Jensen-Shannon divergence

  • extractor : BaseExtractor[GraphType] An extractor (e.g. a QPU or a Quantum emulator) used to conver the raw data (graphs) into features.

  • size_max : int, optional If specified, only consider the first size_max qubits of bitstrings. Otherwise, consider all qubits. You may use this to trade precision in favor of speed.

  • similarity : optional If specified, a custom similarity metric to use. Otherwise, use the Jensen-Shannon divergence.

Methods

method
to_processed_data (X: Sequence[GraphType]) → Sequence[ProcessedData]

Convert the raw data into features.

Performance note

This method can can be very slow if you use, for instance, a remote QPU, as the waitlines to access a QPU can be very long. If you are using this in an interactive application or a server, this will block the entire thread during the wait.

count_occupation_from_bitstring (bitstring: str) → int

Counts the number of '1' bits in a binary string.

Parameters

  • bitstring : str A binary string containing only '0's and '1's.

Returns

  • int The number of '1' bits found in the input string.

dist_excitation_and_vec (count_bitstring: dict[str, int], size_max: int | None = None) → np.ndarray

Calculates the distribution of excitation energies from a dictionary of bitstrings to their respective counts.

Parameters

  • count_bitstring : dict[str, int] A dictionary mapping binary strings to their counts.

  • size_max : int | None If specified, only keep size_max energy distributions in the output. Otherwise, keep all values.

Returns

  • np.ndarray A NumPy array where keys are the number of '1' bits in each binary string and values are the normalized counts.

Raises

  • ValueError