Analyzing QUBO solutions
When using a solver, we return a QUBOSolution instance. It is composed of candidate solutions presented as bitstrings, with their respective QUBO evaluation. If a quantum approach was used, we also save counts or frequencies obtained from sampling from the quantum device , as well as the probabilities (frequencies divided by the total number of samples).
To analyze the solutions from one or many QUBO solvers, we can use QUBOAnalyzer. This will generate a pandas dataframe internally (accessible via the df attribute) for several QUBOAnalyzer methods for plotting or comparing solutions. Here we show how to compare two solutions (generated randomly for simplicity of the tutorial).
In [ ]:
import torchfrom qubosolver.qubo_analyzer import QUBOAnalyzerfrom qubosolver.data import QUBOSolutionIn [ ]:
num_bitstrings=100bit_length=3
costs = torch.randint(1, 20, (2**bit_length,), dtype=torch.float)
bitstrings = torch.randint(0, 2, (num_bitstrings, bit_length))bitstrings,counts=bitstrings.unique(dim=0,return_counts=True)solution1 = QUBOSolution(bitstrings, costs, counts)
bitstrings = torch.randint(0, 2, (num_bitstrings, bit_length))bitstrings,counts=bitstrings.unique(dim=0,return_counts=True)solution2 = QUBOSolution(bitstrings, costs, counts)In [ ]:
# Create the analyzer with our two solutionsanalyzer = QUBOAnalyzer([solution1, solution2], labels=["sol1", "sol2"])In [ ]:
df = analyzer.dfprint("Combined DataFrame:")print(df)Combined DataFrame: labels bitstrings costs counts probs 0 sol1 000 17.0 14 0.14 1 sol1 001 4.0 13 0.13 2 sol1 010 11.0 8 0.08 3 sol1 011 14.0 10 0.10 4 sol1 100 15.0 13 0.13 5 sol1 101 18.0 20 0.20 6 sol1 110 7.0 10 0.10 7 sol1 111 10.0 12 0.12 8 sol2 000 17.0 11 0.11 9 sol2 001 4.0 16 0.16 10 sol2 010 11.0 7 0.07 11 sol2 011 14.0 15 0.15 12 sol2 100 15.0 15 0.15 13 sol2 101 18.0 13 0.13 14 sol2 110 7.0 12 0.12 15 sol2 111 10.0 11 0.11
In [ ]:
filtered_cost_df = analyzer.filter_by_cost(max_cost=10)print("DataFrame after filtering by cost (<10):")print(filtered_cost_df)DataFrame after filtering by cost (<10): labels bitstrings costs counts probs 1 sol1 001 4.0 13 0.13 6 sol1 110 7.0 10 0.10 9 sol2 001 4.0 16 0.16 14 sol2 110 7.0 12 0.12
In [ ]:
# Filter by percentage: keep top 10% (lowest cost) bitstrings per solutionfiltered_percent_df = analyzer.filter_by_percentage(column="probs",order="descending",top_percent=0.1)print("DataFrame after filtering by top 10% (by cost):")filtered_percent_dfDataFrame after filtering by top 10% (by cost):
Out[ ]:
| labels | bitstrings | costs | counts | probs | |
|---|---|---|---|---|---|
| 0 | sol1 | 101 | 18.0 | 20 | 0.20 |
| 1 | sol2 | 001 | 4.0 | 16 | 0.16 |
In [ ]:
# Filter by probability: choose a threshold (here 0.4, for example)# (Probabilities are computed from counts for each solution.)filtered_prob_df = analyzer.filter_by_probability(min_probability=0.01)print("DataFrame after filtering by probability:")print(filtered_prob_df)DataFrame after filtering by probability: labels bitstrings costs counts probs 0 sol1 000 17.0 14 0.14 1 sol1 001 4.0 13 0.13 2 sol1 010 11.0 8 0.08 3 sol1 011 14.0 10 0.10 4 sol1 100 15.0 13 0.13 5 sol1 101 18.0 20 0.20 6 sol1 110 7.0 10 0.10 7 sol1 111 10.0 12 0.12 8 sol2 000 17.0 11 0.11 9 sol2 001 4.0 16 0.16 10 sol2 010 11.0 7 0.07 11 sol2 011 14.0 15 0.15 12 sol2 100 15.0 15 0.15 13 sol2 101 18.0 13 0.13 14 sol2 110 7.0 12 0.12 15 sol2 111 10.0 11 0.11
In [ ]:
avg_cost_df = analyzer.average_cost()print("Average cost for all bitstrings per solution:")print(avg_cost_df)
print('------------------------------------------------')avg_cost_df = analyzer.average_cost(0.5)print("Average cost for top 50% bitstrings per solution:")print(avg_cost_df)
print('------------------------------------------------')avg_cost_df = analyzer.average_cost(0.1)print("Average cost for top 10% bitstrings per solution:")print(avg_cost_df)print('------------------------------------------------')avg_cost_df = analyzer.average_cost(0.01)print("Average cost for top 1% bitstrings per solution:")print(avg_cost_df)Average cost for all bitstrings per solution: labels average cost bitstrings considered 0 sol1 12.0 8 1 sol2 12.0 8 ------------------------------------------------ Average cost for top 50% bitstrings per solution: labels average cost bitstrings considered 0 sol1 9.2 5 1 sol2 9.2 5 ------------------------------------------------ Average cost for top 10% bitstrings per solution: labels average cost bitstrings considered 0 sol1 4.0 1 1 sol2 4.0 1 ------------------------------------------------ Average cost for top 1% bitstrings per solution: labels average cost bitstrings considered 0 sol1 4.0 1 1 sol2 4.0 1
In [ ]:
best_bit_df = analyzer.best_bitstrings()print("Best bitstring per solution:")print(best_bit_df)Best bitstring per solution: labels bitstrings costs counts probs 0 sol1 001 4.0 13 0.13 1 sol2 001 4.0 16 0.16
In [ ]:
df_with_gaps = analyzer.calculate_gaps(opt_cost=10)print("DataFrame after calculating gaps (opt_cost=10):")print(df_with_gaps)DataFrame after calculating gaps (opt_cost=10): labels bitstrings costs counts probs gaps 0 sol1 000 17.0 14 0.14 0.7 1 sol1 001 4.0 13 0.13 0.6 2 sol1 010 11.0 8 0.08 0.1 3 sol1 011 14.0 10 0.10 0.4 4 sol1 100 15.0 13 0.13 0.5 5 sol1 101 18.0 20 0.20 0.8 6 sol1 110 7.0 10 0.10 0.3 7 sol1 111 10.0 12 0.12 0.0 8 sol2 000 17.0 11 0.11 0.7 9 sol2 001 4.0 16 0.16 0.6 10 sol2 010 11.0 7 0.07 0.1 11 sol2 011 14.0 15 0.15 0.4 12 sol2 100 15.0 15 0.15 0.5 13 sol2 101 18.0 13 0.13 0.8 14 sol2 110 7.0 12 0.12 0.3 15 sol2 111 10.0 11 0.11 0.0
In [ ]:
# Filter by percentage: keep top 10% (lowest cost) bitstrings per solutionfiltered_percent_df = analyzer.filter_by_percentage(column="gaps",top_percent=0.1)print("DataFrame after filtering by top 10% (by cost):")print(filtered_percent_df)DataFrame after filtering by top 10% (by cost): labels bitstrings costs counts probs gaps 0 sol1 111 10.0 12 0.12 0.0 1 sol2 111 10.0 11 0.11 0.0
In [ ]:
plot1 = analyzer.plot( x_axis="bitstrings", y_axis="probs", sort_by="probs", sort_order="ascending", context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
plot2 = analyzer.plot( x_axis="costs", y_axis="probs", sort_by="costs", sort_order="ascending", context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
plot2 = analyzer.plot( x_axis="costs", y_axis="probs", sort_by="costs", sort_order="ascending", probability_threshold=0.1, context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
plot2 = analyzer.plot( x_axis="costs", y_axis="probs", sort_by="costs", sort_order="ascending", cost_threshold=11, context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
plot2 = analyzer.plot( x_axis="costs", y_axis="probs", sort_by="costs", sort_order="ascending", top_percent=0.1, context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
plot2 = analyzer.plot( x_axis="costs", y_axis="probs", sort_by="costs", sort_order="ascending", labels=['sol1'], context="notebook")Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting. Using categorical units to plot a list of strings that are all parsable as floats or dates. If these strings should be plotted as numbers, cast to the appropriate data type before plotting.
In [ ]:
# Create a new solution with different bitstrings and costsbitstrings = torch.randint(0, 2, (5, bit_length))bitstrings,counts=bitstrings.unique(dim=0,return_counts=True)costs = torch.randint(1, 20, (len(bitstrings),), dtype=torch.float)solution3 = QUBOSolution(bitstrings, costs, counts)
# Create the analyzer with our three solutionsanalyzer = QUBOAnalyzer([solution1, solution2, solution3], labels=["sol1", "sol2", "sol3"])In [ ]:
# Compare the solutionsanalyzer.compare_qubo_solutions(["sol1", "sol3"])print("\n -------------------------------------- \n")analyzer.compare_qubo_solutions(["sol1", "sol2"])Comparing two lists of bitstrings: 1. sol1: 8 bitstrings (8 unique strings) 2. sol3: 3 bitstrings (3 unique strings) Bitstrings in sol1 not present in sol3: - 111 - 101 - 011 - 000 - 010 Ratio of different bitstrings: 5/8 = 62% -------------------------------------- Comparing two lists of bitstrings: 1. sol1: 8 bitstrings (8 unique strings) 2. sol2: 8 bitstrings (8 unique strings) The lists contain exactly the same bitstrings.