Optimizing GAN for HEP ¶
↑ Click on “Open in Colab” to execute this notebook live on Google Colaboratory.
In this Google Colab notebook, we provide step by step instructions on how to reproduce the results and figures presented in our paper. All the results were obtained using the NERSC/Cori supercomputer at Lawerence Berkeley National Laboratory.
Getting Started ¶
In order to get started with running HYPPO and performing all the post-optimization analysis on the results, we need to install a few packages. The following code snippet will perform the following:
-
Install HYPPO through the PIP Python package manager;
-
Download all the HYPPO product files from our project;
-
Install the HEP/GAN package;
-
Install the latest version of the matplotlib library to avoid common errors from older versions.
Softwares ¶
[1]:
%%capture
!pip install hpo-uq
HYPPO product files ¶
Finally, the following commands will download all the log files generated by the HYPPO software during the HPO searches, as well as all the SLURM scripts used to submit the jobs to the NERSC’s Cori cluster, and the best models saved by the HEP/GAN package.
[2]:
%%capture
!git clone https://gitlab.com/hpo-uq/publications/paper-2 PaperData
GAN for HEP package ¶
We can now clone the git repository that contains the HEP/GAN package as well as both datasets used in this work.
[3]:
%%capture
!git clone https://gitlab.com/hpo-uq/applications/gan4hep
!cd gan4hep && pip install .
Matplotlib ¶
[4]:
%%capture
!pip install -U matplotlib
The following command will restart the runtime. This is required in Google Colab in order for the matplotlib upgrade to be taken into account. Executing the following commands will cause the session to crash and restart automatically. This is an expected behaviour and one can ignore the error messages that will get displayed.
[ ]:
import os
os.kill(os.getpid(), 9)
The following settings from matplotlib will be used for all the figures generated in this notebook.
[1]:
import matplotlib.pyplot as plt
plt.style.use('seaborn')
plt.rc('font', size=15)
plt.rc('axes', labelsize=15, titlesize=15)
plt.rc('legend', fontsize=15)
plt.rc('xtick', labelsize=15)
plt.rc('ytick', labelsize=15)
Hadronization Processes ¶
Parameterized simulations ¶
[2]:
from gan4hep import load_data
data = load_data(filename='gan4hep/data/ClusterTo2Pi0.dat')
dataset_size = len(data['phi'])
[ ]:
import math, numpy
import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14,5),dpi=300)
ax1.hist(numpy.vstack((data['px'],data['py'],data['pz'])).T, 40, density=True,histtype='step',lw=2,color=['limegreen','orangered','dodgerblue'],label=[r'$p_x$',r'$p_y$',r'$p_z$'])
ax1.set_ylim(ymin=0.95)
ax1.set_xlabel('Momentum')
ax1.legend(loc='upper center',ncol=3)
ax2.hist(numpy.vstack((data['phi'],data['theta'])).T, 40, density=True,histtype='step',lw=2,color=['crimson','mediumorchid'],label=[r'$\phi$',r'$\theta$'])
ax2.set_xticks([-math.pi/2,-math.pi/4,0,math.pi/4,math.pi/2])
ax2.set_xticklabels([r'-$\pi$/2',r'-$\pi$/4','0',r'+$\pi$/4',r'+$\pi$/2'])
ax2.set_xlabel('Angle')
ax2.legend(loc='upper center')
plt.tight_layout()
plt.show()

Broad search solutions ¶
For the broad search, we created 3 repositories, each of them containing the results for 3 distinct hyperparameter optimization approaches: -
100
to store results from 20 independent random sampling searches with 100 different sets of hyperparameter evaluated in each search; -
10GP
to store results from 20 independent Gaussian Process optimization, each of them starting with an initial experimental design of 10 sets of hyperparameters; -
10RBF
to store results from 20 independent
Radial Basis Function optimization, each of them starting with an initial experimental design of 10 sets of hyperparameters.
names : [noise_dim,batch_size, lr_gen,lr_disc,gen_layersize,disc_layersize]
mult : [ 4, 32, 0.00005,0.00005, 8, 8]
xlow : [ 1, 8, 1, 1, 8, 8]
xup : [ 100, 64, 200, 200, 128, 128]
All the results can be loaded as follows:
[12]:
import hyppo
random = hyppo.extract('PaperData/hadron/100/*/logs/evaluation_*_01.log',target_unc='STD',target_loss='outer')
gp = hyppo.extract('PaperData/hadron/10GP/*/logs/surrogate_*_01.log',target_unc='STD')
rbf = hyppo.extract('PaperData/hadron/10RBF/*/logs/surrogate_*_01.log',target_unc='STD')
# Store details from best random model for final figure
best_random = random.iloc[random.loss.to_list()==random.loss.min()].iloc[0]
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/*/logs/surrogate_*_01.log (1978/1979)
A total of almost 2,000 architectures were evaluated for each approach. The approach that uses the RBF (Radial Basis Function) optimization appears to outperform both the random sampling and Gaussian Process approaches in finding optimal solutions.
[ ]:
import matplotlib.pyplot as plt
input_data = [[random,'Random Sampling'],[gp,'Gaussian Process'],[rbf,'Radial Basis Function']]
fig, ax = plt.subplots(2,len(input_data),figsize=(4*len(input_data)+2,5),gridspec_kw={'height_ratios': [1, 3]},dpi=300,sharex=True,sharey='row')
plt.subplots_adjust(left=0.05,top=0.93,right=0.9,wspace=0.05,hspace=0.1)
for i,data in enumerate(input_data):
data, title = data
ax[0][i].set_title('(%i) %s' % (i+1,title))
print('Dataset %i : %i solutions displayed' % (i+1,len(data)))
ax[0][i].hist(data.loss,range=[0,1.6],rwidth=0.9,bins=16,histtype='bar')
im = ax[1][i].scatter(data.loss,data.stdev,c=data.lr_gen,lw=0.1,alpha=0.5,s=20,cmap='plasma')
ax[1][i].minorticks_off()
ax[1][i].set_xlabel('Best Mean Wasserstein Distance')
ax[1][i].scatter([0.125],[0.030], facecolors='none', edgecolors='tomato',s=1200,lw=2)
ax[1][0].set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
cax = plt.axes([0.91, 0.125, 0.02, 0.575])
cbar = plt.colorbar(im,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Generator Learning Rate')
plt.savefig('case1_solutions.jpg')
plt.show()
plt.close()
Dataset 1 : 2000 solutions displayed
Dataset 2 : 1979 solutions displayed
Dataset 3 : 1979 solutions displayed

Loss Convergence ¶
[ ]:
import numpy
from hyppo.utils.extract import mult_convergence, extract, single_convergence
from matplotlib.ticker import FormatStrFormatter
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(14,8),dpi=300)
ax1 = fig.add_axes([0.08,0.56,0.30,0.38])
ax2 = fig.add_axes([0.39,0.56,0.10,0.38])
ax3 = fig.add_axes([0.56,0.56,0.30,0.38])
ax4 = fig.add_axes([0.87,0.56,0.10,0.38])
ax5 = fig.add_axes([0.08,0.10,0.30,0.38])
ax6 = fig.add_axes([0.39,0.10,0.10,0.38])
ax7 = fig.add_axes([0.56,0.10,0.41,0.38])
ax = numpy.array([[ax1,ax2],[ax3,ax4],[ax5,ax6],[ax7,None]])
colors = ['teal','orange','tomato']
titles = ['Random Sampling','Gaussian Process','Radial Basis Function']
for n,path in enumerate(['100','10GP','10RBF']):
print(path)
extras = {'target_unc':'STD','with_surrogate':False, 'target_loss':'outer'} if path=='100' else {'target_unc':'STD','with_surrogate':True}
loss, std = mult_convergence('PaperData/hadron/%s'%path,**extras)
loss, std = loss[:100], std[:100]
ax[n,0].plot(loss,lw=2,zorder=2,color=colors[n],drawstyle='steps-post')
ax[n,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post',zorder=2)
losses = []
for i in range(20):
solutions = extract('PaperData/hadron/%s/%03i/logs/*_01.log'%(path,i+1),**extras).head(100)
loss1, std1 = single_convergence(solutions)
ax[n,0].plot(loss1,lw=0.5,zorder=1,color='0.5',drawstyle='steps-post')
ax[n,0].errorbar(x=solutions.nevals,y=solutions.loss,yerr=solutions.stdev,fmt='o',ms=0.8,c='0.7',mfc='k',lw=0)
losses.extend(solutions.loss)
ax[n,1].hist(losses,range=[0,1.6],rwidth=0.9,bins=16,histtype='bar',orientation='horizontal',color='slategrey')
ax[n,0].set_ylabel('Wasserstein distance')
ax[n,1].set_ylim([0,1.6])
ax[n,1].set_xlim([0,600])
ax[n,0].set_ylim([0,1.6])
ax[n,0].set_xlim([0,100])
ax[n,0].set_title(titles[n])
ax[n,1].xaxis.set_ticklabels([])
ax[n,1].yaxis.set_ticklabels([])
ax[3,0].plot(loss,color=colors[n],lw=1,zorder=1,drawstyle='steps-post',label=titles[n])
ax[3,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post')
ax[3,0].axvline(10,color='black',lw=1,ls='dashed')
ax[3,0].set_yscale('log')
ax[2,0].set_xlabel('Index of function evaluations')
ax[3,0].grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax[3,0].yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
ax[3,0].set_xlabel('Index of function evaluations')
ax[3,0].set_ylabel('Best Wasserstein distance so far')
ax[3,0].set_title('Loss Convergence')
leg = ax[3,0].legend(loc='upper right')
for line in leg.get_lines():
line.set_linewidth(4.0)
ax[3,0].set_xlim([0,100])
plt.show()
100
10GP
10RBF
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/surrogate*_01.log (98/99)
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/*_01.log (108/109)

[ ]:
import hyppo
conv_rbf = hyppo.mult_convergence('PaperData/hadron/10RBF',with_surrogate=True,target_unc='STD')
conv_gp = hyppo.mult_convergence('PaperData/hadron/10GP',with_surrogate=True,target_unc='STD')
conv_random = hyppo.mult_convergence('PaperData/hadron/100',with_surrogate=False,target_unc='STD',target_loss='outer')
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/surrogate*_01.log (98/99)
[ ]:
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter
colors = ['teal','orange','tomato']
plt.figure(figsize=(0.8*14,7),dpi=300)
for i,((loss,std),title) in enumerate(((conv_random,'Random Sampling'),(conv_gp,'Gaussian Process'),(conv_rbf,'Radial Basis Function'))):
plt.plot(loss,color=colors[i],lw=1,zorder=1,drawstyle='steps-post',label=title)
plt.fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[i],step='post')
plt.axvline(10,color='black',lw=1,ls='dashed')
plt.xlim([0,100])
plt.yscale('log')
plt.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
plt.gca().yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
plt.xlabel('Index of function evaluations')
plt.ylabel('Best Wasserstein distance so far (log scaling)')
leg = plt.legend(loc='upper right')
for line in leg.get_lines():
line.set_linewidth(4.0)
plt.tight_layout()
plt.savefig('case1_convergence.jpg')
plt.show()

Sensitivity analysis ¶
[ ]:
import hyppo
import matplotlib.pyplot as plt
results = hyppo.extract('PaperData/hadron/10RBF/*/logs/surrogate_*_01.log',target_unc='STD')
best_losses = results[results['loss']<0.05]
data1, hp_names = hyppo.get_custom_data(results,sensitivity=True)[1:]
data2 = hyppo.get_custom_data(best_losses,sensitivity=True)[1]
fig,ax = plt.subplots(3,6,figsize=(14,6),gridspec_kw={'height_ratios': [1,3,3]},dpi=300,sharey='row',sharex='col')
plt.draw()
plt.subplots_adjust(top=0.97,wspace=0.05,hspace=0.05,right=0.92,left=0.07,bottom=0.15)
for n in range(6):
if n+1>data1.shape[1]-3:
break
ax[0][n-1].hist(results[hp_names[n]],rwidth=0.9,bins=16,histtype='bar',color='slategrey')
im1 = ax[1][n-1].scatter(data1[:,n+3],results.loss,c=results['stdev'],lw=0.5,ec='black',alpha=0.5,s=20,cmap='turbo')
im2 = ax[2][n-1].scatter(data2[:,n+3],best_losses.loss,c=best_losses['stdev'],lw=0.5,ec='black',s=20,cmap='turbo')
ax[2][n-1].set_xlabel(hp_names[n])
ax[2][n-1].tick_params(axis='x',labelrotation=45)
ax[1][0].set_ylabel('Wasserstein\nDistance')
ax[2][0].set_ylabel('Wasserstein\nDistance below %.2f' % max(best_losses.loss))
cax = plt.axes([0.93, 0.15, 0.02, 0.69])
cbar = plt.colorbar(im1,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label(r'1$\mathrm{\sigma}$ Deviation')
cbar.ax.xaxis.set_ticks_position('top')
cbar.ax.xaxis.set_label_position('top')
plt.savefig('case1_sensitivity.jpg')
plt.show()
plt.close()
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/*/logs/surrogate_*_01.log (1978/1979)

[ ]:
plt.subplots(figsize=(8,6))
plt.scatter(results.batch_size,results.lr_gen,c=results.loss,cmap='turbo',ec='k',lw=0.2)
plt.xlabel('Batch Size')
plt.ylabel('Generator Learning Rate')
cbar = plt.colorbar()
cbar.set_label('Wassertein Distance')
plt.tight_layout()
plt.savefig('figure.png',dpi=200)
plt.show()

Generator learning rate ¶
Based on the sensitivity analysis from the broad search, we notice the strong sensitivity from the generator learning rate. Here we investigate how much lower one can go with the generator learning rate and whether the loss starts going up below some threshold. We fixed all the hyperparameters, except the generator learning rate, to values that correspond to one of the most optimal solutions found during the broad search, i.e. with
noise_dim
set to 4,
batch_size
to 768,
lr_disc
to
0.00265,
gen_layersize
to 72, and
disc_layersize
to 288.
Broad search ¶
We generated 200 linearly separated values from 5 to 1,000. Each value will be scaled by a factor of 1e-5 to produce the generator learning rate.
[ ]:
import numpy
samples = numpy.linspace(5,1000,200)
numpy.savetxt('samples.txt',samples,fmt='%i')
broad_search = hyppo.extract('PaperData/hadron/learning_rate/broad/logs/evaluation*_01.log',target_unc='STD').sort_values(by=['lr_gen'])
Focus search ¶
We now generated 200 log-separated values from 1 to 100,000. Each value is then multiply by 100 to ensure unique integer points are defined. The values will then be scaled by a factor of 1e-9 to provide the expected generator learning rate.
[ ]:
import numpy
samples = numpy.logspace(0,5,200)
samples_scaled = numpy.array(samples*1e2,dtype=int)
numpy.savetxt('samples.txt',samples_scaled,fmt='%i')
[ ]:
import hyppo
import pandas as pd
focus_search = [
hyppo.extract('PaperData/hadron/learning_rate/focus/EP0020/logs/evaluation*_01.log',target_unc='STD').sort_values(by=['lr_gen']),
pd.concat([
hyppo.extract('PaperData/hadron/learning_rate/focus/EP0100/run1/logs/evaluation*_01.log',target_unc='STD'),
hyppo.extract('PaperData/hadron/learning_rate/focus/EP0100/run2/logs/evaluation*_01.log'),
],ignore_index=True).sort_values(by=['lr_gen']),
hyppo.extract('PaperData/hadron/learning_rate/focus/EP0200/run*/logs/evaluation*_01.log').sort_values(by=['lr_gen']),
hyppo.extract('PaperData/hadron/learning_rate/focus/EP1000/run*/logs/evaluation*_01.log').sort_values(by=['lr_gen']),
]
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/learning_rate/focus/EP1000/run*/logs/evaluation*_01.log (169/170)
[ ]:
import hyppo
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
fig, ax = plt.subplots(1,2,figsize=(14,6),dpi=300)
ax[0].set_title('Broad Search')
ax[0].errorbar(x=broad_search.lr_gen,y=broad_search.loss,yerr=broad_search.stdev.values,fmt='o',ms=2,c='0.7',mfc='k',lw=0.5)
ax[0].set_xlabel('Generator Learning Rate')
ax[0].set_ylabel('Best Wasserstein Distance')
ax[0].add_patch(Rectangle((1e-7, 0), 1e-3-1e-7, 0.16, edgecolor='red', fill=False, linestyle='solid', lw=1))
ax[0].set_xlim(0,0.006)
ax[0].set_ylim(ymax=1.4)
colors = ['navy','teal','orange','tomato']
epochs = [20,100,200,1000]
for n,(color,epoch,lr) in enumerate(zip(colors,epochs,focus_search)):
ax[1].plot(lr.lr_gen,lr.loss,c=colors[n],lw=0.5,label='%i epochs'%epoch)
ax[1].fill_between(lr.lr_gen,lr.loss-lr.stdev,lr.loss+lr.stdev,color=color,alpha=0.1)
ax[1].set_title('Focus Search')
ax[1].set_xlim(1e-7,1e-3)
ax[1].set_ylim(0,0.16)
ax[1].set_xscale('log')
ax[1].set_xlabel('Generator Learning Rate')
leg = ax[1].legend(loc='upper right',handlelength=1)
for line in leg.get_lines():
line.set_linewidth(4.0)
plt.tight_layout(w_pad=0.1)
plt.savefig('case1_lr.jpg')
plt.show()

Focus search solutions ¶
Based on the (much narrower) range of values covered for each hyperparameter for solutions that provide the lowest loss, i.e. below 0.05, we performed a series of focus search to narrow down the region of optimal solutions in the hyperparameter space explored.
[4]:
import hyppo
data = hyppo.extract('PaperData/hadron/focus_search/logs/*_01.log')
print(len(data),'solutions')
best = data.loc[data.loss.idxmin()]
data[data.loss==min(data.loss)]
70 solutions
[4]:
loss | stdev | generator | discriminator | tot_prms | noise_dim | batch_size | lr_gen | lr_disc | gen_layersize | disc_layersize | nevals | time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
49 | 0.01398 | 0.00349 | 5202 | 61237 | 66439 | 3 | 642 | 0.000004 | 0.0024 | 65 | 243 | 56 | 12241.653 |
[ ]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(14,6),dpi=300)
plt.subplots_adjust(left=0.08,top=0.95,bottom=0.12,right=0.88,wspace=0.05,hspace=0.1)
print('%i solutions displayed' % len(data))
im = ax.scatter(data.loss,data.stdev,c=data.tot_prms/1e3,lw=0.5,alpha=0.7,s=70,ec='black',cmap='terrain',zorder=2)
ax.minorticks_off()
ax.set_xlabel('Best Wasserstein Distance')
ax.set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
ax.set_xscale('log')
ax.set_yscale('log')
ax.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax.scatter([best.loss],[best.stdev], facecolors='none', edgecolors='tomato',s=500,lw=5)
cax = plt.axes([0.89, 0.12, 0.02, 0.83])
cbar = plt.colorbar(im,cax=cax,format='%ik')
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Number of trainable parameters')
plt.savefig('case1_focus.jpg')
plt.show()
70 solutions displayed

Final results ¶
[5]:
import os,numpy,math
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from gan4hep import get_inference
n_bins = 20
fig, ax = plt.subplots(1,2,figsize=(14,5),dpi=300,sharex=True)
plt.subplots_adjust(bottom=0.12, left=0.08, wspace=0.15, right=0.98, top=0.9)
config = dict(histtype='step', lw=1)
histograms = [[],[]]
unopt = math.pi/2 * get_inference('PaperData/hadron/best_model/best_random/generator',filename='gan4hep/data/ClusterTo2Pi0.dat',n_events=dataset_size,batch_size=int(best_random.batch_size),noise_dim=int(best_random.noise_dim))[2][0]
for trial in range(8):
tot_wdis, truths, predictions = get_inference(os.path.join('PaperData/hadron/best_model/optimized/%02i/generator'%(trial+1)),filename='gan4hep/data/ClusterTo2Pi0.dat',n_events=dataset_size,batch_size=int(best.batch_size),noise_dim=int(best.noise_dim))
tot_wdis = numpy.median(tot_wdis)
print(trial+1,numpy.mean(tot_wdis))
for idx in range(2):
trial_counts = []
for n in range(predictions.shape[0]):
counts, bins, _ = ax[idx].hist(predictions[n,:,idx],density=True, bins=n_bins, linewidth=0.5, alpha=0, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='tomato', histtype='step', lw=0.1)
trial_counts.append(counts)
histograms[idx].append(numpy.median(trial_counts,axis=0))
for idx,name in enumerate(['phi','theta']):
ax[idx].hist(truths[:, idx], density=True, bins=n_bins, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='black', **config)
ax[idx].hist(unopt[:, idx], density=True, bins=n_bins, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='blue', **config)
ax[idx].set_xticks([-numpy.float32(math.pi/2),-numpy.float32(math.pi/4),0,numpy.float32(math.pi/4),numpy.float32(math.pi/2)])
ax[idx].set_xticklabels([r'-$\pi$/2',r'-$\pi$/4','0',r'+$\pi$/4',r'+$\pi$/2'])
ax[idx].set_xlabel(r"$\%s$" % name)
ax[idx].set_xlim(-5*math.pi/8,5*math.pi/8)
mean = numpy.mean(histograms[idx],axis=0)
stdev = numpy.std(histograms[idx],axis=0)
ax[idx].stairs(mean, bins, color='red', lw=1)
ax[idx].stairs(mean-stdev/2, bins, baseline=mean+stdev/2, color='red',lw=1, fill=True, alpha=0.2)
ax[idx].legend(loc='upper center',ncol=1)
ax[idx].set_ylim([0.25,0.38] if idx==0 else [0,0.6])
ax[0].set_ylabel('Normalized Distribution')
line1 = Line2D([0], [0], label='Simulated Dataset', color='black', lw=5)
line2 = Line2D([0], [0], label='Best-random-GAN-generated events', color='blue', lw=5)
line3 = Line2D([0], [0], label='Optimized-GAN-generated events', color='red', lw=5)
fig.legend(handles=[line1,line2,line3], bbox_to_anchor=[0.03,0.95,1,0.05], loc='center', ncol=3)
plt.savefig('case1_histograms.jpg')
plt.show()
1 0.010046488139778376
2 0.019267020747065544
3 0.013843555003404617
4 0.012103751068934798
5 0.01173804746940732
6 0.011390266939997673
7 0.017358052544295788
WARNING:matplotlib.legend:No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
WARNING:matplotlib.legend:No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
8 0.013876878656446934

Dimuon decay ¶
Parameterized simulations ¶
[6]:
import numpy
data = {}
filename = 'gan4hep/data/mc16d_364100_100Kentries_dimuon.output'
events = numpy.loadtxt(filename)
data['E1'] = events[:,0].astype(numpy.float32)
data['phi1'] = events[:,1].astype(numpy.float32)
data['theta1'] = events[:,2].astype(numpy.float32)
data['E2'] = events[:,3].astype(numpy.float32)
data['phi2'] = events[:,4].astype(numpy.float32)
data['theta2'] = events[:,5].astype(numpy.float32)
dataset_size = len(events)
[ ]:
import math, numpy
import matplotlib.pyplot as plt
fig, (ax1,ax2) = plt.subplots(1,2,figsize=(14,5),dpi=300)
ax1.hist(numpy.vstack((data['E1'],data['E2'])).T, 40, histtype='step',lw=2,color=['dodgerblue','tomato'],label=[r'$E_1$',r'$E_2$'])
ax1.set_xlabel('Energy')
ax1.legend(loc='upper right')
ax2.hist(numpy.vstack((data['phi1'],data['phi2'])).T, 40,histtype='step',lw=2,color=['dodgerblue','tomato'],label=[r'$\phi_1$',r'$\phi_2$'])
ax2.hist(numpy.vstack((data['theta1'],data['theta2'])).T, 40,histtype='step',lw=2,color=['yellowgreen','gold'],label=[r'$\theta_1$',r'$\theta_2$'])
ax2.set_xlabel('Angle')
ax2.set_xticks([-math.pi,-math.pi/2,0,math.pi/2,math.pi])
ax2.set_xticklabels([r'-$\pi$',r'-$\pi$/2','0',r'+$\pi$/2',r'+$\pi$'])
ax2.legend(loc='lower center')
plt.tight_layout()
plt.show()

Broad search solutions ¶
names : [noise_dim,batch_size, lr_gen,lr_disc,gen_layersize,disc_layersize]
mult : [ 4, 32, 0.00005,0.00005, 8, 8]
xlow : [ 1, 8, 1, 1, 8, 8]
xup : [ 100, 64, 200, 200, 128, 128]
[15]:
import hyppo
random = hyppo.extract('PaperData/dimuon/100/*/logs/*_01.log')
gp = hyppo.extract('PaperData/dimuon/10GP/*/logs/*_01.log')
rbf = hyppo.extract('PaperData/dimuon/10RBF/*/logs/*_01.log')
# Store details from best random model for final figure
best_random = random.iloc[random.loss.to_list()==random.loss.min()].iloc[0]
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/*/logs/*_01.log (2174/2175)
[24]:
best_random
[24]:
loss 0.22607
stdev 0.13811
generator 451974.00000
discriminator 91465.00000
tot_prms 543439.00000
noise_dim 316.00000
batch_size 480.00000
lr_gen 0.00010
lr_disc 0.00070
gen_layersize 528.00000
disc_layersize 296.00000
nevals 277.00000
time 167.80300
Name: 1331, dtype: float64
[ ]:
import matplotlib.pyplot as plt
input_data = [[random,'Random Sampling'],[gp,'Gaussian Process'],[rbf,'Radial Basis Function']]
fig, ax = plt.subplots(2,len(input_data),figsize=(4*len(input_data)+2,5),gridspec_kw={'height_ratios': [1, 3]},dpi=300,sharex=True,sharey='row')
if len(input_data)==1:
ax = numpy.array(ax,ndmin=2).T
plt.subplots_adjust(left=0.05,top=0.93,right=0.9,wspace=0.05,hspace=0.1)
for i,(data, title) in enumerate(input_data):
data = data[:2000]
ax[0][i].set_title('(%i) %s' % (i+1,title))
print('Dataset %i : %i solutions displayed' % (i+1,len(data)))
ax[0][i].hist(data.loss,range=[0.2,1.1],rwidth=0.9,bins=16,histtype='bar')
im = ax[1][i].scatter(data.loss,data.stdev,c=data.lr_gen,lw=0.1,alpha=0.5,s=20,cmap='plasma')
ax[1][i].minorticks_off()
ax[1][i].set_xlabel('Best Mean Wasserstein Distance')
ax[1][0].set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
cax = plt.axes([0.91, 0.125, 0.02, 0.575])
cbar = plt.colorbar(im,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Generator Learning Rate')
plt.savefig('case2_solutions.jpg')
plt.show()
plt.close()
Dataset 1 : 2000 solutions displayed
Dataset 2 : 2000 solutions displayed
Dataset 3 : 2000 solutions displayed

Loss convergence ¶
[ ]:
import numpy
from hyppo.utils.extract import mult_convergence, extract, single_convergence
from matplotlib.ticker import FormatStrFormatter
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(14,8),dpi=300)
ax1 = fig.add_axes([0.08,0.56,0.30,0.38])
ax2 = fig.add_axes([0.39,0.56,0.10,0.38])
ax3 = fig.add_axes([0.56,0.56,0.30,0.38])
ax4 = fig.add_axes([0.87,0.56,0.10,0.38])
ax5 = fig.add_axes([0.08,0.10,0.30,0.38])
ax6 = fig.add_axes([0.39,0.10,0.10,0.38])
ax7 = fig.add_axes([0.56,0.10,0.41,0.38])
ax = numpy.array([[ax1,ax2],[ax3,ax4],[ax5,ax6],[ax7,None]])
colors = ['teal','orange','tomato']
titles = ['Random Sampling','Gaussian Process','Radial Basis Function']
for n,path in enumerate(['100','10GP','10RBF']):
print(path)
extras = {'with_surrogate':False} if path=='100' else {'with_surrogate':True}
loss, std = mult_convergence('PaperData/dimuon/%s'%path,**extras)
loss, std = loss[:100], std[:100]
ax[n,0].plot(loss,lw=2,zorder=2,color=colors[n],drawstyle='steps-post')
ax[n,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post',zorder=2)
losses = []
for i in range(20):
solutions = extract('PaperData/dimuon/%s/%03i/logs/*_01.log'%(path,i+1),**extras).head(100)
loss1, std1 = single_convergence(solutions)
ax[n,0].plot(loss1,lw=0.5,zorder=1,color='0.5',drawstyle='steps-post')
ax[n,0].errorbar(x=solutions.nevals,y=solutions.loss,yerr=solutions.stdev,fmt='o',ms=0.8,c='0.7',mfc='k',lw=0)
losses.extend(solutions.loss)
ax[n,1].hist(losses,range=[0,1],rwidth=0.9,bins=16,histtype='bar',orientation='horizontal',color='slategrey')
ax[n,0].set_ylabel('Wasserstein distance')
ax[n,1].set_ylim([0,1])
ax[n,1].set_xlim([0,600])
ax[n,0].set_ylim([0,1])
ax[n,0].set_xlim([0,100])
ax[n,0].set_title(titles[n])
ax[n,1].xaxis.set_ticklabels([])
ax[n,1].yaxis.set_ticklabels([])
ax[3,0].plot(loss,color=colors[n],lw=1,zorder=1,drawstyle='steps-post',label=titles[n])
ax[3,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post')
ax[3,0].axvline(10,color='black',lw=1,ls='dashed')
ax[3,0].set_yscale('log')
ax[2,0].set_xlabel('Index of function evaluations')
ax[3,0].grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax[3,0].yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
ax[3,0].set_xlabel('Index of function evaluations')
ax[3,0].set_ylabel('Best Wasserstein distance so far')
ax[3,0].set_title('Loss Convergence')
leg = ax[3,0].legend(loc='upper right')
for line in leg.get_lines():
line.set_linewidth(4.0)
ax[3,0].set_xlim([0,100])
plt.show()
100
10GP
10RBF
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/006/logs/surrogate*_01.log (98/99)
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/006/logs/*_01.log (108/109)

Sensitivity study ¶
[ ]:
import hyppo
import matplotlib.pyplot as plt
results = hyppo.extract('PaperData/dimuon/10RBF/*/logs/*_01.log')
best_losses = results[results['loss']<0.26]
data1, hp_names = hyppo.get_custom_data(results,sensitivity=True)[1:]
data2 = hyppo.get_custom_data(best_losses,sensitivity=True)[1]
fig,ax = plt.subplots(3,6,figsize=(14,6),gridspec_kw={'height_ratios': [1,3,3]},dpi=300,sharey='row',sharex='col')
plt.draw()
plt.subplots_adjust(top=0.97,wspace=0.05,hspace=0.05,right=0.92,left=0.08,bottom=0.15)
for n in range(6):
if n+1>data1.shape[1]-3:
break
ax[0][n-1].hist(results[hp_names[n]],rwidth=0.9,bins=16,histtype='bar',color='slategrey')
im1 = ax[1][n-1].scatter(data1[:,n+3],results.loss,c=results['stdev'],lw=0.5,ec='black',alpha=0.5,s=20,cmap='turbo')
im2 = ax[2][n-1].scatter(data2[:,n+3],best_losses.loss,c=best_losses['stdev'],lw=0.5,ec='black',s=20,cmap='turbo')
ax[2][n-1].set_xlabel(hp_names[n])
ax[2][n-1].tick_params(axis='x',labelrotation=45)
ax[1][0].set_ylabel('Wasserstein\ndistance')
ax[2][0].set_ylabel('Wasserstein\ndistance below %.2f' % max(best_losses.loss))
cax = plt.axes([0.925, 0.15, 0.02, 0.69])
cbar = plt.colorbar(im1,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label(r'1$\mathrm{\sigma}$ Deviation')
cbar.ax.xaxis.set_ticks_position('top')
cbar.ax.xaxis.set_label_position('top')
plt.savefig('case2_sensitivity.jpg')
plt.show()
plt.close()
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/*/logs/*_01.log (2174/2175)

Focus search solutions ¶
names : [noise_dim,batch_size, lr_gen,lr_disc, gen_layersize,disc_layersize]
mult : [ 1, 32,0.000001,0.00001, 8, 8]
xlow : [ 1, 8, 10, 1, 8, 8]
xup : [ 100, 64, 100, 200, 128, 128]
[8]:
import hyppo
data = hyppo.extract('PaperData/dimuon/focus_search/logs/*_01.log',raw=False).sort_values(by=['loss'])
print(len(data),'solutions')
best = data.loc[data.loss.idxmin()]
data[data.loss==min(data.loss)]
67 solutions
[8]:
loss | stdev | generator | discriminator | tot_prms | noise_dim | batch_size | lr_gen | lr_disc | gen_layersize | disc_layersize | nevals | time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
44 | 0.01304 | 0.00136 | 1025286 | 12169 | 1037455 | 96 | 512 | 0.000095 | 0.00005 | 960 | 104 | 17 | 4258.711 |
[ ]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(14,6),dpi=300)
plt.subplots_adjust(left=0.08,top=0.95,bottom=0.12,right=0.88,wspace=0.05,hspace=0.1)
print('%i solutions displayed' % len(data))
im = ax.scatter(data.loss,data.stdev,c=data.tot_prms/1e3,lw=0.5,alpha=0.7,s=70,ec='black',cmap='terrain',zorder=2)
ax.minorticks_off()
ax.set_xlabel('Best Wasserstein Distance')
ax.set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
ax.set_xscale('log')
ax.set_yscale('log')
ax.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax.scatter([best.loss],[best.stdev], facecolors='none', edgecolors='tomato',s=500,lw=5)
cax = plt.axes([0.89, 0.12, 0.02, 0.83])
cbar = plt.colorbar(im,cax=cax,format='%ik')
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Number of trainable parameters')
plt.savefig('case2_focus.jpg')
plt.show()
67 solutions displayed

Final results ¶
[ ]:
from gan4hep.train_gan_2angles import run_training
run_training(filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',epochs=1000,**best_random)
[16]:
import os,numpy,math
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from gan4hep import get_inference
fig, ax = plt.subplots(2,3,figsize=(14,6),dpi=300,sharex='col')
plt.subplots_adjust(bottom=0.1, left=0.08, hspace=0.2, right=0.98, top=0.9)
ax = ax.flatten()
config = dict(histtype='step', lw=1)
histograms = [[],[],[],[],[],[]]
all_bins = []
unopt = get_inference('PaperData/dimuon/best_model/best_random/generator',filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',n_events=dataset_size,batch_size=int(best_random.batch_size),noise_dim=int(best_random.noise_dim),dimuon=True)[2][0]
for trial in range(8):
tot_wdis, truths, predictions = get_inference('PaperData/dimuon/best_model/optimized/%02i/generator'%(trial+1),filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',n_events=dataset_size,n_iter=1,batch_size=int(best.batch_size),noise_dim=int(best.noise_dim),dimuon=True)
tot_wdis = numpy.median(tot_wdis)
print(trial+1,numpy.mean(tot_wdis))
for idx in range(6):
n_bins = 20 if idx not in [0,3] else 100
trial_counts = []
for n in range(predictions.shape[0]):
counts, bins, _ = ax[idx].hist(predictions[n,:,idx],density=True, bins=n_bins, linewidth=0.5, alpha=0, range=[-1,1], color='tomato', histtype='step', lw=0.1)
trial_counts.append(counts)
if trial==0:
all_bins.append(bins)
histograms[idx].append(numpy.median(trial_counts,axis=0))
for idx,name in enumerate(['Scaled $\mathrm{p_{T}}$','Scaled $\\eta$','Scaled $\phi$','Scaled $\mathrm{p_{T}}$','Scaled $\\eta$','Scaled $\phi$']):
n_bins = 20 if idx not in [0,3] else 100
ax[idx].hist(truths[:, idx], bins=n_bins, density=True, range=[-1,1], color='black', **config)
ax[idx].hist(unopt[:, idx], density=True, bins=n_bins, range=[-1,1], color='blue', **config)
if idx>=3:
ax[idx].set_xlabel(r"%s" % name)
mean = numpy.mean(histograms[idx],axis=0)
stdev = numpy.std(histograms[idx],axis=0)
ax[idx].stairs(mean, all_bins[idx], color='red',lw=1)
ax[idx].stairs(mean-stdev/2, all_bins[idx], baseline=mean+stdev/2, color='red',lw=1, fill=True, alpha=0.2)
if idx in [0,3]:
ax[idx].set_ylabel(('1st' if idx==0 else '2nd')+' Muon\nNormalized Distribution')
ax[idx].set_xlim([-1,-0.6])
if idx in [2,5]:
ax[idx].set_ylim([0.3,0.7])
line1 = Line2D([0], [0], label='Simulated Dataset', color='black', lw=5)
line2 = Line2D([0], [0], label='Best-random-GAN-generated events', color='blue', lw=5)
line3 = Line2D([0], [0], label='Optimized-GAN-generated events', color='red', lw=5)
fig.legend(handles=[line1,line2,line3], bbox_to_anchor=[0.03,0.95,1,0.05], loc='center', ncol=3)
plt.savefig('case2_histograms.jpg')
plt.show()
1 0.010210095128665367
2 0.012668120519568523
3 0.017405066794405382
4 0.010580390226095915
5 0.012686156473743418
6 0.010451896348968148
7 0.013104425355171164
8 0.016322168754413724

Processing time ¶
[ ]:
import glob
import numpy
time = {'hadron':None,'dimuon':None}
for name in time.keys():
times, params = [], []
for filename in glob.glob('PaperData/%s/10RBF/001/logs/*.log' % name):
for line in numpy.loadtxt(filename,dtype=str,delimiter='\n'):
if 'Number of parameters' in line:
nprms = int(line.split()[-1].replace(',',''))
if '\tTime' in line or 'Train Time' in line:
times.append(float(line.split()[-2]))
params.append(nprms)
tot, mean, std, length = numpy.sum(times), numpy.mean(times), numpy.std(times), len(times)
time[name] = numpy.array([[tot,tot/5,tot/50],[std*length,std*length/5,std*length/50]])
[ ]:
import matplotlib
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2,figsize=(14,5.2),dpi=300)
for i,name in enumerate(['hadron','dimuon']):
ax[i].bar(range(3), time[name][0], yerr=time[name][1], align='center', alpha=0.5, ecolor='black',error_kw=dict(lw=2, capsize=5, capthick=2))
ax[i].set_yscale('log')
ax[i].set_yticks([3600,3600*24,3600*24*7])
ax[i].set_yticklabels(['1 hour','1 day','1 week'])
ax[i].set_xticks([0,1,2])
ax[i].set_xticklabels(['No Parallelization','Single-level\nParallelization','Asynchronous\nNested Parallelism\n(this work)'])
ax[i].set_title('Hadronization Processes' if i==0 else 'Dimuon Decay')
plt.tight_layout()
plt.savefig('speedup.jpg')
plt.show()
