Optimizing GAN for HEP

Open In Colab ArXiv DOI

↑ Click on “Open in Colab” to execute this notebook live on Google Colaboratory.

In this Google Colab notebook, we provide step by step instructions on how to reproduce the results and figures presented in our paper. All the results were obtained using the NERSC/Cori supercomputer at Lawerence Berkeley National Laboratory.

Getting Started

In order to get started with running HYPPO and performing all the post-optimization analysis on the results, we need to install a few packages. The following code snippet will perform the following:

  1. Install HYPPO through the PIP Python package manager;

  2. Download all the HYPPO product files from our project;

  3. Install the HEP/GAN package;

  4. Install the latest version of the matplotlib library to avoid common errors from older versions.

Softwares

DOIsoft

[1]:
%%capture
!pip install hpo-uq

HYPPO product files

DOIfile

Finally, the following commands will download all the log files generated by the HYPPO software during the HPO searches, as well as all the SLURM scripts used to submit the jobs to the NERSC’s Cori cluster, and the best models saved by the HEP/GAN package.

[2]:
%%capture
!git clone https://gitlab.com/hpo-uq/publications/paper-2 PaperData

GAN for HEP package

DOIgan

We can now clone the git repository that contains the HEP/GAN package as well as both datasets used in this work.

[3]:
%%capture
!git clone https://gitlab.com/hpo-uq/applications/gan4hep
!cd gan4hep && pip install .

Matplotlib

[4]:
%%capture
!pip install -U matplotlib

The following command will restart the runtime. This is required in Google Colab in order for the matplotlib upgrade to be taken into account. Executing the following commands will cause the session to crash and restart automatically. This is an expected behaviour and one can ignore the error messages that will get displayed.

[ ]:
import os
os.kill(os.getpid(), 9)

The following settings from matplotlib will be used for all the figures generated in this notebook.

[1]:
import matplotlib.pyplot as plt
plt.style.use('seaborn')
plt.rc('font', size=15)
plt.rc('axes', labelsize=15, titlesize=15)
plt.rc('legend', fontsize=15)
plt.rc('xtick', labelsize=15)
plt.rc('ytick', labelsize=15)

Hadronization Processes

Parameterized simulations

[2]:
from gan4hep import load_data
data = load_data(filename='gan4hep/data/ClusterTo2Pi0.dat')
dataset_size = len(data['phi'])
[ ]:
import math, numpy
import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14,5),dpi=300)
ax1.hist(numpy.vstack((data['px'],data['py'],data['pz'])).T, 40, density=True,histtype='step',lw=2,color=['limegreen','orangered','dodgerblue'],label=[r'$p_x$',r'$p_y$',r'$p_z$'])
ax1.set_ylim(ymin=0.95)
ax1.set_xlabel('Momentum')
ax1.legend(loc='upper center',ncol=3)
ax2.hist(numpy.vstack((data['phi'],data['theta'])).T, 40, density=True,histtype='step',lw=2,color=['crimson','mediumorchid'],label=[r'$\phi$',r'$\theta$'])
ax2.set_xticks([-math.pi/2,-math.pi/4,0,math.pi/4,math.pi/2])
ax2.set_xticklabels([r'-$\pi$/2',r'-$\pi$/4','0',r'+$\pi$/4',r'+$\pi$/2'])
ax2.set_xlabel('Angle')
ax2.legend(loc='upper center')
plt.tight_layout()
plt.show()
../_images/papers_gan4hep_17_0.png

Broad search solutions

For the broad search, we created 3 repositories, each of them containing the results for 3 distinct hyperparameter optimization approaches: - 100 to store results from 20 independent random sampling searches with 100 different sets of hyperparameter evaluated in each search; - 10GP to store results from 20 independent Gaussian Process optimization, each of them starting with an initial experimental design of 10 sets of hyperparameters; - 10RBF to store results from 20 independent Radial Basis Function optimization, each of them starting with an initial experimental design of 10 sets of hyperparameters.

names  : [noise_dim,batch_size,  lr_gen,lr_disc,gen_layersize,disc_layersize]
mult   : [        4,        32, 0.00005,0.00005,            8,             8]
xlow   : [        1,         8,       1,      1,            8,             8]
xup    : [      100,        64,     200,    200,          128,           128]

All the results can be loaded as follows:

[12]:
import hyppo
random = hyppo.extract('PaperData/hadron/100/*/logs/evaluation_*_01.log',target_unc='STD',target_loss='outer')
gp     = hyppo.extract('PaperData/hadron/10GP/*/logs/surrogate_*_01.log',target_unc='STD')
rbf    = hyppo.extract('PaperData/hadron/10RBF/*/logs/surrogate_*_01.log',target_unc='STD')
# Store details from best random model for final figure
best_random = random.iloc[random.loss.to_list()==random.loss.min()].iloc[0]
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/*/logs/surrogate_*_01.log (1978/1979)

A total of almost 2,000 architectures were evaluated for each approach. The approach that uses the RBF (Radial Basis Function) optimization appears to outperform both the random sampling and Gaussian Process approaches in finding optimal solutions.

[ ]:
import matplotlib.pyplot as plt
input_data = [[random,'Random Sampling'],[gp,'Gaussian Process'],[rbf,'Radial Basis Function']]
fig, ax = plt.subplots(2,len(input_data),figsize=(4*len(input_data)+2,5),gridspec_kw={'height_ratios': [1, 3]},dpi=300,sharex=True,sharey='row')
plt.subplots_adjust(left=0.05,top=0.93,right=0.9,wspace=0.05,hspace=0.1)
for i,data in enumerate(input_data):
    data, title = data
    ax[0][i].set_title('(%i) %s' % (i+1,title))
    print('Dataset %i : %i solutions displayed' % (i+1,len(data)))
    ax[0][i].hist(data.loss,range=[0,1.6],rwidth=0.9,bins=16,histtype='bar')
    im = ax[1][i].scatter(data.loss,data.stdev,c=data.lr_gen,lw=0.1,alpha=0.5,s=20,cmap='plasma')
    ax[1][i].minorticks_off()
    ax[1][i].set_xlabel('Best Mean Wasserstein Distance')
    ax[1][i].scatter([0.125],[0.030], facecolors='none', edgecolors='tomato',s=1200,lw=2)
ax[1][0].set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
cax = plt.axes([0.91, 0.125, 0.02, 0.575])
cbar = plt.colorbar(im,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Generator Learning Rate')
plt.savefig('case1_solutions.jpg')
plt.show()
plt.close()
Dataset 1 : 2000 solutions displayed
Dataset 2 : 1979 solutions displayed
Dataset 3 : 1979 solutions displayed
../_images/papers_gan4hep_21_1.png

Loss Convergence

[ ]:
import numpy
from hyppo.utils.extract import mult_convergence, extract, single_convergence
from matplotlib.ticker import FormatStrFormatter
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(14,8),dpi=300)
ax1 = fig.add_axes([0.08,0.56,0.30,0.38])
ax2 = fig.add_axes([0.39,0.56,0.10,0.38])
ax3 = fig.add_axes([0.56,0.56,0.30,0.38])
ax4 = fig.add_axes([0.87,0.56,0.10,0.38])
ax5 = fig.add_axes([0.08,0.10,0.30,0.38])
ax6 = fig.add_axes([0.39,0.10,0.10,0.38])
ax7 = fig.add_axes([0.56,0.10,0.41,0.38])
ax = numpy.array([[ax1,ax2],[ax3,ax4],[ax5,ax6],[ax7,None]])
colors = ['teal','orange','tomato']
titles = ['Random Sampling','Gaussian Process','Radial Basis Function']
for n,path in enumerate(['100','10GP','10RBF']):
    print(path)
    extras = {'target_unc':'STD','with_surrogate':False, 'target_loss':'outer'} if path=='100' else {'target_unc':'STD','with_surrogate':True}
    loss, std = mult_convergence('PaperData/hadron/%s'%path,**extras)
    loss, std = loss[:100], std[:100]
    ax[n,0].plot(loss,lw=2,zorder=2,color=colors[n],drawstyle='steps-post')
    ax[n,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post',zorder=2)
    losses = []
    for i in range(20):
        solutions = extract('PaperData/hadron/%s/%03i/logs/*_01.log'%(path,i+1),**extras).head(100)
        loss1, std1 = single_convergence(solutions)
        ax[n,0].plot(loss1,lw=0.5,zorder=1,color='0.5',drawstyle='steps-post')
        ax[n,0].errorbar(x=solutions.nevals,y=solutions.loss,yerr=solutions.stdev,fmt='o',ms=0.8,c='0.7',mfc='k',lw=0)
        losses.extend(solutions.loss)
    ax[n,1].hist(losses,range=[0,1.6],rwidth=0.9,bins=16,histtype='bar',orientation='horizontal',color='slategrey')
    ax[n,0].set_ylabel('Wasserstein distance')
    ax[n,1].set_ylim([0,1.6])
    ax[n,1].set_xlim([0,600])
    ax[n,0].set_ylim([0,1.6])
    ax[n,0].set_xlim([0,100])
    ax[n,0].set_title(titles[n])
    ax[n,1].xaxis.set_ticklabels([])
    ax[n,1].yaxis.set_ticklabels([])
    ax[3,0].plot(loss,color=colors[n],lw=1,zorder=1,drawstyle='steps-post',label=titles[n])
    ax[3,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post')
    ax[3,0].axvline(10,color='black',lw=1,ls='dashed')
    ax[3,0].set_yscale('log')
ax[2,0].set_xlabel('Index of function evaluations')
ax[3,0].grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax[3,0].yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
ax[3,0].set_xlabel('Index of function evaluations')
ax[3,0].set_ylabel('Best Wasserstein distance so far')
ax[3,0].set_title('Loss Convergence')
leg = ax[3,0].legend(loc='upper right')
for line in leg.get_lines():
    line.set_linewidth(4.0)
ax[3,0].set_xlim([0,100])
plt.show()
100
10GP
10RBF
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/surrogate*_01.log (98/99)
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/*_01.log (108/109)
../_images/papers_gan4hep_23_1.png
[ ]:
import hyppo
conv_rbf    = hyppo.mult_convergence('PaperData/hadron/10RBF',with_surrogate=True,target_unc='STD')
conv_gp     = hyppo.mult_convergence('PaperData/hadron/10GP',with_surrogate=True,target_unc='STD')
conv_random = hyppo.mult_convergence('PaperData/hadron/100',with_surrogate=False,target_unc='STD',target_loss='outer')
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/015/logs/surrogate*_01.log (98/99)
[ ]:
import matplotlib.pyplot as plt
from matplotlib.ticker import FormatStrFormatter
colors = ['teal','orange','tomato']
plt.figure(figsize=(0.8*14,7),dpi=300)
for i,((loss,std),title) in enumerate(((conv_random,'Random Sampling'),(conv_gp,'Gaussian Process'),(conv_rbf,'Radial Basis Function'))):
    plt.plot(loss,color=colors[i],lw=1,zorder=1,drawstyle='steps-post',label=title)
    plt.fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[i],step='post')
    plt.axvline(10,color='black',lw=1,ls='dashed')
    plt.xlim([0,100])
    plt.yscale('log')
plt.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
plt.gca().yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
plt.xlabel('Index of function evaluations')
plt.ylabel('Best Wasserstein distance so far (log scaling)')
leg = plt.legend(loc='upper right')
for line in leg.get_lines():
    line.set_linewidth(4.0)
plt.tight_layout()
plt.savefig('case1_convergence.jpg')
plt.show()
../_images/papers_gan4hep_25_0.png

Sensitivity analysis

[ ]:
import hyppo
import matplotlib.pyplot as plt
results = hyppo.extract('PaperData/hadron/10RBF/*/logs/surrogate_*_01.log',target_unc='STD')
best_losses = results[results['loss']<0.05]
data1, hp_names = hyppo.get_custom_data(results,sensitivity=True)[1:]
data2 = hyppo.get_custom_data(best_losses,sensitivity=True)[1]
fig,ax = plt.subplots(3,6,figsize=(14,6),gridspec_kw={'height_ratios': [1,3,3]},dpi=300,sharey='row',sharex='col')
plt.draw()
plt.subplots_adjust(top=0.97,wspace=0.05,hspace=0.05,right=0.92,left=0.07,bottom=0.15)
for n in range(6):
    if n+1>data1.shape[1]-3:
        break
    ax[0][n-1].hist(results[hp_names[n]],rwidth=0.9,bins=16,histtype='bar',color='slategrey')
    im1 = ax[1][n-1].scatter(data1[:,n+3],results.loss,c=results['stdev'],lw=0.5,ec='black',alpha=0.5,s=20,cmap='turbo')
    im2 = ax[2][n-1].scatter(data2[:,n+3],best_losses.loss,c=best_losses['stdev'],lw=0.5,ec='black',s=20,cmap='turbo')
    ax[2][n-1].set_xlabel(hp_names[n])
    ax[2][n-1].tick_params(axis='x',labelrotation=45)
ax[1][0].set_ylabel('Wasserstein\nDistance')
ax[2][0].set_ylabel('Wasserstein\nDistance below %.2f' % max(best_losses.loss))
cax = plt.axes([0.93, 0.15, 0.02, 0.69])
cbar = plt.colorbar(im1,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label(r'1$\mathrm{\sigma}$ Deviation')
cbar.ax.xaxis.set_ticks_position('top')
cbar.ax.xaxis.set_label_position('top')
plt.savefig('case1_sensitivity.jpg')
plt.show()
plt.close()
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/hadron/10RBF/*/logs/surrogate_*_01.log (1978/1979)
../_images/papers_gan4hep_27_1.png
[ ]:
plt.subplots(figsize=(8,6))
plt.scatter(results.batch_size,results.lr_gen,c=results.loss,cmap='turbo',ec='k',lw=0.2)
plt.xlabel('Batch Size')
plt.ylabel('Generator Learning Rate')
cbar = plt.colorbar()
cbar.set_label('Wassertein Distance')
plt.tight_layout()
plt.savefig('figure.png',dpi=200)
plt.show()
../_images/papers_gan4hep_28_0.png

Generator learning rate

Based on the sensitivity analysis from the broad search, we notice the strong sensitivity from the generator learning rate. Here we investigate how much lower one can go with the generator learning rate and whether the loss starts going up below some threshold. We fixed all the hyperparameters, except the generator learning rate, to values that correspond to one of the most optimal solutions found during the broad search, i.e. with noise_dim set to 4, batch_size to 768, lr_disc to 0.00265, gen_layersize to 72, and disc_layersize to 288.

Focus search solutions

Based on the (much narrower) range of values covered for each hyperparameter for solutions that provide the lowest loss, i.e. below 0.05, we performed a series of focus search to narrow down the region of optimal solutions in the hyperparameter space explored.

[4]:
import hyppo
data = hyppo.extract('PaperData/hadron/focus_search/logs/*_01.log')
print(len(data),'solutions')
best = data.loc[data.loss.idxmin()]
data[data.loss==min(data.loss)]
70 solutions
[4]:
loss stdev generator discriminator tot_prms noise_dim batch_size lr_gen lr_disc gen_layersize disc_layersize nevals time
49 0.01398 0.00349 5202 61237 66439 3 642 0.000004 0.0024 65 243 56 12241.653
[ ]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(14,6),dpi=300)
plt.subplots_adjust(left=0.08,top=0.95,bottom=0.12,right=0.88,wspace=0.05,hspace=0.1)
print('%i solutions displayed' % len(data))
im = ax.scatter(data.loss,data.stdev,c=data.tot_prms/1e3,lw=0.5,alpha=0.7,s=70,ec='black',cmap='terrain',zorder=2)
ax.minorticks_off()
ax.set_xlabel('Best Wasserstein Distance')
ax.set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
ax.set_xscale('log')
ax.set_yscale('log')
ax.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax.scatter([best.loss],[best.stdev], facecolors='none', edgecolors='tomato',s=500,lw=5)
cax = plt.axes([0.89, 0.12, 0.02, 0.83])
cbar = plt.colorbar(im,cax=cax,format='%ik')
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Number of trainable parameters')
plt.savefig('case1_focus.jpg')
plt.show()
70 solutions displayed
../_images/papers_gan4hep_38_1.png

Final results

[5]:
import os,numpy,math
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from gan4hep import get_inference
n_bins = 20
fig, ax = plt.subplots(1,2,figsize=(14,5),dpi=300,sharex=True)
plt.subplots_adjust(bottom=0.12, left=0.08, wspace=0.15, right=0.98, top=0.9)
config = dict(histtype='step', lw=1)
histograms = [[],[]]
unopt = math.pi/2 * get_inference('PaperData/hadron/best_model/best_random/generator',filename='gan4hep/data/ClusterTo2Pi0.dat',n_events=dataset_size,batch_size=int(best_random.batch_size),noise_dim=int(best_random.noise_dim))[2][0]
for trial in range(8):
    tot_wdis, truths, predictions = get_inference(os.path.join('PaperData/hadron/best_model/optimized/%02i/generator'%(trial+1)),filename='gan4hep/data/ClusterTo2Pi0.dat',n_events=dataset_size,batch_size=int(best.batch_size),noise_dim=int(best.noise_dim))
    tot_wdis = numpy.median(tot_wdis)
    print(trial+1,numpy.mean(tot_wdis))
    for idx in range(2):
        trial_counts = []
        for n in range(predictions.shape[0]):
            counts, bins, _ = ax[idx].hist(predictions[n,:,idx],density=True, bins=n_bins, linewidth=0.5, alpha=0, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='tomato', histtype='step', lw=0.1)
            trial_counts.append(counts)
        histograms[idx].append(numpy.median(trial_counts,axis=0))
for idx,name in enumerate(['phi','theta']):
    ax[idx].hist(truths[:, idx], density=True, bins=n_bins, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='black', **config)
    ax[idx].hist(unopt[:, idx], density=True, bins=n_bins, range=[-numpy.float32(numpy.pi/2), numpy.float32(numpy.pi/2)], color='blue', **config)
    ax[idx].set_xticks([-numpy.float32(math.pi/2),-numpy.float32(math.pi/4),0,numpy.float32(math.pi/4),numpy.float32(math.pi/2)])
    ax[idx].set_xticklabels([r'-$\pi$/2',r'-$\pi$/4','0',r'+$\pi$/4',r'+$\pi$/2'])
    ax[idx].set_xlabel(r"$\%s$" % name)
    ax[idx].set_xlim(-5*math.pi/8,5*math.pi/8)
    mean = numpy.mean(histograms[idx],axis=0)
    stdev = numpy.std(histograms[idx],axis=0)
    ax[idx].stairs(mean, bins, color='red', lw=1)
    ax[idx].stairs(mean-stdev/2, bins, baseline=mean+stdev/2, color='red',lw=1, fill=True, alpha=0.2)
    ax[idx].legend(loc='upper center',ncol=1)
    ax[idx].set_ylim([0.25,0.38] if idx==0 else [0,0.6])
ax[0].set_ylabel('Normalized Distribution')
line1 = Line2D([0], [0], label='Simulated Dataset', color='black', lw=5)
line2 = Line2D([0], [0], label='Best-random-GAN-generated events', color='blue', lw=5)
line3 = Line2D([0], [0], label='Optimized-GAN-generated events', color='red', lw=5)
fig.legend(handles=[line1,line2,line3], bbox_to_anchor=[0.03,0.95,1,0.05], loc='center',  ncol=3)
plt.savefig('case1_histograms.jpg')
plt.show()
1 0.010046488139778376
2 0.019267020747065544
3 0.013843555003404617
4 0.012103751068934798
5 0.01173804746940732
6 0.011390266939997673
7 0.017358052544295788
WARNING:matplotlib.legend:No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
WARNING:matplotlib.legend:No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
8 0.013876878656446934
../_images/papers_gan4hep_40_3.png

Dimuon decay

Parameterized simulations

[6]:
import numpy
data = {}
filename = 'gan4hep/data/mc16d_364100_100Kentries_dimuon.output'
events = numpy.loadtxt(filename)
data['E1']     = events[:,0].astype(numpy.float32)
data['phi1']   = events[:,1].astype(numpy.float32)
data['theta1'] = events[:,2].astype(numpy.float32)
data['E2']     = events[:,3].astype(numpy.float32)
data['phi2']   = events[:,4].astype(numpy.float32)
data['theta2'] = events[:,5].astype(numpy.float32)
dataset_size = len(events)
[ ]:
import math, numpy
import matplotlib.pyplot as plt
fig, (ax1,ax2) = plt.subplots(1,2,figsize=(14,5),dpi=300)
ax1.hist(numpy.vstack((data['E1'],data['E2'])).T, 40, histtype='step',lw=2,color=['dodgerblue','tomato'],label=[r'$E_1$',r'$E_2$'])
ax1.set_xlabel('Energy')
ax1.legend(loc='upper right')
ax2.hist(numpy.vstack((data['phi1'],data['phi2'])).T, 40,histtype='step',lw=2,color=['dodgerblue','tomato'],label=[r'$\phi_1$',r'$\phi_2$'])
ax2.hist(numpy.vstack((data['theta1'],data['theta2'])).T, 40,histtype='step',lw=2,color=['yellowgreen','gold'],label=[r'$\theta_1$',r'$\theta_2$'])
ax2.set_xlabel('Angle')
ax2.set_xticks([-math.pi,-math.pi/2,0,math.pi/2,math.pi])
ax2.set_xticklabels([r'-$\pi$',r'-$\pi$/2','0',r'+$\pi$/2',r'+$\pi$'])
ax2.legend(loc='lower center')
plt.tight_layout()
plt.show()
../_images/papers_gan4hep_44_0.png

Broad search solutions

names  : [noise_dim,batch_size,  lr_gen,lr_disc,gen_layersize,disc_layersize]
mult   : [        4,        32, 0.00005,0.00005,            8,             8]
xlow   : [        1,         8,       1,      1,            8,             8]
xup    : [      100,        64,     200,    200,          128,           128]
[15]:
import hyppo
random = hyppo.extract('PaperData/dimuon/100/*/logs/*_01.log')
gp     = hyppo.extract('PaperData/dimuon/10GP/*/logs/*_01.log')
rbf    = hyppo.extract('PaperData/dimuon/10RBF/*/logs/*_01.log')
# Store details from best random model for final figure
best_random = random.iloc[random.loss.to_list()==random.loss.min()].iloc[0]
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/*/logs/*_01.log (2174/2175)
[24]:
best_random
[24]:
loss                   0.22607
stdev                  0.13811
generator         451974.00000
discriminator      91465.00000
tot_prms          543439.00000
noise_dim            316.00000
batch_size           480.00000
lr_gen                 0.00010
lr_disc                0.00070
gen_layersize        528.00000
disc_layersize       296.00000
nevals               277.00000
time                 167.80300
Name: 1331, dtype: float64
[ ]:
import matplotlib.pyplot as plt
input_data = [[random,'Random Sampling'],[gp,'Gaussian Process'],[rbf,'Radial Basis Function']]
fig, ax = plt.subplots(2,len(input_data),figsize=(4*len(input_data)+2,5),gridspec_kw={'height_ratios': [1, 3]},dpi=300,sharex=True,sharey='row')
if len(input_data)==1:
    ax = numpy.array(ax,ndmin=2).T
plt.subplots_adjust(left=0.05,top=0.93,right=0.9,wspace=0.05,hspace=0.1)
for i,(data, title) in enumerate(input_data):
    data = data[:2000]
    ax[0][i].set_title('(%i) %s' % (i+1,title))
    print('Dataset %i : %i solutions displayed' % (i+1,len(data)))
    ax[0][i].hist(data.loss,range=[0.2,1.1],rwidth=0.9,bins=16,histtype='bar')
    im = ax[1][i].scatter(data.loss,data.stdev,c=data.lr_gen,lw=0.1,alpha=0.5,s=20,cmap='plasma')
    ax[1][i].minorticks_off()
    ax[1][i].set_xlabel('Best Mean Wasserstein Distance')
ax[1][0].set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
cax = plt.axes([0.91, 0.125, 0.02, 0.575])
cbar = plt.colorbar(im,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Generator Learning Rate')
plt.savefig('case2_solutions.jpg')
plt.show()
plt.close()
Dataset 1 : 2000 solutions displayed
Dataset 2 : 2000 solutions displayed
Dataset 3 : 2000 solutions displayed
../_images/papers_gan4hep_48_1.png

Loss convergence

[ ]:
import numpy
from hyppo.utils.extract import mult_convergence, extract, single_convergence
from matplotlib.ticker import FormatStrFormatter
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(14,8),dpi=300)
ax1 = fig.add_axes([0.08,0.56,0.30,0.38])
ax2 = fig.add_axes([0.39,0.56,0.10,0.38])
ax3 = fig.add_axes([0.56,0.56,0.30,0.38])
ax4 = fig.add_axes([0.87,0.56,0.10,0.38])
ax5 = fig.add_axes([0.08,0.10,0.30,0.38])
ax6 = fig.add_axes([0.39,0.10,0.10,0.38])
ax7 = fig.add_axes([0.56,0.10,0.41,0.38])
ax = numpy.array([[ax1,ax2],[ax3,ax4],[ax5,ax6],[ax7,None]])
colors = ['teal','orange','tomato']
titles = ['Random Sampling','Gaussian Process','Radial Basis Function']
for n,path in enumerate(['100','10GP','10RBF']):
    print(path)
    extras = {'with_surrogate':False} if path=='100' else {'with_surrogate':True}
    loss, std = mult_convergence('PaperData/dimuon/%s'%path,**extras)
    loss, std = loss[:100], std[:100]
    ax[n,0].plot(loss,lw=2,zorder=2,color=colors[n],drawstyle='steps-post')
    ax[n,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post',zorder=2)
    losses = []
    for i in range(20):
        solutions = extract('PaperData/dimuon/%s/%03i/logs/*_01.log'%(path,i+1),**extras).head(100)
        loss1, std1 = single_convergence(solutions)
        ax[n,0].plot(loss1,lw=0.5,zorder=1,color='0.5',drawstyle='steps-post')
        ax[n,0].errorbar(x=solutions.nevals,y=solutions.loss,yerr=solutions.stdev,fmt='o',ms=0.8,c='0.7',mfc='k',lw=0)
        losses.extend(solutions.loss)
    ax[n,1].hist(losses,range=[0,1],rwidth=0.9,bins=16,histtype='bar',orientation='horizontal',color='slategrey')
    ax[n,0].set_ylabel('Wasserstein distance')
    ax[n,1].set_ylim([0,1])
    ax[n,1].set_xlim([0,600])
    ax[n,0].set_ylim([0,1])
    ax[n,0].set_xlim([0,100])
    ax[n,0].set_title(titles[n])
    ax[n,1].xaxis.set_ticklabels([])
    ax[n,1].yaxis.set_ticklabels([])
    ax[3,0].plot(loss,color=colors[n],lw=1,zorder=1,drawstyle='steps-post',label=titles[n])
    ax[3,0].fill_between(range(len(loss)),loss-std/2,loss+std/2,alpha=0.2,lw=0,color=colors[n],step='post')
    ax[3,0].axvline(10,color='black',lw=1,ls='dashed')
    ax[3,0].set_yscale('log')
ax[2,0].set_xlabel('Index of function evaluations')
ax[3,0].grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax[3,0].yaxis.set_major_formatter(FormatStrFormatter('%.2f'))
ax[3,0].set_xlabel('Index of function evaluations')
ax[3,0].set_ylabel('Best Wasserstein distance so far')
ax[3,0].set_title('Loss Convergence')
leg = ax[3,0].legend(loc='upper right')
for line in leg.get_lines():
    line.set_linewidth(4.0)
ax[3,0].set_xlim([0,100])
plt.show()
100
10GP
10RBF
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/006/logs/surrogate*_01.log (98/99)
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/006/logs/*_01.log (108/109)
../_images/papers_gan4hep_50_1.png

Sensitivity study

[ ]:
import hyppo
import matplotlib.pyplot as plt
results = hyppo.extract('PaperData/dimuon/10RBF/*/logs/*_01.log')
best_losses = results[results['loss']<0.26]
data1, hp_names = hyppo.get_custom_data(results,sensitivity=True)[1:]
data2 = hyppo.get_custom_data(best_losses,sensitivity=True)[1]
fig,ax = plt.subplots(3,6,figsize=(14,6),gridspec_kw={'height_ratios': [1,3,3]},dpi=300,sharey='row',sharex='col')
plt.draw()
plt.subplots_adjust(top=0.97,wspace=0.05,hspace=0.05,right=0.92,left=0.08,bottom=0.15)
for n in range(6):
    if n+1>data1.shape[1]-3:
        break
    ax[0][n-1].hist(results[hp_names[n]],rwidth=0.9,bins=16,histtype='bar',color='slategrey')
    im1 = ax[1][n-1].scatter(data1[:,n+3],results.loss,c=results['stdev'],lw=0.5,ec='black',alpha=0.5,s=20,cmap='turbo')
    im2 = ax[2][n-1].scatter(data2[:,n+3],best_losses.loss,c=best_losses['stdev'],lw=0.5,ec='black',s=20,cmap='turbo')
    ax[2][n-1].set_xlabel(hp_names[n])
    ax[2][n-1].tick_params(axis='x',labelrotation=45)
ax[1][0].set_ylabel('Wasserstein\ndistance')
ax[2][0].set_ylabel('Wasserstein\ndistance below %.2f' % max(best_losses.loss))
cax = plt.axes([0.925, 0.15, 0.02, 0.69])
cbar = plt.colorbar(im1,cax=cax)
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label(r'1$\mathrm{\sigma}$ Deviation')
cbar.ax.xaxis.set_ticks_position('top')
cbar.ax.xaxis.set_label_position('top')
plt.savefig('case2_sensitivity.jpg')
plt.show()
plt.close()
WARNING: Some hyperparameter sets were evaluated more than once in PaperData/dimuon/10RBF/*/logs/*_01.log (2174/2175)
../_images/papers_gan4hep_52_1.png

Focus search solutions

names  : [noise_dim,batch_size, lr_gen,lr_disc, gen_layersize,disc_layersize]
mult   : [        1,        32,0.000001,0.00001,            8,             8]
xlow   : [        1,         8,      10,      1,            8,             8]
xup    : [      100,        64,     100,    200,          128,           128]
[8]:
import hyppo
data = hyppo.extract('PaperData/dimuon/focus_search/logs/*_01.log',raw=False).sort_values(by=['loss'])
print(len(data),'solutions')
best = data.loc[data.loss.idxmin()]
data[data.loss==min(data.loss)]
67 solutions
[8]:
loss stdev generator discriminator tot_prms noise_dim batch_size lr_gen lr_disc gen_layersize disc_layersize nevals time
44 0.01304 0.00136 1025286 12169 1037455 96 512 0.000095 0.00005 960 104 17 4258.711
[ ]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1,figsize=(14,6),dpi=300)
plt.subplots_adjust(left=0.08,top=0.95,bottom=0.12,right=0.88,wspace=0.05,hspace=0.1)
print('%i solutions displayed' % len(data))
im = ax.scatter(data.loss,data.stdev,c=data.tot_prms/1e3,lw=0.5,alpha=0.7,s=70,ec='black',cmap='terrain',zorder=2)
ax.minorticks_off()
ax.set_xlabel('Best Wasserstein Distance')
ax.set_ylabel(r'1$\mathrm{\sigma}$ Deviation')
ax.set_xscale('log')
ax.set_yscale('log')
ax.grid(which='minor', color='w', linestyle='dashed', lw=0.8)
ax.scatter([best.loss],[best.stdev], facecolors='none', edgecolors='tomato',s=500,lw=5)
cax = plt.axes([0.89, 0.12, 0.02, 0.83])
cbar = plt.colorbar(im,cax=cax,format='%ik')
cbar.set_alpha(1)
cbar.draw_all()
cbar.set_label('Number of trainable parameters')
plt.savefig('case2_focus.jpg')
plt.show()
67 solutions displayed
../_images/papers_gan4hep_55_1.png

Final results

[ ]:
from gan4hep.train_gan_2angles import run_training
run_training(filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',epochs=1000,**best_random)
[16]:
import os,numpy,math
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from gan4hep import get_inference
fig, ax = plt.subplots(2,3,figsize=(14,6),dpi=300,sharex='col')
plt.subplots_adjust(bottom=0.1, left=0.08, hspace=0.2, right=0.98, top=0.9)
ax = ax.flatten()
config = dict(histtype='step', lw=1)
histograms = [[],[],[],[],[],[]]
all_bins = []
unopt = get_inference('PaperData/dimuon/best_model/best_random/generator',filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',n_events=dataset_size,batch_size=int(best_random.batch_size),noise_dim=int(best_random.noise_dim),dimuon=True)[2][0]
for trial in range(8):
    tot_wdis, truths, predictions = get_inference('PaperData/dimuon/best_model/optimized/%02i/generator'%(trial+1),filename='gan4hep/data/mc16d_364100_100Kentries_dimuon.output',n_events=dataset_size,n_iter=1,batch_size=int(best.batch_size),noise_dim=int(best.noise_dim),dimuon=True)
    tot_wdis = numpy.median(tot_wdis)
    print(trial+1,numpy.mean(tot_wdis))
    for idx in range(6):
        n_bins = 20 if idx not in [0,3] else 100
        trial_counts = []
        for n in range(predictions.shape[0]):
            counts, bins, _ = ax[idx].hist(predictions[n,:,idx],density=True, bins=n_bins, linewidth=0.5, alpha=0, range=[-1,1], color='tomato', histtype='step', lw=0.1)
            trial_counts.append(counts)
        if trial==0:
            all_bins.append(bins)
        histograms[idx].append(numpy.median(trial_counts,axis=0))
for idx,name in enumerate(['Scaled $\mathrm{p_{T}}$','Scaled $\\eta$','Scaled $\phi$','Scaled $\mathrm{p_{T}}$','Scaled $\\eta$','Scaled $\phi$']):
    n_bins = 20 if idx not in [0,3] else 100
    ax[idx].hist(truths[:, idx], bins=n_bins, density=True, range=[-1,1], color='black', **config)
    ax[idx].hist(unopt[:, idx], density=True, bins=n_bins, range=[-1,1], color='blue', **config)
    if idx>=3:
        ax[idx].set_xlabel(r"%s" % name)
    mean = numpy.mean(histograms[idx],axis=0)
    stdev = numpy.std(histograms[idx],axis=0)
    ax[idx].stairs(mean, all_bins[idx], color='red',lw=1)
    ax[idx].stairs(mean-stdev/2, all_bins[idx], baseline=mean+stdev/2, color='red',lw=1, fill=True, alpha=0.2)
    if idx in [0,3]:
        ax[idx].set_ylabel(('1st' if idx==0 else '2nd')+' Muon\nNormalized Distribution')
        ax[idx].set_xlim([-1,-0.6])
    if idx in [2,5]:
        ax[idx].set_ylim([0.3,0.7])
line1 = Line2D([0], [0], label='Simulated Dataset', color='black', lw=5)
line2 = Line2D([0], [0], label='Best-random-GAN-generated events', color='blue', lw=5)
line3 = Line2D([0], [0], label='Optimized-GAN-generated events', color='red', lw=5)
fig.legend(handles=[line1,line2,line3], bbox_to_anchor=[0.03,0.95,1,0.05], loc='center',  ncol=3)
plt.savefig('case2_histograms.jpg')
plt.show()
1 0.010210095128665367
2 0.012668120519568523
3 0.017405066794405382
4 0.010580390226095915
5 0.012686156473743418
6 0.010451896348968148
7 0.013104425355171164
8 0.016322168754413724
../_images/papers_gan4hep_58_1.png

Processing time

[ ]:
import glob
import numpy
time = {'hadron':None,'dimuon':None}
for name in time.keys():
  times, params = [], []
  for filename in glob.glob('PaperData/%s/10RBF/001/logs/*.log' % name):
    for line in numpy.loadtxt(filename,dtype=str,delimiter='\n'):
      if 'Number of parameters' in line:
        nprms = int(line.split()[-1].replace(',',''))
      if '\tTime' in line or 'Train Time' in line:
        times.append(float(line.split()[-2]))
        params.append(nprms)
  tot, mean, std, length = numpy.sum(times), numpy.mean(times), numpy.std(times), len(times)
  time[name] = numpy.array([[tot,tot/5,tot/50],[std*length,std*length/5,std*length/50]])
[ ]:
import matplotlib
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2,figsize=(14,5.2),dpi=300)
for i,name in enumerate(['hadron','dimuon']):
  ax[i].bar(range(3), time[name][0], yerr=time[name][1], align='center', alpha=0.5, ecolor='black',error_kw=dict(lw=2, capsize=5, capthick=2))
  ax[i].set_yscale('log')
  ax[i].set_yticks([3600,3600*24,3600*24*7])
  ax[i].set_yticklabels(['1 hour','1 day','1 week'])
  ax[i].set_xticks([0,1,2])
  ax[i].set_xticklabels(['No Parallelization','Single-level\nParallelization','Asynchronous\nNested Parallelism\n(this work)'])
  ax[i].set_title('Hadronization Processes' if i==0 else 'Dimuon Decay')
plt.tight_layout()
plt.savefig('speedup.jpg')
plt.show()
../_images/papers_gan4hep_61_0.png