Fitting

A lot of time in our analysis we need to extract observable by fitting to a shape function. In this tutorial we will show you how to fit this in Python. The tools we will focus in this tutorial is iminuit, and probfit.They are heavily influenced by ROOFIT and ROOT's MINUIT(and also PyMinuit).

The basic idea of fitting is quite simple

  1. We have PDF and data.
  2. We build our cost function.
  3. We use minimizer to find shape parameters and one of those is our physics observable.

Minimizer

We will start this tutorial with minimizer. Minuit is hands down(for me) the best minimizer there is for HEP. There are couple wrapper for Minuit in Python. The one we will show here is iminuit. It's relatively new. It has all the function I use but might not have your favorite feature; but, you are welcome to implement it(I'll point you in the right direction).

iminuit has its own quick start tutorial that givens you an overview of its feature and hard core tutorial that teach you how to complex stuff like using cython, make your own costfunction fast, and even parallel computing. We will be showing here basic feature of iminuit. If you need to do advance stuff take a look at those tutorials.

Quick Start

In [1]:
from iminuit import Minuit, describe
In [2]:
%load_ext inumpy
In [3]:
#define a function to minimize
#making x,y correlates on purpose
def f(x,y,z):
    return (x-2)**2 + (y-x)**2 + (z-4)**2
In [4]:
#iminuit relies on python introspection to read function signature
describe(f) # one of the most useful function from iminuit
Out[4]:
['x', 'y', 'z']
In [5]:
#notice here that it automatically knows about x,y,z
#no RooRealVar etc. needed here
m = Minuit(f)
#it warns about every little thing that might go wrong
-c:3: InitialParamWarning: errordef is not given. Default to 1.
-c:3: InitialParamWarning: Parameter x does not have initial value. Assume 0.
-c:3: InitialParamWarning: Parameter x is floating but does not have initial step size. Assume 1.
-c:3: InitialParamWarning: Parameter y does not have initial value. Assume 0.
-c:3: InitialParamWarning: Parameter y is floating but does not have initial step size. Assume 1.
-c:3: InitialParamWarning: Parameter z does not have initial value. Assume 0.
-c:3: InitialParamWarning: Parameter z is floating but does not have initial step size. Assume 1.
In [6]:
#the most frequently asked question is Does my fit converge
#also look at your console it print progress if you use print_level=2
m.migrad();
#bonus: see that link with a plus sign on the top left corner?
#clicking it will give you a latex table that you can copy paste
#to your paper/beamer slide;

FCN = 6.57752842029e-23 NFCN = 52 NCALLS = 52
EDM = 6.57730573206e-23 GOAL EDM = 1e-05 UP = 1.0
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 x 2.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00
2 y 2.000000e+00 1.414214e+00 0.000000e+00 0.000000e+00
3 z 4.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00

Accessing Value/Error

In [7]:
print m.values
print m.errors
print m.fval
{'y': 1.999999999999814, 'x': 1.999999999997652, 'z': 3.999999999992544}
{'y': 1.414213562372341, 'x': 0.9999999999995464, 'z': 0.9999999999998791}
6.57752842029e-23
In [8]:
m.print_param()
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 x 2.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00
2 y 2.000000e+00 1.414214e+00 0.000000e+00 0.000000e+00
3 z 4.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00
In [9]:
#correlation matrix
#Only Chrome/safari gets the vertical writing mode right.
m.print_matrix() 
+
x
y
z
x 1.00 0.71 -0.00
y 0.71 1.00 -0.00
z -0.00 -0.00 1.00

Checking convergence

More details on this in the tip section at the end. Basically return value of migrad tells you a bunch of fit status.

In [10]:
print m.migrad_ok()
print m.matrix_accurate()
True
True

minos

In [11]:
m.minos(); #do m.minos('x') if you need just 1 of them
Minos status for x: VALID
Error -0.999999999998 1.0
Valid True True
At Limit False False
Max FCN False False
New Min False False
Minos status for y: VALID
Error -1.41421356237 1.41421356237
Valid True True
At Limit False False
Max FCN False False
New Min False False
Minos status for z: VALID
Error -0.999999999993 1.00000000001
Valid True True
At Limit False False
Max FCN False False
New Min False False
Out[11]:
{'x': {'lower_new_min': False, 'upper': 1.0000000000023477, 'lower': -0.9999999999976521, 'at_lower_limit': False, 'min': 1.999999999997652, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 18L},
 'y': {'lower_new_min': False, 'upper': 1.414213562373281, 'lower': -1.4142135623729089, 'at_lower_limit': False, 'min': 1.999999999999814, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 28L},
 'z': {'lower_new_min': False, 'upper': 1.000000000007456, 'lower': -0.9999999999925441, 'at_lower_limit': False, 'min': 3.999999999992544, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 18L}}
In [12]:
m.merrors
Out[12]:
{('x', -1.0): -0.9999999999976521,
 ('x', 1.0): 1.0000000000023477,
 ('y', -1.0): -1.4142135623729089,
 ('y', 1.0): 1.414213562373281,
 ('z', -1.0): -0.9999999999925441,
 ('z', 1.0): 1.000000000007456}

Contour and Profile(Scan)

In [13]:
m.draw_mncontour('x','y') # you can get the raw value using m.mncontour  or m.mncontour_grid;
In [14]:
m.draw_profile('x') # this is 1d evaluation not minos scan;

Initial value, Limit, initial error, fixing

In [15]:
m = Minuit(f, x=2, y=4, fix_y=True, limit_z=(-10,10), error_z=0.1)
-c:1: InitialParamWarning: errordef is not given. Default to 1.
-c:1: InitialParamWarning: Parameter x is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter z does not have initial value. Assume 0.
In [16]:
m.migrad();

FCN = 2.00000011701 NFCN = 30 NCALLS = 30
EDM = 1.17011093083e-07 GOAL EDM = 1e-05 UP = 1.0
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 x 3.000070e+00 7.071068e-01 0.000000e+00 0.000000e+00
2 y 4.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 FIXED
3 z 3.999673e+00 9.980094e-01 0.000000e+00 0.000000e+00 -10.0 10.0

Building cost fuction using probfit

You could write your always own cost function(see iminuit hardcore tutorial for example). But why should you. probfit provide convenience functor for you to build a simple cost function like UnbinnedLH, BinnedLH, BinnedChi2, Chi2Regression.

Let's try to fit a simple gaussian with unbinned likelihood.

In [17]:
from math import exp, pi, sqrt
from probfit import UnbinnedLH
from iminuit import Minuit
In [18]:
seed(0)
gdata = randn(1000)
hist(gdata,bins=100, histtype='step');
In [19]:
#you can define your pdf manually like this
def my_gauss(x, mu, sigma):
    return exp(-0.5*(x-mu)**2/sigma**2)/(sqrt(2*pi)*sigma)
In [20]:
#probfit use the same describe magic as iminuit
#Build your favorite cost function like this
#Notice no RooRealVar etc. It use introspection to
#find parameters
#the only requirement is that the first argument is
#in independent variable the rest are parameters
ulh = UnbinnedLH(my_gauss, gdata)
describe(ulh)
Out[20]:
['mu', 'sigma']
In [21]:
m = Minuit(ulh, mu=0.2, sigma=1.5)
m.set_up(0.5) #remember up is 0.5 for likelihood and 1 for chi^2
ulh.show(m)
-c:1: InitialParamWarning: Parameter mu is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter sigma is floating but does not have initial step size. Assume 1.
In [22]:
m.migrad();
ulh.show(m)

FCN = 1405.88688927 NFCN = 67 NCALLS = 67
EDM = 7.65781084211e-07 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mu -4.528735e-02 3.121221e-02 0.000000e+00 0.000000e+00
2 sigma 9.870165e-01 2.206980e-02 0.000000e+00 0.000000e+00

Another way to fit gaussian

probfit comes with a bunch of builtin functions so you don't have to write your own pdf. If you can't find your favorite pdf there is nothing preventing you from doing:

def my_secret_pdf(x,y,z): return secret_formula(x,y,z)

But, it's better if you fork our project, implement it and submit a pull request.

In [23]:
from probfit import gaussian
In [24]:
ulh = UnbinnedLH(gaussian, gdata)
describe(ulh)
Out[24]:
['mean', 'sigma']
In [25]:
m = Minuit(ulh, mean=0.2, sigma =0.3)
m.migrad()
ulh.show(m)
-c:1: InitialParamWarning: Parameter mean is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter sigma is floating but does not have initial step size. Assume 1.

FCN = 1405.8868885 NFCN = 84 NCALLS = 84
EDM = 2.0619807636e-10 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mean -4.525667e-02 3.121275e-02 0.000000e+00 0.000000e+00
2 sigma 9.870336e-01 2.207075e-02 0.000000e+00 0.000000e+00

Other Cost functions

Binned $\chi^2$

Just a $\chi^2$ with symmetric poisson error assumption. Currently doesn't support unextended fit yet(easy fix, anyone wanna do it?). But, binned likelihood support both extended and unextended fit.

In [26]:
from probfit import Extended, BinnedChi2
seed(0)
gdata = randn(10000)
In [27]:
mypdf = Extended(gaussian)
describe(mypdf) # just basically N*gaussian(x,mean,sigma)
Out[27]:
['x', 'mean', 'sigma', 'N']
In [28]:
bx2 = BinnedChi2(mypdf, gdata, bound=(-3,3))#create cost function
bx2.show(args={'mean':1.0, 'sigma':1.0, 'N':10000}) #another way to draw it
In [29]:
m = Minuit(bx2, mean=1.0, sigma=1.0, N=1000.)
m.migrad()
bx2.show(m)
-c:1: InitialParamWarning: Parameter mean is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter sigma is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter N is floating but does not have initial step size. Assume 1.

FCN = 33.9840496564 NFCN = 87 NCALLS = 87
EDM = 2.21724517031e-07 GOAL EDM = 1e-05 UP = 1.0
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mean -1.856158e-02 1.003145e-02 0.000000e+00 0.000000e+00
2 sigma 9.883410e-01 7.437716e-03 0.000000e+00 0.000000e+00
3 N 9.970810e+03 9.970433e+01 0.000000e+00 0.000000e+00

Binned Likelihood

Poisson binned log likelihood with minimum subtractacted(aka likelihood ratio).

In [30]:
from probfit import Extended, BinnedLH
seed(0)
gdata = randn(10000)
In [31]:
mypdf = gaussian
describe(mypdf) # just basically N*gaussian(x,mean,sigma)
Out[31]:
['x', 'mean', 'sigma']
In [32]:
blh = BinnedLH(mypdf, gdata, bound=(-3,3))#create cost function
#it can also do extended one if you pass it an extended pdf and pass extended=True to BinnedLH
blh.show(args={'mean':1.0, 'sigma':1.0})
In [33]:
m = Minuit(blh, mean=1.0, sigma=1)
m.set_up(0.5)
m.migrad()
blh.show(m)
-c:1: InitialParamWarning: Parameter mean is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter sigma is floating but does not have initial step size. Assume 1.

FCN = 16.9210948316 NFCN = 59 NCALLS = 59
EDM = 1.35282725541e-06 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mean -1.944627e-02 1.005891e-02 0.000000e+00 0.000000e+00
2 sigma 9.895137e-01 7.536312e-03 0.000000e+00 0.000000e+00

$\chi^2$ Regression

Some time you just want a simple line fit as opposed to fitting pdf.

In [34]:
from probfit import Chi2Regression
In [35]:
x = linspace(-10,10,30)
y = 3*x**2 +2*x + 1
#add some noise
y = y+randn(30)*10
errorbar(x,y,10, fmt='b.')
Out[35]:
<Container object of 3 artists>
In [36]:
#there is a poly2 builtin but just to remind you that you can do this
def my_poly(x, a, b, c):
    return a*x**2+ b*x+ c
In [37]:
err = np.array([10]*30)
x2reg= Chi2Regression(my_poly, x, y, error=err)
x2reg.draw(args={'a':1,'b':2,'c':3})
In [38]:
m = Minuit(x2reg, a=1, b=2, c=3)
m.migrad()
x2reg.show(m)
-c:1: InitialParamWarning: Parameter a is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter b is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter c is floating but does not have initial step size. Assume 1.

FCN = 34.2023676505 NFCN = 52 NCALLS = 52
EDM = 3.89079131844e-07 GOAL EDM = 1e-05 UP = 1.0
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 a 3.011252e+00 5.738228e-02 0.000000e+00 0.000000e+00
2 b 2.121267e+00 3.058568e-01 0.000000e+00 0.000000e+00
3 c 2.134416e+00 2.741159e+00 0.000000e+00 0.000000e+00

Let's do some physics

Remeber the D mass?? Let's try to fit relativistic Breit-Wigner to it.

In [39]:
from root_numpy import root2rec
In [40]:
data = root2rec('data/*.root')
bb = root2rec('data/B*.root')
cc = root2rec('data/cc*.root')
In [41]:
hs = np.hstack
hist([hs(data.DMass), hs(bb.DMass), hs(cc.DMass)], bins=50, histtype='step');

Simple fit

First lets fit bb's DMass alone with a Breit-Wigner.

In [42]:
from probfit import UnbinnedLH, draw_compare_hist,\
                    vector_apply, Normalized, rtv_breitwigner,\
                    linear, rename, AddPdfNorm
from iminuit import Minuit
In [43]:
bb_dmass = hs(bb.DMass)
print bb.DMass.size
3219
In [44]:
#you can compare them like this
draw_compare_hist(rtv_breitwigner, {'m':1.87, 'gamma':0.01}, bb_dmass, normed=True);
In [45]:
#Our DMass is a truncated one so we need to have it normalized properly
#this is easy with Normalized functor which normalized your pdf
#this might seems a bit unusual if you never done functinal programming
#but normalize just wrap the function around and return a new function
signalpdf = Normalized(rtv_breitwigner,(1.83,1.91))
In [46]:
ulh = UnbinnedLH(signalpdf, bb_dmass)
In [47]:
m = Minuit(ulh, m=1.875, gamma=0.01)#I shift it on purpose
m.set_up(0.5)
ulh.show(m) #you can see it before the fit begins;
-c:1: InitialParamWarning: Parameter m is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter gamma is floating but does not have initial step size. Assume 1.
In [48]:
%timeit -n1 -r1 m.migrad();
-c:257: SmallIntegralWarning: (1.8630708456039429, 1.875, -0.05716159358979792)

FCN = -10458.6241946 NFCN = 53 NCALLS = 53
EDM = 6.52822663213e-07 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869087e+00 1.344898e-04 0.000000e+00 0.000000e+00
2 gamma 1.151186e-02 3.003198e-04 0.000000e+00 0.000000e+00

1 loops, best of 1: 142 ms per loop
In [49]:
ulh.show(m) #looks good
In [50]:
m.minos();#do m.minos('m') if you need just 1 parameter
Minos status for m: VALID
Error -0.000134384231326 0.000134627933923
Valid True True
At Limit False False
Max FCN False False
New Min False False
Minos status for gamma: VALID
Error -0.000295920358877 0.000304715369101
Valid True True
At Limit False False
Max FCN False False
New Min False False
Out[50]:
{'gamma': {'lower_new_min': False, 'upper': 0.00030471536910134004, 'lower': -0.00029592035887724555, 'at_lower_limit': False, 'min': 0.01151186448673394, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 12L},
 'm': {'lower_new_min': False, 'upper': 0.00013462793392262257, 'lower': -0.00013438423132646268, 'at_lower_limit': False, 'min': 1.8690871087182035, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 14L}}
In [51]:
m.print_param()
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869087e+00 1.344898e-04 -1.343842e-04 1.346279e-04
2 gamma 1.151186e-02 3.003198e-04 -2.959204e-04 3.047154e-04

More ComplexPDF

Now let's add the background and fit it. Looks like a job for linear + breitwigner.

In [52]:
bound = (1.83,1.91)
bgpdf = Normalized(linear,bound)
In [53]:
describe(bgpdf)
Out[53]:
['x', 'm', 'c']
In [54]:
#remember our breit wigner also has m argument which means different thing
describe(rtv_breitwigner)
Out[54]:
['x', 'm', 'gamma']
In [55]:
#renaming is easy
signalpdf = Normalized(rename(rtv_breitwigner,['x','mass','gamma']),(1.83,1.91))
In [56]:
describe(signalpdf)
Out[56]:
['x', 'mass', 'gamma']
In [57]:
#now we can add them
total_pdf = AddPdfNorm(signalpdf,bgpdf)
#if you want to just directly add them up with out the factor(eg. adding extended pdf)
#use AddPdf
describe(total_pdf)
Out[57]:
['x', 'mass', 'gamma', 'm', 'c', 'f_0']
In [58]:
ulh = UnbinnedLH(total_pdf, np.hstack(data.DMass))
m = Minuit(ulh, mass=1.87, gamma=0.01, m=0., c=1, f_0=0.7)
ulh.show(m)
-c:2: InitialParamWarning: Parameter mass is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter gamma is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter m is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter c is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter f_0 is floating but does not have initial step size. Assume 1.
In [59]:
m.migrad();
-c:1: SmallIntegralWarning: (1.8630708456039429, 1.87, -0.09079225177349604)
-c:1: SmallIntegralWarning: (1.8630708456039429, 0.0, -0.007922517734960222)

FCN = -21505.9979204 NFCN = 248 NCALLS = 248
EDM = 4.16787410874e-09 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mass 1.869398e+00 1.414321e-04 0.000000e+00 0.000000e+00
2 gamma 1.081224e-02 3.908496e-04 0.000000e+00 0.000000e+00
3 m -4.677901e-01 3.999174e-02 0.000000e+00 0.000000e+00
4 c 1.041123e+00 8.896113e-02 0.000000e+00 0.000000e+00
5 f_0 5.347971e-01 1.279253e-02 0.000000e+00 0.000000e+00

In [60]:
ulh.show(m, parts=True)
In [61]:
m.print_matrix()
+
mass
gamma
m
c
f_0
mass 1.00 -0.02 -0.05 -0.05 -0.02
gamma -0.02 1.00 -0.01 -0.01 0.61
m -0.05 -0.01 1.00 -0.85 -0.00
c -0.05 -0.01 -0.85 1.00 -0.00
f_0 -0.02 0.61 -0.00 -0.00 1.00

Note on complex PDF

There is nothing preventing you from doing something like this: def mypdf(x,mass, gamma, m, c, f_0): return brietwigner(x, mass, gamma) + f_0*(m*x+c)/normalization ulh=UnbinnedLH(mypdf, data) m=Minuit(ulh, **initial_values) m.migrad()

If your PDF is more complicated than what you can do with AddPDF and AddPDFNorm, it might be easier to write it out manually.

Simultaneous Fit

Let try to simultaneous fit 2 gaussians with the same width but different mean.

In [64]:
from probfit import SimultaneousFit
In [65]:
data1 = randn(10000)
data2 = randn(10000)+10
hist([data1,data2], histtype='step', bins=100);
In [66]:
#note here that they share the same sigma
g1 = rename(gaussian, ['x','mu1','sigma'])
g2 = rename(gaussian, ['x','mu2','sigma'])
print describe(g1)
print describe(g2)
['x', 'mu1', 'sigma']
['x', 'mu2', 'sigma']
In [67]:
#make two likelihood and them up
ulh1 = UnbinnedLH(g1,data1)
ulh2 = UnbinnedLH(g2,data2)
sim = SimultaneousFit(ulh1,ulh2)
print describe(sim) #note the sigma merge
['mu1', 'sigma', 'mu2']
In [68]:
sim.draw(args=(0.5, 1.5, 10.5))
In [69]:
m = Minuit(sim, mu1=0.5, sigma=1.5, mu2=10.5)
-c:1: InitialParamWarning: errordef is not given. Default to 1.
-c:1: InitialParamWarning: Parameter mu1 is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter sigma is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter mu2 is floating but does not have initial step size. Assume 1.
In [70]:
m.migrad();

FCN = 28276.585769 NFCN = 81 NCALLS = 81
EDM = 1.53288894679e-08 GOAL EDM = 1e-05 UP = 1.0
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mu1 1.023630e-02 1.407006e-02 0.000000e+00 0.000000e+00
2 sigma 9.949032e-01 7.035022e-03 0.000000e+00 0.000000e+00
3 mu2 9.994690e+00 1.407006e-02 0.000000e+00 0.000000e+00

In [71]:
sim.show(m)

Note on simultaneous fit

Again there is nothing preventing you from doing def my_cost_function(mu1, mu2, sigma): return ulh1(mu1,sigma)+ulh2(mu2.sigma)

m=Minuit(my_cost_function, **initial_values) m.migrad()

If your cost function is more complex than adding them together this is the to do it.

Toy generation.

This is invert CDF implementation (not accept reject). Large overhead but fast element-wise. Anyone want to signup for accept/reject?

In [72]:
from probfit import gen_toy
In [73]:
toy = gen_toy(total_pdf, 1000, (1.83,1.91), mass=1.87, gamma=0.01, c=1.045, m=-0.43, f_0=0.5, quiet=False)
['x', 'mass', 'gamma', 'm', 'c', 'f_0']
In [74]:
hist(toy, bins=100, histtype='step');
In [75]:
ulh = UnbinnedLH(total_pdf, toy)
m = Minuit(ulh, mass=1.87, gamma=0.01, c=1.045, m=-0.43, f_0=0.5)
m.migrad();
ulh.show(m)
-c:2: InitialParamWarning: Parameter mass is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter gamma is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter m is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter c is floating but does not have initial step size. Assume 1.
-c:2: InitialParamWarning: Parameter f_0 is floating but does not have initial step size. Assume 1.

FCN = -2756.9986626 NFCN = 472 NCALLS = 472
EDM = 3.10277679189e-07 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 mass 1.870038e+00 3.294864e-04 0.000000e+00 0.000000e+00
2 gamma 9.423832e-03 9.778754e-04 0.000000e+00 0.000000e+00
3 m 4.031586e+03 1.414212e+00 0.000000e+00 0.000000e+00
4 c 6.188177e+03 1.414212e+00 0.000000e+00 0.000000e+00
5 f_0 5.332493e-01 3.463870e-02 0.000000e+00 0.000000e+00

Tips

A lot of time you want to generate toy from fitted parameters. Retyping/Repeating yourself is not the best use of time. Use python dictionary expansion.

In [76]:
m.fitarg
Out[76]:
{'c': 6188.177096551795,
 'error_c': 1.414212148161654,
 'error_f_0': 0.03463870349452257,
 'error_gamma': 0.0009778753714724175,
 'error_m': 1.4142121450350984,
 'error_mass': 0.0003294864038653029,
 'f_0': 0.533249323216673,
 'fix_c': False,
 'fix_f_0': False,
 'fix_gamma': False,
 'fix_m': False,
 'fix_mass': False,
 'gamma': 0.009423831866177658,
 'limit_c': None,
 'limit_f_0': None,
 'limit_gamma': None,
 'limit_m': None,
 'limit_mass': None,
 'm': 4031.5859291021125,
 'mass': 1.8700381598984956}
In [77]:
toy = gen_toy(total_pdf, 1000, (1.83,1.91), quiet=False, **m.fitarg)#note the double star
['x', 'mass', 'gamma', 'm', 'c', 'f_0']

Saving/Loading your toy

using np.save

In [78]:
np.save('mytoy.npy', toy)
In [79]:
loaded_toy = np.load('mytoy.npy')
hist(loaded_toy, bins=100, histtype='step');

Recipe

I won't cover it but just something you may find useful.

Using Cython

Skip this part if you don't have cython

A much more comprehensive example of how to use cython is provided in iminuit hard core tutorial. We will show a simple example here.

In [80]:
%load_ext cythonmagic
In [81]:
%%cython
from libc.math cimport sqrt
cimport cython

cdef double pi = 3.14159265358979323846264338327

@cython.embedsignature #you need this or experimental @cython.binding.
cpdef double cython_bw(double x, double m, double gamma):
    cdef double mm = m*m
    cdef double xm = x*x-mm
    cdef double gg = gamma*gamma
    cdef double s = sqrt(mm*(mm+gg))
    cdef double N = (2*sqrt(2)/pi)*m*gamma*s/sqrt(mm+s)
    return N/(xm*xm+mm*gg)
In [82]:
cython_bw(1,2,3)
Out[82]:
0.2585302502852219
In [83]:
from probfit import describe
describe(cython_bw)
Out[83]:
['x', 'm', 'gamma']
In [84]:
ulh = UnbinnedLH(cython_bw, bb_dmass)
In [85]:
m = Minuit(ulh, m=1.875, gamma = 0.01)
ulh.show(m)
-c:1: InitialParamWarning: Parameter m is floating but does not have initial step size. Assume 1.
-c:1: InitialParamWarning: Parameter gamma is floating but does not have initial step size. Assume 1.
In [86]:
%timeit -r1 -n1 m.migrad()

FCN = -10170.1470088 NFCN = 56 NCALLS = 56
EDM = 9.31305324182e-07 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869105e+00 1.236187e-04 0.000000e+00 0.000000e+00
2 gamma 9.609371e-03 2.250473e-04 0.000000e+00 0.000000e+00

1 loops, best of 1: 68.8 ms per loop

Building Cost Functions Manually

See iminuit hardcore tutorial.

Checking convergence programatically

In [87]:
status, param = m.migrad()

FCN = -10170.1470097 NFCN = 10 NCALLS = 66
EDM = 4.27228260843e-13 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869105e+00 1.236193e-04 0.000000e+00 0.000000e+00
2 gamma 9.609357e-03 2.250544e-04 0.000000e+00 0.000000e+00

In [88]:
print status.has_covariance
print status.is_valid
True
True
In [89]:
#or an umbrella call
m.migrad_ok()
Out[89]:
True
In [90]:
#minos
results = m.minos('m')
print results['m']
print results['m'].upper_valid
print results['m'].lower_valid
Minos status for m: VALID
Error -0.000123642360121 0.000123614309471
Valid True True
At Limit False False
Max FCN False False
New Min False False
{'lower_new_min': False, 'upper': 0.0001236143094705069, 'lower': -0.00012364236012110086, 'at_lower_limit': False, 'min': 1.8691050836421013, 'at_lower_max_fcn': False, 'is_valid': True, 'upper_new_min': False, 'at_upper_limit': False, 'lower_valid': True, 'upper_valid': True, 'at_upper_max_fcn': False, 'nfcn': 12L}
True
True

Saving/Reuse fit argument

Some time we want to resue fitting argument.

In [91]:
m.fitarg
Out[91]:
{'error_gamma': 0.0002250543924548983,
 'error_m': 0.000123619287240136,
 'fix_gamma': False,
 'fix_m': False,
 'gamma': 0.00960935659431392,
 'limit_gamma': None,
 'limit_m': None,
 'm': 1.8691050836421013}
In [92]:
old_fitarg = m.fitarg
ulh2 = UnbinnedLH(cython_bw, bb_dmass)
m2 = Minuit(ulh2, **old_fitarg)
In [93]:
m2.print_param()
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869105e+00 1.236193e-04 0.000000e+00 0.000000e+00
2 gamma 9.609357e-03 2.250544e-04 0.000000e+00 0.000000e+00
In [94]:
#since fitarg is just a dictionary you can dump it via pickle
import pickle
out = open('my_fitarg.pck','w')
pickle.dump(old_fitarg,out)
out.close()
In [95]:
fin = open('my_fitarg.pck','r')
loaded_fitarg = pickle.load(fin)
fin.close()
In [96]:
print loaded_fitarg
{'fix_m': False, 'm': 1.8691050836421013, 'limit_m': None, 'error_m': 0.000123619287240136, 'fix_gamma': False, 'error_gamma': 0.0002250543924548983, 'limit_gamma': None, 'gamma': 0.00960935659431392}
In [97]:
m3 = Minuit(ulh2, **loaded_fitarg)
In [98]:
m3.migrad();

FCN = -10170.1470097 NFCN = 29 NCALLS = 29
EDM = 1.23520523748e-10 GOAL EDM = 5e-06 UP = 0.5
Valid Valid Param Accurate Covar PosDef Made PosDef
True True True True False
Hesse Fail HasCov Above EDM Reach calllim
False True False False
+ Name Value Parab Error Minos Error- Minos Error+ Limit- Limit+ FIXED
1 m 1.869105e+00 1.236185e-04 0.000000e+00 0.000000e+00
2 gamma 9.609347e-03 2.250465e-04 0.000000e+00 0.000000e+00