Difference between revisions of "Python:Fitting"

From PrattWiki
Jump to navigation Jump to search
 
Line 10: Line 10:
 
0.797140015    0.999986
 
0.797140015    0.999986
 
</source>
 
</source>
<!--
+
== Common Command Reference ==
 +
All links below to NumPy v1.15 manual at [https://docs.scipy.org/doc/numpy-1.15.0/index.html NumPy v1.15 Manual]; these commands show up in just about all the examples:
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.copy.html numpy.ndarray.copy]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.min.html numpy.ndarray.min]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.max.html numpy.ndarray.max]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.linspace.html numpy.linspace]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.loadtxt.html numpy.loadtxt]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.mean.html numpy.mean]
 +
 
 
== Polynomial Fitting ==
 
== Polynomial Fitting ==
 
Polynomial fits are those where the dependent data is related to some set of integer powers of the independent variable.  MATLAB's built-in <code>polyfit</code> command can determine the coefficients of a polynomial fit.
 
Polynomial fits are those where the dependent data is related to some set of integer powers of the independent variable.  MATLAB's built-in <code>polyfit</code> command can determine the coefficients of a polynomial fit.
 +
 +
=== Specific Command References ===
 +
All links below to NumPy v1.15 manual at [https://docs.scipy.org/doc/numpy-1.15.0/index.html NumPy v1.15 Manual]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyfit.html numpy.polyfit]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.polyval.html numpy.polyval]
  
 
=== Example Code ===
 
=== Example Code ===
In the example code below, <code>N</code> determines the order of the fit.  Not much else would ever need to change.
+
In the example code below, <code>n</code> determines the order of the fit.  Not much else would ever need to change.
 +
<syntaxhighlight  lang='python' line>
 +
# %% Import modules
 +
import numpy as np
 +
import matplotlib.pyplot as plt
  
<source lang="matlab">
 
%% Initialize the workspace
 
clear; format short e
 
figure(1); clf
 
  
%% Load and manipulate the data
+
# %% Load and manipulate data
load Cantilever.dat
+
# Load data from Cantilever.dat
 +
beam_data = np.loadtxt('Cantilever.dat')
 +
# Copy data from each column into new variables
 +
mass = beam_data[:, 0].copy()
 +
disp = beam_data[:, 1].copy()
 +
# Convert mass to force
 +
force = mass * 9.81
 +
# Convert disp to meters
 +
disp = (disp * 2.54) / 100
  
Force = Cantilever(:,1) * 9.81;
+
# %% Rename and create model data
Displ = Cantilever(:,2) * 2.54 / 100;
+
x = force
 +
y = disp
 +
xmodel = np.linspace(np.min(x), np.max(x), 100)
  
%% Rename and create model data
+
# %% Perform calculations
x = Force;
+
n = 1
y = Displ;
+
p = np.polyfit(force, disp, n)
xmodel = linspace(min(x), max(x), 100);
+
print(p)
  
%% Determine the polynomial coefficients
 
N = 1;
 
P = polyfit(x, y, N)
 
  
%% Generate estimates and model
 
yhat  = polyval(P, x);
 
ymodel = polyval(P, xmodel);
 
  
%% Calculate statistics
 
% Compute sum of the squares of the data residuals
 
St = sum(( y - mean(y) ).^2)
 
  
% Compute sum of the squares of the estimate residuals
+
# %% Generate estimates and model
Sr = sum(( y - yhat ).^2)
+
yhat = np.polyval(p, x)
 +
ymodel = np.polyval(p, xmodel)
  
% Compute the coefficient of determination
+
# %% Calculate statistics
r2 = (St - Sr) / St
+
st = np.sum((y - np.mean(y))**2)
 +
sr = np.sum((y - yhat)**2)
 +
r2 = (st - sr) / st
 +
print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
  
%% Generate plots
+
# %% Generate and save plots
plot(x,     y,     'ko',...
+
plt.figure(1)
    x,     yhat,   'ks',...
+
plt.clf()
    xmodel, ymodel, 'k-');
+
plt.plot(x, y, 'ko', label='Data')
xlabel('Independent Value')
+
plt.plot(x, yhat, 'ks', label='Estimates', mfc='none')
ylabel('Dependent Value')
+
plt.plot(xmodel, ymodel, 'k-', label='Model')
title('Dependent vs. Independent and Model')
+
plt.grid(1)
legend('Data', 'Estimates', 'Model', 'location', 'best')
+
plt.legend()
</source>
+
</syntaxhighlight>
  
 
== General Linear Regression ==
 
== General Linear Regression ==
Line 67: Line 85:
 
</math></center>
 
</math></center>
 
where the <math>a_j</math> are the coefficients of the fit and the <math>\phi_j</math> are the specific functions of the independent variable that make up the fit.
 
where the <math>a_j</math> are the coefficients of the fit and the <math>\phi_j</math> are the specific functions of the independent variable that make up the fit.
 +
=== Specific Command References ===
 +
All links below to NumPy v1.15 manual at [https://docs.scipy.org/doc/numpy-1.15.0/index.html NumPy v1.15 Manual]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.block.html numpy.block]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.linalg.lstsq.html numpy.linalg.lstsq]
 +
* [https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.reshape.html numpy.reshape]
 +
 
=== Example Code ===
 
=== Example Code ===
In the example code below, there are several examples of general linear fits of one variable.
+
In the example code below, there is an example of a general linear fits of one variable.  It is solving the same fit as given above, just in different way.  Specifically it uses linear algebra to find the coefficients that minimize the sum of the squares of the estimate residuals for a general linear fit.
<source lang="matlab">
+
<syntaxhighlight  lang='python' line>
%% Initialize the workspace
+
# %% Import modules
clear; format short e
+
import numpy as np
figure(1); clf
+
import matplotlib.pyplot as plt
  
%% Load and manipulate the data
 
load Cantilever.dat
 
  
Force = Cantilever(:,1) * 9.81;
+
# %% Load and manipulate data
Displ = Cantilever(:,2) * 2.54 / 100;
+
# Load data from Cantilever.dat
 +
beam_data = np.loadtxt('Cantilever.dat')
 +
# Copy data from each column into new variables
 +
mass = beam_data[:, 0].copy()
 +
disp = beam_data[:, 1].copy()
 +
# Convert mass to force
 +
force = mass * 9.81
 +
# Convert disp to meters
 +
disp = (disp * 2.54) / 100
  
%% Rename and create model data
+
# %% Rename and create model data
x = Force;
+
xv = np.reshape(force, (-1, 1))
y = Displ;
+
yv = np.reshape(disp, (-1, 1))
xmodel = linspace(min(x), max(x), 100);
+
xmodel = np.linspace(np.min(xv), np.max(xv), 100)
  
%% Define model equation and A matrix
+
# %% Perform calculations
model = 'linear'
+
def yfun(xe, coefs):
switch model
+
     return coefs[0] * xe + coefs[1]
    case 'linear'
 
        yeqn = @(coefs, x) coefs(1)*x.^1 + coefs(2)*x.^0;
 
        A    =                    [x.^1            x.^0];
 
     case 'quadratic'
 
        yeqn = @(coefs, x) coefs(1)*x.^2 + coefs(2)*x.^1 + coefs(3)*x.^0;
 
        A    =                    [x.^2            x.^1            x.^0];
 
    case 'line through origin'
 
        yeqn = @(coefs, x) coefs(1)*x.^1;
 
        A    =                    [x.^1];
 
    case 'trig'
 
        yeqn = @(coefs, x) coefs(1)*cos(x) + coefs(2)*sin(x);
 
        A    =                    [cos(x)            sin(x)];
 
    case 'trig with offset'
 
        yeqn = @(coefs, x) coefs(1)*cos(x) + coefs(2)*sin(x) + coefs(3)*x.^0;
 
        A    =                    [cos(x)            sin(x)            x.^0];
 
    otherwise
 
        error('Don''t know the model...')
 
end
 
  
%% Determine the function coefficients
+
a_mat = np.block([[xv**1, xv**0]])
MyCoefs = A\y
+
pvec = np.linalg.lstsq(a_mat, yv, rcond=None)[0]
 +
print(pvec)
  
%% Generate estimates and model
+
# %% Generate estimates and model
yhat   = yeqn(MyCoefs, x);
+
yhat = yfun(xv, pvec)
ymodel = yeqn(MyCoefs, xmodel);
+
ymodel = yfun(xmodel, pvec)
  
%% Calculate statistics
+
# %% Calculate statistics
% Compute sum of the squares of the data residuals
+
st = np.sum((yv - np.mean(yv))**2)
St = sum(( y - mean(y) ).^2)
+
sr = np.sum((yv - yhat)**2)
 +
r2 = (st - sr) / st
 +
print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
  
% Compute sum of the squares of the estimate residuals
+
# %% Generate and save plots
Sr = sum(( y - yhat ).^2)
+
plt.figure(1)
 +
plt.clf()
 +
plt.plot(xv, yv, 'ko', label='Data')
 +
plt.plot(xv, yhat, 'ks', label='Estimates', mfc='none')
 +
plt.plot(xmodel, ymodel, 'k-', label='Model')
 +
plt.grid(1)
 +
plt.legend()
 +
</syntaxhighlight>
  
% Compute the coefficient of determination
+
== Nonlinear Regression ==
r2 = (St - Sr) / St
+
Nonlinear regression is both more powerful and more sensitive than linear regression.  For inherently nonlinear fits, it will also produce a better <math>S_r</math> value than linearization since the nonlinear regression process is minimizing the <math>S_r</math> of the actual data rather than that of the ''transformed'' values.  The sensitivity comes into play as the optimization routine may find ''local'' minima versus ''global'' minima.  A good starting guess will work wonders.
  
%% Generate plots
+
=== Specific Command References ===
plot(x,      y,      'ko',...
+
The link below is to the SciPy v1.1.0 reference guide at [https://docs.scipy.org/doc/scipy/reference/index.html SciPy]
    x,     yhat,   'ks',...
+
* [https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html scipy.optimize.curve_fit]
    xmodel, ymodel, 'k-');
+
 
xlabel('Independent Value')
+
=== Example Code ===
ylabel('Dependent Value')
+
Note in the example code that the initial guess gives 1 for the slope and 2 for the intercept.  While these numbers are quite far from the optimized values of 0.0034 for the slope and 0.00055 for the intercept, the optimization routine is still able to find the correct value.
title('Dependent vs. Independent and Model')
+
<syntaxhighlight  lang='python' line>
legend('Data', 'Estimates', 'Model', 'location', 'best')
+
# %% Import modules
</source>
+
import numpy as np
 +
import matplotlib.pyplot as plt
 +
import scipy.optimize as opt
 +
 
 +
# %% Load and manipulate data
 +
# Load data from Cantilever.dat
 +
beam_data = np.loadtxt('Cantilever.dat')
 +
# Copy data from each column into new variables
 +
mass = beam_data[:, 0].copy()
 +
disp = beam_data[:, 1].copy()
 +
# Convert mass to force
 +
force = mass * 9.81
 +
# Convert disp to meters
 +
disp = (disp * 2.54) / 100
 +
 
 +
# %% Rename and create model data
 +
x = force
 +
y = disp
 +
xmodel = np.linspace(np.min(x), np.max(x), 100)
 +
 
 +
# %% Perform calculations
 +
def yfun(x, *coefs):
 +
    return coefs[0] * x + coefs[1]
 +
 
 +
popt = opt.curve_fit(yfun, x, y, [1, 2])[0]
 +
print(popt)
 +
 
 +
 
 +
# %% Generate estimates and model
 +
yhat = yfun(x, *popt)
 +
ymodel = yfun(xmodel, *popt)
 +
 
 +
# %% Calculate statistics
 +
st = np.sum((y - np.mean(y))**2)
 +
sr = np.sum((y - yhat)**2)
 +
r2 = (st - sr) / st
 +
print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
 +
 
 +
# %% Generate and save plots
 +
plt.figure(1)
 +
plt.clf()
 +
plt.plot(x, y, 'ko', label='Data')
 +
plt.plot(x, yhat, 'ks', label='Estimates', mfc='none')
 +
plt.plot(xmodel, ymodel, 'k-', label='Model')
 +
plt.grid(1)
 +
plt.legend()
 +
 
 +
</syntaxhighlight>
 +
 
 +
<!--
 
=== Multidimensional===
 
=== Multidimensional===
 
General linear fitting works for multidimensional functions as well.  In the following code, there are two independent variables and one dependent.  The data set is built at the start of the code (rather than loaded):
 
General linear fitting works for multidimensional functions as well.  In the following code, there are two independent variables and one dependent.  The data set is built at the start of the code (rather than loaded):

Revision as of 17:24, 29 October 2018

This document contains examples of polynomial fitting, general linear regression, and nonlinear regression. In each section, there will be example code that may come in useful for later courses. The example code is based on the existence of a file in the same directory called Cantilever.dat that contains two columns of data - the first is an amount of mass (in kg) placed at the end of a beam and the second is a displacement, measured in inches, at the end of the beam. For EGR 103, this file is:

0.000000        0.005211
0.113510002     0.158707
0.227279999     0.31399
0.340790009     0.474619
0.455809998     0.636769
0.569320007     0.77989
0.683630005     0.936634
0.797140015     0.999986

Common Command Reference

All links below to NumPy v1.15 manual at NumPy v1.15 Manual; these commands show up in just about all the examples:

Polynomial Fitting

Polynomial fits are those where the dependent data is related to some set of integer powers of the independent variable. MATLAB's built-in polyfit command can determine the coefficients of a polynomial fit.

Specific Command References

All links below to NumPy v1.15 manual at NumPy v1.15 Manual

Example Code

In the example code below, n determines the order of the fit. Not much else would ever need to change.

 1 # %% Import modules
 2 import numpy as np
 3 import matplotlib.pyplot as plt
 4 
 5 
 6 # %% Load and manipulate data
 7 # Load data from Cantilever.dat
 8 beam_data = np.loadtxt('Cantilever.dat')
 9 # Copy data from each column into new variables
10 mass = beam_data[:, 0].copy()
11 disp = beam_data[:, 1].copy()
12 # Convert mass to force
13 force = mass * 9.81
14 # Convert disp to meters
15 disp = (disp * 2.54) / 100
16 
17 # %% Rename and create model data
18 x = force
19 y = disp
20 xmodel = np.linspace(np.min(x), np.max(x), 100)
21 
22 # %% Perform calculations
23 n = 1
24 p = np.polyfit(force, disp, n)
25 print(p)
26 
27 
28 
29 
30 # %% Generate estimates and model
31 yhat = np.polyval(p, x)
32 ymodel = np.polyval(p, xmodel)
33 
34 # %% Calculate statistics
35 st = np.sum((y - np.mean(y))**2)
36 sr = np.sum((y - yhat)**2)
37 r2 = (st - sr) / st
38 print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
39 
40 # %% Generate and save plots
41 plt.figure(1)
42 plt.clf()
43 plt.plot(x, y, 'ko', label='Data')
44 plt.plot(x, yhat, 'ks', label='Estimates', mfc='none')
45 plt.plot(xmodel, ymodel, 'k-', label='Model')
46 plt.grid(1)
47 plt.legend()

General Linear Regression

General linear regression involves finding some set of coefficients for fits that can be written as:

\( \hat{y}(x)=\sum_{j=1}^{M}a_j\phi_j(x) \)

where the \(a_j\) are the coefficients of the fit and the \(\phi_j\) are the specific functions of the independent variable that make up the fit.

Specific Command References

All links below to NumPy v1.15 manual at NumPy v1.15 Manual

Example Code

In the example code below, there is an example of a general linear fits of one variable. It is solving the same fit as given above, just in different way. Specifically it uses linear algebra to find the coefficients that minimize the sum of the squares of the estimate residuals for a general linear fit.

 1 # %% Import modules
 2 import numpy as np
 3 import matplotlib.pyplot as plt
 4 
 5 
 6 # %% Load and manipulate data
 7 # Load data from Cantilever.dat
 8 beam_data = np.loadtxt('Cantilever.dat')
 9 # Copy data from each column into new variables
10 mass = beam_data[:, 0].copy()
11 disp = beam_data[:, 1].copy()
12 # Convert mass to force
13 force = mass * 9.81
14 # Convert disp to meters
15 disp = (disp * 2.54) / 100
16 
17 # %% Rename and create model data
18 xv = np.reshape(force, (-1, 1))
19 yv = np.reshape(disp, (-1, 1))
20 xmodel = np.linspace(np.min(xv), np.max(xv), 100)
21 
22 # %% Perform calculations
23 def yfun(xe, coefs):
24     return coefs[0] * xe + coefs[1]
25 
26 a_mat = np.block([[xv**1, xv**0]])
27 pvec = np.linalg.lstsq(a_mat, yv, rcond=None)[0]
28 print(pvec)
29 
30 # %% Generate estimates and model
31 yhat = yfun(xv, pvec)
32 ymodel = yfun(xmodel, pvec)
33 
34 # %% Calculate statistics
35 st = np.sum((yv - np.mean(yv))**2)
36 sr = np.sum((yv - yhat)**2)
37 r2 = (st - sr) / st
38 print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
39 
40 # %% Generate and save plots
41 plt.figure(1)
42 plt.clf()
43 plt.plot(xv, yv, 'ko', label='Data')
44 plt.plot(xv, yhat, 'ks', label='Estimates', mfc='none')
45 plt.plot(xmodel, ymodel, 'k-', label='Model')
46 plt.grid(1)
47 plt.legend()

Nonlinear Regression

Nonlinear regression is both more powerful and more sensitive than linear regression. For inherently nonlinear fits, it will also produce a better \(S_r\) value than linearization since the nonlinear regression process is minimizing the \(S_r\) of the actual data rather than that of the transformed values. The sensitivity comes into play as the optimization routine may find local minima versus global minima. A good starting guess will work wonders.

Specific Command References

The link below is to the SciPy v1.1.0 reference guide at SciPy

Example Code

Note in the example code that the initial guess gives 1 for the slope and 2 for the intercept. While these numbers are quite far from the optimized values of 0.0034 for the slope and 0.00055 for the intercept, the optimization routine is still able to find the correct value.

 1 # %% Import modules
 2 import numpy as np
 3 import matplotlib.pyplot as plt
 4 import scipy.optimize as opt
 5 
 6 # %% Load and manipulate data
 7 # Load data from Cantilever.dat
 8 beam_data = np.loadtxt('Cantilever.dat')
 9 # Copy data from each column into new variables
10 mass = beam_data[:, 0].copy()
11 disp = beam_data[:, 1].copy()
12 # Convert mass to force
13 force = mass * 9.81
14 # Convert disp to meters
15 disp = (disp * 2.54) / 100
16 
17 # %% Rename and create model data
18 x = force
19 y = disp
20 xmodel = np.linspace(np.min(x), np.max(x), 100)
21 
22 # %% Perform calculations
23 def yfun(x, *coefs):
24     return coefs[0] * x + coefs[1]
25 
26 popt = opt.curve_fit(yfun, x, y, [1, 2])[0]
27 print(popt)
28 
29 
30 # %% Generate estimates and model
31 yhat = yfun(x, *popt)
32 ymodel = yfun(xmodel, *popt)
33 
34 # %% Calculate statistics
35 st = np.sum((y - np.mean(y))**2)
36 sr = np.sum((y - yhat)**2)
37 r2 = (st - sr) / st
38 print('st: {}\nsr: {}\nr2: {}'.format(st, sr, r2))
39 
40 # %% Generate and save plots
41 plt.figure(1)
42 plt.clf()
43 plt.plot(x, y, 'ko', label='Data')
44 plt.plot(x, yhat, 'ks', label='Estimates', mfc='none')
45 plt.plot(xmodel, ymodel, 'k-', label='Model')
46 plt.grid(1)
47 plt.legend()

References