Thursday, 27 August 2015

Wrapping up

So, the GSoC period has come to an end now.

Project Goals  
My project was about creating Matlab compatible wrappers for the optim package.  Here is a brief list of my project goals.

1- lsqnonlin wrapping nonlin_residmin and residmin_stat
2- lsqcurvefit wrapping nonlin_curvefit and curvefit_stat
3- nlinfit wrapping nonlin_curvefit (it was initially decided to wrap leasqr but changed to avoid extra computations)
4- quadprog wrapping __qp__ instead of qp and returning lambda in the form of a structure as in Matlab
5- fmincon wrapping nonlin_min
6- Test and demos for the above functions
7- Stretch goals: previously decided to create other missing functions or perhaps additional backends but before midterm I decided instead to include optimoptions in my to do list.


Achievements
The functions lsqnonlinlsqcurvefit and nlinfit are complete with tests and demos and integrated in the optim package. Since nlinfit is from the statistics package in Matlab, additional functions such as statset, statget were required for handling options. These functions are implemented with minor modifications in optimset, optimget and __all_opts__ as statset, statget and __all_stat_opts__ and are now a part of optim package.

The function quadprog required directly wrapping __qp__ instead of qp for the ordering of lambda. It is in the final stages of review and will soon be integrated.

fmincon has not been thoroughly reviewed yet. I will send it to Olaf after quadprog is committed to optim. 

Hiccups in the stretch goal
I couldn't create optimoptions in the GSoC time frame because it was a bit open ended and I had to come up with an object oriented design for the function. I was trying to understand how Matlab implements it for quite some time. Anyway, I didn't pursue it further and shifted my focus on the refinement of my almost complete functions to get them integrated in optim.  

Interesting Takeaways
This is my first experience of working with any open source organization and it's definitely a pleasant one. It's delightful to see people using my functions and possibly benefiting from my work [2-3]. :)

I think I have managed to meet all the goals I set before the start of GSoC. (Regrets? Well, I could have saved more time for optimoptions and it would've been better to discuss it way before than being stuck for a while.)

I'm extremely grateful to the Octave community, especially my mentors Olaf Till and Nir Krakauer for their unrelenting support. GSoC wouldn't have been possible without their constructive feedback. I have learned a lot from this experience.


[1] https://github.com/AsmaAfzal/octave_workspace/issues/1
[2] http://octave.1599824.n4.nabble.com/GSoC-2015-Optimization-Package-Non-linear-and-constrained-least-squares-lsqcurvefit-lsqlin-lsqnonlin-tp4668777p4670940.html

Sunday, 16 August 2015

Week 11 and 12: Integrating existing work in optim package.

A recap of the progress in two weeks:

  • I had to let go of optimoptions (for GSoC) mainly because of
    • time constraints 
    • and also because I don't have much experience with objected oriented programming. For optimoptions, I will have to come up with a design. I started with class implementation using classdef as in Matlab, but it is in its infancy in Octave and it could possibly be a limiting factor.
  • I am refining my existing functions and including tests and demos so they can be integrated in the optim package.
    • lsqnonlin and lsqcurvefit required additional options documentation and OutputFcn and Display setting. These two functions have now been successfully integrated in the optim package. [1]
    • Functions nlinfit and quadprog are under review.
    • I am working on fmincon now. Still have to discuss which backend should be used. lm_feasible can return Lagrange multipliers, gradient and hessian, but since it adheres to the constraints in all iterations, it behaves differently (from Matlab's algos) and sometimes less efficiently as octave_sqp, which only respects the constraints for the final result.
[1] http://sourceforge.net/p/octave/optim/ci/85d621b7e31a3546383248431b2f340d43edd6da/

Monday, 3 August 2015

Week 10: Preliminary work on optimoptions

Thoroughly checking how optimoptions works in  Matlab.

options = optimoptions (SolverName)

Things to do:

  • Identify if Solver name is the right string or function handle

  • Cater for multiple Algorithms
     A subset of options for different algorithms.
  • Transfer relevant options of different solvers to modify/create option.
    oldoptions = optimoptions('fmincon','TolX',1e-10)
    newoptions = optimoptions('lsqnonlin',oldoptions)
  • Using dot notation or optimoptions to modify previously created options.
    (Second argument in optimoptions can be old options)
  • Display options:
    Set by user on top (for the current algo)
    Default options
    Options set for other algorithms.
Implementation ideas:

In Matlab, these two calls generate the same options object optim.options.Fmincon:
optimoptions('fmincon')

optim.options.Fmincon
What would be more appropriate, IMO, will be to have a function optimoptions of the following format
opts = function optimoptions(SolverName,varargin)
     ...

    obj = optim.options.SolverName(varargin)

    opts = struct(obj)

    ...

This function will
  1. Instantiate the relevant class and request for default options from the solver.
  2. Compare the user provided options to add relevant options.
  3. Display options of the current algorithm.
  4. The output can also be returned in the form of a struct to be compatible with optimget.




Thursday, 23 July 2015

Week 8 and 9: The ordering of lambda in quadprog

I was trying to dig through __qp__.cc to figure out why the ordering of lambda in [1] does not match that of quadprog in Matlab.
The example shows how the values were different:

C = [0.9501    0.7620    0.6153    0.4057
    0.2311    0.4564    0.7919    0.9354
    0.6068    0.0185    0.9218    0.9169
    0.4859    0.8214    0.7382    0.4102
    0.8912    0.4447    0.1762    0.8936];
d = [0.0578
    0.3528
    0.8131
    0.0098
    0.1388];
A =[0.2027    0.2721    0.7467    0.4659
    0.1987    0.1988    0.4450    0.4186
    0.6037    0.0152    0.9318    0.8462];
b =[0.5251
    0.2026
    0.6721];
Aeq = [3 5 7 9];
beq = 4;
lb = -0.1*ones(4,1);
ub = 1*ones(4,1);
H = C ' * C;

f = -C ' * d;

[x, obj_qp, INFO, lambda] = qp ([],H,f,Aeq,beq,lb,ub,[],A,b);
lambda = 
   0.01654
   0.06743
   0.00000
   0.24993
   0.00000
   0.00000
   0.00000
   0.00000
   0.00000
   0.00000
   0.49819
   0.00000


Reordering lambda based on the length of constraints resulted in
lambda.eqlin = 
   0.01654
lambda.lower =
   0.06743
   0.00000
   0.24993
   0.00000

lambda.upper =
   0.00000
   0.00000
   0.00000
   0.00000

lambda.ineqlin =
   0.00000
   0.49819
   0.00000


Matlab however,  gave lambda.eqlin= -0.0165 for this example and lambda.lower =
    0.0674
    0.2499
         0
         0

There were two issues with the ordering:
  1.  the sign for Lagrange multipliers corresponding to linear equality constraint is always different from Matlab's
  2. The multipliers corresponding to the bound constraints as underlined above are swapped.
I tried several different examples to understand what is going on. For all the examples, the sign for lambda.eqlin was consistently different. Although, I still can't pinpoint why but for now I am just multiplying lambda.eqlin by -1.

For the swapping issue, I tried the same example with just the bound constraints:

[x, obj_qp, INFO, lambda] = qp ([],H,f,[],[],lb,ub)
lambda= 
   0.07377
   0.00000
   0.29791
   0.00000
   0.00000
   0.00000
   0.00000
   0.00000

but only considering lower bound constraints gave:
[x, obj_qp, INFO, lambda] = qp ([],H,f,[],[],lb,[])
lambda=
   0.07377
   0.29791
   0.00000
   0.00000


which is how it is supposed to be. Tracing back, I found out that the ordering of lambda vector in qp.m [2] is not [equality constraints; lower bounds; upper bounds; other inequality constraints] like I previously assumed. From lines 287 and 288 in qp.m [2], the bound constraints are added to the inequality constraint matrix alternatively. So the issue wasn't swapping but understanding how the constraints are passed to __qp__.

In my code in [1], I had to make significant changes in the original qp.m code such as:
- Inequality constraint matrix has the order: [Linear inequality constraints; lower bounds; upper bounds]
- Check for too close bounds forming an equality constraint- This brings indexing issues as now the Lagrange multiplier value corresponding to bounds is in place of multipliers corresponding to linear equality constraints.
Also, Matlab only accepts too close bounds when using a medium scale algorithm and since the lower bound is approximately equal to the upper bound and it is considered as a single equality constraint, the single Lagrange multiplier is placed in the corresponding lambda.upper field while the corresponding lambda.lower value is zero.

Continuing the above example:

lb(4) = 0.3;
ub = 0.3*ones(4,1);


[x, obj_qp, INFO, lambda] = qp ([],H,f,[],[],lb,ub)
lambda =

  -0.04828
   0.04154
   0.00000
   0.28615
   0.00000
   0.00000
   0.01546


Here, lb(4) = ub(4) and hence the constraint is treated as an equality constraint so the value for corresponding Lagrange multipliers is present on top (underlined)
I added checks for such cases and now my code in [1] gives same results as Matlab:

[x,obj,flag,op,lambda]=quadprog(H,f,[],[],[],[],lb,ub)

lambda =
  scalar structure containing the fields:
    lower =
       0.04154
       0.28615
       0.00000
       0.00000
    upper =

       0.000000
       0.000000
       0.015456
       0.048279


- qp.m strips off the -Inf constraints before passing them to __qp__. I am doing the same in quadprog. I have added further checks to make sure the multipliers are placed in the right positions in their respective fields. 

Plans for the next weeks:
- Get feedback from my mentor on the changes in quadprog.
- Begin intial work on optimoptions.

[1] https://github.com/AsmaAfzal/octave_workspace/blob/master/quadprog.m
[2] http://fossies.org/dox/octave-4.0.0/qp_8m_source.html

Tuesday, 14 July 2015

Week 7: quadprog wrapping __qp__

I have wrapped quadprog on __qp__ instead of qp.m in [1].

Main differences between quadprog in [1] and qp.m.

- Input argument placement
  quadprog(H, f, A, b, Aeq, beq, lb, ub, x0, options)  =  qp (x0, H, f, Aeq, beq, lb, ub, [], A, b, options) 

- Check for empty inputs A and b
qp ([],H,f,Aeq,beq,lb,ub,[],A,[])

This works. qp simply ignores inequality constraints due to if checks
in lines  258, 266 and 275 of qp.m. Matlab gives an error if A is empty and b is not and vice versa.

Example:
quadprog (H, f, A, [], Aeq, beq, lb, ub)
Error: The number of rows in A must be the same as the length of b. I have added this check in line 181 in [1].

- Lambda output as a structure instead of a vector as in qp.m.

Ordering of lambda:
  • The order of lambda vector output (qp_lambda) from __qp__(in my code) is [equality constraints; inequality constraints; lower bounds; upper bounds]. 
  • The multipliers are present if the constraints are given as inputs so the size of qp_lambda depends on the size of constraints. 
  • Variables idx_ineq, idx_lb and idx_ub make sure I pick the right values. 
Example:

H = diag([1; 0]);
f = [3; 4];
A = [-1 -3; 2 5; 3 4];
b = [-15; 100; 80];
l = zeros(2,1);
[x, obj, info, qp_lambda] = qp ([], H, f, [], [],l,[],[], A, b)
[x,fval,exitflag,output,lambda] = quadprog (H, f, A, b,[],[],l,[])

qp_lambda =

   1.66667
   0.00000
   1.33333
   0.00000
   0.00000

lambda =
scalar structure containing the fields:

    lower =

       1.66667
       0.00000

    upper = [](0x0)
    eqlin = [](0x0)
    ineqlin =

       1.33333
       0.00000
       0.00000


Things to do:
  • Check the sign issue for lambda.eqlin (qp gives values -1* Matlab's)
  • Check if __qp__ changes the order of constraints. The values of lambda from qp.m in the last example in [2] are there but not coinciding with the respective constraints.
  • Move on to optimoptions.

[1] https://github.com/AsmaAfzal/octave_workspace/blob/master/quadprog.m
[2] https://github.com/AsmaAfzal/octave_workspace/blob/master/runquadprog.m

Thursday, 2 July 2015

Week 5 and 6: Refining fmincon

So my fmincon implementation is coming in shape [1]. 

[x,fval,exitflag,output,lambda,grad,hessian] = 
fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)

I came across a few issues which turned out to be bugs. Olaf pushed fixes in the central repository. Listing the issues for the record:
Setting gradc (the gradient of general equality/inequality functions)A bug in nonlin_min.m (and __nonlin_residmin__.m)
Example:
      objective_function = @ (p) p(1)^2 + p(2)^2;
 pin = [-2; 5];
 constraint_function = @ (p) p(1)^2 + 1 - p(2);
 gradc = @(p) [2*p(1);-1];
 [p, objf, cvg, outp] = nonlin_min (objective_function, pin, optimset
("equc", {constraint_function, gradc}))

error: function handle type invalid as index value
- Giving linear inequality/ equality constraints to lm_feasible. A bug in nonlin_min.m
Example:
      f = @(x) -x(1) * x(2) * x(3);
S = [1  -1;   2  -2;   2  -2]
b = [0;72];
x0 = [10;10;10];
[x,fval] = nonlin_min( f, x0, optimset ("inequc",{S,b}) )
      error: __lm_feasible__: operator -: nonconformant arguments (op1 is
3x1, op2 is 3x0)
 
-Any zero value in initial guess vector for nonin_residmin/nonlin_min gave an error. Required a minor change of sign(0)==0 in __dfdp__.m.
Example:
      k = 1:10;
func = @(x) 2 + 2 * k - exp (k * x(1)) - exp (k * x(2));
x0 = [0;0.5];
x = nonlin_residmin(func,x0)

warning: division by zero
warning: called from
    __dfdp__ at line 367 column 21
    __nonlin_residmin__> at line -1 column -1
    __lm_svd__ at line 191 column 9
    __nonlin_residmin__ at line 1125 column 21
    nonlin_residmin at line 98 column 21
    runlsqnonlin at line 9 column 3
error: svd: cannot take SVD of matrix containing Inf or NaN values

Functionality for Returning Hessian and Gradient
New options "ret_objf_grad" and "ret_hessian" to be introduced in nonlin_min (by Olaf). If anyone of these options is set to true, the 'outp' structure output argument of nonlin_min will contain additional fields .objf_grad and .hessian. My code currently checks this. 

Rearranging values of lambda in the fields of a structure.
- For lm_feasible, outp will contain an additional field lambda, a structure which contains Lagrange multipliers in fields separated by constraint type.

I added an additional feature in [1] to cater for the non linear constraint function set using deal();
Example:
      
      c = @(x) [x(1) ^ 2 / 9 + x(2) ^ 2 / 4 - 1;
        x(1) ^ 2 - x(2) - 1];
ceq = @(x) tanh (x(1)) - x(2);
nonlinfcn = @(x) deal (c(x), ceq(x));
obj = @(x) cosh (x(1)) + sinh (x(2));
z = fmincon (obj, [0;0], [], [], [], [], [], [], nonlinfcn)

z =
   -0.6530
   -0.5737

To do:
1- Write test cases/refined examples for for lsqnonlin, lsqcurvefit, nlinfit and fmincon.
2- Start wrapping quadprog to __qp__ instead of qp.m (because of the ordering of the lambda output).

Wednesday, 24 June 2015

Progress Update: Midterm Evaluation


Objective:
Adding functions to the Optim package for Octave using existing back-ends.

Expected deliverables before midterm:
  • 'lsqnonlin' using 'nonlin_residmin'
Done in [1]. Differences in backends, nonlin_residmin uses "lm_svd" algorithm for optimization as currently the only backend. However, lsqnonlin in Matlab can choose between "trust-region-reflective" and "Levenberg-Marquardt" (LM) algorithms.
Another difference is in complex inputs. lm_svd does not support complex valued inputs whereas Matlab's LM algorithm can accept complex input arguments. One way of providing complex inputs to lsqnonlin in Octave is to split the real and imaginary parts into separate variables and running the optimization.  
  • 'lsqcurvefit' using 'nonlin_curvefit', 'nonlin_residmin', or 'lsqnonlin'
Done in [2] using nonlin_curvefit. lsqcurvefit is very similar to lsqnonlin with only a few minor interface differences. lsqcurvefit explicitly takes independent variables and the observations as inputs while these values can be wrapped inside the objective function while using lsqnonlin. Additional bounds for the optimized parameters can be specified. 
  • 'nlinfit' using 'leasqr',
I wrapped nlinfit on nonlin_curvefit and curvefit_stat as leasqr repeats the optimization to compute the additional statistics (Jacobian and Covariance matrix) while curvefit_stat saves this computation overhead. I have partially implemented nlinfit in [3] (It hasn't been thoroughly reviewed yet). Two missing features are: 1) Error models and Error parameters estimation, and 2) Robust Weight function. Meanwhile, no such functionality exists in the Octave's optimization backends for the missing features. My current implementation supports the input of scalar positive array of weights for robust regression.  
Since nlinfit is from the statistics toolbox of Matlab, it uses statset and statget to create and get options respectively. I created additional functions statset, statget and __all_stat_opts__ with minor changes to the code in optimset, optimget and __all_opts__.
  • 'fmincon' using 'nonlin_min',
In progress [4].


Future goals:
    1. Complete fmincon implementation.
    2. Create solver specific options using optimoptions and desirably still be able to use optimget. 
    3. Arranging lambda output for quadprog by wrapping it on __qp__ instead of qp.m
    4. Test cases for all the implemented functions.

[1] https://github.com/AsmaAfzal/octave_workspace/blob/master/lsqnonlin.m
[2] https://github.com/AsmaAfzal/octave_workspace/blob/master/lsqcurvefit.m
[3] https://github.com/AsmaAfzal/octave_workspace/blob/master/nlinfit.m
[4] https://github.com/AsmaAfzal/octave_workspace/blob/master/fmincon.m

Monday, 22 June 2015

Week4: fmincon wrapping nonlin_min

Time flies.. A third of the way through already..

fmincon/nonlin_min is the most elaborate function of all that I have previously 
implemented so before actual coding I would like to thoroughly check the the 
mapping of arguments and options. 

[x,fval,exitflag,output,lambda,grad,hessian] = fmincon (fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
[x,fval,exitflag,output] = nonlin_min (fun,x0,settings)

A,b Aeq, beq- Linear inequality and equality constraints
Inequality constraints:
Matlab standard: A * x - b <= 0 ,
Octave standard: m.' * x  + v >= 0. 
This implies: m=-A.' , v=b.
Set in Octave using 
optimset ("inequc", {m,v})
Similar for equality constraints:
optimset ("equc", {m,v})

lb, ub- Lower and upper bounds
Set in Octave using optimset
 optimset ("lbound", ..., "ubound",...)

nonlcon- Nonliner constraint function handle 
In Matlab, nonlinear constraints are given in a function with the following format

function [c,ceq,gradc,gradceq] = mycon(x)
c = ...     % Compute nonlinear inequalities at x.  
ceq = ...   % Compute nonlinear equalities at x.
% Optional output arguments:
gradc = ...   % Gradient of c(x).
gradceq = ...   % Gradient of ceq(x).
options = optimoptions('fmincon','GradConstr','on')

Alternative to nonlcon in nonlin_min
  optimset ("equc", {constraint_function})

Options- Options common to both fmincon and nonlin_min
User-supplied Gradient
In Matlab, the objective function must return the second output when the 
GradObj option is set:
optimset('GradObj','on')
Octave:
optimset("objf_grad",@fun)
User-supplied Hessian
In Matlab, the objective function must return the third output when the Hessian 
option is set:
optimset('Hessian','user-supplied')
Octave:
optimset("objf_hessian",@fun)

Lambda- returned as a structure in Matlab but as a vector field of the "output" 
ouput argument in Octave.

Things to investigate:
1. Algorithm mapping. 
2. Exitflag mapping.
3. Setting gradc (the gradient of general equality/inequality functions). The second 
entry for the equc/inequc setting implements this feature in Octave as stated by 
optim_doc but I was unable to make it work properly.
4. Returning additional statistics (Hessian, Gradient)- I used 
residmin_stat/curvefit_stat previously for lsqnonlin and lsqcurvefit. No such function for nonlin_min.
5. Rearranging values of lambda in the fields of a structure.

P.s. this week I spent quite some time with Mercurial and how to possibly work in the optim package. There was a slight miscommunication/confusion but it is clear now and I will continue to publish my code on github..




Sunday, 14 June 2015

Week3: nlinfit, statset, stat_get and __all_stat_opts__

So this week I achieved the following milestones:

Wrapping nlinfit on nonlin_curvefit

[beta,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(X,Y,modelfun,beta0,options,Name,Value)

Implementation [1]:

I chose not to wrap nlinfit on lsqcurvefit because 
  1. We might end up wrapping lsqcurvefit on lsqnonlin eventually so it is undecided.
  2. The default options for lsqnonlin/lsqcurvefit are different from nlinfit. 
Missing features:
  1. RobustWgtFun - The field RobustWgtFun in options can be provided with a function handle which computes the residuals at every iteration. The backend optimization algorithm in Octave currently does not support this functionality.  
  2. Name-Value pairs. Currently the only implemented one is "weights",  which takes an array of weights for the weighted optimization. "ErrorModelInfo" and "ErrorParameters" are not implemented. The possible error models include, "constant", "proportional" and "combined". The error model also translates to a weight function which helps in reducing the effect of outliers.   
  3. ErrorModelInfo- output field which gives information about the error variance, and estimates the parameters of Error models. 
Setting options using statset

In Matlab, options for statistics toolbox are set using statset [2].  The functionality is almost identical to optimset but the different functions in Matlab are because of different toolboxes (statset for statistics and optimset for optimization toolbox).

Added functions:
statset.m, statget.m and __all_stat_opts__.m [3]-[5]
Creating these functions was pretty straight forward.

Still to do:
  1. Have to check if the weighted residual and weighted Jacobian output in octave is consistent with Matlab and further refine the functions with the feedback from my mentors.
  2. Move on to fmincon wrapping nonlin_min.
[1] https://github.com/AsmaAfzal/octave_workspace/blob/master/nlinfit.m
[2] http://uk.mathworks.com/help/stats/statset.html
[3] https://github.com/AsmaAfzal/octave_workspace/blob/master/statset.m
[4] https://github.com/AsmaAfzal/octave_workspace/blob/master/statget.m
[5] https://github.com/AsmaAfzal/octave_workspace/blob/master/__all_stat_opts__.m

Tuesday, 9 June 2015

Week 2: lsqnonlin and lsqcurvefit

A bit late blogging about week 2. 

Almost completed functions lsqnonlin and lsqcurvefit. 
Successfully mapped user-specified Jacobian. 
In Matlab, if the Jacobian option is set to "on",  the model function must return 
a second output which is the Jacobian function.
In Octave, the Jacobian function handle is given to the dfdp option using optimset.

Lsqnonlin function description:

[x,resnorm,residual,exitflag,output,lambda,jacobian] = ...
lsqnonlin(fun,x0,lb,ub,options)

This function maps on:

[x, residual, exitflag, output] = nonlin_residmin (fun, x0, options)

Features of lsqnonlin in Octave:
  • Input arguments: Acceptable forms are lsqnonlin(fun,x0), lsqnonlin(fun,x0,lb,ub) and lsqnonlin(fun,x0,lb,ub,options)
  • Outputs
    • x, exitflag, residual and output currently same as nonlin_residmin.
    • resnorm=sum(residual.^2)
    • Lambda is computed using the complementary pivoting in __lm_svd__.
      It's values differ from Matlab's due to the difference in backends.
    • Jacobian is computed using the function residmin_stat ().


Lsqcurvefit function description:

[x,resnorm,residual,exitflag,output,lambda,jacobian] = ...
lsqcurvefit(fun,x0,xdata,ydata,lb,ub,options)

This function maps on

[x, fy, exitflag, output] = nonlin_curvefit (fun, x0, xdata, ydata, options)

Features of lsqcurvefit in Octave:
  • Input arguments: Acceptable forms are lsqcurvefit (fun,x0,xdata,ydata), lsqcurvefit (fun,x0,xdata,ydata,lb,ub) and lsqcurvefit (fun,x0,lb,ub,xdata,ydata,options)
  • Outputs
    • x, exitflag, residual and output currently same as nonlin_curvefit.
    • residual = fy-ydata, resnorm = sum(residual.^2)
    • Lambda and Jacobian same as in lsqnonlin.
There are only minor interface differences between lsqcurvefit and lsqnonlin.

This week's plan:

  • Hopefully, with lsqnonlin and lsqcurvefit wrapped up, I'll move on to nlinfit
  • Three key challenges need to be addressed when wrapping nlinfit using nonlin_curvefit
    and curvefit_stat:
    • Weight functions: Currently, no such functionality exists in nonlin_curvefit,
       where a user can specify weight functions to perform Robust regression
      (weights computed using the specified function in every iteration).
    • Error Models and ErrorModelInfo
    • Setting options using statset instead of optimset or optimoptions. 

Sunday, 31 May 2015

optimoptions


Answering some questions to better understand the behavior of optimoptions.
Although the answers have been discussed in [1], I am pasting some examples for clarity.
*I was using the terms solver/algorithm in the wrong context before. Solver names are functions such as lsqnonlin, fmincon, etc. and one solver can have multiple algorithms such as interior-point, lev-mar, etc. *

1) Is there an error or warning if optimoptions is used to assign an option not contained in the specified algorithm? 

If the option belongs to a different algorithm of the solver, Matlab stores it as options "not" used by the current algorithm so if we change the algorithm to which the option belongs, we do not need to set it again. 

For eg. The default algorithm of lsqnonlin is trust-region-reflective but when I try to set the option 'ScaleProblem' which is specific to 'levenberg-marquardt' algorithm, I get: 

opts = optimoptions ('lsqnonlin', 'ScaleProblem', 'Jacobian')

opts = 

  lsqnonlin options:

   Options used by current Algorithm ('trust-region-reflective'):
   (Other available algorithms: 'levenberg-marquardt')

   Set by user:
     No options set by user.

   Default:

 Algorithm: 'trust-region-reflective'
    DerivativeCheck: 'off'
     .               .  
     .               .  
     .               .  
     .               .  
    TolX: 1.0000e-06

   Show options not used by current Algorithm ('trust-region-reflective')

Set by user:
    ScaleProblem: 'jacobian'

This gives the same result using dot notation:

opts = optimoptions ('lsqnonlin')
opts.ScaleProblem = 'Jacobian'

2) ... or to assign an option that does not exist in the specified solver or any solver?

This gives an error. Trying to set the option for SQP Algorithm, which is not used by lsqnonlin:

opts=optimoptions('lsqnonlin', 'MaxSQPIter', 100)
Error using optimoptions
'MaxSQPIter' is not an option for LSQNONLIN

using dot notation:

opts.MaxSQPIter = 100;
No public field MaxSQPIter exists for class optim.options.Lsqnonlin.
 
3) If options are transfered to a different solver with optimoptions, are there errors or warnings if the new solver does not have some of these options?

No errors or warnings. The options common to both solvers are copied. They could be options of different algorithms. For eg. 

opts = optimoptions ('fmincon', 'Algorithm', 'sqp', 'TolX', 1e-10) 
opts_lsq = optimoptions ('lsqnonlin', opts) 

Options set in opts_lsq are: 
                PrecondBandWidth: 0 
                TolX: 1.0000e-10 

The option PrecondBandwidth belongs to the trust-region algorithm of fmincon solver. 
Another option copied from opts in opts_lsq belongs to the lev-mar algorithm of lsqnonlin. It is the stored option as mentioned in 1) 

  ScaleProblem: 'none' 
 
4) Can options returned by optimoptions be used with optimset, and   vice versa? 

This returns an error in both cases:

opts = optimoptions ('lsqnonlin','MaxIter',100);
opts = optimset (opts, 'TolX', 1e-6);
Error using optimset
Cannot use OPTIMSET to alter options created using OPTIMOPTIONS.
Use OPTIMOPTIONS instead.

opts = optimset ('TolX', 1e-6);
opts = optimoptions ('lsqnonlin', opts);
Error using optimoptions
Invalid option name specified. Provide a string (such as 'Display').

References: 
[1] http://octave.1599824.n4.nabble.com/lsqnonlin-and-nonlin-residmin-tp4670540p4670569.html