[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Kirill Mishchenko ki.mishchenko at gmail.com
Wed Mar 29 09:15:16 EDT 2017


Thanks for your answer, I was thinking about kind of the same solution.

I have yet another question, an organisational one. There are several phases for evaluation during coding under the GSoC program. Namely, there are three: in the end of June, in the end of July and in the end of August. My question is should I prefer to plan coding in the way that I finish implementing some logical part of functionality by the start of an evaluation phase? 

For example, suppose I plan to implement functionality A1, A2, and B such that I need spend approximately a week for functionality A1, another week for A2, and 3 weeks for B. Also suppose that A1 and A2 correspond to one logical module, and B can be tested (and implemented) when either A1 or A2 is implemented (at least one of them). If I decide to implement them in the order A1, B, A2, I will likely finish B approximately by the start of the first evaluation phase. On the other side, If I decide to implement in the order A1, A2, B, I will likely be unable to finish B by the start of the first evaluation phase. So, what plaining in the described situation I should prefer? A more logical one (A1, A2, B) or evaluation phase oriented (A1, B, A2)? Or does it just depend on my preferences?

Best regards,

Kirill Mishchenko



> On 22 Mar 2017, at 01:45, Ryan Curtin <ryan at ratml.org> wrote:
> 
> On Tue, Mar 21, 2017 at 05:09:46PM +0500, Kirill Mishchenko wrote:
>> Ryan,
>> 
>> I’m working on a proposal for the idea, and wondering whether
>> hyper-parameter module should be flexible enough to support metrics
>> with different correlations. E.g., if we use accuracy as a metric,
>> then we want to find a model that maximises this metric; on the other
>> hand, if we want to use some kind of error as a metric (like mean
>> squared error), then we need to find a model that minimises the
>> metric. So, again, the question is whether hyper-parameter module
>> should be flexible enough to maximise some metrics and minimise
>> others?
> 
> Hi Kirill,
> 
> I agree, this would be nice support.  Although it's true that you can
> maximize, e.g., (1 / RMSE), that might be a bit awkward.  One idea might
> be to use traits for the individual metrics being optimized, like:
> 
> /**
> * I'm just making up this interface, maybe you have a better idea.
> */
> class RMSE
> {
>  // The signature almost certainly needs to be different.
>  template<typename Model>
>  static double Evaluate(Model& m,
>                         arma::mat& testData,
>                         arma::Mat<size_t> testLabels);
> 
>  // This boolean can be used with SFINAE to determine whether or not
>  // minimization or maximization is necessary.  Maybe it could have a
>  // better name too.
>  static const bool NeedsMinimization = true;
> };
> 
> What do you think?  You could easily automatically generate a wrapper
> class for any type where NeedsMinimization was false, so that you could
> use the mlpack optimizers with it.  (Probably most of the optimizers
> won't be useful, but in the end any optimizers you implemented for
> hyper-parameter tuning, like grid search, could probably follow the same
> API as the rest of the other mlpack optimizers, and later be used for
> other tasks too.)
> 
> -- 
> Ryan Curtin    | "You can think about it... but don't do it."
> ryan at ratml.org |   - Sheriff Justice

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20170329/6db47e7f/attachment.html>


More information about the mlpack mailing list