UBaymodel¶

UBayFS

class UBaymodel.UBaymodel(data, target, feat_names=[], M=100, tt_split=0.75, nr_features='auto', method=['mrmr'], prior_model='dirichlet', weights=[1], constraints=None, l=1, optim_method='GA', popsize=100, maxiter=100, random_state=None)¶

Initialization of a UBaymodel.

Parameters:

data (<numpy array> or <pandas dataframe>) – Dataset on which feature selection shall be performed. Variable types must be numeric or integer.
target (<numpy array> or <pandas dataframe>) – Response variable of data. Variable types must be numeric or integer.
feat_names (<list>) – List holding feature names. Preferably a list of string values. If empty, feature names will be generated automatically. Default: feat_names=[].
M (<int>) – Positive integer determining the number of ensemble models. Default M=100. tt_split : <float> Ratio of samples used for training a single ensemble model. Default tt_split=0.75.
nr_features (<string or int>) –
Number of features selected in a single ensemble. Default: nr_features="auto".
- string="auto" : A random number between 1 and the total number of features.
- int : A positive integer.
method (<list of strings>) –
List of feature selectors used as ensemble feature selectors.Currently options are:
- mrmr : minimum Redundancy maximal Relevance criterion. This method supports classification and regression tasks.
- chi : chi square whatever
- fisher : Fisher score (classification only)
prior_model (<string>) – Type of prior. Default: prior_model="dirichlet". So far, “dirichlet” is the only implemented prior model type.
weights (<list>) – A list of integers defining the prior weights of the features. If a list with only one entry is used, this value is assigned to each feature as prior weight. Default: weight=[1]
constraints (<UBayconstraint>) – A UBayconstraint object describing user-defined constraints. See description UBayconstraint. Default: constraints=None
l (<float>) – Positive float. The Lagrange parameter defining the penalization strength imposed on a feature set violating the constraints. Default: l=1
optim_method (<string>) – Optimizer. Currently only Genetic Algorithm “GA” available. Default: optim_metod="GA"
popsize (<integer>) – Positive integer for the population size in GA.
maxiter (<integer>) – Positive integer for the maximal number of GA iterations.

admissibility(state, log=True)¶

Get admissibility of a feature set. :param state: Binary 1-d array indicating which features are selected (1) and which are not selected (0). :type state: <np.array> :param log: Use of log-scale. :type log: <boolean>

Return type:: A numeric value.

evaluateFS(state, method='spearman', log=False)¶

Train the UBaymodel.

Return type:: A <dictionary> with different key parameters of the selected feature set.

getConstraints()¶

Get side constraints.

Return type:: A list.

getOptim()¶

Get optimization parameters.

Return type:: A dictionary with the optimization parameters.

getWeights()¶

Get prior weights.

Return type:: A numpy array with the prior weights.

posteriorExpectation()¶

Posterior expectation score.

Return type:: A numeric value.

sampleInitial(post_scores, size)¶

Sample an initial feature set based on a search heuristic.

Return type:: A binary <numpy array> feature set.

setConstraints(constraints, append=False)¶

Set side oconstraints.

Parameters:

constraints (<UBayconstraint>) – A UBayconstraint object describing user-defined constraints. See description UBayconstraint.
append (<boolean>) –
- True: Append a new constraint to the list of present constraints
- False: Replace all present constraints with the new constraint

setOptim(optim_method, popsize, maxiter)¶

Set parameters for optimization.

Parameters:

optim_method (<string>) – Currently only genetic algorithm (“GA”) possible.
popsize (<integer>) – Positive integer for the population size in GA.
maxiter (<integer>) – Positive integer for the maximal number of GA iterations.

setWeights(weights, block_list=None, block_matrix=None)¶

Set prior weights.

Parameters:

weights (<list>) – A list of integers defining the prior weights of the features. If a list with only one entry is used, this value is assigned to each feature as prior weight. Block assignment information for features.
block_matrix (<np.array>) – Numpy array matrix definint the block assignment information for features.

train()¶

Train the UBaymodel.

Returns:

<pandas dataframe> with the optimal feature set and their names as index
<list> of selected feature names

UBaymodel¶

Table of Contents

Previous topic

Next topic

This Page