batch_learner.agglo_gfmm

General fuzzy min-max neural network trained by the agglomerative learning algorithm with full similarity matrix.

class hbbrain.numerical_data.batch_learner.agglo_gfmm.AgglomerativeLearningGFMM(theta=0.5, gamma=1, min_simil=0.5, simil_measure='mid', asimil_type='max', is_draw=False)[source]

Bases: BaseGFMMClassifier

Agglomerative learning algorithm with full similarity matrix for a general fuzzy min-max neural network with numerical data.

See [1] for more detailed information regarding this learning algorithm.

Note

Note that this implementation uses the accelerated mechanism presented in [2] to accelerate the improved online learning algorithm.

Parameters:

thetafloat, optional, default=0.5: Maximum hyperbox size for numerical features.
gammafloat or ndarray of shape (n_features,), optional, default=1: A sensitivity parameter describing the speed of decreasing of the membership function in each continuous feature.
min_similfloat, optional, default=0.5: Minimum similarity threshold so that two hyperboxes are agglomerated.
simil_measure{‘short’, ‘long’, ‘mid’}, optional, default=’mid’: Type of similarity measures is used to compute similarity between two hyperboxes. It can get values of shorted gap, middel gap or longest gap between two hyperboxes.
asimil_type{‘max’, ‘min’}, optional, default=’max’: Type of similarity measures is used in the case of simil_measure getting a value of mid. It can be the maximum or minimum values of two dissimilar values of a similarity measure based on middle distance.
is_drawboolean, optional, default=False: Whether the construction of hyperboxes can be progressively shown during the training process on a canvas window.

References

[1]

B. Gabrys, “Agglomerative learning algorithms for general fuzzy min-max neural network”, Journal of VLSI signal processing systems for signal, image and video technology, vol. 32, no. 1, pp. 67-82, 2002.

[2]

T.T. Khuat and B. Gabrys, “Accelerated learning algorithms of general fuzzy min-max neural network using a novel hyperbox selection rule,” Information Sciences, vol. 547, pp. 887-909, 2021.

Examples

>>> from hbbrain.numerical_data.batch_learner.agglo_gfmm import AgglomerativeLearningGFMM
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> from sklearn.preprocessing import MinMaxScaler
>>> scaler = MinMaxScaler()
>>> scaler.fit(X)
MinMaxScaler()
>>> X = scaler.transform(X)
>>> clf = AgglomerativeLearningGFMM(theta=0.1, min_simil=0.8, simil_measure='short')
>>> clf.fit(X, y)
>>> print("Number of hyperboxes = %d"%clf.get_n_hyperboxes())
Number of hyperboxes = 65
>>> clf.predict(X[[10, 50, 100]])
array([0, 1, 2])

Attributes:

Varray-like of shape (n_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Warray-like of shape (n_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Carray-like of shape (n_hyperboxes,): A vector stores all class labels correponding to existing hyperboxes.
N_samplesarray-like of shape (n_hyperboxes,): A vector stores the number of samples fully included in each existing hyperbox.
is_exist_missing_valueboolean: Is there any missing values in continuous features in the training data.
elapsed_training_timefloat: Training time in seconds.

Methods

`delay`([delay_constant])	Delay a time period to display hyperboxes
`draw_hyperbox_and_boundary`([window_name, ...])	Draw the existing hyperboxes and their decision boundaries among classes
`fit`(X, y)	Fit the model according to the given training data using the agglomerative learning algorithm using full similarity matrix.
`get_n_hyperboxes`()	Get number of hyperboxes in the trained hyperbox-based model
`get_params`([deep])	Get parameters for this estimator.
`get_sample_explanation`(xl, xu[, ...])	Get useful information for explaining the reason behind the predicted result for the input pattern
`initialise_canvas_graph`([n_dims, ...])	Initialise a canvas to draw hyperboxes
`predict`(X[, type_boundary_handling])	Predict class labels for samples in X.
`predict_proba`(X)	Predict class probabilities of the input samples X.
`predict_with_membership`(X)	Predict class membership values of the input samples X.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.
`show_sample_explanation`(xl, xu, ...[, ...])	Show explanation for predicted results of an input pattern under the form of parallel coordinates or hyperboxes in 2D or 3D planes.
`simple_pruning`(Xl_val, Xu_val, y_val[, ...])	Simply prune low qualitied hyperboxes based on a pre-defined accuracy threshold for each hyperbox

fit(X, y)[source]

Fit the model according to the given training data using the agglomerative learning algorithm using full similarity matrix.

Parameters:

Xarray-like of shape (n_samples, n_features): Training vector, where n_samples is the number of samples and n_features is the number of features.
yarray-like of shape (n_samples,): Target vector relative to X.

Returns:

selfobject: Fitted hyperbox-based model.

get_sample_explanation(xl, xu, type_boundary_handling=1)[source]

Get useful information for explaining the reason behind the predicted result for the input pattern

Parameters:

xlndarray of shape (n_feature,): Minimum point of the input pattern which needs to be explained.
xundarray of shape (n_feature,): Maximum point of the input pattern which needs to be explained.
type_boundary_handlingint, optional, default=PROBABILITY_MEASURE (aka 1): The way of handling samples located on the boundary.

Returns:

y_predint: The predicted class of the input pattern
dict_mem_val_classesdictionary: A dictionary stores all membership values for all classes. The key is class label and the value is the corresponding membership value.
dict_min_point_classesdictionary: A dictionary stores all mimimal points of hyperboxes having the maximum membership value for each class. The key is the class label and the value is the minimal points of all hyperboxes coressponding to each class
dict_max_point_classesdictionary: A dictionary stores all maximal points of hyperboxes having the maximum membership value for each class. The key is the class label and the value is the maximal points of all hyperboxes coressponding to each class

predict(X, type_boundary_handling=1)[source]

Predict class labels for samples in X.

Note

In the case there are many winner hyperboxes representing different class labels but with the same membership value with respect to the input pattern \(X_i\), an additional criterion based on the probability generated by number of samples included in winner hyperboxes and membership values or the Manhattan distance between the central point of winner hyperboxes and the input sample is used to find the final winner hyperbox that its class label is used for predicting the class label of the input pattern \(X_i\).

Parameters:

Xarray-like of shape (n_samples, n_features): The data matrix for which we want to predict the targets.
type_boundary_handlingint, optional, default=PROBABILITY_MEASURE (aka 1): The way of handling many winner hyperboxes, i.e., PROBABILITY_MEASURE or MANHATTAN_DIS

Returns:

y_predndarray of shape (n_samples,): Vector containing the predictions. In binary and multiclass problems, this is a vector containing n_samples.

simple_pruning(Xl_val, Xu_val, y_val, acc_threshold=0.5, keep_empty_boxes=False, type_boundary_handling=1)[source]

Simply prune low qualitied hyperboxes based on a pre-defined accuracy threshold for each hyperbox

Parameters:

Xl_valarray-like of shape (n_samples, n_features): The data matrix contains lower bounds of validation patterns.
Xu_valarray-like of shape (n_samples, n_features): The data matrix contains upper bounds of validation patterns.
y_valndarray of shape (n_samples,): A vector contains the true class label corresponding to each validation pattern.
acc_thresholdfloat, optional, default=0.5: The minimum accuracy for each hyperbox to be kept unchanged.
keep_empty_boxesboolean, optional, default=False: Whether to keep the hyperboxes which do not join the prediction process on the validation set. If True, keep them, else the decision for keeping or removing based on the classification accuracy on the validation dataset
type_boundary_handlingint, optional, default=PROBABILITY_MEASURE (aka 1): The way of handling samples located on the boundary.

Returns:

self: A hyperbox-based model with the low-qualitied hyperboxes pruned.