multigranular_learner.multi_resolution_gfmm

A multi-resolution hierarchical granular representation based classifier using general fuzzy min-max neural network.

class hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm.MultiGranularGFMM(n_partitions=4, granular_theta=[0.1, 0.2, 0.3], gamma=1, min_membership_aggregation=0.5, random_state=0)[source]

Bases: BaseHyperboxClassifier

A multi-resolution hierarchical granular representation based classifier using general fuzzy min-max neural network.

This class implements the multi-granular learning algorithm to construct classifiers from multiresolution hierarchical granular representations using hyperbox fuzzy sets. This algorithm forms a series of granular inferences hierarchically through many levels of abstraction. An attractive characteristic of our classifier is that it can maintain a high accuracy in comparison to other fuzzy min-max models at a low degree of granularity based on reusing the knowledge learned from lower levels of abstraction. In addition, our approach can reduce the data size significantly as well as handle the uncertainty and incompleteness associated with data in real-world applications. The construction process of the classifier consists of two phases. The first phase is to formulate the model at the greatest level of granularity, while the later stage aims to reduce the complexity of the constructed model and deduce it from data at higher abstraction levels. The details of this algorithm can be found in [1].

Parameters:

n_partitionsint, default=4: Number of partitions to split the original training set into disjoint training sets to build base learners.
granular_thetalist of float, optional, default=[0.1, 0.2, 0.3]: Maximum hyperbox sizes at granularity levels.
gammafloat or ndarray of shape (n_features,), optional, default=1: A sensitivity parameter describing the speed of decreasing of the membership function in each continuous feature.
min_membership_aggregationfloat, optional, default=0.5: Minimum membership value between two hyperboxes aggregated to form a larger sized hyperbox at a higher level of abstraction.
random_stateint, RandomState instance or None, default=None: Controls the stratified random sampling rate of the original dataset to form disjoint subsets for training base learners.

References

[1]

T.T. Khuat, F. Chen, and B. Gabrys, “An Effective Multiresolution Hierarchical Granular Representation Based Classifier Using General Fuzzy Min-Max Neural Network,” IEEE Transactions on Fuzzy Systems, vol. 29, no. 2, pp. 427-441, 2021.

Examples

>>> from sklearn.datasets import load_iris
>>> from hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm import MultiGranularGFMM
>>> X, y = load_iris(return_X_y=True)
>>> from sklearn.preprocessing import MinMaxScaler
>>> scaler = MinMaxScaler()
>>> scaler.fit(X)
MinMaxScaler()
>>> X = scaler.transform(X)
>>> clf = MultiGranularGFMM(n_partitions=2, granular_theta=[0.1, 0.2, 0.3, 0.4, 0.5], gamma=1, min_membership_aggregation=0.6, random_state=0)
>>> clf.fit(X, y)
>>> clf.predict(X[[10, 50, 100]])
array([0, 1, 2])
>>> clf.predict(X[[10, 50, 100]], level=0)
array([0, 1, 2])
>>> print("Number of hyperboxes at granularity 1 = %d"%clf.get_n_hyperboxes(0))
Number of hyperboxes at granularity 1 = 77
>>> clf.predict(X[[10, 50, 100]], level=4)
array([0, 1, 2])
>>> print("Number of hyperboxes at granularity 5 = %d"%clf.get_n_hyperboxes(4))
Number of hyperboxes at granularity 5 = 11

Attributes:

granularity_leveldict: A mapping between the maximum hyperbox size and the granular level.
smallest_thetafloat: Maximum hyperbox size at the highest granularity level.
higher_level_thetalist of float: Maximum hyperbox sizes of higher abstraction levels apart form the highest granularity level.
granular_classifiers_ndarray of BaseGranular objects with shape (n_granularity_levels,): A list of general fuzzy min-max neural networks at all granularity levels.
base_learners_list: A list of base learners trained from disjoint subsets of input training patterns.
is_exist_missing_valueboolean: Is there any missing values in continuous features in the training data.
elapsed_training_timefloat: Training time in seconds.

Methods

`delay`([delay_constant])	Delay a time period to display hyperboxes
`draw_2D_hyperbox_and_boundary_granular_level`([...])	Draw the existing hyperboxes and their decision boundaries among classes at a given granularity level.
`draw_2D_hyperbox_and_boundary_partitions`([...])	Draw the existing hyperboxes and their decision boundaries among classes in a given partition.
`draw_hyperbox_and_boundary`([window_name, ...])	Draw the existing hyperboxes and their decision boundaries among classes
`fit`(X, y[, learning_type, X_val, y_val, ...])	Fit the model according to the given training data using the multi granularity learning algorithm.
`get_n_hyperboxes`([level])	Get number of hyperboxes at a given granularity level.
`get_n_hyperboxes_at_partition`([partition])	Get number of hyperboxes at a given granularity level.
`get_params`([deep])	Get parameters for this estimator.
`get_sample_explanation_granular_level`(xl, xu)	Get useful information for explaining the reason behind the predicted result for the input pattern
`granular_learning_phase_1`(Xl, Xu, y[, ...])	Training a granular general fuzzy min-max neural network using a learning algorithm in phase 1 to distribute disjoint subsets into working processes to build base learners.
`granular_learning_phase_2`()	Training a granular general fuzzy min-max neural network using a learning algorithm in phase 2 to reduce number of hyperboxes while keeping a good classification performance.
`initialise_canvas_graph`([n_dims, ...])	Initialise a canvas to draw hyperboxes
`predict`(X[, level])	Predict class labels for samples in X at a given granularity level.
`predict_at_partitions`(Xl, Xu[, partition])	Predict class labels for samples in the form of hyperboxes represented by low bounds Xl and upper bounds Xu at a given granularity level.
`predict_proba`(X[, level])	Predict class probabilities of the input samples X at a given granularity level.
`predict_with_membership`(X[, level])	Predict class memberships of the input samples X at a given granularity level.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_params`(**params)	Set the parameters of this estimator.
`show_sample_explanation`(xl, xu, ...[, ...])	Show explanation for predicted results of an input pattern under the form of parallel coordinates or hyperboxes in 2D or 3D planes.
`simple_pruning`(V, W, C, N_samples, ...[, ...])	Simply prune low qualitied hyperboxes based on a pre-defined accuracy threshold for each hyperbox

draw_2D_hyperbox_and_boundary_granular_level(window_name='Hyperbox-based classifier and its decision boundaries', level=0)[source]

Draw the existing hyperboxes and their decision boundaries among classes at a given granularity level.

Note

This method only works on 2-dimensional datasets.

Parameters:

window_namestr, optional, default=”Hyperbox-based classifier and its decision boundaries”: Name of plotting window showing hyperboxes and their decision boundaries.
levelint, optional, default=0: The granularity level needs to draw hyperboxes and its boundaries.

Returns:

None.

draw_2D_hyperbox_and_boundary_partitions(window_name='Base learners and its decision boundaries', partition=0, fig_num=100)[source]

Draw the existing hyperboxes and their decision boundaries among classes in a given partition.

Note

This method only works on 2-dimensional datasets.

Parameters:

window_namestr, optional, default=”Hyperbox-based classifier and its decision boundaries”: Name of plotting window showing hyperboxes and their decision boundaries.
partitionint, optional, default=0: The partition needs to draw hyperboxes and its boundary.
fig_numint, optional, default=100: Index of the drawing canvas.

Returns:

None.

fit(X, y, learning_type=1, X_val=None, y_val=None, acc_threshold=0.5, keep_empty_boxes=False)[source]

Fit the model according to the given training data using the multi granularity learning algorithm.

Parameters:

Xarray-like of shape (n_samples, n_features): Training vector, where n_samples is the number of samples and n_features is the number of features.
yarray-like of shape (n_samples,): Target vector relative to X.
learning_typeenum (int), optional, default=HETEROGENEOUS_CLASS_LEARNING: Learning type is used to build base learners from disjoint datasets. It gets two defined enum values being HETEROGENEOUS_CLASS_LEARNING and HOMOGENEOUS_CLASS_LEARNING. Heterogeneous class learning means that base learners are trained based on the order of input samples. Homogeneous class learning means that input data are sorted and grouped according to class labels before starting the training process.
X_valarray-like of shape (n_val_samples, n_features), optional, default=None: A matrix contains a validation set, where n_val_samples is the number of validation samples and n_features is the number of features.
y_valarray-like of shape (n_val_samples,), optional, default=None: Target vector relative to X_val.
acc_thresholdfloat, optional, default=0.5: The minimum accuracy for each hyperbox to be kept unchanged.
keep_empty_boxesboolean, optional, default=False: Whether to keep the hyperboxes which do not join the prediction process on the validation set. If True, keep them, else the decision for keeping or removing based on the classification accuracy on the validation dataset.

Returns:

selfobject: Fitted multigranular general fuzzy min-max neural network.

get_n_hyperboxes(level=-1)[source]

Get number of hyperboxes at a given granularity level.

Parameters:

levelint, optional, default=-1: The granularity level needs to get number of hyperboxes. If level gets a value of -1, return number of hyperboxes in all granularity levels.

Returns:

int: Number of hyperboxes at the given granularity level.

get_n_hyperboxes_at_partition(partition=0)[source]

Get number of hyperboxes at a given granularity level.

Parameters:

partitionint, optional, default=0: The partition needs to get number of base learners.

Returns:

int: Number of hyperboxes at the given partition.

get_sample_explanation_granular_level(xl, xu, level=0)[source]

Get useful information for explaining the reason behind the predicted result for the input pattern

Parameters:

xlndarray of shape (n_feature,): Minimum point of the input pattern which needs to be explained.
xundarray of shape (n_feature,): Maximum point of the input pattern which needs to be explained.
levelint, optional, default=0: The granularity level is used to generate prediction.

Returns:

y_predint: The predicted class of the input pattern
dict_mem_val_classesdictionary: A dictionary stores all membership values for all classes. The key is class label and the value is the corresponding membership value.
dict_min_point_classesdictionary: A dictionary stores all mimimal points of hyperboxes having the maximum membership value for each class. The key is the class label and the value is the minimal points of all hyperboxes coressponding to each class
dict_max_point_classesdictionary: A dictionary stores all maximal points of hyperboxes having the maximum membership value for each class. The key is the class label and the value is the maximal points of all hyperboxes coressponding to each class

granular_learning_phase_1(Xl, Xu, y, learning_type=1, X_val=None, y_val=None, acc_threshold=0.5, keep_empty_boxes=False)[source]

Training a granular general fuzzy min-max neural network using a learning algorithm in phase 1 to distribute disjoint subsets into working processes to build base learners. After that, resulting hyperboxes from all base learners will merged and pruned.

Parameters:

Xlarray-like of shape (n_samples, n_features): The data matrix contains lower bounds of input training patterns.
Xuarray-like of shape (n_samples, n_features): The data matrix contains upper bounds of input training patterns.
yarray-like of shape (n_samples,): Target vector relative to input training hyperboxes [Xl, Xu].
learning_typeenum (int), optional, default=HETEROGENEOUS_CLASS_LEARNING: Learning type is used to build base learners from disjoint datasets. It gets two defined enum values being HETEROGENEOUS_CLASS_LEARNING and HOMOGENEOUS_CLASS_LEARNING. Heterogeneous class learning means that base learners are trained based on the order of input samples. Homogeneous class learning means that input data are sorted and grouped according to class labels before starting the training process.
X_valarray-like of shape (n_samples, n_features): The data matrix contains validation patterns.
y_valndarray of shape (n_samples,): A vector contains the true class label corresponding to each validation pattern.
acc_thresholdfloat, optional, default=0.5: The minimum accuracy for each hyperbox to be kept unchanged.
keep_empty_boxesboolean, optional, default=False: Whether to keep the hyperboxes which do not join the prediction process on the validation set. If True, keep them, else the decision for keeping or removing based on the classification accuracy on the validation dataset

Returns:

selfobject: A granular general fuzzy min-max neural network trained by a phase-1 learning algorithm.

granular_learning_phase_2()[source]

Training a granular general fuzzy min-max neural network using a learning algorithm in phase 2 to reduce number of hyperboxes while keeping a good classification performance.

Returns:

selfobject: A granular general fuzzy min-max neural network trained by a phase-2 learning algorithm.

predict(X, level=-1)[source]

Predict class labels for samples in X at a given granularity level.

Note

In the case there are many winner hyperboxes representing different class labels but with the same membership value with respect to the input pattern \(X_i\), an additional criterion based on the minimum distance between the input samples and the centroids of the winner hyperboxes is used to find the final winner hyperbox that its class label is used for predicting the class label of the input pattern \(X_i\).

Parameters:

Xarray-like of shape (n_samples, n_features): The data matrix for which we want to predict the targets.
levelint, optional, default=-1: The granularity level is used to generate predicted classes for the input testing samples. If this variable gets the values of -1, then the predicted class for each sample is the class getting the most votes from all available granularity levels.

Returns:

y_predndarray of shape (n_samples,): Vector containing the predictions. In binary and multiclass problems, this is a vector containing n_samples.

predict_at_partitions(Xl, Xu, partition=0)[source]

Predict class labels for samples in the form of hyperboxes represented by low bounds Xl and upper bounds Xu at a given granularity level.

Note

In the case there are many winner hyperboxes representing different class labels but with the same membership value with respect to the input pattern \(X_i\) in the form of an hyperbox represented by a lower bound \(Xl_i\) and an upper bound \(Xu_i\), an additional criterion based on the minimum distance between the centroids of winner hyperboxes and the input sample is used to find the final winner hyperbox that its class label is used for predicting the class label of the input hyperbox \(X_i\).

Parameters:

Xlarray-like of shape (n_samples, n_features): The data matrix containing the lower bounds of input patterns for which we want to predict the targets.
Xuarray-like of shape (n_samples, n_features): The data matrix containing the upper bounds of input patterns for which we want to predict the targets.
partitionint, optional, default=0: The base learner at a given partition is used to generate predicted classes for the input testing samples.

Returns:

y_predndarray of shape (n_samples,): Vector containing the predictions. In binary and multiclass problems, this is a vector containing n_samples.

predict_proba(X, level=-1)[source]

Predict class probabilities of the input samples X at a given granularity level.

The predicted class probability at a given granularity level is the fraction of the membership value of the representative hyperbox of that class at the given granularity level and the sum of all membership values of all representative hyperboxes of all classes joining the prediction procedure.

Parameters:

Xarray-like of shape (n_samples, n_features): The input samples.
levelint, optional, default=-1: The granularity level is used to generate predicted class probabilities for the input testing samples. If this variable gets the values of -1, then the predicted class probability value for each sample is the average of probability values at all available granularity levels.

Returns:

probandarray of shape (n_samples, n_classes): The class probabilities of the input samples. The order of the classes corresponds to that in ascending integers of class labels.

predict_with_membership(X, level=-1)[source]

Predict class memberships of the input samples X at a given granularity level.

The predicted class memberships are the membership values of the representative hyperbox of that class at a given granularity level.

Parameters:

Xarray-like of shape (n_samples, n_features): The input samples.
levelint, optional, default=-1: The granularity level is used to generate predicted classes for the input testing samples. If this variable gets the values of -1, then the predicted class memberhip value for each sample is the average of all class memberships of all granularity levels.

Returns:

mem_valsndarray of shape (n_samples, n_classes): The class memberships of the input samples. The order of the classes corresponds to that in ascending integers of class labels.

simple_pruning(V, W, C, N_samples, Centroids, Xl_val, Xu_val, y_val, acc_threshold=0.5, keep_empty_boxes=False)[source]

Simply prune low qualitied hyperboxes based on a pre-defined accuracy threshold for each hyperbox

Parameters:

Varray-like of shape (n_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Warray-like of shape (n_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Carray-like of shape (n_hyperboxes,): A vector stores all class labels correponding to existing hyperboxes.
N_samplesarray-like of shape (n_hyperboxes,): A vector stores the number of samples fully included in each existing hyperbox.
Centroidsarray-like of shape (n_hyperboxes, n_features): A matrix stores all centroid points of all existing hyperboxes, in which each row is a centroid point of a hyperbox.
Xl_valarray-like of shape (n_samples, n_features): The data matrix contains lower bounds of validation patterns.
Xu_valarray-like of shape (n_samples, n_features): The data matrix contains upper bounds of validation patterns.
y_valndarray of shape (n_samples,): A vector contains the true class label corresponding to each validation pattern.
acc_thresholdfloat, optional, default=0.5: The minimum accuracy for each hyperbox to be kept unchanged.
keep_empty_boxesboolean, optional, default=False: Whether to keep the hyperboxes which do not join the prediction process on the validation set. If True, keep them, else the decision for keeping or removing based on the classification accuracy on the validation dataset

Returns:

new_Varray-like of shape (n_new_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all remaining hyperboxes after pruning, in which each row is a minimal point of a hyperbox.
new_Warray-like of shape (n_new_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all remaining hyperboxes after pruning, in which each row is a maximal point of a hyperbox.
new_Carray-like of shape (n_new_hyperboxes,): A vector stores all class labels correponding to remaining hyperboxes after pruning.
new_N_samplesarray-like of shape (n_new_hyperboxes,): A vector stores the number of samples fully included in each remaining hyperbox after pruning.
new_Centroidsarray-like of shape (n_new_hyperboxes, n_features): A matrix stores all centroid points of all remaining hyperboxes after pruning, in which each row is a centroid point of a hyperbox.

hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm.convert_granular_theta_to_level(granular_thetas)[source]

Convert a list of maximum hyperbox sizes to the corresponding granular levels.

Parameters:

granular_thetaslist: A list contains all maximum hyperbox sizes for all granularity levels.

Returns:

level_dicdict: A mapping between the maximum hyperbox size and the granular level.

hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm.predict_with_centroids(V, W, C, N_samples, Centroids, Xl, Xu, g=1)[source]

Predict class labels for samples in X represented in the form of invervals [Xl, Xu]. This is a common function to determine the right class labels for X wrt. a trained hyperbox-based classifier represented by [V, W, C]. It uses the winner-takes-all principle to predict class labels for each sample in X by assigning the class label of the sample to the class label of the hyperbox with the maximum membership value to that sample. It will use an Euclidean distance from the input pattern to the centroid point of the hyperbox in the case of many winner hyperboxes with different classes having the same maximum membership value. If two winner hyperboxes show the same Euclidean distance to their centroid points, the winner hyperbox with a higher number of included samples will be selected.

Parameters:

Varray-like of shape (n_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Warray-like of shape (n_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Carray-like of shape (n_hyperboxes,): A vector stores all class labels correponding to existing hyperboxes.
N_samplesarray-like of shape (n_hyperboxes,): A vector stores the number of samples fully included in each existing hyperbox.
Centroidsarray-like of shape (n_hyperboxes, n_features): A matrix stores all centroid points of all existing hyperboxes, in which each row is a centroid point of a hyperbox.
Xlarray-like of shape (n_samples, n_features): The data matrix contains lower bounds of input patterns for which we want to predict the targets.
Xuarray-like of shape (n_samples, n_features): The data matrix contains upper bounds of input patterns for which we want to predict the targets.
gfloat or array-like of shape (n_features,), optional, default=1: A sensitivity parameter describing the speed of decreasing of the membership function in each dimension.

Returns:

y_predndarray of shape (n_samples,): A vector contains the predictions. In binary and multiclass problems, this is a vector containing n_samples.

hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm.predict_with_membership(V, W, C, Xl, Xu, g=1)[source]

Return class membership values for samples in X represented in the form of invervals [Xl, Xu]. This is a common function to determine the membership values from an input X to a trained hyperbox-based classifier represented by [V, W, C].

Parameters:

Varray-like of shape (n_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Warray-like of shape (n_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Carray-like of shape (n_hyperboxes,): A vector stores all class labels correponding to existing hyperboxes.
Xlarray-like of shape (n_samples, n_features): The data matrix contains lower bounds of input patterns for which we want to predict the targets.
Xuarray-like of shape (n_samples, n_features): The data matrix contains upper bounds of input patterns for which we want to predict the targets.
gfloat or array-like of shape (n_features,), optional, default=1: A sensitivity parameter describing the speed of decreasing of the membership function in each dimension.

Returns:

mem_valsndarray of shape (n_samples, n_classes): A vector contains the membership values for all classes for each input sample which needs to get the membership values.

hbbrain.numerical_data.multigranular_learner.multi_resolution_gfmm.remove_contained_hyperboxes(V, W, C, N_samples, Centroids)[source]

Remove all hyperboxes contained in other hyperboxes with the same class label and update the centroids of larger hyperboxes included the removed hyperboxes.

Parameters:

Varray-like of shape (n_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all existing hyperboxes, in which each row is a minimal point of a hyperbox.
Warray-like of shape (n_hyperboxes, n_features): A matrix stores all maximum points for numerical features of all existing hyperboxes, in which each row is a maximum point of a hyperbox.
Carray-like of shape (n_hyperboxes,): A vector stores all class labels correponding to existing hyperboxes.
N_samplesarray-like of shape (n_hyperboxes,): A vector stores the number of samples fully included in each existing hyperbox.
Centroidsarray-like of shape (n_hyperboxes, n_features): A matrix stores all centroid points of all existing hyperboxes, in which each row is a centroid point of a hyperbox.

Returns:

new_Varray-like of shape (n_new_hyperboxes, n_features): A matrix stores all minimal points for numerical features of all hyperboxes after removal of fully contained hyperboxes, in which each row is a minimal point of a hyperbox.
new_Warray-like of shape (n_new_hyperboxes, n_features): A matrix stores all maximal points for numerical features of all hyperboxes after removal of fully contained hyperboxes, in which each row is a maximal point of a hyperbox.
new_Carray-like of shape (n_new_hyperboxes,): A vector stores all class labels correponding to remaining hyperboxes after removal of fully contained hyperboxes.
new_N_samplesarray-like of shape (n_new_hyperboxes,): A vector stores the number of samples fully included in each hyperbox.
new_Centroidsarray-like of shape (n_new_hyperboxes, n_features): A matrix stores all centroid points of all remaining hyperboxes after removal of fully contained hyperboxes, in which each row is a centroid point of a hyperbox.
n_removed_hyperboxesint: Numer of hyperboxes has been removed because they are included in at least one larger hyperbox with the same class label.