# Classifying Data with Neural Networks

Classifying Vector Data
Create an artificial dataset from three normally distributed clusters:
Plot the dataset:
The training data consists of rules mapping the point to the cluster it is in:
Create a net to compute the probability of a point lying in each cluster, using a "Class" decoder to classify the input as Red, Green or Blue:
Train the net on the data:
Evaluate the net on the centers of each cluster:
Show the contours in feature space in which each class reaches a posterior probability of 0.75:
Plot the degree of uncertainty of the classifier as measured by entropy as a function of the position:
Perform logistic regression on the Fisher Iris dataset. First, obtain the training data:
Build a list of the unique labels to which each example is assigned:
Create a NetChain to perform the classification, using a "Class" decoder to interpret the output of the net as probabilities for each class:
NetTrain will automatically use a CrossEntropyLossLayer with a NetEncoder to interpret the class labels in the training data:
Classify an input:
Obtain the probabilities associated with the classification:
Use NetMeasurements to test the classification performance of the trained net on the test set:
Classifying Categorical Data
Perform logistic regression on a dataset containing both categorical and numeric values.
First, obtain the training data, and drop rows containing missing data:
Split the data into training and test datasets:
Categorical variables cannot be used directly in neural networks and must be encoded as arrays.
Create "Class" encoders that encode the categorical variables as one-hot encoded vectors:
Applying the encoders to class labels produces unit vectors:
Create a network with an input corresponding to each feature and using a "Boolean" decoder to interpret the output of the net as the probability of survival.
The input features are first concatenated together before being further processed:
Train the net on the training data. NetTrain will automatically attach a CrossEntropyLossLayer["Binary"] layer to the output of the net:
Predict whether a passenger will survive:
Obtain the probability of survival associated with the input:
Plot the survival probability as a function of age for some combinations of "class" and "sex":
Use NetMeasurements to test the accuracy of the trained net on the test set:
The accuracy is typically comparable to that obtained using Classify when specifying the method "LogisticRegression":
Perform multitask learning by creating a net that produces two separate classifications.
First, obtain training data:
The training data consists of an image and the corresponding high-level and low-level labels:
Extract the unique labels from the "Label" and "SubLabel" columns:
Create a base convolutional net that will produce a vector of 500 features:
Create a NetGraph that will produce separate classifications for the high-level and low-level labels:
Train the network. NetTrain will automatically attach CrossEntropyLossLayer objects to both outputs, taking the target values from the training data using the corresponding names "Label" and "SubLabel":
Evaluate the trained network on a single image:
Evaluate the trained network on several images, taking only the "SubLabel" output:
Get a property of the class decoder of the "SubLabel" output on an input image:
From a random sample, select the images for which the net produces highest and lowest entropy predictions for "Label":
Use NetMeasurements to test the accuracy for both outputs of the net:
Produce a subnetwork that computes only "SubLabel" predictions:
Make a prediction on a single image: