After much hype, Google finally released TensorFlow 2.0 which is the latest version of Google's flagship deep learning platform. A lot of long-awaited features have been introduced in TensorFlow 2.0. This article very briefly covers how you can develop simple classification and regression models using TensorFlow 2.0.
Classification with Tensorflow 2.0
If you have ever worked with Keras library, you are in for a treat. TensorFlow 2.0 now uses Keras API as its default library for training classification and regression models. Before TensorFlow 2.0, one of the major criticisms that the earlier versions of TensorFlow had to face stemmed from the complexity of model creation. Previously you need to stitch graphs, sessions and placeholders together in order to create even a simple logistic regression model. With TensorFlow 2.0, creating classification and regression models have become a piece of cake.
So without further ado, let's develop a classification model with TensorFlow.
The dataset for the classification example can be downloaded freely from this link. Download the file in CSV format. If you open the downloaded CSV file, you will see that the file doesn't contain any headers. The detail of the columns is available at UCI machine learning repository. I will recommend that you read the dataset information in detail from the download link. I will briefly summarize the dataset in this section.
The dataset basically consists of 7 columns:
- price (the buying price of the car)
- maint ( the maintenance cost)
- doors (number of doors)
- persons (the seating capacity)
- lug_capacity (the luggage capacity)
- safety (how safe is the car)
- output (the condition of the car)
Given the first 6 columns, the task is to predict the value for the 7th column i.e. the output. The output column can have one of the three values i.e. "unacc" (unacceptable), "acc" (acceptable), good, and very good.
Before we import the dataset into our application, we need to import the required libraries.
import pandas as pd import numpy as np import tensorflow as tf import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns sns.set(style="darkgrid")
Before we proceed, I want you to make sure that you have the latest version of TensorFlow i.e. TensorFlow 2.0. You can check your TensorFlow version with the following command:
If you do not have TensorFlow 2.0 installed, you can upgrade to the latest version via the following command:
$ pip install --upgrade tensorflow
Importing the Dataset
The following script imports the dataset. Change the path to your CSV data file according.
cols = ['price', 'maint', 'doors', 'persons', 'lug_capacity', 'safety','output'] cars = pd.read_csv(r'/content/drive/My Drive/datasets/car_dataset.csv', names=cols, header=None)
Since the CSV file doesn't contain column headers by default, we passed a list of column headers to the
Let's now see the first 5 rows of the dataset via the
You can see the 7 columns in the dataset.
Data Analysis and Preprocessing
Let's briefly analyze the dataset by plotting a pie chart that shows the distribution of the output. The following script increases the default plot size.
plot_size = plt.rcParams["figure.figsize"] plot_size  = 8 plot_size  = 6 plt.rcParams["figure.figsize"] = plot_size
And the following script plots the pie chart showing the output distribution.
cars.output.value_counts().plot(kind='pie', autopct='%0.05f%%', colors=['lightblue', 'lightgreen', 'orange', 'pink'], explode=(0.05, 0.05, 0.05,0.05))
The output shows that majority of cars (70%) are in unacceptable condition while 20% cars are in acceptable conditions. The ratio of cars in good and very good condition is very low.
All the columns in our dataset are categorical. Deep learning is based on statistical algorithms and statistical algorithms work with numbers. Therefore, we need to convert the categorical information into numeric columns. There are various approaches to do that but one of the most common approach is one-hot encoding. In one-hot encoding, for each unique value in the categorical column, a new column is created. For the rows in the actual column where the unique value existed, a 1 is added to the corresponding row of the column created for that particular value. This might sound complex but the following example will make it clear.
The following script converts categorical columns into numeric columns:
price = pd.get_dummies(cars.price, prefix='price') maint = pd.get_dummies(cars.maint, prefix='maint') doors = pd.get_dummies(cars.doors, prefix='doors') persons = pd.get_dummies(cars.persons, prefix='persons') lug_capacity = pd.get_dummies(cars.lug_capacity, prefix='lug_capacity') safety = pd.get_dummies(cars.safety, prefix='safety') labels = pd.get_dummies(cars.output, prefix='condition')
To create our feature set, we can merge the first six columns horizontally:
X = pd.concat([price, maint, doors, persons, lug_capacity, safety] , axis=1)
Let's see how our label column looks now:
The label column is basically a one-hot encoded version of the output column that we had in our dataset. The output column had four unique values: unacc, acc, good and very good. In the one-hot encoded label dataset, you can see four columns, one for each of the unique values in the output column. You can see 1 in the column for the unique value that originally existed in that row. For instance, in the first five rows of the output column, the column value was unacc. In the labels column, you can see 1 in the first five rows of the condition_unacc column.
Let's now convert our labels into a numpy array since deep learning models in TensorFlow accept numpy array as input.
y = labels.values
The final step before we can train our TensorFlow 2.0 classification model is to divide the dataset into training and test sets:
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
To train the model, let's import the TensorFlow 2.0 classes. Execute the following script:
from tensorflow.keras.layers import Input, Dense, Activation,Dropout from tensorflow.keras.models import Model
As I said earlier, TensorFlow 2.0 uses the Keras API for training the model. In the script above we basically import
Dropout classes from
tensorflow.keras.layers module. Similarly, we also
Model class from the
The next step is to create our classification model:
input_layer = Input(shape=(X.shape,)) dense_layer_1 = Dense(15, activation='relu')(input_layer) dense_layer_2 = Dense(10, activation='relu')(dense_layer_1) output = Dense(y.shape, activation='softmax')(dense_layer_2) model = Model(inputs=input_layer, outputs=output) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])
As can be seen from the script, the model contains three dense layers. The first two dense layers contain 15 and 10 nodes, respectively with
relu activation function. The final dense layer contain 4 nodes (
y.shape == 4) and
softmax activation function since this is a classification task. The model is trained using
categorical_crossentropy loss function and
adam optimizer. The evaluation metric is accuracy.
The following script shows the model summary:
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 21)] 0 _________________________________________________________________ dense (Dense) (None, 15) 330 _________________________________________________________________ dense_1 (Dense) (None, 10) 160 _________________________________________________________________ dense_2 (Dense) (None, 4) 44 ================================================================= Total params: 534 Trainable params: 534 Non-trainable params: 0 _________________________________________________________________ None
Finally, to train the model execute the following script:
history = model.fit(X_train, y_train, batch_size=8, epochs=50, verbose=1, validation_split=0.2)
The model will be trained for 50 epochs but here for the sake of space, the result of only last 5 epochs is displayed:
Epoch 45/50 1105/1105 [==============================] - 0s 219us/sample - loss: 0.0114 - acc: 1.0000 - val_loss: 0.0606 - val_acc: 0.9856 Epoch 46/50 1105/1105 [==============================] - 0s 212us/sample - loss: 0.0113 - acc: 1.0000 - val_loss: 0.0497 - val_acc: 0.9856 Epoch 47/50 1105/1105 [==============================] - 0s 219us/sample - loss: 0.0102 - acc: 1.0000 - val_loss: 0.0517 - val_acc: 0.9856 Epoch 48/50 1105/1105 [==============================] - 0s 218us/sample - loss: 0.0091 - acc: 1.0000 - val_loss: 0.0536 - val_acc: 0.9856 Epoch 49/50 1105/1105 [==============================] - 0s 213us/sample - loss: 0.0095 - acc: 1.0000 - val_loss: 0.0513 - val_acc: 0.9819 Epoch 50/50 1105/1105 [==============================] - 0s 209us/sample - loss: 0.0080 - acc: 1.0000 - val_loss: 0.0536 - val_acc: 0.9856
By the end of the 50th epoch, we have training accuracy of 100% while validation accuracy of 98.56%, which is impressive.
Let's finally evaluate the performance of our classification model on the test set:
score = model.evaluate(X_test, y_test, verbose=1) print("Test Score:", score) print("Test Accuracy:", score)
Here is the output:
WARNING:tensorflow:Falling back from v2 loop because of error: Failed to find data adapter that can handle input: <class 'pandas.core.frame.DataFrame'>, <class 'NoneType'> 346/346 [==============================] - 0s 55us/sample - loss: 0.0605 - acc: 0.9740 Test Score: 0.06045335989359314 Test Accuracy: 0.9739884
Our model achieves an accuracy of 97.39% on the test set. Though it is slightly less than the training accuracy of 100%, it is still very good given the fact that we randomly chose the number of layers and the nodes. You can add more layers to the model with more nodes and see if you can get better results on the validation and test sets.
Regression with TensorFlow 2.0
In regression problem, the goal is to predict a continuous value. In this section, you will see how to solve a regression problem with TensorFlow 2.0
The dataset for this problem can be downloaded freely from this link. Download the CSV file.
The following script imports the dataset. Do not forget to change the path to your own CSV datafile.
petrol_cons = pd.read_csv(r'/content/drive/My Drive/datasets/petrol_consumption.csv')
Let's print the first five rows of the dataset via the
You can see that there are five columns in the dataset. The regression model will be trained on the first four columns, i.e. Petrol_tax, Average_income, Paved_Highways, and Population_Driver_License(%). The value for the last column i.e. Petrol_Consumption will be predicted. As you can see that there is no discrete value for the output column, rather the predicted value can be any continuous value.
In the data preprocessing step we will simply split the data into features and labels, followed by dividing the data into test and training sets. Finally the data will be normalized. For regression problems in general, and for regression problems with deep learning, it is highly recommended that you normalize your dataset. Finally, since all the columns are numeric, here we do not need to perform one-hot encoding of the columns.
X = petrol_cons.iloc[:, 0:4].values y = petrol_cons.iloc[:, 4].values from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)
In the above script, in the feature set
X, the first four columns of the dataset are included. In the label set
y, only the 5th column is included. Next, the data set is divided into training and test size via the
train_test_split method of the
sklearn.model_selection module. The value for the
test_size attribute is 0.2 which means that the test set will contain 20% of the original data and the training set will consist of the remaining 80% of the original dataset. Finally, the
StandardScaler class from the
sklearn.preprocessing module is used to scale the dataset.
The next step is to train our model. This is process is quite similar to training the classification. The only change will be in the loss function and the number of nodes in the output dense layer. Since now we are predicting a single continuous value, the output layer will only have 1 node.
input_layer = Input(shape=(X.shape,)) dense_layer_1 = Dense(100, activation='relu')(input_layer) dense_layer_2 = Dense(50, activation='relu')(dense_layer_1) dense_layer_3 = Dense(25, activation='relu')(dense_layer_2) output = Dense(1)(dense_layer_3) model = Model(inputs=input_layer, outputs=output) model.compile(loss="mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
Our model consists of four dense layers with 100, 50, 25, and 1 node, respectively. For regression problems, one of the most commonly used loss functions is
mean_squared_error. The following script prints the summary of the model:
Model: "model_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_4 (InputLayer) [(None, 4)] 0 _________________________________________________________________ dense_10 (Dense) (None, 100) 500 _________________________________________________________________ dense_11 (Dense) (None, 50) 5050 _________________________________________________________________ dense_12 (Dense) (None, 25) 1275 _________________________________________________________________ dense_13 (Dense) (None, 1) 26 ================================================================= Total params: 6,851 Trainable params: 6,851 Non-trainable params: 0
Finally, we can train the model with the following script:
history = model.fit(X_train, y_train, batch_size=2, epochs=100, verbose=1, validation_split=0.2)
Here is the result from the last 5 training epochs:
Epoch 96/100 30/30 [==============================] - 0s 2ms/sample - loss: 510.3316 - mean_squared_error: 510.3317 - val_loss: 10383.5234 - val_mean_squared_error: 10383.5234 Epoch 97/100 30/30 [==============================] - 0s 2ms/sample - loss: 523.3454 - mean_squared_error: 523.3453 - val_loss: 10488.3036 - val_mean_squared_error: 10488.3037 Epoch 98/100 30/30 [==============================] - 0s 2ms/sample - loss: 514.8281 - mean_squared_error: 514.8281 - val_loss: 10379.5087 - val_mean_squared_error: 10379.5088 Epoch 99/100 30/30 [==============================] - 0s 2ms/sample - loss: 504.0919 - mean_squared_error: 504.0919 - val_loss: 10301.3304 - val_mean_squared_error: 10301.3311 Epoch 100/100 30/30 [==============================] - 0s 2ms/sample - loss: 532.7809 - mean_squared_error: 532.7809 - val_loss: 10325.1699 - val_mean_squared_error: 10325.1709
To evaluate the performance of a regression model on test set, one of the most commonly used metrics is root mean squared error. We can find mean squared error between the predicted and actual values via the
mean_squared_error class of the
sklearn.metrics module. We can then take square root of the resultant mean squared error. Look at the following script:
from sklearn.metrics import mean_squared_error from math import sqrt pred_train = model.predict(X_train) print(np.sqrt(mean_squared_error(y_train,pred_train))) pred = model.predict(X_test) print(np.sqrt(mean_squared_error(y_test,pred)))
The output shows the mean squared error for both the training and test sets. The results show that model performance is better on the training set since the root mean squared error value for training set is less. Our model is overfitting. The reason is obvious, we only had 48 records in the dataset. Try to train regression models with a larger dataset to get better results.
TensorFlow 2.0 is the latest version of Google's TensorFlow library for deep learning. This article briefly covers how to create classification and regression models with TensorFlow 2.0. To have a hands on experience, I would suggest that you practice the examples given in this article and try to create simple regression and classification models with TensorFlow 2.0 using some other datasets.