Original article was published on Deep Learning on Medium

Predictive maintenance is very important for manufacturers as well as the maintainers, which lowers maintenance cost, extend equipment life, reduce downtime and improve production quality by addressing problems before they cause equipment failures.

“Predictive maintenance techniques are designed to help determine the condition of in-service equipment in order to estimate when maintenance should be performed” — source Wikipedia

In this post, I would like to demonstrate that the use of

RNN(Recurrent Neural Network)/LSTM(Long Short Term Memory)architecture is not onlymore accurate but it performs better in classifying the results accuratelywhen compared to the previousCNN (Convolution Neural Network)approach, written by Marco Cerliani (readhere).

# Dataset

This post uses the **C-MAPSS dataset**** **for the predictive maintenance of the Turbofan Engine. Here the challenge is to determine the **Remaining Useful Life (RUL) **until next fault that occur in the engine.

The dataset can be found (**here**), here’s a brief on the dataset,

“The engine is

operating normallyat the start of each time series, and develops a fault at some point during the series.In the

training set, the fault grows in magnitude until system failure.In

the test set, the time series ends some time prior to system failure.

The following are the conditions of the engine that are used in the training of the model

Train trjectories: 100

Test trajectories: 100

Conditions: ONE (Sea Level)

Fault Modes: ONE (HPC Degradation)

**Understanding the Dataset**

Once we load the dataset, we obtain the time series data of 100 engines that contains the operational settings and sensor readings of each 100 engines with different senarios where the fault occurs and a total of 20631 training examples. To illustrate, below are the first 5 training examples of our training dataset.

`train_df.head()`

To further understand the data given, (see Fig 2) describes that for a given engine how many cycles are left before the next fault occurs.

*Example 1 : Engine id number 69 (farthest left) approximately has 360 cycles remaining before fault.*

*Example 2 : Engine id number 39 (farthest right) approximately has 110 cycles remaining before fault.*

`train_df.id.value_counts().plot.bar()`

The following (Fig3 and Fig 4) are time series data for engine whose id is 69,

engine_id = train_df[train_df['id'] == 69]ax1 = engine_id[train_df.columns[2:]].plot(subplots=True, sharex=True, figsize=(20,30))

**The images (fig2, fig 3 and fig4) are obtained by using the source code from GitHub Notebook (**here**), by Marco Cerliani.*

# Data preprocessing

Data preprocessing is the most important step towards training any neural network. For neural networks like RNN (Recurrent Neural Network), the network is very sensitive to the input data and the data needs to be in range of -1 to 1 or 0 to 1. This range i.e, -1 to 1 or 0 to 1, is usually because of the *tanh (**see Fig 5) *is the activation function accompanied in the hidden layers of the network. Thus, the data must be normalized before the training of the model.

Using the **MinMaxScaler **function** **provided by sklearn’s preprocessing library we normalize our training data in the scale ranging in **0 to 1, **although, theoritcally we could normalize and experiment our data to -1 to 1. However, this post only demostrates scaling of data in the range in **0 to 1.**

from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler(feature_range=(0,1))

train_df[train_df.columns[2:26]] = sc.fit_transform(train_df[ train_df.columns[2:26]])

train_df = train_df.dropna(axis=1)

The reason of using the columns numbers from 2 to 26(see Fig1), is that we take the operational setting1 (as column number 2) , setting2, setting3, sensor 1 up until sensor 21 (column 25), and python range doesn’t take in account of the upper limit, hence the upper limit 26. For illustration, here’s first 5 training examples after the normalization of training data (fig 6).

Once we have our data normalized, we employ a classification approach in order to predict RUL. We do this by adding new labels for our classifcation approach onto our dataset by the following.

*“following source code is used : from **Github**”*

w1 = 45

w0 = 15train_df['class1'] = np.where(train_df['RUL'] <= w1, 1, 0 )

train_df['class2'] = train_df['class1']

train_df.loc[train_df['RUL'] <= w0, 'class2'] = 2

This snippet of code now creates labels(see fig7) for our classification problem and the classifcation approcah is as follows,

label 0 : when 45+ cycles are left until fault.

label 1 : when cycles between 16 and 45 are left until fault.

label 2 : when cycles between 0 and 15 are left until fault.

Great! Now we need to further prepare our data inorder for the neural network to efficiently process the time series data, we do this by specifying the time step size(or window size). Neural network such as RNN or CNN, require the input data to be in 3-Dimensional form. Hence we now need to transorm our 2-Dimensional data to 3-Dimensional data.

To demonstrate this process of transformation (see Fig8), we simple run through the time series data by specifying the time step size(window size). This process is also known as* **Sliding Window technique**.*

For our time series data, we use the *Sliding Window technique* for all of the sensors and operating settings by* *specifying the time steps(or window size) as 50, although, the time steps size can be set arbitary. Following code snippet transforms our 2-Dimensional to a 3-Dimensional data(numpy pandas array)of size 15631x50x17, which is optimum for input to the neural network.

*“following source code is modified : from **Github**”*

time_steps = 50def gen_sequence(id_df):data_matrix = id_df.iloc[2:26]

num_elements = data_matrix.shape[0]for start, stop in zip(range(0, num_elements-time_steps), range(time_steps, num_elements)):

yield data_matrix[start:stop, :]def gen_labels(id_df, label):data_matrix = id_df[label].values

num_elements = data_matrix.shape[0]return data_matrix[time_steps:num_elements, :]x_train, y_train = [], []

for engine_id in train_df.id.unique():

for sequence in gen_sequence(train_df[train_df.id==engine_id]):

x_train.append(sequence)

for label in gen_labels(train_df[train_df.id==engine_id['label2']):

y_train.append(label)x_train = np.asarray(x_train)

y_train = np.asarray(y_train).reshape(-1,1)

**For more further reading on time series data, please read the article (**here**).*

# Deep Learning Model

RNN/LSTM has been best proven for handling time series data and there are plenty of articles on the web demonstrating the effectiveness on broad applications. Hence, we employ the RNN/LSTM architecture.

Now since our data is ready and is in 3-Dimensional form, we can now define the RNN/LSTM neural network architecture which comprises of 2 hidden layers and each hidden layer having activation function of **tanh** (see fig5) followed by a layer of **softmax classifier**.

model = Sequential()#inputmodel.add(LSTM(units=50, return_sequences='true', activation='tanh',

input_shape = (x_train.shape[1], x_train.shape[2])) )

model.add(Dropout(0.2))#hidden layer 1

model.add(LSTM(units=60, return_sequences='true',activation='tanh'))

model.add(Dropout(0.2))#hidden layer 2

model.add(LSTM(units=60, activation='tanh'))

model.add(Dropout(0.2))#output

model.add(Dense(units=3,activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])print(model.summary())

Here’s an output of the model summary(see fig 9).

**Training the RNN/LSTM Model**

The RNN/LSTM model is been trained for a total of 30 epochs, although I’ve tried to train the model for 40 epochs, and the model was seen to be overfitting.

`history = model.fit(x_train, y_train,batch_size=32, epochs=30)`