AutoEncoder in Spectrogram

Source: Deep Learning on Medium


AutoEncoder is the type of neural networks that receive input X of Image, then learn the feature and generate output X of Image. In this article, we would like to share how to create AutoEncoder that learn Spectrogram Image Feature and produce similar Spectrogram Image Feature in order to prove better network architecture.

The first step, read the Spectrogram Image (in this case, we use 28×28 Image) through this code:

path = os.path.abspath('[FILE_NAME]')
path = re.sub('[a-zA-Z\s._]+$', '', path)
dirs = os.listdir(path+[SPECTROGRAM_PATH])
label = 0
im_arr = []
lb_arr = []
X = []
y = []
for i in dirs:
count = 0
for pic in glob.glob(path+[SPECTROGRAM_PATH]+i+'/*.png'):
im = cv2.imread(pic)
im = cv2.resize(im,(28,28))
im = np.array(im)
count = count + 1
X.append(im)
y.append(label)
if(count == 3):
im_arr.append({str(i):im})
print("Jumlah "+str(i)+" : "+str(count))
label = label + 1
lb_arr.append(i)
X = np.array(X)
y = np.array(y);

the second step, setup Neural Network Architecture (we use CNN):

input_img = Input(shape=(28, 28, 3))
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(3, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

the last step, feed forward the image through architecture

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

we can use Matplotlib Pyplot to show the original Spectrogram and the new one as follow:

Original Spectrogram
After Performing Auto Encoder