What is Tensorflow?

When you are building neural networks, convolutional layers, or anything related to machine learning, it is recommended to use libraries. For beginners problem is amount of these libraries. “What I should use?” is a common question and there is as many answers as there is libraries.

Basically it doesn’t matter what library you are using because all works fundamentally similarly. I personally started using Tensorflow (tf) because there was a lot of examples and tutorials. I recommend it to beginners for two reasons. Firstly, as I already said there is a lot of material on web. It is also used in many companies so if you are interested to work for someone they might appreciate your tf knowledge. The second reason is that Tensorflow is Google’s open source library. As long as Google is behind this project it is updated often and there will be tutorials and other material from Google. Tf’s documentations are also very easy to read and I often find what I’m looking for.

Top of Tensorflow there is built a library called Keras. Keras is like drag-and-drop programming. You don’t learn anything by doing it and when you want to build something better it is more complicated. So I don’t recommend using libraries which are build top of other libraries. People say it is easier to read code when there is less lines but I think it is opposite. It is annoying to search from source code what method is doing when you could just write it to the main code.

Tensorflow Hello World!

Python version:

a = 5
b = 6
c = a*b

Tensorflow version:

import tensorflow as tf
# first we define the parameters
a = tf.constant(5)
b = tf.constant(6)
# then we define the calculation. It is not computed yet. This is called defining graph. In tf we first define graph and then later calculate it.
c = a*b
# this starts Tensorflow environment.
with tf.Session() as sess:
# This is calculating the calculation which we made earlier.
    result =
    # And finally we print the result.

From these examples we saw how Python and Tensorflow is calculating the same result differently. Tensorflow calculated the sum of two number in GPU and Python in CPU. In this calculation it doesn’t matter because this is so simple but when we are calculating big matrix multiplications it is a lot faster to calculate in GPU. So basically we could use Python to calculate neural networks but we don’t do it because it is much slower.

Simple example

import tensorflow as tf

We set our input and output data

x = tf.constant([1,3,8,5,20,4],tf.float32)
y = tf.constant([2,6,16,10,40,8],tf.float32)

Randomly chosen starting points (so 1 and 2 is just random numbers)

a = tf.Variable(1.0)
b = tf.Variable(2.0)

Defining the graph. This is linear function.

y_hat = a*x+b

Then we calculate our loss function and take gradient of it. In the second line we define learning rate which is not normally something less than 1 but now we have linear function and we are using integers so using 1 as learning rate works better than smaller number.

loss = tf.losses.mean_squared_error(y,y_hat)
optimizer = tf.train.AdamOptimizer(1).minimize(loss)

Finally, we calculate our graph. Last line we shows the values of a and b which should be a = 2 and b = 0. Why these values? Well if you remember we defined our function to be a*x+b. 2*x+0=2x. Test to multiply every x value with 2 and you should get the y value.

model = tf.global_variables_initializer()
with tf.Session() as sess:
for i in range(100):
OUTPUT: a: 2.0047016 b: -0.005181805
(Your output can be a little bit different)


  • First we load our data and save it to tf.constant
  • Then we randomly choose our parameters which we add to our function
  • After that we make our loss function and feed that into optimizer
  • Finally we loop through optimizer and every loop it changes parameters to the direction which reduce loss between y_hat (predicted result) and y.

Now you might wonder what are constant and Variable.

Variable: These are same thing as normal variables in Python. We can change these and we have to define something to them first. Our parameters are variables because our optimizer have to be available to change those.

Placeholder: This is third type of parameter method which Tensorflow have. Placeholders are like variables but you define these inside session by writing feed_dict={variable_name=value}. These are very popular way to feed in the data. I use placeholders later examples so you will understand then better.

Constant: This is like variable but you can’t change the values. So you can think of this like static variable which is once set and can’t be later modified. In our example I made data to be constant because we never changed it.

Predicting the result of football game

First we import all important packages.

import pandas as pd
from math import floor
import tensorflow as tf
import numpy as np
from sklearn import preprocessing

Then we get the data from four different csv file and save those to data which type is DataFrame.

data1 = pd.read_csv('premier_14_15.csv')
data2 = pd.read_csv('premier_15_16.csv')
data3 = pd.read_csv('premier_16_17.csv')
data4 = pd.read_csv('premier_17_18.csv')
dataCon = [data1,data2,data3,data4]
data = pd.concat(dataCon)

These functions just modify our data making it easier to use.

def normalization(raw_data):
for col_num in range(raw_data.shape[1]):
if raw_data.iloc[:,col_num].dtype == np.float or raw_data.iloc[:,col_num].dtype ==
raw_data.iloc[:,col_num] = (raw_data.iloc[:,col_num] - raw_data.iloc[:,col_num].mean()) / (raw_data.iloc[:,col_num].max() - raw_data.iloc[:,col_num].min())
return raw_data
def embedding_matrix(column):
labels = []
embeddings = np.array([])
num_of_uniques = len(np.unique(column))
for i in range(num_of_uniques):
if embeddings.size == 0:
embeddings = np.random.uniform(low=-0.01,high=0.01,size=(min(50,(num_of_uniques+1)//2),1))
embeddings = np.append(embeddings,np.random.uniform(low=-0.01,high=0.01,size=(min(50,(num_of_uniques+1)//2),1)),axis=1)
return pd.DataFrame(data=embeddings,columns=labels)
em = embedding_matrix(data['HomeTeam'])
data = normalization(data)

We take Ys to other DataFrame and Xs to other.

x_data = np.column_stack((np.transpose(em[data['HomeTeam']].values),
y_data = data['FTR']
y_data = pd.get_dummies(y_data)

Then we split our data to train, test and validation sets.

train_size = 0.9
valid_size = 0.3
train_cnt = floor(x_data.shape[0] * train_size)
x_train = x_data[0:train_cnt]
y_train = y_data.iloc[0:train_cnt].values
valid_cnt = floor((x_data.shape[0] - train_cnt) * valid_size)
x_valid = x_data[train_cnt:train_cnt+valid_cnt]
y_valid = y_data.iloc[train_cnt:train_cnt+valid_cnt].values
x_test = x_data[train_cnt+valid_cnt:]
y_test = y_data.iloc[train_cnt+valid_cnt:]

Now we define x and y variables to be tf.placeholder so we can set the values later. We already have the data so we could define x and y right now but this is more common approach.

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

Hyper parameters. You can play with these and get a better result than I got. I just randomly chose these so choosing at least better learning rate (=alpha) might help.

# Parameters
ALPHA = 1e-3
num_epochs = 50
batch_size = 128
display_step = 5

More parameters…

# Network Parameters
num_input = x_data.shape[1]
num_classes = y_data.shape[1]
num_hidden_1 = 50
num_hidden_2 = 50

Then we make function which will build our model.

def neural_network(x,weights,biases,keep_prob):
layer_1 = tf.add(tf.matmul(x,weights['w1']),biases['b1'])
layer_1 = tf.nn.relu(layer_1)
layer_1 = tf.nn.dropout(layer_1,keep_prob)

layer_2 = tf.add(tf.matmul(layer_1,weights['w2']),biases['b2'])
layer_2 = tf.nn.relu(layer_2)
layer_2 = tf.nn.dropout(layer_2,keep_prob)

layer_out = tf.add(tf.matmul(layer_2, weights['out']), biases['out'])
return layer_out

Weights and biases for the model.

# Store layers weight & bias
weights = {
'w1': tf.Variable(tf.random_normal([num_input,num_hidden_1])),
'w2': tf.Variable(tf.random_normal([num_hidden_1,num_hidden_2])),
'out': tf.Variable(tf.random_normal([num_hidden_2, num_classes]))
biases = {
'b1': tf.Variable(tf.random_normal([num_hidden_1])),
'b2': tf.Variable(tf.random_normal([num_hidden_2])),
'out': tf.Variable(tf.random_normal([num_classes]))
keep_prob = tf.placeholder("float")

Then we build our model and save it to predictions . After that we define the cost and the optimizer.

predictions = neural_network(x, weights, biases, keep_prob)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predictions, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=ALPHA).minimize(cost)

And finally we calculate the graph.

with tf.Session() as sess:

for epoch in range(num_epochs):
avg_cost = 0.0
total_batch = int(len(x_train) / batch_size)
x_batches = np.array_split(x_train, total_batch)
y_batches = np.array_split(y_train, total_batch)

for i in range(total_batch):
batch_x, batch_y = x_batches[i], y_batches[i]
_, c =[optimizer, cost],
x: batch_x,
y: batch_y,
avg_cost += c / total_batch

if epoch % display_step == 0:
print("Train: Epoch:", '%04d' % (epoch+display_step), "cost=", "{:.9f}".format(avg_cost))
_, valid_cost =[optimizer, cost], feed_dict={x: x_valid, y: y_valid, keep_prob: 1})
print("Valid: Epoch:", '%04d' % (epoch+display_step), "cost=", "{:.9f}".format(valid_cost))

correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Accuracy:", accuracy.eval({x: x_test, y: y_test, keep_prob: 1.0}))
print("Optimization Finished!")
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Accuracy:", accuracy.eval({x: x_test, y: y_test, keep_prob: 1.0}))
Train: Epoch: 0005 cost= 14.639687252
Valid: Epoch: 0005 cost= 7.217744350
Accuracy: 0.45794392
Train: Epoch: 0010 cost= 8.123380804
Valid: Epoch: 0010 cost= 3.757234097
Accuracy: 0.39252338
Train: Epoch: 0015 cost= 7.011940718
Valid: Epoch: 0015 cost= 2.546033621
Accuracy: 0.44859812
Train: Epoch: 0020 cost= 6.122287321
Valid: Epoch: 0020 cost= 2.085423470
Accuracy: 0.48598132
Train: Epoch: 0025 cost= 5.233317709
Valid: Epoch: 0025 cost= 2.182467699
Accuracy: 0.57009345
Train: Epoch: 0030 cost= 4.904307556
Valid: Epoch: 0030 cost= 2.305070400
Accuracy: 0.5420561
Train: Epoch: 0035 cost= 4.445039535
Valid: Epoch: 0035 cost= 2.415178537
Accuracy: 0.5607477
Train: Epoch: 0040 cost= 3.930668139
Valid: Epoch: 0040 cost= 2.189312696
Accuracy: 0.5607477
Train: Epoch: 0045 cost= 3.636997890
Valid: Epoch: 0045 cost= 2.088778257
Accuracy: 0.5607477
Train: Epoch: 0050 cost= 3.385137343
Valid: Epoch: 0050 cost= 1.808984399
Accuracy: 0.5420561
Optimization Finished!
Accuracy: 0.5607477

Classifying images with Tensorflow


import numpy as np
import tensorflow as tf
from sklearn.cross_validation import train_test_split
import os
import random
from PIL import Image

We import our data. This is from Tensorflow and it is just hand written numbers from 1 to 10.

from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Hyper parameters.

img_shape = np.array([28,28])
num_input = img_shape[0] * img_shape[1]
num_classes = 10

drop_out_prob = 0.5
learning_rate = 0.001
epochs = 5
batch_size = 128
display_step = 100


X = tf.placeholder(tf.float32, [None, num_input])
Y = tf.placeholder(tf.float32, [None, num_classes])
drop_out = tf.placeholder(tf.float32)

Then we made function which will build the model.

def conv_net(x, drop_out):
x = tf.reshape(x, shape=[-1, img_shape[0], img_shape[1], 1])

conv1 = tf.layers.conv2d(
pool1 = tf.layers.max_pooling2d(inputs=conv1,pool_size=[2,2],strides=2)

conv2 = tf.layers.conv2d(
pool2 = tf.layers.max_pooling2d(inputs=conv2,pool_size=[2,2],strides=2)

pool2_flat = tf.reshape(pool2,[-1,7*7*64])
dense = tf.layers.dense(inputs=pool2_flat,units=1024,activation=tf.nn.relu)
dropout = tf.layers.dropout(

return tf.layers.dense(inputs=dropout,units=num_classes)

Again we build the model, calculate the cost, and finally define the optimizer.

logits = conv_net(X,drop_out)
prediction = tf.nn.softmax(logits)

cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(prediction), reduction_indices=[1]))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Then we calculate the model.

with tf.Session() as sess:

test_images,test_labels = mnist.test.next_batch(batch_size)

for epoch in range(epochs):
num_steps = int(len(mnist.train.labels)/batch_size)
for step in range(num_steps):
batch_x, batch_y = mnist.train.next_batch(batch_size), feed_dict={X: batch_x, Y: batch_y,drop_out:drop_out_prob})
if step % display_step == 0 or step == 1:
pred =,feed_dict={X: test_images})
print("y predicts:",np.argmax(test_labels,axis=1)[:5])
print("Accuracy:",, feed_dict={X:test_images,Y:test_labels,drop_out:0}))

print(num_steps,, feed_dict={X:test_images, Y:test_labels,drop_out:0}))

print("Optimization Finished!")

# Calculate accuracy for 256 MNIST test images
print("Testing Accuracy:",, feed_dict={X: mnist.test.images,
Y: mnist.test.labels,

And finally we got 99.1% accuracy.

I recommend saving my Github to your bookmarks because most of these codes you can just copy from there. I hope this helped and if some examples wasn’t explained well leave a comment and I will explain those to you.


Source: Deep Learning on Medium