Source: Deep Learning on Medium
SecureSVM, Boosting, Bagging, Clustering, LSTM, CNN, GAN
In continuation to my previous blogs, “Traditional vs Deep Learning in Retail Industry” and “Deep Learning Vs Deep Reinforcement Learning Algorithms in Retail Industry” this blog highlights on different ML algorithms used in blockchain transactions with a special emphasis on bitcoins in retail payments. This blog is structured as follows:
- Overview on the role of blockchain in retail industry.
- Different traditional (SecureSVM, Bagging, Boosting Clustering) vs deep learning algorithms (LSTM, CNN and GAN) used in bitcoin retail payments.
BlockChain in Retail
Blockchain technology addresses these challenges by providing authenticity to the supply chain. The potential of blockchain to solve retail supply chain manifests in three areas.
Provenance: Both the retailer and the customer can track the entire product life cycle along the supply chain.
Smart contracts: Transactions among disparate partners that are prone to lag can be automated for more efficiency.
IoT backbone: Supports low powered mesh networks for IoT devices reducing the needs for a central server and enhancing the reliability of sensor data.
Use of Traditional Algorithms
- Support vector machine (SVM) enables efficient data classification and thereby finds its applications in retail industry for scenarios like detecting customer loyalty, anomaly detection and customer behaviour classification. SecureSVM, a secured modified version of SVM algorithm was developed and enabled for training over privacy-preserving SVM scheme using blockchain-based encrypted IoT data. Blockchain design and architecture helps to build a secure and reliable data sharing platform among multiple retail stores of same the same brand, where IoT data is encrypted and then recorded on a distributed ledger. The removal of a trusted party and introduction of enhanced security ensures confidentiality of the sensitive data for each data provider as well as the SVM model parameters. The below figure represents feature extraction and template creation for training data for modeling SecureSVM.
- Supervised Learning Algorithms like Boosting , Bagging (e.g. GradientBoost, AdaBoost, Random Forest, ExtraTrees) can be used for uncovering Bitcoin Blockchain anonymity in retail payments. Bitcoin is a cryptocurrency whose transactions are recorded on a distributed, openly accessible ledger. As Bitcoin provides a high degree of anonymity with an entity’s real-world identity hidden behind pseudonym, supervised learning algorithms finds great application to reveal the anonymity of buyers from different transactions by clustering bitcoin addresses. The below figure illustrates a mechanism to identify possible owners of a bitcoin cluster by predicting the category of a yet-unidentified cluster.
- K-means Clustering is used in retail payments to identify malicious activity. As retailers are accepting bitcoin as a form of payment, the evolution of bitcoin in different apps is set to revolutionize the payment scheme. As bitcoin uses blockchain technology, allowing sharing/processing data between multiple parties over a network, it becomes essential to detect anomalous nodes which may be malicious and involved in illegal activity. The figure below illustrates use of clustering technique to identify and accept transactions between genuine users and reject it otherwise.
A blockchain can be clustered in groups based on its behavioral pattern. To analyses the behavior, the parameters selected are the time taken for one transaction and the amount involved from one node to another node. The reason for selecting this parameter is that usually the transaction amount is the most important and predominant feature of a node. For doing this the algorithm used is K-means.
The algorithm is implemented after extracting sequences data to represent node behaviors, and then clustering the nodes into categories. After the clustering, representative behavior patterns for each category can be selected as behavior templates, to identify strange behavior patterns that do not conform to any template. Moreover, clustering behavior patterns into categories may both leads to deeper insights into the blockchain network and helps maintainers manage and organize the nodes
The above diagram depicts how the clusters objects interact with each other, and how it helps in securing the blockchain. Let us assume that there is a malicious node (M) and its cluster centroid is Cm. If M initiates two transaction say T1 and T2 at the same time (attack called double spending attack) to two different nodes, then the Cm will get to know about it as the transaction will be added to the block. When Cm gets to know that malicious activity is being performed by the node M, it will discard both transactions and show an error message. If the transaction is to happen between two genuine users and the node is not doing any malicious activity, then the transactions will be sent from the Cm to the Cluster centroids of the other node i.e. the node which has to accept the transaction, and that centroid will proceed the transaction to the node. After all this the transaction whether valid or invalid will be added to the distributed ledger, which keeps record of all the transaction happened ever.
To evaluate similarity measure of the cluster sequence, at first k-means algorithm is implemented and then the features can be enhanced. To find the similarity between two sequences, there are various selection techniques such as Euclidean distance (EDR), Dynamic time wrapping (DTW), and Longest Common Sub-Sequence (LCSS). DTW is a better choice over others as it uses the sequence of different length ideal for a BlockChain system, with all the blocks are of different lengths.
Use of Deep Learning Algorithms
Credit Evaluation System Based on Blockchain
A blockchain-based credit evaluation system can be used to strengthen the effectiveness of supervision and management in the retail supply chain. As illustrated in the following figure, the system gathers credit evaluation text from traders by smart contracts on the blockchain. Then the gathered text is analyzed for different sentiments by a deep learning network named Long Short Term Memory (LSTM). Finally traders’ (such as farmers, production factories, distributors, retailers and consumers) credit results are used as a reference for the supervision and management of regulators. By applying blockchain, traders can be held accountable for their actions in the process of transaction and credit evaluation. Regulators (manages authentication, authorization, monitoring, transaction and credit evaluation of traders) can gather more reliable, authentic and sufficient information about traders.
GAN and LSTM in Price Prediction
Bitcoin and other crypto-currencies has gained much attention as they offer an opportunity to bypass expensive forms of payment for something much cheaper. It makes country of origin labeling, product safety tracking easier in the supply chain by maintaining authenticity of goods, where blockchain records every touchpoint in the lifespan of a product as it moves through the supply chain, from the producer to the consumers.
The popularity has touched the retail industry, where Blockchain payments has been enabled across multiple devices with the BitPay Checkout app.
To predict the price of Bitcoin in retail industry using machine
learning, some of the popular models that can be used are Bayesian optimized recurrent neural network (RNN) and Long Short Term Memory (LSTM) network, ARIMA model.
Traditional time series prediction methods such as Holt-Winters exponential
smoothing models rely on linear assumptions and require data that can be broken down into trend, seasonal and noise to be effective. This type of methodology is more suitable for a task such as forecasting sales where seasonal effects are present. Due to the lack of seasonality in the Bitcoin market and its high volatility, these methods are not very effective for this task.
This type of task uses data of a sequential nature and as a result is similar to a price prediction task. The recurrent neural network (RNN) and the long short term memory (LSTM) are favored over the traditional multilayer perceptron (MLP) due to the temporal nature of the more advanced algorithms.
One limitation of the MLP and similarly the RNN is that they are affected by the vanishing gradient problem. This issue is that as layers and time steps of the network relate to each other through multiplication, derivatives are susceptible to exploding or vanishing gradients. Another limitation of the MLP is that its signals only pass forward in the network in a static nature. As a result, it does not recognize the temporal element of a time series task effectively. The recurrent neural network, also known as a dynamic neural network addresses some of these limitations.
Generative adversarial network (GAN) for Anomaly Detection
In supply chain, transfer of goods from suppliers, the suppliers’ suppliers, logistics providers, carriers, cargo, ports involves monitoring real-time status and visibility all transactions that requires:
- Evaluate patterns and trends over time
- Identify partners which contribute to the anomaly
- Find date patterns and/or event-driven occurrences.
The pricing and availability of supply chain items in retail can be represented by time-series, hence a combination of LSTM-GANs. Generative adversarial network (GAN) is a framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model that captures the data distribution and a discriminative model that estimates the probability that a sample truly came from the training data.
The LSTM-GAN combination can learn the characteristics of the retail commodity stocks and pricing trends based on availability of the right number of products, product demand, promotions of the retailer, promotions by the competition, changes in customer tastes and preferences.
The generator G takes a random noise from the latent space at each time step. Sample outputs are generated from the generator G, and toggled with the real samples to be inputs of the discriminator D. The discriminator D obtains the toggled samples, performs classification tasks, and gives a single output (real or fake).
LSTM has been applied as a base structure of GANs to identify anomaly patterns in supply chain. LSTM networks compose of forget gate f, input gate i, memory cell update C, and output gate o. The forget gate decides how much of the historical data are kept. The input gate decides which values will be updated. The memory cell update will update the old cell state into the new state. The output gate chooses the parts of the cell state that are going to be sent out.
In the following figure, Long short-term memory (LSTM) was used as a base structure of the GANs, which learned normal market behaviors in an unsupervised way. After the training, the discriminator network of GANs was used as a detector to discriminate between normal supply and outliers from supply chain to assist in effectively managing the waste and availability.
As LSTMs can be used for GANs, LSTM and CNN (Convolution Neural Network) combination can also be used to create GANs. A CNN consists of an input layer and an output layer, as well as multiple hidden layers. The hidden layers of a CNN typically consist of convolutional layers, pooling layers, fully connected layers, and normalization layers.
The below figure illustrates such a combo (GAN-FD architecture), where the generator is founded on LSTM, which applies to predicting Yt+1. The discriminator is based on CNN for the purpose of estimating the probability whether a sequence is real (Y) or being predicted (Y).
The discriminative model is based on the CNN architecture and performs convolution operations on the one-dimensional input sequence, generated from the generator LSTM in order to estimate the probability whether a sequence comes from the dataset or being produced by a generative model.
The main objective behind using GANS is that the adversarial loss can simulate the pricing characteristics of commodity items. Retailers can forecast the pricing of stocked items through the available indicator data with the generative model, and then judge the correct probability of his own forecast with the previous stock with the help of discriminative model.