Simplify Machine Learning Development Process with Apache Prediction IO

Original article was published on Artificial Intelligence on Medium


Model Training and Deployment

I know you are already wanted to jump to the main part, so you can start with starting docker that I already prepared. Just run the command below:

docker run -it -p 8001:8000 -p 7071:7070 adrian3ka/pio:0.0.2 /bin/bash
pio-start-all

Please wait 10 up to 30 seconds to let the engine warm up first. If you check it too early maybe you could get some error message. To check whether the engine is ready to go you could do:

pio status

The expected result from the command above is:

[INFO] [Management$] Your system is all ready to go.

After the pio engine already started, we could start to importing the data. Before we proceed it’s better if we have a grasp about the data that would be imported. The data resides in the data/data.txt file. Every row is representing 1 event, and it’s result. In example:

FRAUDSTER,10 165000000 1 3 GOLD

From data above you can say the transaction with attribute:

  • transactionVelocity 10
  • gtv 165000000
  • relatedAccount 1
  • accountAge 3
  • cardType GOLD

is a transaction that came from FRAUDSTER user.

Before we proceed into the sources code explanation and datasets import we should prepare the environment first:

export FRAUD_MODEL_KEY=fraud-detection-modelpio app new FraudDetectionModel --access-key=${FRAUD_MODEL_KEY}pio app list

Going back to the host (your computer) Folder on this directory. First of all we need to build the scala the jar with command below:

sbt package

Copy all the sources code including the Jar To Docker with command below:

docker container ls # copy paste the container id to the next line
export SPARK_CONTAINER_ID=bc07c00d3370
docker cp ./ ${SPARK_CONTAINER_ID}:/fraud-detection-model

To import all the data to the server we could run command below:

cd fraud-detection-modelpython data/import_eventserver.py --access_key $FRAUD_MODEL_KEY

To check whether the import is success, please do the command below: You can get all the data by curl:

curl  -X GET "http://localhost:7070/events.json?accessKey=$FRAUD_MODEL_KEY&limit=-1" | jq

After we make sure all the events are fetch correctly we could, move forward to the next step that’s train and deploy the model. You could simply deploy it by type the following command:

pio build --verbose
pio train
pio deploy

To check whether to engine built and trained correctly we could verify by using opening a new tab to have a curl from your native laptop / server outside from the docker.

The curl below should return FRAUDSTER:

curl -H "Content-Type: application/json" -d \
'{ "transactionVelocity":10, "gtv":165000000, "relatedAccount":1, "accountAge": 3, "cardType": "GOLD" }'\
http://localhost:8001/queries.json

The curl below should return SAFE:

curl -H "Content-Type: application/json" -d \
'{ "transactionVelocity":1, "gtv":450000, "relatedAccount":1, "accountAge": 9, "cardType": "SILVER" }' \
http://localhost:8001/queries.json

The curl below should return SUSPICIOUS:

curl -H "Content-Type: application/json" -d \
'{ "transactionVelocity":4, "gtv":135000000, "relatedAccount":1, "accountAge": 96, "cardType": "PLATINUM" }' \
http://localhost:8001/queries.json

The curl below should return SUSPICIOUS, actually it’s returning SAFE, but we add some hard-coded rules in Serving components

curl -H "Content-Type: application/json" -d \
'{ "transactionVelocity":6, "gtv":2000000001, "relatedAccount":6, "accountAge": 108, "cardType": "PLATINUM" }' \
http://localhost:8001/queries.json