Bye to Trial and Error Activation Functions of Neural Networks III: Proposed Jameel’s ANNAF…

Source: Deep Learning on Medium

Bye to Trial and Error Activation Functions of Neural Networks III: Proposed Jameel’s ANNAF Deterministic Criterion

You never change things by fighting the existing reality. To change something, build a new model that makes the existing model obsolete.”

Buckminster Fuller

Your brain does not manufacture thoughts. Your thoughts shape neural networks.”

Deepak Chopra

In my third article entitled “Bye to Trial and Error Activation Functions of Neural Networks II: Proposed Jameel’s ANNAF Stochastic Criterion”, a stochastic criterion was discussed. It would be recalled that eight (8) stocks prices activation functions were proposed. In this article, using “Jameel’s ANNAF Deterministic Criterion”, we can have up to at least TWO THOUSAND (2000) ACTIVATION FUNCTIONS emanated from our sample AI-ML-PURIFIED DATA SET.

Proposed Jameel’s ANNAF Deterministic Criterion

ANNAF means Artificial Neural Network Activation Functions.

For any Neural Network that require DETERMINISTIC ACTIVATION FUNCTIONS can satisfy the following proposed criterion:

(1) The function f(x)shall be EMANATED from our referenced AI-ML-Purified Data Set. The essence of the function f(x) to be EMANATED from the referenced AI-ML-Purified Data is to build an incredible and sophisticated Activation Function(s) that has the BEST MATCH AND OR TUNE with our referenced AI-ML-Purified Data Set since the neural network is a system made to learn a function from data. The Activation Functions obtained from the referenced AI-ML-Purified Data can be used to build an extra-ordinary Neural Network Artificial Intelligence System.

(2) A curve fitting for Best Fitted Deterministic Function shall be carried out, the function f(x)whose:

(a) Rank is Unity (1)

(b) Fattiness Standard Error is smaller than any other on the list;

(3) The function f(x) shall be Nonlinear;

(4)The function f(x) shall have a Range;

(5) The function f(x) shall be Continuously Differentiable;

(6) The function f(x) shall be Monotonic;

(7) The function f(x) shall be Smooth Function with a Monotonic Derivative;

(8) The function f(x) shall Approximates Identity near the Origin.

If these failed Discard the 1st rated function f(x), repeat (1) to (8) until the qualified Deterministic Activation Function is EMANATED from our referenced AI-ML-Purified Data.

NOTE: Deep Learning Artificial Neural Network’s Hidden and output Layers consist of at least one, two or more Best fitted Activation Functions EMANATED from our AI-Data Set, therefore, the RANK: UNITY (ONE) in (2)-(a) and Fattiness Standard Error (2)-(b) of the criterion means when a function whose Real “Rank =1” was chosen and it satisfied (1) to (8) then the next function on list whose Real “Rank=2” will assume “New Rank=1” and will be tested to satisfy all the eight (8) axioms until we have the required number of BEST (EXCELLENT) Activation Functions needed to carry out our Deep Learning Artificial Neural Network.

Proposed Jameel’s Deterministic Lemma

All the TOP-RANKED Nonlinear Monotonic Continuously Differentiable Deterministic Functions EMANATED from referenced AI-ML-Purified Data satisfies Proposed “Jameel’s ANNAF Deterministic Criterion” are EXCELLENT DETERMINISTIC ACTIVATION FUNCTIONS to perform well-informed Forward and Backward Propagations of an Artificial Neural Network.

Example

Jamilu (2019) employed the “TABLECURVE 2D CURVE FITTING” software of “SYSTAT”. The software automatically fits 3,665 BUILT-IN EQUATIONS FROM ALL DISCIPLINES to discover the ideal MODELS that describe Data. “TableCurve 2D is the first and only program that completely eliminates endless “TRIAL and ERROR” by automating curve fitting process”. This is because the software statistically RANKED the LIST of candidate equations according to “Jameel’s ANNAF Deterministic Criterion” of Deep Learning Artificial Neural networks proposed above.

The author used the SAMPLE DATA of “TEMPERATURE VS CONDUCTANCE” provided by TABLECURVE 2D Software as shown below:

Source: https://systatsoftware.com/products/tablecurve-2d/tablecurve-2d-curve-fitting/

Deterministic Activation Functions for Temperature vs Conductance

Now let find the BEST fitting Functions of the above AI DATA SET:

The FIRST Ranked Function is:

We can view all the FUNCTIONS that fitted our Sample AI DATA as follows:

Automatically fitting and added 2224 Functions at the end of 1:45 minutes as follows:

The SECOND Ranked Function is:

The Rank of about 2153 fitted Functions Emanated from our Sample AI Data:

This means we can have up to about 2224 FUNCTIONS (mostly Deterministic) that can be served as ACTIVATION FUNCTIONS EMANATED from our SAMPLE AI DATA to perform Deep Learning Neural Network for “TEMPERATURE vs CONDUCTANCE”. Thus, these satisfied first axiom of “Jameel’s ANNAF Deterministic Criterion” that says “(1) The function shall be EMANATED from the referenced AI-ML-Purified Data Set”.

Now we will test and see whether all the 2224 set of Activation Functions satisfies the remaining SEVEN (7) AXIOMS of “Jameel’s ANNAF Deterministic Criterion “ for the successful conduct of our Deep Learning Neural Network processes. Note that any qualified Stochastic Activation Function in this list shall satisfy “Jameel’s ANNAF Stochastic Criterion”.

The author only shows the First Derivatives of the THREE TOP-RANKED Activation Functions as follows:

Backward Propagation: First Derivatives of the Three Top-Ranked Activation Functions

First Derivative of the First (1ST) Ranked Activation Function:

First Derivative of the Second (2ND) Ranked Activation Function:

Now if a=b=c=d=e=f=g=h=i=j=k=1 then we have:

First Derivative of the Third (3ND) Ranked Activation Function:

Now if a=b=c=d=e=f=g=h=i=j=1 then we have:

Thanks to “TABLECURVE 2D CURVE FITTING” software of “SYSTAT” and “DERIVATIVE CALCULATOR”, the Three Top-Ranked Activation Functions are now DIFFERENTIABLE.

Thus, satisfied the following AXIOMS of “Jameel’s ANNAF Deterministic Criterion”:

(1) The THREE (3) functions f(x)s EMANATED from our referenced (Temperature vs Conductance) AI-ML-PURIFIED DATA SET.

(2) The THREE (3) functions f(x)s’ curve fittings have:

(a) Rank = 1

(b) Fattiness Standard Error is smaller than any others on the list;

(3) The functions f(x)s are Nonlinear;

(4)The functions f(x)s all have Ranges;

(5) The functions f(x)s are Continuously Differentiable;

(6) The functions f(x)s are Monotonic;

(7) The functions f(x)s are Smooth Function with a Monotonic Derivative;

And so on.

Therefore, if the Three Top-Ranked Activation Functions satisfied the remaining axiom of “Jameel’s ANNAF Deterministic Criterion”, are ready for the successful conduct of Deep learning Neural Network for “TEMPERATURE vs CONDUCTANCE” otherwise, the iterations shall be repeated until all the Activation Functions satisfies the Deterministic Criterion.

Guarantee, if the first three deterministic functions satisfied Jameel’s ANNAF Deterministic Criterion are excellent Activation Functions to successfully conduct “TEMPERATURE vs CONDUCTANCE” Deep Learning Neural Network. Subsequent Functions on the list are also good Advanced Optimized Activation Functions.

This research REVEALED that the Advanced Activation Functions satisfies Jameel’s ANNAF Stochastic and or Deterministic Criterion would henceforth depend on the REFERENCED PURIFIED AI DATA SET, TIME CHANGE and AREA OF APPLICATION (acronym DTA) as shown in the figure below:

The direction of my next article would work towards achieving “SUPER-INTELLIGENT NEURAL NETWORKS” using Jameel’s ANNAF Stochastic CriterionandJameel’s ANNAF Deterministic Criterion.

AUTHOR

Jamilu Auwalu Adamu, FIMC, CMC, FIMS (UK), FICA (in view)

Associate Editor, Risk and Financial Management Journal, USA

Editor, Journal of Economics and Management Sciences, USA

Former Associate Editor, Journal of Risk Model Validation, UK

PEER-REVIEWER, RISK.NET Journals, London

Former, Steering Committee Member, PRMIA Nigeria Chapter

Books Author

Correspondence: Mathematics Programme Building, 118 National Mathematical Centre, Small Sheda, Kwali, 904105, FCT-Abuja, Nigeria. Tel: +2348038679094. E-mail: whitehorseconsult@yahoo.com

References

TABLECURVE 2D SOFTWARE, SYSTAT (2019) available online: https://systatsoftware.com/products/tablecurve-2d/tablecurve-2d-curve-fitting/

Derivative Calculator (2019) available on: https://www.derivative-calculator.net/

Jamilu Auwalu Adamu (2019), Advanced Stochastic Optimization Algorithm for Deep Learning Artificial Neural Networks in Banking and Finance Industries, Risk and Financial Management Journal, USA, Vol 1 No1 (2019), DOI: https://doi.org/10.30560/rfm.v1n1p8 and available online: https://j.ideasspread.org/index.php/rfm/article/view/387

Jamilu Auwalu Adamu (2019), Superintelligent Deep Learning Artificial Neural Networks, accepted for publication in the International Journal of Applied Science, IDEAS SPREAD. INC, USA (https://j.ideasspread.org/index.php/ijas), preprint available on https://www.preprints.org/manuscript/201912.0263/v1 with doi: 10.20944/preprints201912.0263.v1

Jamilu Auwalu Adamu (2019), Advanced Deterministic Optimization Algorithm for Deep Learning Artificial Neural Networks, accepted for publication in the International Journal of Applied Science, IDEAS SPREAD. INC, USA (https://j.ideasspread.org/index.php/ijas)

Jamilu Auwalu Adamu (2019), Deterministic and Stochastic Superintelligent Digital Brains, Distinct Biological Neurons: Distinct Activation Functions implying Distinct Artificial Neurons, preprints available on researchgate.net and academia.edu respectively via https://www.researchgate.net/publication/338170126_Deterministic_and_Stochastic_Superintelligent_Digital_Brains_Distinct_Biological_Neurons_Distinct_Activation_Functions_implying_Distinct_Artificial_Neurons with DOI: 10.13140/RG.2.2.31550.23368 and https://www.academia.edu/41430249/Deterministic_and_Stochastic_Superintelligent_Digital_Brains

Jamilu Auwalu Adamu (2019), Backward Propagation of Artificial Neural Network: First Derivatives of Advanced Optimized Stochastic Activation Functions, preprints available on researchgate.net via https://www.researchgate.net/publication/337907110_Backward_Propagation_of_Artificial_Neural_Network_First_Derivatives_of_Advanced_Optimized_Stochastic_Activation_Functions , DOI: 10.13140/RG.2.2.14004.60803

Jamilu Auwalu Adamu (2015),Banking and Economic Advanced Stressed Probability of Default Models, Asian Journal of Management Sciences, 03(08), 2015, 10–18.

Jamilu A. Adamu (2015), Estimation of Probability of Default using Advanced Stressed Probability of Default Models, Ongoing Ph.D Thesis, Ahmadu Bello University (ABU), Zaria, Nigeria.

Nair et al. (2010), Rectified linear units improve restricted boltzmann machines, ICML’10 Proceedings of the 27th International Conference on International Conference on Machine Learning Pages 807–814, Haifa, Israel — June 21–24, 2010.

Djork-Arne Clevert, Thomas Unterthiner & Sepp Hochreiter (2016), FAST AND ACCURATE DEEP NETWORK LEARNING BY EXPONENTIAL LINEAR UNITS (ELUS), Published as a conference paper at ICLR 2016

Klambauer et al. (2017),Self-Normalizing Neural Networks, Institute of Bioinformatics, Johannes Kepler University Linz, Austria.

Lichman, 2013, UCI machine learning repository, URL http://archive. ics. uci. edu/ml 901

Aman Dureja and Payal Pahwa (2019), Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks, Recent Patents on Computer Science Journal, Volume 12 , Issue 3 , 2019, DOI : 10.2174/2213275911666181025143029

Chigozie Enyinna Nwankpa et al. (2018), Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, available online: https://arxiv.org/pdf/1811.03378.pdf

Soufiane Hayou et al. (2019), On the Impact of the Activation Function on Deep Neural Networks Training, available online: https://arxiv.org/pdf/1902.06853.pdf

Schoenholz et al. (2017), DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES, Published as a conference paper at ICLR 2018, available online: file:///C:/Users/pc/Downloads/Deep_Neural_Networks_as_Gaussian_Processes.pdf

Asman Dureja and Payal Pahwa (2019), Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks, Recent Patents on Computer Science, Volume 12 , Issue 3 , 2019, DOI : 10.2174/2213275911666181025143029

Casper Hansen (2019) says “Better optimized neural network; choose the right activation function, and your neural network can perform vastly better”, available online: https://mlfromscratch.com/neural-networks-explained/#/

Artist Hans Hoffman wrote, “The ability to simplify means to eliminate the unnecessary so that the necessary may speak.” available online: https://www.brainyquote.com/quotes/hans_hofmann_107805

Barnaby Black et.at (2016), Complying with IFRS 9 Impairment Calculations for Retail Portfolios, Moody’s Analytics Risk Perspectives, the convergence of Risk, Finance, and Accounting, Volume VII, June, 2016.

Ben Steiner (2019), Model Risk Management for Deep Learning and Alpha Strategies, BNP Paribas Asset Management, Quant Summit 2019

Bellotti T. and Crook J. (2012), Loss Given Default Models Incorporating Macroeconomic Variables for Credit Cards, International Journal of Forecasting, 28(1), 171–182, DOI: 10.1016/j.ijforecast.2010.08.005

Burton G. Malkiel (2009), The Clustering of Extreme Movements: Stock prices and the Weather, Princeton University, AtanuSaha, Alixpartners, Alex Grecu, Huron Consulting Group, CEPS working paper №186 February, 2009.

Chigozie Enyinna Nwankpa et al. (2018), Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, Preprint.

Nassim N. Taleb (2011), The Future has Thicker Tails Past: Model Error as Branching Counterfactuals, presented in Honor of Benoit Mandelbrot’s at his Scientific Memorial, Yale University, April, 2011.

Nassim N. Taleb (2011), A Map and Simple Heuristic to Detect Fragility, Antifragility, and Model Error, First Version, 2011

Nassim N. Taleb (2010), Why Did the Crisis of 2008 Happen, Draft, 3rd Version, August, 2010.

Nassim N. Taleb (2009), Errors, Robustness, and Fourth Quadrant, New York University Polytechnic Institute and Universa Investment, United States, International Journal of Forecasting 25 (2009) 744–759

Nassim N. Taleb (2010), Convexity, Robustness, and Model Error inside the “Black Swan Domain”, Draft Version, September, 2010.

Nassim N. Taleb et al (2009), Risk Externalities and Too bid to Fail, New York University Polytechnic Institute, 11201, New York, United States.

Nassim N. Taleb (2012), The Illusion of Thin — Tails under Aggregation, NYU — Poly, January, 2012

Nassim N. Taleb (2007), Black Swans and the Domains of Statistics, American Statistician, August 2007, Vol. 6I, №3.

Soufiane Hayou et al. (2019), On the Impact of the Activation Function on Deep Neural Networks Training, Proceedings of the 36 th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

TidarukAreerak (2014), Mathematical Model of Stock Prices via a Fractional Brownian Motion Model with Adaptive Parameters

Ton Dieker (2004), Simulation of Fractional Brownian Motion, Thesis, University of Twente, Department of Mathematical Sciences, P.O. BOX 217, 7500 AE Enschede, Netherlands

Wenyu Zhang (2015), Introduction to Ito’s Lemma, Lecture Note, Cornell University, Department of Statistical Sciences, May 6, 2015.

https://www.stoodnt.com/blog/scopes-of-machine-learning-and-artificial-intelligence-in-banking-financial-services-ml-ai-the-future-of-fintechs/

https://medium.com/datadriveninvestor/neural-networks-activation-functions-e371202b56ff

https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/

http://www.datastuff.tech/machine-learning/why-do-neural-networks-need-an-activation-function/

https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0

https://www.youthkiawaaz.com/2019/07/future-of-artificial-intelligence-in-banks/

https://news.efinancialcareers.com/uk-en/328299/ai-in-trading-buy-side

https://ai.stackexchange.com/questions/7609/is-nassim-taleb-right-about-ai-not-being-able-to-accurately-predict-certain-type/7610

https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c

The U.S. Hearing Youtube (Technology Companies & Algorithms): https://www.youtube.com/watch?v=vtw4e68CkwU

https://www.commerce.senate.gov/2019/6/optimizing-for-engagement-understanding-the-use-of-persuasive-technology-on-internet-platforms

https://ai.stackexchange.com/questions/7088/how-to-choose-an-activation-function

https://mlfromscratch.com/activation-functions-explained/#/

https://github.com/nadavo/mood.

Backward propagation and Activation Functions:

https://www.youtube.com/watch?v=q555kfIFUCM

https://www.youtube.com/watch?v=-7scQpJT7uo

https://towardsdatascience.com/analyzing-different-types-of-activation-functions-in-neural-networks-which-one-to-prefer-e11649256209

http://vision.stanford.edu/teaching/cs231n-demos/linear-classify/

http://vision.stanford.edu/teaching/cs231n-demos/knn/

https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.41357&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false