Market Generators
Blanka Horvath King’s College London
7CCMFM18 Machine Learning
March 23, 2020
Transforming horizons of mathematical modelling
Changes induced by the increase in computing power availability
1) Currently used stochastic market models: Black-Scholes model, Heston Model, Stochastic volatility models, Rough Volatility models
(few parameters, simplified but tractable model of ”reality”, well-understood, approximative solutions, often full error analysis available)
Transforming horizons of mathematical modelling
Changes induced by the increase in computing power availability
1) Currently used stochastic market models: Black-Scholes model, Heston Model, Stochastic volatility models, Rough Volatility models
(few parameters, simplified but tractable model of ”reality”, well-understood, approximative solutions, often full error analysis available)
2) DNN models: based on Generative Modelling techniques adapted to financial appications and settings
(fully flexible: many parameters, ongoing challenge to deliver approximation and convergence properties)
Transforming horizons of mathematical modelling
Changes induced by the increase in computing power availability
1) Currently used stochastic market models: Black-Scholes model, Heston Model, Stochastic volatility models, Rough Volatility models
(few parameters, simplified but tractable model of ”reality”, well-understood, approximative solutions, often full error analysis available)
2) DNN models: based on Generative Modelling techniques adapted to financial appications and settings
(fully flexible: many parameters, ongoing challenge to deliver approximation and convergence properties)
1.5) Augmenting and extending currently prevalent stochastic models
(models that are adaptive to market environments by creating mixture models)
Transforming horizons of mathematical modelling
Changes induced by the increase in computing power availability
1) Currently used stochastic market models: Black-Scholes model, Heston Model, Stochastic volatility models, Rough Volatility models
(few parameters, simplified but tractable model of ”reality”, well-understood, approximative solutions, often full error analysis available)
2) DNN models: based on Generative Modelling techniques adapted to financial appications and settings
(fully flexible: many parameters, ongoing challenge to deliver approximation and convergence properties)
1.5) Augmenting and extending currently prevalent stochastic models
(models that are adaptive to market environments by creating mixture models)
1) ⇒ 2) Models (Quant) to Market Generators (DS) What’s the difference?
Stylised facts
There is a long history of financial modelling, and matching stylised facts of markets even before Market Generators have emerged. These stylised facts are:
– Heavy tails and aggregational Gaussianity: Asset returns have (power-law-like) heavier tails than normal distribution, and have a distribution that is more peaked than the normal distribution. However, as the time scale t increases, the distribution looks more and more Gaussian.
– Multifractal scaling structure. – Slow decay of autocorrelation in absolute returns: decays slowly, as a function of the time lag following a power-law.
– Volatility clustering: phases of high/low activity tend to be followed by phases of high/low activity.
– Leverage effect: a negative correlation between the volatility of asset returns and the returns process
– Non-stationarity: Financial time series are typically non-stationary, that is, past distributions are not necessarily like future distributions.
– …
1) A modern example: Rough Volatility Models
Closely matches observations of financial time-series: 1. Volatility clustering
2. Leverage effect
3. Multifractal scaling structure
⇒ a good model as it reflects many stylised facts (as opposed to Black-Scholes for example) see notebook.
1.5) Example:
A first step towards mixture of experts models using the Deep Learning Volatility framework
Training in the DLV framework is done solving
NN wˆ =argminw∈ΩL {f(w,xi)}i=1,{P(xi)}i=1
1.5) Example:
A first step towards mixture of experts models using the Deep Learning Volatility framework
We take a mixture of two (or more) classical stochastic volatitlity models and calibrate it to data as demonstrated in the Deep Learning Volatility example.
First step towards more sophisticated mixture of expert models. We see that after learning, calibrating many parameters is fast
First step towards more sophisticated mixture of expert models.
We see that after learning, calibrating many parameters is fast
⇒ learn several models at the same time+choose mixture of “best fit” models
First step towards more sophisticated mixture of expert models.
We see that after learning, calibrating many parameters is fast
⇒ learn several models at the same time+choose mixture of “best fit” models New learning procedure:
Train the generator on several models at the same time (here Heston and rBergomi) in Monte Carlo experiments as before: New mixture parameter a
a×Heston+(1−a)×rBergomi
Calibrate mixture models ⇒ determine the best-fit mixture of model the two
models to a given data.
Controlled experiments: train on both Bergomi and Heston ⇒ test on data generated by Heston.
First step towards more sophisticated mixture of expert models.
2) Market Generators: New horizons? New challenges?
With increased computational powers “DNN-based Generative models” can be used to model financial markets.
2) Market Generators: New horizons? New challenges?
With increased computational powers “DNN-based Generative models” can be used to model financial markets.
Types of existing Generative Models in Deep Learning
1. Restricted Boltzman Machine (RBM) ⇒ see appendix for network architecture 2. Generative Adversarial Network (GAN) ⇒ consisiting of Generator-Discriminator 3. Variational Autoencoder (VAE)
See background material on KEATS for more on their properties. See also Deep Learning book for more details on each generative modelling technique in classical ML.
2) Market Generators: New horizons? New challenges?
With increased computational powers “DNN-based Generative models” can be used to model financial markets.
Types of existing Generative Models in Deep Learning
1. Restricted Boltzman Machine (RBM) ⇒ see appendix for network architecture 2. Generative Adversarial Network (GAN) ⇒ consisiting of Generator-Discriminator 3. Variational Autoencoder (VAE)
See background material on KEATS for more on their properties. See also Deep Learning book for more details on each generative modelling technique in classical ML. See also ”Deep Fake” and https://thispersondoesnotexist.com/ for images by generative models of people that do not exist.
This is what we mean by generating fake data that ”looks like the original”. But what does this mean for financial data? We return to this later.
How is the question of Market Generators relevant? Why not stick to traditional models?
New Deep learning applications allow us to approximative solutions that were previously not possible to derive by usual means.
Why not stick to traditional models for their training?
Example: Deep Hedging [BGTW18]
Reinforcement learning on a feedforward NN with objective function derived from
hedging equations.
Observation 1: If the hedging engine is trained on paths of the Heston
(Black-Scholes) model, it identifies the known optimal hedging strategies under
Heston (Black-Scholes).
Observation 2: Training the hedging engine on more complex paths (eg. including
transaction costs) yields insights to scenarios where so far solutions were only available in special cases.
How is the question of Market Generators relevant? Why not stick to traditional models?
New Deep learning applications allow us to approximative solutions that were previously not possible to derive by usual means.
Why not stick to traditional models for their training?
Example: Deep Hedging [BGTW18]
Reinforcement learning on a feedforward NN with objective function derived from
hedging equations.
Observation 1: If the hedging engine is trained on paths of the Heston
(Black-Scholes) model, it identifies the known optimal hedging strategies under
Heston (Black-Scholes).
Observation 2: Training the hedging engine on more complex paths (eg. including
transaction costs) yields insights to scenarios where so far solutions were only
available in special cases.
But Heston & Black-Scholes not realistic enough. ⇒ What if “more realistic”
directly data-driven market generators are used for training?
2) Modern Market Generators: New horizons? New challenges?
With increased computational powers “DNN-based Generative models can be used to model financial markets
+ More modelling flexibility (high number of parameters and no ”restrictions” imposed by classical models)
+ ”direct” data-driven modelling
– time series aspect: a challenge (while DNN well-developed for static images,
generating time-series was a challenge for a long time)
– generaitve models also more difficult to risk manage (reliance on training data, explainability, Robustness or sensitivity to training data, Data privacy)
What is a (good) model in the ML era?
The concept of “Model” (here, in form of a numerical program):
Classical: (Program; Data) ⇒ Output
Now: (Architecture, OF; TrData) ⇒ Program
(Program, TestData) ⇒ Output
⇒ We need to redefine the concept of model governance too: Model=(Architecture, OF; Dataset)
Network
Training data has become a part of the model!
What is a (good) model in the ML era?
The concept of “Model” (here, in form of a numerical program):
Classical: (Program; Data) ⇒ Output
Now: (Architecture, OF; TrData) ⇒ Program
(Program, TestData) ⇒ Output
⇒ We need to redefine the concept of model governance too: Model=(Architecture, OF; Dataset)
Network
Training data has become a part of the model!
⇒ Exposing the network to more/different datasets changes the “model”:
What is a (good) model in the ML era?
The concept of “Model” (here, in form of a numerical program):
Classical: (Program; Data) ⇒ Output
Now: (Architecture, OF; TrData) ⇒ Program
(Program, TestData) ⇒ Output
⇒ We need to redefine the concept of model governance too: Model=(Architecture, OF; Dataset)
Network
Training data has become a part of the model!
⇒ Exposing the network to more/different datasets changes the “model”: Which datasets?
What is a (good) “model” in the ML era?
Important concerns regarding training data:
Issue 1 data privacy (data often proprietory and not accessible, difficult to compare/regulate models’ performance consistently across the industry)
Issue 2 availability (some data is simply scarce and not enough to train the network) Issue 3 data quality (noise, biasses, outliers, expressiveness)
A Key Question: Evaluating the “Quality” of a MG
How do we evaluate the quality of a market generator, (whether a the generated paths reflect the key properties of the market paths?)
In other words: what is the right objective function for stochasitc processes?
A Key Question: Evaluating the “Quality” of a MG
How do we evaluate the quality of a market generator, (whether a the generated paths reflect the key properties of the market paths?)
In other words: what is the right objective function for stochasitc processes?
Usually so far:
Matching the distribution at a selected number of time-points
Checkingwhetherthegenerateddatareflectskeystylisedfactsoffinancialmarkets Matching key properties of the paths via Signatures
⇒ See some basic background in extra material on KEATS.
Market Generators
Market Generators: help to address issues 1), 2) and 3) with regards to training data as part of “models” of the new era.
1. Data Anonymity
2. Fighting Overfitting 3. Outlier Detection
Market Generators
Market Generators: help to address issues 1), 2) and 3) with regards to training data as part of “models” of the new era.
Applications:
Applications of Market Generators: Data Anonymity
1. Standard methods so far: de-identification methods (relying on perturbations of the original dataset)
2. Measures of degree of data anonymisation so far: k-anonymity, l-divergence, t-closeness and differential privacy.
3. Inherent problem is that a good balance is difficult to keep: Either sensitive information can still be extracted from the perturbed data, or many important patterns of the data are lost.
Applications of Market Generators: Data Anonymity
1. Standard methods so far: de-identification methods (relying on perturbations of the original dataset)
2. Measures of degree of data anonymisation so far: k-anonymity, l-divergence, t-closeness and differential privacy.
3. Inherent problem is that a good balance is difficult to keep: Either sensitive information can still be extracted from the perturbed data, or many important patterns of the data are lost.
⇒ Promising way out: Generative models.
However, caution is needed here too. Active field of research.
Applications are closely interlinked:
1. Data Anonymity
2. Fighting Overfitting (If the ML application is overfitted to training data, attacker
models can extract information: anonymity compromised. The more complex the
ML application, the more pronounced this problem is.)
3. Outlier Detection (Outliers are at larger risk to be identified even after small ”perturbations”: outliers’ anonymity compromised)
Applications are closely interlinked:
1. Data Anonymity
2. Fighting Overfitting (If the ML application is overfitted to training data, attacker
models can extract information: anonymity compromised. The more complex the
ML application, the more pronounced this problem is.)
3. Outlier Detection (Outliers are at larger risk to be identified even after small
”perturbations”: outliers’ anonymity compromised)
⇒ Use a generative model with a lean structure (to combat 2.) and with a bottleneck structure (to combat 3.): such generative models are for example RBM, VAE (both with a bottleneck structure built-in: meaning that properties of the data are stored a few parameters, and new datasamples are built from this reduced stored information. The bottleneck structure ensures that the newly generated images are similar to the original data in its most important features only (recall Principal Component Analysis) but not exactly the same as the original data (recall https://thispersondoesnotexist.com/).
b1 h1
b2 h2
…
bj hj
…
bM hM
− hidden layer biases − hidden layer units
− network weights
− visible layer units − visible layer biases
Appendix
Example: Network architecture of a Restricted Boltzmann Machine (RBM)
w11 v1
a1
v2 a2
v3 a3
…
vi ai
…
wNM
vN aN