Ensemble
Lecture 11: Online Learning
Prof. Michael R. Lyu
Computer Science & Engineering Dept.
The Chinese University of Hong Kong
1
CMSC5741 Big Data Tech. & Apps.
1
A Motivating Example– Spam Filtering
2
Incoming Emails
Spams
Not Spams
Spam Filter
Incoming Emails
Spams
Not Spams
Spam Filter
Traditional Method: Training
3
Feature extraction: X
A batch of Emails
Linear classifier- WTX
Model selection
Incoming Emails
Spams
Not Spams
Spam Filter
Traditional Method: Test
4
Feature extraction – X
A batch of Emails
Linear classifier- WTX
Prediction:
Non-spam – Y’=1
Prediction: spam – Y’=-1
Determination
A new mail
Incoming Emails
Spams
Not Spams
Spam Filter
Online Protocol
5
Feature extraction – X
An online mail
Linear classifier- WTX
Prediction: Non-spam – Y’=1
Prediction: spam – Y’=-1
User judge
Update model
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
6
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
7
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
8
Learning Paradigms Overview
Learning paradigms
Supervised learning
Semisupervised learning
Transductive learning
Unsupervised learning
Universum learning
Transfer learning
9
9
Learning Paradigms Overview
Learning paradigms
Supervised learning
Semisupervised learning
Transductive learning
Unsupervised learning
Universum learning
Transfer learning
10
10
Train on labeled data
Test on test data
Supervised Learning
11
X
–
X
–
?
?
Horse and donkey
11
Train on labeled and unlabeled data
Test on test data
Semisupervised Learning
12
X
–
X
–
o
o
o
o
o
o
?
?
Horse and donkey
12
Train on labeled and test data
Test on test data
Transductive Learning
13
?
?
X
–
X
–
o
o
o
o
o
o
Horse and donkey
13
Unsupervised Learning
Train on unlabeled data
Test on test data (Test reconstruction error)
14
jump
height
?
?
Train on labeled and universum data
Test on test data
Universum Learning
15
?
?
X
–
X
–
Train on horse and donkey, universum is mule
15
Transfer Learning
Train on labeled from source and target domains
Test on test data
16
?
?
Source: goat and sheep
Target: horse and donkey
16
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
17
What is Online Learning?
Batch/Offline learning
Observe a batch of training data
Learn a model from them
Predict new samples accurately
Online learning
Observe a sequence of data
Learn a model incrementally as instances come
Make the sequence of online predictions accurately
18
Update a model
True response
user
Make prediction
Online learning is the process of answering a sequence of questions given (maybe partial) knowledge of the correct answers to previous questions and possibly additional available information. [Shal11]
Difference between online and incremental learning
Incremental learning includes informationally incremental learning and operationally incremental learning
Informationally incremental learning: work incrementally, no permission to look back at the whole history of information presented.
Operationally incremental learning: permission to look back, not allow to use information of the past in some effective way.
Incremental learning: incremental, not looking back
Online learning: the key defining characteristic is that soon after the prediction is made, the true label of the instance is discovered. This information can then be used to refine the prediction hypothesis used by the algorithm. The goal of the algorithm is to make predictions that are close to the true labels.
Stochastic gradient descent: in machine learning, the problem is usually to minimize an objective function.
18
Online Prediction Algorithm
An initial prediction rule
For t = 1, 2, …
We observe and make a prediction
We observe the true outcome yt and then compute a loss
The online algorithm updates the prediction rule using the new example and construct
19
Update ft
yt
user
19
Online Prediction Algorithm
The total error of the method is
Goal: this error to be as small as possible
Predict unknown future one step a time: similar to generalization error
20
yt
user
Regret Analysis
: optimal prediction function from a class H, e.g., the class of linear classifiers
with minimum error after seeing all examples
Regret for the online learning algorithm
21
We want regret as small as possible
Why Low Regret?
Regret for the online learning algorithm
Advantages
We do not lose much from not knowing future events
We can perform almost as well as someone who observes the entire sequence and picks the best prediction strategy in hindsight
We can also compete with changing environment
22
Advantages of Online Learning
Meet many applications for data arriving sequentially while predictions are required on-the-fly
Avoid re-training when adding new data
Applicable in adversarial and competitive environment
Strong adaptability to changing environment
High efficiency and excellent scalability
Simple to understand and easy to implement
Easy to be parallelized
Theoretical guarantees
23
In many cases, data arrives sequentially while predictions are required on-the-fly
Applicable also in adversarial and competitive environments (e.g. spam filtering, stock market)
Can adapt to changing environment
Efficient to run (scalability) and easy to code
Theoretical guarantees
23
Where to Apply Online Learning?
24
Online Learning
Social Media
Finance
Internet Security
Robot Motion Planning
Online Learning for Social Media
Recommendation, sentiment/emotion analysis
25
Where to Apply Online Learning?
26
Online Learning
Social Media
Finance
Internet Security
Robot Motion Planning
Online Learning for Robot Motion Planning
Tasks
Exploring an unknown terrain
Finding a destination
27
Rock-Paper-Scissors: You vs. the Computer
Robot Dog
27
Where to Apply Online Learning?
28
Online Learning
Social Media
Finance
Internet Security
Robot Motion Planning
Online Learning for Internet Security
Electronic business sectors
Spam email filtering
Fraud credit card transaction detection
Network intrusion detection system, etc.
29
29
Where to Apply Online Learning?
30
Online Learning
Social Media
Finance
Internet Security
Robot Motion Planning
Online Learning for Financial Decision
Financial decision
Online portfolio selection
Sequential investment, etc.
31
31
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
32
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
33
Perceptron Algorithm (F. Rosenblatt, 1958)
One of the oldest machine learning algorithm
Online algorithm for learning a linear threshold function with small error
34
n
Perceptron Algorithm (F. Rosenblatt, 1958)
Goal: find a linear classifier with small error
35
If no error, keeping the same; otherwise, update.
Intuition Explanation
Want positive margin:
Effect of Perceptron update on margin:
So margin increases
36
37
Geometric View
w0
Initialization
37
38
Geometric View
w0
t=1
(x1, y1(=1))
Misclassification!
38
39
Geometric View
w0
Update
(x1, y1(=1))
w1=w0+x1y1
39
40
Geometric View
t=2
(x1, y1(=1))
w1
(x2, y2(=-1))
Misclassification!
40
41
Geometric View
(x1, y1)
w1
(x2, y2(=-1))
w2=w1+x2y2
Update
Continue…
41
In-class Practice
Go to practice
42
Perceptron Mistake Bound
Consider w* separate the data:
Define margin
The number of mistakes perceptron makes is at most
43
Norm of x: the larger, the larger mistake bound
The larger, the more confidence
43
Proof of Perceptron Mistake Bound [Novikoff, 1963]
Proof: Let be the hypothesis before the k-th mistake. Assume that the k-th mistake occurs on the input example .
44
First,
Second,
Hence,
In the left (first),
1. y_i*(v_k^Tx_i)<0;
2. |x|_i^2<=R^2
3. v_k+1 has made k-th mistakes
In the second,
y_ix_i^Tu = |x_i^Tu|>=\gamma R
44
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
45
Overview
46
Online Learning
Non-sparse learning
Sparse learning
Unsupervised learning
Online/Stochastic Gradient Descent
Online gradient descent
Stochastic gradient descent
47
Update ft
yt
user
Update ft
yt
user
In stochastic (or “on-line”) gradient descent, the true gradient of is approximated by a gradient at a single example: w(j) w(j) – (j,i)
As the algorithm sweeps through the training set, it performs the above update for each training example. Several passes can be made over the training set until the algorithm converges. If this is done, the data can be shuffled for each pass to prevent cycles.
47
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
48
Online Non-Sparse Learning
Decision function can be linear and non-linear as
First order learning methods
Online gradient descent (Zinkevich, 2003)
Passive aggressive learning (Crammer et al., 2006)
Others (including but not limited)
ALMA: A New Approximate Maximal Margin Classification Algorithm (Gentile, 2001)
ROMMA: Relaxed Online Maximum Margin Algorithm (Li and Long, 2002)
MIRA: Margin Infused Relaxed Algorithm (Crammer and Singer, 2003)
DUOL: A Double Updating Approach for Online Learning (Zhao et al., 2009)
49
Online margin-based prediction algorithms
49
Online Gradient Descent (OGD)
(Zinkevich, 2003)
Online convex optimization
Consider a convex objective function
where is a bounded convex set
Update by Online Gradient Descent (OGD) or Stochastic Gradient Descent (SGD)
where is a learning rate
50
gradient descent
projection
Provide a framework to prove regret bound for online convex optimization
Online learning does not necessarily need gradient descent.
But in classification/regression, they have a minimization objective function. One typical solution is gradient descent. \nabla f (w_t) can be derivative of w_t.
Stochastic gradient descent is equivalent to online gradient descent in this case.
The contribution of this work triggers the investigation of online convex optimization, where the objective is a convex function and is performend in the online mode.
50
Online Gradient Descent (OGD) (Zinkevich, 2003)
For t = 1, 2, …
An unlabeled sample arrives
Make a prediction based on existing weights
Observe the true class label
Update the weights by
where is a learning rate
51
Update wt+1
yt
user
regret bound is established.
This is the first generalized online convex optimization. Regret bound on the gradient descent algorithm is established.
O(\sqrt{T})
51
Passive-Aggressive Online Learning (Crammer et al., 2006)
Each example defines a set of consistent hypotheses:
The new vector is set to be the projection of onto
52
Classification
Regression
Uniclass
assume that there exist a weight vector w* and an insensitivity parameter epsilon* for which the data is perfectly realizable. Denote the set by C.
For classification, C is a half space C={w: -y_t w*x_t\le\epsilon}
For regression, C is an \epsilon-hyper-slab, C={w: |w*x_t-y_t|\le\epsilon}
For uniclass, it is a ball of radius \epsilon centered at x_t, C={w: |w-x_t|\le\epsilon}
The optimization problem attempts to keep w_{t+1} as close to w_t as possible, while forcing w_{t+1} to achieve a zero loss on the most recent example.
The resulting algorithm is passive whenever the loss is zero.
52
Passive-Aggressive
53
achieve a zero loss on the most recent example
53
Passive Aggressive Online Learning
(Crammer et al., 2006)
PA (Binary classification)
PA-I (C-SVM)
PA-II (Relaxed C-SVM)
Closed-form solution
54
PA: hinge loss for binary classification
PA-I: hinge loss with a slack variable (hinge loss)
PA-II: slack variable is not necessary non-negative (square loss)
54
Passive Aggressive Online Learning
(Crammer et al., 2006)
Algorithm
Objective
Closed-form solutions
55
Perceptron: \tao_t=1
Margin based online learning algorithm, binary classification, C-SVM, regression
the core of our construction can be viewed as finding a support vector machine based on a single example while replacing the norm constraint of SVM with a proximity constraint to the current classifier. The benefit of this approach is two fold. First, we get a closed form solution for the next classifier. Second, we are able to provide a unified analysis of the cumulative loss for various online algorithms used to solve different decision problems. Specifically, we derive and analyze versions for regression problems, uniclass prediction, multiclass problems, and sequence
prediction tasks.
The resulting algorithm is passive whenever the hinge is zero. If the loss is positive, the algorithm aggressively forces w_{t+1} to satisfy the constraint regardless of the step-size required.
55
Online Non-Sparse Learning
First order methods
Learn a linear weight vector (first order) of model
Pros and Cons
Simple and easy to implement
Efficient and scalable for high-dimensional data
Relatively slow convergence rate
56
Online Non-Sparse Learning
Second order online learning methods
Update the weight vector w by maintaining and exploring both first-order and second-order information
Some representative methods, but not limited
SOP: Second Order Perceptron (Cesa-Bianchi et al., 2005)
CW: Confidence Weighted learning (Dredze et al., 2008)
AROW: Adaptive Regularization of Weights (Crammer et al., 2009)
IELLIP: Online Learning by Ellipsoid Method (Yang et al., 2009)
NHERD: Gaussian Herding (Crammer & Lee 2010)
NAROW: New variant of AROW algorithm (Orabona & Crammer 2010)
SCW: Soft Confidence Weighted (SCW) (Hoi et al., 2012)
Pros and Cons
Faster convergence rate
Expensive for high-dimensional data
Relatively sensitive to noise
57
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
58
Sparse Learning
59
Natural Images
Learned bases (f1 , …, f64): “Edges”
» 0.8 * + 0.3 * + 0.5 *
Test example
x » 0.8 * f36 + 0.3 * f42 + 0.5 * f63
59
Online Sparse Learning
Motivation
Space constraint: RAM overflow
Test-time constraint
How to induce Sparsity in the weights of online learning algorithms?
60
60
Online Sparse Learning
Objective function
Problem in online learning
Standard stochastic gradient descent
It does not yield sparse solution
Some representative work
Truncated gradient (Langford et al., 2009)
FOBOS: Forward Looking Subgradients (Duchi and Singer, 2009)
Dual averaging (Xiao, 2009)
etc.
61
w0
Objective function
Stochastic gradient descent
Simple coefficient rounding
when the coefficient is small
Truncated Gradient (Langford et al., 2009)
62
Truncated gradient: impose sparsity by modifying the stochastic gradient descent
w0
SGD does not attain sparse solution. Truncated gradient: if the f(wi) is less than theta, then truncate the value of wi to 0.
62
Simple Coefficient Rounding vs. Less aggressive truncation
Truncated Gradient (Langford et al., 2009)
63
If i/K is an integer, update by rounding. How to select K is a main drawback of the method.
If K is too large, maintain too many non-zero elements.
If K=1, the step size is small, rounding pulls to zero.
Directly rounding to zero is too aggressive.
A less aggressive version is to shrink the coefficient to zero by a smaller amount.
63
K, …
Truncated Gradient (Langford et al., 2009)
The amount of shrinkage is measured by a gravity parameter
When , the update rule is identical to the standard SGD
The truncation can be performed every K online steps
Loss functions:
Logistic
SVM (hinge)
Least square
64
The larger the parameters g and \theta are, the more sparsity is incurred.
64
Theoretical result (T: No. of samples)
Let , the regret is
Truncated Gradient (Langford et al., 2009)
65
regret bound is established.
Truncated gradient is consistently competitive with the other two online algorithms and significantly outperformed them in some problems. This suggests the effectiveness of truncated gradient.
it is interesting to observe that the qualitative behavior of truncated gradient is often similar to LASSO, especially when very sparse weight vectors are allowed.
LASSO usually has worse performance when the allowed number of nonzero weights is set too large.
it is worth emphasizing that the experiments in this subsection try to shed some light on the relative strengths of these algorithms in terms of feature sparsification. For large data sets such as Big_Ads only truncated gradient, coefficient rounding, and the sub-gradient algorithms are
applicable.
65
Dual Averaging (Xiao, 2010)
Objective function
Problem: truncated gradient doesn’t produce truly sparse weight due to small learning rate
Fix: dual averaging which keeps two state representations:
parameter
average gradient vector
66
Dual Averaging (Xiao, 2010)
has entry-wise closed-form solution
Advantage: sparse on the weight
Disadvantage: keep a non-sparse subgradient
67
Algorithm
Average regret
Theoretical bound: similar to gradient descent
Convergence and Regret
68
average regret bound is established.
Variants of Online Sparse Learning Models
Online feature selection (OFS)
A variant of sparse online learning
The key difference is that OFS focuses on selecting a fixed subset of features in online learning process
Could be used as an alternative tool for batch feature selection when dealing with big data
Other existing work
Online learning for Group Lasso (Yang et al., 2010) and online learning for multi-task feature selection (Yang et al. 2013) to select features in group manner or features among similar tasks
69
Online Sparse Learning
Objective
Induce sparsity in the weights of online learning algorithms
Pros and Cons
Simple and easy to implement
Efficient and scalable for high-dimensional data
Relatively slow convergence rate
No perfect way to attain sparsity solution yet
70
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
71
Online Unsupervised Learning
Assumption: data generated from some underlying parametric probabilistic density function
Goal: estimate the parameters of the density to give a suitable compact representation
72
72
Online Unsupervised Learning
Some representative work
Online singular value decomposition (SVD) (Brand, 2003)
Online principal component analysis (PCA) (Warmuth and Kuzmin, 2006)
Online dictionary learning for sparse coding (Mairal et al. 2009)
Online learning for latent Dirichlet allocation (LDA) (Hoffman et al., 2010)
Online variational inference for the hierarchical Dirichlet process (HDP) (Wang et al. 2011)
Online Learning for Collaborative Filtering (Ling et al. 2012)
…
73
Adiabatic (隔热)
73
: input data matrix
matrix (e.g. documents, terms)
: left singular vectors
matrix (documents, topics)
: singular values
diagonal matrix (strength of each “topic”)
rank of matrix
Nonnegative and sorted
: right singular vectors
matrix ( terms, topics)
SVD: Definition
74
: scalar
: vector
: vector
A
𝑉𝑇
Online SVD (Brand, 2003)
Challenges: storage and computation
Idea: an incremental algorithm computes the principal eigenvectors of a matrix without storing the entire matrix in memory
75
Online SVD (Brand, 2003)
76
m
Online SVD (Brand, 2003)
Complexity
Store
The online SVD has more error, but it is comparable to PCA (SVD)
77
77
Online SVD
Unsupervised learning: minimizing the reconstruction errors
Each update will increase the rank by at most one, until a user-specified ceiling is reached
Pros and Cons
Simple to implement
Fast computation
Comparable performance
Lack of theoretical guarantee
78
Outline
Introduction
Learning paradigms
Online learning and its applications
Online learning algorithms
Perceptron
Online non-sparse learning
Online sparse learning
Online unsupervised learning
Conclusion
79
One-slide Takeaway
Basic concepts
What is online learning?
What is regret analysis?
Online learning algorithms
Perceptron
Online gradient descent
Passive aggressive
Truncated gradient
Dual averaging
Online SVD
80
Resources
Book and Video:
Prediction Learning and Games. N. Cesa-Bianchi and G. Lugosi. Cambridge university press, 2006.
[Shal11] Online Learning and Online Convex Optimization. Shai Shalev-Shwartz. Foundations and Trends in Machine Learning, Vol. 4, No. 2, 2011, 107-194, DOI: 10.1561/2200000018
http://videolectures.net/site/search/?q=online+learning
Software:
Pegasos: http://www.cs.huji.ac.il/~shais/code/index.html
VW: hunch.net/~vw/
SGD by Leon Bottou: http://leon.bottou.org/projects/sgd
81
81
References
Cesa-Bianchi, Nicol`o, Conconi, Alex, and Gentile, Claudio. A second-order perceptron algorithm. SIAM J. Comput., 34(3):640–668, 2005.
M. Brand. Fast online svd revisions for lightweight recommender systems. In SDM, 2003.
N. Cesa-Bianchi, A. Conconi, and C. Gentile. A second-order perceptron algorithm. SIAM J. Comput., 34(3):640–668, 2005.
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551–585, 2006.
K. Crammer, A. Kulesza, and M. Dredze. Adaptive regularization of weight vectors. In NIPS, pages 414–422, 2009.
K. Crammer and D. D. Lee. Learning via gaussian herding. In NIPS, pages 451–459, 2010.
K. Crammer and Y. Singer. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 3:951–991, 2003.
M. Dredze, K. Crammer, and F. Pereira. Confidence-weighted linear classification. In ICML, pages 264–271, 2008.
C. Gentile. A new approximate maximal margin classification algorithm. Journal of Machine Learning Research, 2:213–242, 2001.
M. D. Hoffman, D. M. Blei, and F. R. Bach. Online learning for latent dirichlet allocation. In NIPS, pages 856–864, 2010.
S. C. H. Hoi, J. Wang, and P. Zhao. Exact soft confidence-weighted learning. In ICML, 2012.
82
82
References
J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. Journal of Machine Learning Research, 10:777–801, 2009.
Y. Li and P. M. Long. The relaxed online maximum margin algorithm. Machine Learning, 46(1-3):361–387, 2002.
Guang Ling, Haiqin Yang, Irwin King and M.R. Lyu. Online Learning for Collaborative Filtering, IJCNN, Brisbane, Australia, 2012.
P. L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal., 16(6):964–979, 1979.
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In ICML, page 87, 2009.
Y. Nesterov. Gradient methods for minimizing composite objective function. CORE Discussion Paper 2007/76, Catholic University of Louvain, Center for Operations Research and Econometrics, 2007.
F. Orabona and K. Crammer. New adaptive algorithms for online classification. In NIPS, pages 1840–1848, 2010.
F. Rosenblatt. The Perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–408, 1958.
83
References
C. Wang, J. W. Paisley, and D. M. Blei. Online variational inference for the hierarchical dirichlet process. Journal of Machine Learning Research – Proceedings Track, 15:752–760, 2011.
M. K. Warmuth and D. Kuzmin. Randomized pca algorithms with regret bounds that are logarithmic in the dimension. In NIPS, pages 1481–1488, 2006.
S. J. Wright, R. D. Nowak, and M. A. T. Figueiredo. Sparse reconstruction by separable approximation. IEEE Transactions on Signal Processing, 57(7):2479–2493, 2009.
L. Xiao. Dual averaging methods for regularized stochastic learning and online optimization. Journal of Machine Learning Research, 11:2543–2596, 2010.
H. Yang, M. R. Lyu, and I. King. Efficient online learning for multi-task feature selection. ACM Transactions on Knowledge Discovery from Data, 2013.
H. Yang, Z. Xu, I. King, and M. R. Lyu. Online learning for group lasso. In ICML, pages 1191–1198, Haifa, Israel, 2010.
L. Yang, R. Jin, and J. Ye. Online learning by ellipsoid method. In ICML, page 145, 2009.
P. Zhao, S. C. H. Hoi, and R. Jin. Duol: A double updating approach for online learning. In NIPS, pages 2259–2267, 2009.
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, pages 928–936, 2003.
84
In-class Practice
We have two data and , how to get a classifier by Perceptron learning rule?
Assume
is in class (the first data)
is in class
Data points are linearly separable and can be applied repeatedly (for validation).
85
(
)
{
}
N
i
i
i
y
1
,
=
x
(
)
(
)
t
t
y
y
,
,
,
,
1
1
x
x
K
)
(
0
×
f
)
(
1
t
t
f
x
–
)
),
(
(
1
t
t
t
y
f
l
x
–
)
(
x
t
f
t
x
t
x
x
)
(
1
t
t
f
x
–
å
=
–
T
t
t
t
t
y
f
l
1
1
)
),
(
(
x
(
)
å
=
Î
=
×
T
t
t
t
H
f
y
f
l
f
1
*
)
),
(
(
min
arg
x
)
(
*
×
f
[
]
å
=
–
–
=
T
t
t
t
t
t
t
y
f
l
y
f
l
T
1
*
1
)
),
(
(
)
),
(
(
1
regret
x
x
[
]
å
=
–
–
=
T
t
t
t
t
t
t
y
f
l
y
f
l
T
1
*
1
)
),
(
(
)
),
(
(
1
regret
x
x
0
Lavf56.36.100
0
*
>
i
i
T
y
x
w
2
2
*
*
sup
min
i
i
i
T
i
x
w
x
w
=
g
2
–
g
k
v
(
)
i
i
y
,
x
2
2
*
*
sup
min
i
i
i
T
i
x
w
x
w
=
g
)
sgn(
ˆ
t
T
t
t
y
x
w
=
{
}
1
,
1
+
–
Î
t
y
t
y
ˆ
50100150200250300350400450500
50
100
150
200
250
300
350
400
450
500
50100150200250300350400450500
50
100
150
200
250
300
350
400
450
500
50100150200250300350400450500
50
100
150
200
250
300
350
400
450
500
t
w
(
)
å
=
=
t
i
i
i
t
w
f
t
g
1
1
thr?
<
p
/docProps/thumbnail.jpeg