程序代写代做代考 data mining decision tree html Prof. Dr. Matei Demetrescu University of Kiel Institute for Statistics and Econometrics Summer 2020

Prof. Dr. Matei Demetrescu University of Kiel Institute for Statistics and Econometrics Summer 2020
Data Mining
Course description
The course provides a statistical introduction to methods designed for analyzing large and complex data sets and relations. The focus is on regression and classification methods. We start in a parametric setup with linearity, but move on relatively fast to discuss issues appearing in practice such as regressor (feature) selection and take a look at model selection techniques like cross-validation. The course is completed by taking a glimpse at specific nonparametric techniques such as regression and decision trees. Selected case studies are discussed in a computer class using R. After completing the course, you will be able to conduct complex data analyses on your own.
Prerequisites
• (Advanced) Statistics I+II or equivalent
Outline
1. Statistical learning
2. Prediction and classification
3. Using linear models
4. Model selection and error estimation
5. Dealing with many features: Shrinkage and dimensionality reduction 6. Getting nonlinear: Local regression, trees and more
7. Ensemble methods: Bagging, boosting, and model averaging
8. Interpretable models
9. Unsupervised learning

Schedule
• The course will begin on April 8th as an OLAT based online course; it is not clear yet when or whether we can revert to normal. Please see the following generic in- formation regarding online teaching at the Institute for Statistics and Econometrics.
Concerning this course, the plan is to upload slides and video tutorials each wee- kend, followed by live Q&A sessions during the original time slot (i.e. Thursday 2:15pm to 3:45pm) using suitable video conference software.
Please note that plans may change as we go. Any changes will ba communi- cated via OLAT.
• The PC tutorials will be re-scheduled towards the end of the semester; we will let you know as soon as we have more information.
Materials
• Slides and lecture notes will be made available in due time via OLAT • The basic textbook is
– Hastie, T. , R. Tibshirani and J. Friedman (2009, 2nd ed.) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer1
(A pdf copy is freely available from the authors: https://web.stanford.edu/ ~hastie/ElemStatLearn/download.html)
• More focused:
– Bishop, C. (2006) Pattern Recognition and Machine Learning, Springer
• A different perspective:
– Han, J., M. Kamber and J. Pei (2012, 3rd ed.) Data Mining: Concepts and
Techniques, Elsevier
Exam
• written exam
• you may use the slides
• you can earn some bonus points by solving R assignments
1An introductory version we may sometimes use is James, G., D. Witten, T. Hastie and R. Tibshirani (2013) An Introduction to Statistical Learning: With Applications in R, Springer. A pdf copy is also freely available from the authors, feel free to search for it.

Contact:
• mdeme@stat-econ.uni-kiel.de, mokuneva@stat-econ.uni-kiel.de Office hours:
• Office hours are only available online: per email and video call (the latter by ap- pointment).