CS计算机代考程序代写 Excel [1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Global Survey

In just 3 minutes, help us better understand how you perceive arXiv.
Take the survey

TAKE SURVEY

We gratefully acknowledge support from
the Simons Foundation and member institutions.

arXiv.org > cs > arXiv:1502.03167

Help | Advanced Search

All fields
Title
Author
Abstract
Comments
Journal reference
ACM classification
MSC classification
Report number
arXiv identifier
DOI
ORCID
arXiv author ID
Help pages
Full text

quick links

Computer Science > Machine Learning

arXiv:1502.03167 (cs)

[Submitted on 11 Feb 2015 (v1), last revised 2 Mar 2015 (this version, v3)]

Title:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Authors:Sergey Ioffe, Christian Szegedy

Download PDF

Abstract: Training Deep Neural Networks is complicated by the fact that the
distribution of each layer’s inputs changes during training, as the parameters
of the previous layers change. This slows down the training by requiring lower
learning rates and careful parameter initialization, and makes it notoriously
hard to train models with saturating nonlinearities. We refer to this
phenomenon as internal covariate shift, and address the problem by normalizing
layer inputs. Our method draws its strength from making normalization a part of
the model architecture and performing the normalization for each training
mini-batch. Batch Normalization allows us to use much higher learning rates and
be less careful about initialization. It also acts as a regularizer, in some
cases eliminating the need for Dropout. Applied to a state-of-the-art image
classification model, Batch Normalization achieves the same accuracy with 14
times fewer training steps, and beats the original model by a significant
margin. Using an ensemble of batch-normalized networks, we improve upon the
best published result on ImageNet classification: reaching 4.9% top-5
validation error (and 4.8% test error), exceeding the accuracy of human raters.

Subjects:
Machine Learning (cs.LG)
Cite as: arXiv:1502.03167 [cs.LG]
(or
arXiv:1502.03167v3 [cs.LG] for this version)

Submission history
From: Sergey Ioffe [view email]

[v1]
Wed, 11 Feb 2015 01:44:18 UTC (30 KB)

[v2]
Fri, 13 Feb 2015 17:31:36 UTC (30 KB)
[v3]
Mon, 2 Mar 2015 20:44:12 UTC (30 KB)

Full-text links:
Download:

PDF
PostScript
Other formats

(license)

Current browse context: cs.LG

< prev | next >

new
|
recent
|
1502

Change to browse by:

References & Citations

NASA ADS
Google Scholar
Semantic Scholar

23 blog links
(what is this?)

DBLP – CS Bibliography

listing | bibtex

Sergey Ioffe
Christian Szegedy

export bibtex citation
Loading…

Bibtex formatted citation

loading…

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code & Data

Code and Data Associated with this Article

arXiv Links to Code Toggle

arXiv Links to Code & Data (What is Links to Code & Data?)

Related Posts