CS计算机代考程序代写 Excel [1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Global Survey

In just 3 minutes, help us better understand how you perceive arXiv.
Take the survey

TAKE SURVEY

Skip to main content

We gratefully acknowledge support from
the Simons Foundation and member institutions.

arXiv.org > cs > arXiv:1502.03167

Help | Advanced Search

All fields
Title
Author
Abstract
Comments
Journal reference
ACM classification
MSC classification
Report number
arXiv identifier
DOI
ORCID
arXiv author ID
Help pages
Full text

Search

GO

quick links

Login
Help Pages
About

Computer Science > Machine Learning

arXiv:1502.03167 (cs)

[Submitted on 11 Feb 2015 (v1), last revised 2 Mar 2015 (this version, v3)]

Title:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Authors:Sergey Ioffe, Christian Szegedy

Download PDF

Abstract: Training Deep Neural Networks is complicated by the fact that the
distribution of each layer’s inputs changes during training, as the parameters
of the previous layers change. This slows down the training by requiring lower
learning rates and careful parameter initialization, and makes it notoriously
hard to train models with saturating nonlinearities. We refer to this
phenomenon as internal covariate shift, and address the problem by normalizing
layer inputs. Our method draws its strength from making normalization a part of
the model architecture and performing the normalization for each training
mini-batch. Batch Normalization allows us to use much higher learning rates and
be less careful about initialization. It also acts as a regularizer, in some
cases eliminating the need for Dropout. Applied to a state-of-the-art image
classification model, Batch Normalization achieves the same accuracy with 14
times fewer training steps, and beats the original model by a significant
margin. Using an ensemble of batch-normalized networks, we improve upon the
best published result on ImageNet classification: reaching 4.9% top-5
validation error (and 4.8% test error), exceeding the accuracy of human raters.

Subjects:
Machine Learning (cs.LG)
Cite as: arXiv:1502.03167 [cs.LG]
  (or
arXiv:1502.03167v3 [cs.LG] for this version)

Submission history
From: Sergey Ioffe [view email]

[v1]
Wed, 11 Feb 2015 01:44:18 UTC (30 KB)

[v2]
Fri, 13 Feb 2015 17:31:36 UTC (30 KB)
[v3]
Mon, 2 Mar 2015 20:44:12 UTC (30 KB)

Full-text links:
Download:

PDF
PostScript
Other formats

(license)

Current browse context: cs.LG

< prev   |   next >

new
|
recent
|
1502

Change to browse by:

cs

References & Citations

NASA ADS
Google Scholar
Semantic Scholar

23 blog links
(what is this?)

DBLP – CS Bibliography

listing | bibtex

Sergey Ioffe
Christian Szegedy

a

export bibtex citation
Loading…

Bibtex formatted citation

×

loading…

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code & Data

Code and Data Associated with this Article

arXiv Links to Code Toggle

arXiv Links to Code & Data (What is Links to Code & Data?)

Related Papers

Recommenders and Search Tools

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Core recommender toggle

CORE Recommender (What is CORE?)

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv’s community? Learn more about arXivLabs and how to get involved.

Which authors of this paper are endorsers? |
Disable MathJax (What is MathJax?)

About
Help

Click here to contact arXiv
Contact

Click here to subscribe
Subscribe

Copyright
Privacy Policy

Web Accessibility Assistance

arXiv Operational Status

Get status notifications via
email
or slack