[1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Global Survey
In just 3 minutes, help us better understand how you perceive arXiv.
Take the survey
TAKE SURVEY
Skip to main content
We gratefully acknowledge support from
the Simons Foundation and member institutions.
arXiv.org > cs > arXiv:1502.03167
Help | Advanced Search
All fields
Title
Author
Abstract
Comments
Journal reference
ACM classification
MSC classification
Report number
arXiv identifier
DOI
ORCID
arXiv author ID
Help pages
Full text
Search
GO
quick links
Login
Help Pages
About
Computer Science > Machine Learning
arXiv:1502.03167 (cs)
[Submitted on 11 Feb 2015 (v1), last revised 2 Mar 2015 (this version, v3)]
Title:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Authors:Sergey Ioffe, Christian Szegedy
Download PDF
Abstract: Training Deep Neural Networks is complicated by the fact that the
distribution of each layer’s inputs changes during training, as the parameters
of the previous layers change. This slows down the training by requiring lower
learning rates and careful parameter initialization, and makes it notoriously
hard to train models with saturating nonlinearities. We refer to this
phenomenon as internal covariate shift, and address the problem by normalizing
layer inputs. Our method draws its strength from making normalization a part of
the model architecture and performing the normalization for each training
mini-batch. Batch Normalization allows us to use much higher learning rates and
be less careful about initialization. It also acts as a regularizer, in some
cases eliminating the need for Dropout. Applied to a state-of-the-art image
classification model, Batch Normalization achieves the same accuracy with 14
times fewer training steps, and beats the original model by a significant
margin. Using an ensemble of batch-normalized networks, we improve upon the
best published result on ImageNet classification: reaching 4.9% top-5
validation error (and 4.8% test error), exceeding the accuracy of human raters.
Subjects:
Machine Learning (cs.LG)
Cite as: arXiv:1502.03167 [cs.LG]
(or
arXiv:1502.03167v3 [cs.LG] for this version)
Submission history
From: Sergey Ioffe [view email]
[v1]
Wed, 11 Feb 2015 01:44:18 UTC (30 KB)
[v2]
Fri, 13 Feb 2015 17:31:36 UTC (30 KB)
[v3]
Mon, 2 Mar 2015 20:44:12 UTC (30 KB)
Full-text links:
Download:
PDF
PostScript
Other formats
(license)
Current browse context: cs.LG
< prev | next >
new
|
recent
|
1502
Change to browse by:
cs
References & Citations
NASA ADS
Google Scholar
Semantic Scholar
23 blog links
(what is this?)
DBLP – CS Bibliography
listing | bibtex
Sergey Ioffe
Christian Szegedy
a
export bibtex citation
Loading…
Bibtex formatted citation
×
loading…
Data provided by:
Bookmark
Bibliographic Tools
Bibliographic and Citation Tools
Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code & Data
Code and Data Associated with this Article
arXiv Links to Code Toggle
arXiv Links to Code & Data (What is Links to Code & Data?)
Related Papers
Recommenders and Search Tools
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Core recommender toggle
CORE Recommender (What is CORE?)
About arXivLabs
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv’s community? Learn more about arXivLabs and how to get involved.
Which authors of this paper are endorsers? |
Disable MathJax (What is MathJax?)
About
Help
Click here to contact arXiv
Contact
Click here to subscribe
Subscribe
Copyright
Privacy Policy
Web Accessibility Assistance
arXiv Operational Status
Get status notifications via
email
or slack