CS计算机代考程序代写 scheme dns DHCP case study algorithm Week 7-audio-video

Week 7-audio-video

Advanced Network Technologies
Multimedia 1/2

School of Computer Science
Dr. Wei Bao| Lecturer

Multimedia

› Multimedia

› Streaming stored video

› Voice-over-IP

› RTP/SIP

Multimedia

Multimedia networking: 3 application types

› streaming, stored audio, video
– streaming: can begin playout before downloading entire file
– stored (at server): can transmit faster than audio/video will

be rendered (implies storing/buffering at client)
– e.g., YouTube, Netflix, Hulu

› conversational voice/video over IP
– interactive nature of human-to-human conversation limits

delay tolerance
– e.g., Skype

› streaming live audio, video
– e.g., live sporting event

Multimedia audio

› analog audio signal sampled at
constant rate

– telephone: 8,000 samples/sec
– CD music: 44,100 samples/sec

› each sample quantized, i.e.,
rounded
– e.g., 28=256 possible quantized

values
– each quantized value

represented by bits, e.g., 8 bits
for 256 values

time

au
di

o
si

gn
al

a
m

pl
itu

de

analog
signal

quantized
value of

analog value

quantization
error

sampling rate
(N sample/sec)

Rate=44100 samples/sec * 8bit/sample = 352800 bps

vVideo: sequence of images displayed at constant rate
§e.g. 24 images/sec

vEach image: array of pixels: Resolution: e.g. 480*640
§each pixel: 3 colors
§Red, Green, Blue (RGB)
§Each color has 28=256 possible quantized values (8 bit)
§Data rate: 8*3*480*640*24 = 177 Mbps. Too large!

Video

vcoding: use redundancy within
and between images to
decrease # bits used to encode
image
§ spatial (within image)
§ temporal (from one image to next)

• examples:
• MPEG 1 (CD-ROM) 1.5 Mbps
• MPEG2 (DVD) 3-6 Mbps
• MPEG4 (often used in Internet, < 1 Mbps) § MPEG: Moving Picture Experts Group ……………………...… spatial coding example: instead of sending N values of same color (all purple), send only two values: color value (purple) and number of repeated values (N) ……………………...… frame i frame i+1 temporal coding example: instead of sending complete frame at i+1, send only differences from frame i Streaming Stored Video 1. video recorded (e.g., 30 frames/sec) 2. video sentC um ul at iv e da ta streaming: at this time, client playing out early part of video, while server still sending later part of video network delay (fixed in this example) time 3. video received Streaming stored video 4. video played out Frame 1 Frame 2 Frame 3 Frame 4 … Streaming stored video: challenges › continuous playout constraint: once client playout begins, playback must match original timing - … but network delays are variable (jitter), so will need client-side buffer to match playout requirements ›other challenges: - client interactivity: pause, fast-forward, jump through video - video packets may be lost, retransmitted constant bit rate video transmission C um ul at iv e da ta time variable network delay client video reception constant bit rate video playout at client client playout delay bu ffe re d vi de o Streaming stored video: revisited › client-side buffering and playout delay: compensate for network- added delay, delay jitter bu ffe re d vi de o constant bit rate video transmission C um ul at iv e da ta time variable network delay client video reception constant bit rate video playout at client client playout delay bu ffe re d vi de o Streaming stored video: revisited Buffer underflow! Cannot be played on time constant bit rate (CBR) video transmission C um ul at iv e da ta time variable network delay client video reception constant bit rate video playout at client larger client playout delay Streaming stored video: revisited › Increase playout delay: fewer buffer underflows › initial playout delay tradeoff variable fill rate, x(t) client application buffer, size B playout rate, e.g., CBR r buffer fill level, Q(t) video server client Client-side buffering, playout variable fill rate, x(t) client application buffer, size B playout rate, e.g., CBR r buffer fill level, Q(t) video server client 1. Initial fill of buffer until playout begins at tp 2. playout begins at tp, 3. buffer fill level Q(t) varies over time as fill rate x(t) varies and playout rate r is constant 4. Q(t+1)=Q(t)+x(t), t ≤ tp; Q(t+1)=max[Q(t)+x(t)-r, 0], t> tp
5. Q(t)+x(t)-r<0: buffer underflow Client-side buffering, playout Client-side buffering, playout playout buffering: average fill rate E(x), playout rate r ›E(x) < r: buffer eventually empties (causing freezing of video playout until buffer fills again) ›E(x) ≥ r: buffer will not empty, provided initial playout delay is large enough to absorb variability in x(t) - initial playout delay tradeoff: buffer starvation less likely with larger delay, but larger delay until user begins watching variable fill rate, x(t) client application buffer, size B playout rate, e.g., CBR r buffer fill level, Q(t) video server Streaming multimedia: UDP › server sends at rate appropriate for client - often: send rate = encoding rate = constant rate - transmission rate can be oblivious to congestion levels › short playout delay (2-5 seconds) to remove network jitter › error recovery: application-level, time-permitting › RTP [RFC 2326]: multimedia payload types › UDP may not go through firewalls Streaming multimedia: HTTP › multimedia file retrieved via HTTP GET › send at maximum possible rate under TCP › fill rate fluctuates due to TCP congestion control, retransmissions (in-order delivery) › larger playout delay: smooth TCP delivery rate › HTTP/TCP passes more easily through firewalls variable rate, x(t) TCP send buffer video file TCP receive buffer application playout buffer server client Streaming multimedia: DASH › DASH: Dynamic, Adaptive Streaming over HTTP › server: - divides video file into multiple chunks - each chunk stored, encoded at different rates - manifest file: provides URLs for different chunks › client: - periodically measures server-to-client bandwidth - consulting manifest, requests one chunk at a time - chooses maximum coding rate sustainable given current bandwidth - can choose different coding rates at different points in time (depending on current available bandwidth) Streaming multimedia: DASH Chunk1 Chunk2 Chunk3 … ChunkN Chunk1 Chunk2 Chunk3 … ChunkN Chunk1 Chunk2 Chunk3 … ChunkN Low quality High quality Bandwidth Chunk1 Chunk2 Chunk3 … Streaming multimedia: DASH › DASH: Dynamic, Adaptive Streaming over HTTP › “intelligence” at client: client determines - when to request chunk (so that buffer starvation does not occur) - what encoding rate to request (higher quality when more bandwidth available) - where to request chunk (can request from URL server that is “close” to client or has high available bandwidth) Content distribution network › challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users? › option 1: single, large “mega-server” - single point of failure - point of network congestion - long path to distant clients - multiple copies of video sent over outgoing link ….quite simply: this solution doesn’t scale Content distribution network › challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users? › option 2: store/serve multiple copies of videos at multiple geographically distributed sites (CDN) Bob (client) requests video http://netcinema.com/6Y7B23V §video stored in CDN at http://KingCDN.com/NetC6y&B23V netcinema.com KingCDN.com 1 1. Bob gets URL for video http://netcinema.com/6Y7B23V from netcinema.com web page 2 2. resolve http://netcinema.com/6Y7B23V via Bob’s local DNS netcinema’s authorative DNS 3 4 4&5. Resolve http://KingCDN.com/NetC6y&B23 via KingCDN’s authoritative DNS, which returns IP address of best KingCDN server with video 5 6. request video from KINGCDN server, streamed via HTTP KingCDN authoritative DNS CDN: “simple” content access scenario 3. netcinema’s DNS returns URL http://KingCDN.com/NetC6y&B23V Local DNS 0. Store the video in CDN http://netcinema.com/6Y7B23V http://KingCDN.com/NetC6y&B23V http://netcinema.com/6Y7B23V http://netcinema.com/6Y7B23V http://KingCDN.com/NetC6y&B23 http://KingCDN.com/NetC6y&B23V CDN cluster selection strategy › challenge: how does CDN DNS select “good” CDN node to stream to client - pick CDN node geographically closest to client - pick CDN node with shortest delay (or min # hops) to client (CDN nodes periodically ping access ISPs, reporting results to CDN DNS) › alternative: let client decide - give client a list of several CDN servers - client pings servers, picks “best” - Netflix approach Case study: Netflix › 30% downstream US traffic in 2011 › Owns very little infrastructure, uses 3rd party services: - own registration, payment servers - Amazon (3rd party) cloud services: - Create multiple versions of movie (different encodings) in Amazon cloud - Upload versions from cloud to CDNs - Cloud hosts Netflix web pages for user browsing - three 3rd party CDNs host/stream Netflix content: Akamai, Limelight, Level-3 1 1. Bob manages Netflix account Netflix registration, accounting servers Amazon cloud Akamai CDN Limelight CDN Level-3 CDN 2 2. Bob browses Netflix video 3 3. Manifest file returned for requested video 4. DASH streaming upload copies of multiple versions of video to CDNs Case study: Netflix Master version ->
different formats.

homepage

DNS

Voice over IP

Voice-over-IP (VoIP)

› VoIP end-end-delay requirement: needed to maintain
“conversational” aspect
– higher delays noticeable, impair interactivity

– < 150 msec: good - > 400 msec: bad

– includes application-level (playout), network delays

› session initialization: how does callee advertise IP address, port
number, encoding algorithms?

› value-added services: call forwarding, screening, recording
› emergency services: 911/000

VoIP characteristics

› speaker’s audio: alternating talk spurts, silent periods.

– 64 kbps during talk spurt

– chucks generated only during talk spurts

– 20 msec: chucks at 8 Kbytes/sec: 160 bytes of data

› application-layer header added to each chunk

› chunk+header encapsulated into UDP or TCP segment

› application sends segment into socket every 20 msec during
talkspurt

VoIP: packet loss, delay

› network loss: IP datagram lost due to network congestion (router
buffer overflow)

› delay loss: IP datagram arrives too late for playout at receiver
– delays: processing, queueing in network, transmission, proporgation.

– typical maximum tolerable delay: 400 ms

› loss tolerance: depending on voice encoding, loss concealment,
packet loss rates between 1% and 10% can be tolerated

C
um

ul
at

iv
e

da
ta

time

variable
network
delay
(jitter)

client
reception

client playout
delay

Sum delay
bu

ffe
re

d
da

ta

Delay jitter

VoIP: fixed playout delay

› receiver attempts to playout each chunk exactly q
msecs after chunk was generated.
– chunk has time stamp t: play out chunk at t+q
– chunk arrives after t+q: data arrives too late for

playout: data “lost”
› tradeoff in choosing q:

– large q: less packet loss
– small q: better interactive experience

packets

time

packets
generated

packets
received

loss

r
p p’

playout schedule
p’ – r

playout schedule
p – r

VoIP: fixed playout delay

› sender generates packets every 20 msec during talk spurt.

› first packet received at time r
› first playout schedule: begins at p

› second playout schedule: begins at p’

Adaptive playout delay
› goal: low playout delay, low late loss rate

C
um

ul
at

iv
e

da
ta

time

variable
network
delay
(jitter)

client
reception

To many losses Unnecessary delay
Best

Adaptive playout delay
› goal: low playout delay, low late loss rate

› approach: adaptive playout delay adjustment:
– estimate network delay, adjust playout delay at beginning of each talk

spurt
– silent periods compressed and elongated

› adaptively estimate packet delay: (EWMA – exponentially weighted
moving average, recall TCP RTT estimate):

di = (1-a)di-1 + a (ri – ti)

delay estimate
after ith packet

small constant,
e.g. 0.01

time received – time sent
(timestamp)

measured delay of ith packet

(ri – ti)

v also useful to estimate average deviation of delay, vi :

Adaptive playout delay (cont’d)

› estimates di, vi calculated for every received packet, but used only
at start of talk spurt

› for first packet in talk spurt, playout time is:

vi = (1-b)vi-1 + b |ri – ti – di|

playout-timei = ti + di + Kvi

C
um

ul
at

iv
e

da
ta

time

Delay jitter

talk spurt 1

adjust
delay


talk spurt 2

ri – ti Determine di + Kvi
di + Kvi

Adaptive playout delay (cont’d)

Q: How does receiver determine whether packet is first in a talkspurt?

› if no loss, receiver looks at successive timestamps
– difference of successive stamps > 20 msec ⇒ talk spurt begins.

› with loss possible, receiver must look at both time stamps and
sequence numbers
– difference of successive stamps > 20 msec and sequence numbers

without gaps ⇒ talk spurt begins.

Adaptive playout delay (cont’d)

20ms 20ms 20ms 20ms 20ms

Spurt 1 Spurt 2

20ms 20ms 20ms 20ms

Spurt 1 Spurt 2
1 3 4 5

0 20 40 100 120

0 40 100 120

VoIP: recovery from packet loss

Challenge: recover from packet loss given small tolerable
delay between original transmission and playout

› each ACK/NAK takes ~ one RTT
› alternative: Forward Error Correction (FEC)

– send enough bits to allow recovery without retransmission

simple FEC
› for every group of n chunks, create redundant chunk by exclusive

OR-ing n original chunks
› send n+1 chunks, increasing bandwidth by factor 1/n
› can reconstruct original n chunks if at most one lost chunk from n+1

chunks
› Send x1, x2, x3, … xn, and y=x1 xor x2 xor x3,…, xor xn,
› If x3 is lost, can re-compute x3 from x1, x2, x4, … xn, and y

1 0 1 0

1 0 ? 0 1 XOR 0 XOR x3 =0 x3 =1

another FEC scheme:
• “piggyback lower

quality stream”
• send lower resolution

audio stream as
redundant information

• e.g., nominal
stream at 64 kbps
and redundant stream
at 13 kbps

VoIP: recovery from packet loss (cont’d)

VoiP: recovery from packet loss (cont’d)

interleaving to conceal loss:
› audio chunks divided into smaller

units, e.g. four 5 msec units per
20 msec audio chunk

› packet contains small units from
different chunks

› if packet lost, still have most of
every original chunk

› no redundancy overhead, but
worse delay performance

1 2 3 4

5 6 7 8

9 11 12

13 14 15 16

10

e.g., word missing e.g. syllable missing, acceptable

VoiP: recovery from packet loss (cont’d)

1 2 3 4

5 6 7 8

9 11 12

13 14 15 16

10

1 2 3 4

5 6 7 8

13 14 15 16

1 2 4

5 6 8

9 12

13 14 16

10

word

Word 1
Word 2
Word 3
Word 4

Subjective feeling is improved

Real-time Conversational
Applications

Real-Time Protocol (RTP)

› RTP specifies packet
structure for packets
carrying audio, video
data

› RFC 3550
› RTP packet provides

– payload type identification
– packet sequence

numbering
– time stamping

› RTP runs in end systems
› RTP packets
encapsulated in UDP
segments

› interoperability: if two
VoIP applications run
RTP, they may be able to
work together

RTP libraries provide transport-layer interface
that extends UDP:

• port numbers, IP addresses (already existing)
• payload type identification
• packet sequence numbering
• time-stamping

RTP runs on top of UDP

RTP example

example: sending 64 kbps PCM µ-
law encoded voice over RTP
PCM: Pulse-code modulation: a
method used to digitally represent
sampled analog signals
µ-law: Special quatization
Sample rate 8000samples/second
Quantization 8bit/sample
application collects encoded data in
chunks, e.g., every 20 msec = 160
bytes in a chunk
› audio chunk + RTP header form

RTP packet, which is
encapsulated in UDP segment

› RTP header indicates type of
audio/video encoding in each
packet
– sender can change encoding

during conference
› RTP header also contains

sequence numbers,
timestamps

RTP and QoS

› RTP does not provide any mechanism to ensure timely data
delivery or other QoS guarantees

› RTP encapsulation only seen at end systems (not by intermediate
routers)
– routers provide best-effort service, making no special effort to

ensure that RTP packets arrive at destination in timely manner

• payload type (7 bits): indicates type of encoding currently
being used. If sender changes encoding during call,
sender informs receiver via payload type field

• Payload type 0: PCM µ-law, 64 kbps
• Payload type 3: GSM, 13 kbps
• Payload type 7: LPC, 2.4 kbps
• Payload type 26: Motion JPEG

• Payload type 31: H.261
• Payload type 33: MPEG2 video

• sequence # (16 bits): increment by one for each RTP
packet sent
• detect packet loss, restore packet sequence

payload
type

sequence
number

type

time stamp Miscellaneous
fields

RTP header

Synchronization
Source ID (SSRC)

RTP header

› timestamp field (32 bits long): sampling instant of first byte in this
RTP data packet
– for audio, timestamp clock increments by one for each sampling period

(e.g., each 125 usecs for 8 KHz sampling clock)

– if application generates chunks of 160 encoded samples (20ms),
20ms/125us=160

– timestamp increases by 160 for each RTP packet when source is active.
Timestamp clock continues to increase at constant rate when source is
inactive.

payload
type

sequence
number

type

time stamp Synchronization
Source ID (SSRC)

Miscellaneous
fields

X X+160

160 samples
20 ms

X+480

480*125us=60 ms

RTP header

› sequence # + timestamp: knows new spurts

› SSRC field (32 bits long): identifies source of RTP stream.
Each stream in RTP session has distinct SSRC

payload
type

sequence
number

type

time stamp Synchronization
Source ID (SSRC)

Miscellaneous
fields

SIP: Session Initiation Protocol [RFC 3261]

long-term vision:

› all telephone calls, video conference calls take place over Internet
› people identified by names or e-mail addresses, rather than by

phone numbers

› can reach callee (if callee so desires), no matter where callee
roams, no matter what IP device callee is currently using

SIP services

› SIP provides
mechanisms for call
setup:
– for caller to let callee
know she wants to
establish a call

– so caller, callee can
agree on media type,
encoding

– to end call

› determine current IP
address of callee:
– maps mnemonic identifier

to current IP address
› call management:

– add new media streams
during call

– change encoding during
call

– invite others
– transfer, hold calls

› Alice’s SIP invite message
indicates her port number, IP

address, encoding she prefers
to receive (PCM µlaw)

› Bob’s 200 OK message
indicates his port number, IP
address, preferred encoding

(GSM)
› SIP messages can be sent
over TCP or UDP; here sent

over RTP/UDP

› Default SIP port # is 5060
› Actually, Bob and Alice talks

simultaneoulsy
› SIP is out-of-bandtime time

Bob’s
terminal rings

Alice

167.180.112.24

Bob

193.64.210.89

port 5060

port 38060
µ Law audio

GSM
port 48753

INVITE .210.89c=IN IP4 167.180.112.24m=audio 38060 RTP/AVP 0
port 5060

200 OK
c=IN IP4 193.64.210

.89

m=audio 48753 RTP
/AVP 3

ACK
port 5060

Example: setting up call to known IP address

Setting up a call (cont’d)

› codec negotiation:
– suppose Bob doesn’t have

PCM µlaw encoder
– Bob will instead reply with

606 Not Acceptable Reply,
listing his encoders. Alice
can then send new
INVITE message,
advertising different
encoder

› rejecting a call
– Bob can reject with

replies “busy,” “gone,”
“payment required,”
“forbidden”

› media can be sent
over RTP or some
other protocol

Name translation, user location

› caller wants to call callee,
but only has callee’s
name or e-mail address.

› need to get IP address of
callee’s current host:
– user moves around
– DHCP protocol (dynamically

assign IP address)
– user has different IP devices

(PC, smartphone, car device)

› result can be based on:
– time of day (work, home)
– caller (don’t want boss to

call you at home)
– status of callee (calls sent to

voicemail when callee is
already talking to someone)

SIP registrar

REGISTER sip:domain.com SIP/2.0

Via: SIP/2.0/UDP 193.64.210.89

From: sip:

To: sip:

Expires: 3600

v one function of SIP server: registrar
v when Bob starts SIP client, client sends SIP
REGISTER message to Bob’s registrar server

register message:

SIP proxy

› another function of SIP server: proxy

› Alice sends invite message to her proxy server
– contains address sip:

– proxy responsible for routing SIP messages to callee, possibly through
multiple proxies

› Bob sends response back through same set of SIP proxies

› proxy returns Bob’s SIP response message to Alice
– contains Bob’s IP address

› SIP proxy analogous to local DNS server

1

1. Alice sends INVITE
message to UMass

SIP proxy.

2. UMass proxy forwards request
to Poly registrar server

2 3. Poly server returns redirect response,
indicating that it should try

3

5. eurecom
registrar

forwards INVITE
to 197.87.54.21,
which is running
Bob’s SIP client

5

4
4. Umass proxy forwards request

to Eurecom registrar server

8
6

7
6-8. SIP response returned to

Alice

9
9. Data flows between clients

UMass
SIP proxy

Poly SIP
registrar

Eurecom SIP
registrar

Bob
197.87.54.21

Alice
128.119.40.186

SIP example: calls