The Cloud
Syllabus
This module will cover the following
• What the cloud is
• Virtualization
• Cloud interfacing protocols
• Big data processing
• Device management platforms
2 © 2020 Arm Limited
What is the Cloud?
Definition
“A network of remote servers hosted on the Internet and used to store, manage, and process data in place of local servers or personal computers.”
The role of the cloud in IoT
• • • •
Management and configuration of IoT devices
Aggregating, processing, storing, analyzing, and visualizing data Sharing of computing infrastructure
Service customization
(Lexico)
3
© 2020 Arm Limited
Virtualization
Virtualization
• In general, virtualization is the process by which a virtual version of a platform/process is created based on a single physical resource
• Virtualization can be applied to hardware, networks, storage, or operating systems
• In cloud computing, virtualization enables running separate workloads under strict
partitioning of resources
• Resource separation allow to utilize the CPU and memory more efficiently, and also offers certain security guarantees
• Two main approaches adopted: Virtual Machines and Containers
5 © 2020 Arm Limited
Virtual Machines (VM) based virtualization
Infrastructure as a Service (IaaS)
• Multiple instances of (potentially different) operating systems running on the same physical computing platform
• VMs created, run, and monitored by a (software/hardware) hypervisor
• Two types:
• Type 1 (bare-metal) – runs directly on the host
hardware (HW), e.g., Microsoft Hyper-V
• Type 2 (hosted) – runs on the OS of the host,
e.g., VMWare, VirtualBox
6 © 2020 Arm Limited
Containers
Platform as a Service (PaaS)
• OS-level virtualization – different applications running on isolated user space partitions
• OS can control access to peripherals or certain CPU/network resources
• Kernels can run even applications with security issues (execution safe)
7 © 2020 Arm Limited
Advantages of containers over VMs
Fast to instantiate
Can be destroyed as needed
Hypervisor not required
Possible to share OS libraries among containers Easy to scale
8 © 2020 Arm Limited
Q. Compared to VMs, what disadvantages of containers can you imagine?
Example: Running a web server in a Docker container
Dockerfile to define the image
# Use an official Python runtime as a parent image
FROM python:3.6
# Set the working directory to /app
WORKDIR /app
# Install app dependencies
RUN pip install -r requirements.txt
# Bundle app source
COPY src /app
# Make port 80 available to the world outside this container
EXPOSE 80
# Run app.py when the container launches
CMD [“python”, “app.py”]
9
© 2020 Arm Limited
Example: Running a web server in a Docker container
Dependencies and app
• •
requirements.txt (load Flask micro web framework to be able to serve HTML pages)
flask
src/app.py (listen on port 80, respond with custom message to any connection)
from flask import Flask
# Define app variable:
app = Flask(__name__)
# Set app route and define what to return on connection
@app.route(“/”) def hello():
return “IoTSSC – Module 7. The Cloud”
# Listen on port 80
app.run(host=’0.0.0.0′, port=80)
10
© 2020 Arm Limited
Example: Running a web server in a Docker container
Building the image and running the app
• •
• •
Build image
$ docker build -t python-docker-example
Running the app, generating a container
$ docker run -p 8080:80 python-docker-example
App listens on port 80 → –p maps port 80 of the container to port 8080 of the localhost In browser:
IoTSSC – Module 7. The Cloud
11
© 2020 Arm Limited
Serverless computing
Function as a Service (FaaS)
Applications broken up into functions, each of these hosted by a cloud provider Developers do not need to worry about backend infrastructure (servers)
No specific machine assigned to a function
Faster to deploy even than containers
Charging based on the amount of time each function runs Key advantages: low cost, fast to instantiate, high scalability
12 © 2020 Arm Limited
Q. Compared to containers, what might be some of the disadvantages of serverless computing?
Protocols for IoT cloud interfacing
Putting the “things” online
By now, you would have learned how to:
Program embedded devices
Connect devices to a gateway using different wireless technology
Instantiate apps in the cloud that could handle data produced by IoT devices Question to be answered: How to integrate embedded devices with cloud logic?
14 © 2020 Arm Limited
Cloud communication protocols
HTTP-based RESTful API
Lightweight HTTP service with REpresentational State Transfer (REST) architectural style • Resources are accessible via Uniform Resource Identifiers (URIs)
• A set of standard methods exchanges representation (e.g., POST, GET, etc.)
• Resources are decoupled from representations; content can be accessed as HTML, XML, JSON, etc. • Apart from explicit CRUD (Create, Replace, Update, Delete), resource interactions are stateless
Example:
• IoT device measures heart rate and represents values sampled in JavaScript Object Notation (JSON)
record={
“date”: “2020-02-28”,
“time”: “09:05:32”,
“heart-rate”: 67 }
15
© 2020 Arm Limited
Cloud communication protocols
HTTP-based RESTful API
•
Example (continued):
• Representations sent to the server end-point using HTTP POST method • Python code snippet:
import requests import json
…
url = “http://
16
© 2020 Arm Limited
Authorization
Only devices that have permission should be able to exchange data
• OAuth authorization framework (RFC 6749, 6750)
• User never passes their ID/password to the resource server
• IoT devices operate unattended, often no keyboard → token- based access useful
• Additional step required to be able to authenticate to AS
17 © 2020 Arm Limited
MQTT
• MQ Telemetry Transport (MQTT) is a publish/subscribe protocol that enables short message distribution between (IoT) devices
• Standardized by the Organization for the Advancement of Structured Information Standards (OASIS) and the International Organization for Standardization (ISO)
• Operates on top of TCP/IP
• Transport Layer Security (TLS) encryption may be used to protect connections
• Particularly suitable for low bandwidth, high latency networks
• Protocol currently at version 5
18 © 2020 Arm Limited
MQTT architecture
Broker-centered → clients do not know each other’s IP addresses • Broker dispatches messages received
from publishers (different IoT devices)
• Message distribution based on topics e.g., /house/living-room/temperature
• Subscribers register interest in topics, creating “virtual channels”
• Multiple topics are combined with topic levels, e.g., /house/living-room/#
• Scalable solution, but broker is a single point of failure
19 © 2020 Arm Limited
Q. How could the MQTT architecture be made resilient to the Broker failing?
MQTT Quality of Service (QoS)
Three levels defined referring to message delivery guarantees
QoS 0
• Each message transmitted at most once
• Best-effort delivery → message not acknowledged by broker
• Message not stored and delivery not reattempted if message lost in transit to broker
QoS 1
• Each messages guaranteed to be delivered at least once
• A message may be delivered more than once
• Messages stored by sender until acknowledged by broker via a PUBACK
QoS 2
• Each message received by broker exactly once
• Four-way handshake for each message
• Highest delivery guarantee but longest delivery time
20 © 2020 Arm Limited
Retained messages
QoS applies to publisher-broker communication → publisher has no guarantees that message was received by subscribers
Subscriber has no knowledge about when the first message on a topic will be received → this depends strictly on publisher
MQTT messages with the “retain” flag set address these issues
Broker stores the last message with a retain flag set, together with the QoS for the corresponding topic
Clients receive the retained messages immediately after subscribing
21 © 2020 Arm Limited
Retained messages
A subscriber will receive a message also if subscribed to a relevant
topic level (not necessarily to an exact topic)
Example:
Client A publishes a retained messages to /home/living-room/temperature
Client B receives the last temperature reading for that room as soon as it subscribes to /home/#
A publisher must send a new retained message with zero payload to delete an earlier retained message
Q. What kind of messages would not be appropriate to use with having the retain flag set?
22 © 2020 Arm Limited
Constrained Application Protocol (CoAP)
• Specified by RFC 7252 – similar to HTTP → works with URIs (coap://), but specialized for constrained devices and low-power, lossy networks
• UDP-based transport • faster
• asynchronous messaging, establishing a session not required • reliability left to application layer
• Low-overhead 4-byte packet header
• Ability to cache requests and responses
• Support for sending messages to groups of recipients (multicast)
• Incorporates mechanism for resource discovery
23 © 2020 Arm Limited
CoAP features
Device can act both as client and server – no need for a central coordinator
Tokens inside packets used to match responses to requests
“Observe” option can be used with a request to register interest in receiving updates on a specific resource; useful for monitoring when the state of a device has changed
Block-wise transfers used to transmit large payloads, e.g., firmware updates
Datagram TLS (DTLS) used to protect communication from eavesdropping
24 © 2020 Arm Limited
CoAP packet format
• Version – 2 bits, indicating protocol version
• T – Type, 2 bits, indicating if request is confirmable (0) or non-confirmable (1); or response is an acknowledgement (0) or reset (1)
• TKL – 4 bits, token length
• Code – 8 bits, similar to HTTP status code (e.g.,
404 – not found, 503 – service unavailable, etc.)
• Message ID – 16 bits, used to detect duplicates
• Token – optional 0–8 bytes, request/response matching
25 © 2020 Arm Limited
Lightweight Machine-to-Machine (LwM2M)
An IoT device management protocol
• Application layer communications protocol
• Simple resource model → set of objects defined for interacting with clients
• Used together with CoAP, optimized bandwidth consumption
• Transport agnostic (UDP, TCP, and SMS)
• Support for end-to-end security (via Object Security for Constrained RESTful Environments – OSCORE)
• Used for device bootstrap, configuration, firmware update, and diagnostics
26 © 2020 Arm Limited
Data processing in the cloud
Data processing pipelines
Series of elements that act on data, with the output of one being the input to another
On-device/edge pre-processing – noise removal, averaging, summarization
Real-time computation on multiple data stream (e.g., Storm), stream processing (e.g., Kafka), or aggregation into non-relational distributed databases (e.g., HBase)
Distributed storage (e.g., Hadoop DFS) and offline analysis (e.g., MapReduce) Data warehousing, query, and analysis engines (e.g., Apache Hive)
Workflow scheduling (e.g., Apache Oozie)
28 © 2020 Arm Limited
Distributed data storage
Scaling storage to big data
• Distributed file systems have been developed to deal with vast volumes of data (that may be also generated by millions of IoT devices)
• The idea is to use a network protocol to coordinate access to storage and locate files
• Clients remain unaware of the exact location of the content accessed, but see DFS just
as regular storage (access transparency)
• The namespace used covers both local and remote file, but does not give a file’s location (location transparency)
• Clients reading/modifying content have the same shared view of the file system (concurrency transparency)
29 © 2020 Arm Limited
The Hadoop Distributed Files System (HDFS)
• NameNode: tracks files, manages FS, and stores metadata
• DataNode: stores actual data in blocks
• TCP/IP sockets used for communication
• Client-NameNode interaction via a JobTracker
30 © 2020 Arm Limited
Workflow schedulers
• Workflows can be seen as directed acyclic graphs (DAGs) comprising control flow and action nodes.
• Control flow nodes specify the start/end of workflows and mechanism to control execution paths
• Action nodes trigger the execution of processing and computation tasks
• Possible to run multiple jobs in parallel
• Example: Apache Oozie
31 © 2020 Arm Limited
Device lifecycle management
Device management platform
Abstracting away IoT complexity
• Enables connecting a wide range of trusted IoT devices easily
• Devices administration (including updating)
• Ingest real-time data from devices and extract insights that can be acted upon
• Flexible cloud- based/on-premises deployment
33 © 2020 Arm Limited
Connectivity management
Automating connectivity lifecycle
• Rapid activation of SIM cards and virtual eUICC profiles to bootstrap connectivity.
• Integration with mobile network operators to allow real-time monitoring
• RESTful APIs to permit integration with other systems
• Use cases: smart metering, asset tracking, industrial IoT
34 © 2020 Arm Limited
Device management
Provisioning, deployment, and remote updating of devices
• Supports diverse device profiles
• Enables configuring devices with unique cryptographic identities
• Facilitates the distribution of firmware updates to devices and recovery in case of failure
35 © 2020 Arm Limited
Data management
Access, integrate, and action insights derived from IoT data
• Collect and unify structured/un- structured data from heterogeneous devices, enterprise, and 3rd party data
• Machine learning and query engine to extract insights from data
• Orchestration of workflows integrated with customized services
36 © 2020 Arm Limited
Summary
Cloud computing has become primary paradigm for scalable compute and storage supply, powering most modern web applications and connected-device platforms from global data-centres. IoT devices may be supported by vendor-operated cloud platforms.
• Virtualization is key underlying enabling technology, migrating existing OSes/apps. • move towards increasingly smaller components: VMs -> containers -> functions
• IoT interfacing protocols support local and cloud communication
• RESTful HTTP (cloud platform upload); MQTT (local/infrastructure comms); CoAP, LwM2M (device)
• Big data processing allows cloud computing to harness data-gathering IoT opportunity • wisdom gained from millions of thermostats (Nest), voices (Alexa), …
• Device management platforms enable this, uniformly • ways to manage devices, deployments/users, data
37 © 2020 Arm Limited