CS计算机代考程序代写 scheme python data science database chain deep learning Bayesian file system flex android decision tree AI algorithm Hive AI Primer (IMDA Publications)

AI Primer (IMDA Publications)

AI PRIMER
(IMDA PUBLICATIONS)

TYPES OF DATA

ARTIFICIAL INTELLIGENCE DESCRIBED ON A SINGLE
CHART

Source: Schulte Research Estimates

Physical
Data

IoT
Digital
Data

Infra.

Neural
Networks:
Machine
Learning

AI

1. Financial Services

2. Cognitive Services

3. Lifestyle/ Health

4. Autonomous cars

5. Robotics

6. Advertising

Cloud

Q
u

a
n

tu
m

C

o
m

p
u

tin
g

5bn
Mobile
Devices

40bn
Sensors

DATA SCIENCE VS. BIG DATA VS. DATA ANALYTICS

 Data Science: Dealing with unstructured and structured data, Data Science is a field that comprises

everything that related to data cleansing, preparation, and analysis. Data Science is the combination

of statistics, mathematics, programming, problem-solving, capturing data in ingenious ways, the

ability to look at things differently, and the activity of cleansing, preparing, and aligning the data. In

simple terms, it is the umbrella of techniques used when trying to extract insights and information

from data.

 Big Data: Big Data refers to humongous volumes of data that cannot be processed effectively with

the traditional applications that exist. The processing of Big Data begins with the raw data that isn’t

aggregated and is most often impossible to store in the memory of a single computer. The definition

of Big Data, given by Gartner, is, “Big data is high-volume, and high-velocity or high-variety

information assets that demand cost-effective, innovative forms of information processing that enable

enhanced insight, decision making, and process automation.”

 Data Analytics: the science of examining raw data to conclude that information. Data Analytics

involves applying an algorithmic or mechanical process to derive insights and, for example, running

through several data sets to look for meaningful correlations between each other. It is used in several

industries to allow organizations and companies to make better decisions as well as verify and

disprove existing theories or models. The focus of Data Analytics lies in inference, which is the

process of deriving conclusions that are solely based on what the researcher already knows.

https://www.simplilearn.com/data-science-vs-
big-data-vs-data-analytics-article

https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article

DATA SCIENCE VS. BIG DATA VS. DATA ANALYTICS

Data Science: Dealing with unstructured and structured

data, Data Science is a field that comprises everything

that related to data cleansing, preparation, and analysis.

Data Science is the combination of statistics,

mathematics, programming, problem-solving, capturing

data in ingenious ways, the ability to look at things

differently, and the activity of cleansing, preparing, and

aligning the data. In simple terms, it is the umbrella of

techniques used when trying to extract insights and

information from data.

https://www.simplilearn.com/data-science-vs-
big-data-vs-data-analytics-article

https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article

DATA SCIENCE VS. BIG DATA VS. DATA ANALYTICS

 Big Data: Big Data refers to humongous volumes of data that

cannot be processed effectively with the traditional

applications that exist. The processing of Big Data begins with

the raw data that isn’t aggregated and is most often impossible

to store in the memory of a single computer. The definition of

Big Data, given by Gartner, is, “Big data is high-volume, and

high-velocity or high-variety information assets that demand

cost-effective, innovative forms of information processing that

enable enhanced insight, decision making, and process

automation.”

https://www.simplilearn.com/data-science-vs-
big-data-vs-data-analytics-article

https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article

DATA SCIENCE VS. BIG DATA VS. DATA ANALYTICS

 Data Analytics: the science of examining raw data to conclude

that information. Data Analytics involves applying an

algorithmic or mechanical process to derive insights and, for

example, running through several data sets to look for

meaningful correlations between each other. It is used in

several industries to allow organizations and companies to

make better decisions as well as verify and disprove existing

theories or models. The focus of Data Analytics lies in

inference, which is the process of deriving conclusions that are

solely based on what the researcher already knows.

https://www.simplilearn.com/data-science-vs-
big-data-vs-data-analytics-article

https://www.simplilearn.com/data-science-vs-big-data-vs-data-analytics-article

DATA
STRUCTURE

STRUCTURED VS.
UNSTRUCTURED
DATA

https://www.prowebscraper.com/blog/str
uctured-vs-unstructured-data-best-thing-
you-need-to-know/

https://www.prowebscraper.com/blog/structured-vs-unstructured-data-best-thing-you-need-to-know/

WHAT IS
STRUCTURED
DATA?

 Structured data refers to any data that resides in
a fixed field within a record or file. This includes
data contained in relational databases and
spreadsheets.

 Structured data Examples :

 Meta-data (Time and date of creation, File
size, Author etc.)

 Library Catalogues (date, author, place,
subject, etc)

 Census records (birth, income, employment,
place etc.)

 Economic data (GDP, PPI, ASX etc.)

 Facebook like button

 Phone numbers (and the phone book)

 Databases (structuring fields)

WHAT IS
UNSTRUCTURE
D DATA?

 Unstructured data (or unstructured information)
is the kind of information that either does not
have a predefined data model or is not
organized in a pre-defined manner.

 Unstructured data examples are as follows :

 Text files (Word processing, spreadsheets,
presentations etc.)

 Email body

 Social Media ( Data from Facebook, Twitter,
LinkedIn)

 Website (YouTube, Instagram, photo sharing
sites )

 Mobile data ( Text messages )

 Communications ( Chat, Instant Messaging,
phone recordings, collaboration software )

 Media ( MP3-MPEG Audio Layer-3-
compressed file, digital photos, audio and video
files )

COMPUTER OR MACHINE-GENERATED

Machine Generated Structured Data sources Machine Generated Unstructured Data sources

Sensor data: When you talk about radio frequency ID tags,
smart meters, medical devices, and Global Positioning System
data, you are basically referring to machine generated structured
data. Supply chain management and inventory control is what
gets the companies interested in this.

Satellite images: When you take into consideration the weather
data or the data that government agencies procure through its
satellite surveillance imagery, you are talking about machine
generated unstructured data. Google Earth and similar
mechanisms aptly illustrate the point.

Web log data: When systems and mechanisms such as servers,
applications and networks etc. work, they soak in different types
of data regarding whatever is the operation. It means enormous
piles of data of diverse kinds. Based on this data, you can deal
with service-level agreements or predict security breaches.

Scientific data: All scientific data that includes seismic imagery,
atmospheric data and high energy Physics so and so forth stand
for machine generated unstructured data.

Point-of-sale data: When the digital transactions take place
over the counter of a shopping mall, the machine captures a lot
of data. This is machine generated structured data related to
barcode and other relevant details of the product etc.

Photographs and video: When machines capture images and
video for the purposes of security, surveillance and traffic, the
data that is produced is machine generated unstructured data.

Financial data: Computer programs are used with respect to
financial data a lot more now. Processes are automated with the
help of these programs. Take the case of stock-trading. It carries
structured data such as the company symbol and dollar value. A
part of this data is machine generated and some of it is human
generated.

Radar or sonar data: This includes vehicular, meteorological,
and oceanographic seismic profiles.

HUMAN-GENERATED

Human Generated Structured Data sources Human Generated Unstructured Data sources

Input data: When a human user enters input such as
name, age, income, non-free-form survey responses etc.
into a computer, it is human generated structured data.
Companies can find this type of data quite useful in
studying customer behavior.

Text internal to your company: This is the type of data
that is restricted to a given company such as documents,
logs, survey results, emails etc. Such enterprise
information forms a big part of such unstructured text
information in the world.

Clickstream data: This is the type of data generated
through the process of a user clicking a link on a website.
Businesses like this type of data because it allows them to
study customer behavior and purchase patterns.

Social media data: This kind of data is generated when
human users interact with social media platforms such as
Facebook, Twitter, Flickr, YouTube, LinkedIn etc.

Gaming-related data: When a human user makes a
move in a game on a virtual platform, it produces a piece
of information. How users navigate a gaming portfolio is
a source of a lot of interesting data.

Mobile data: This type of data includes information such
as text messages and location information.

Website content: This type of data is derived from a site
delivering unstructured content such as YouTube, Flickr,
Instagram etc.

CHARACTERISTICS

Structured data Unstructured data

Flexibility
Schema dependent rigorous
schema

Absence of schema, Very
flexible

Scalability
Scaling DB schema is
difficult

Highly scalable

Robustness Robust –

Query Performance
Structured query allows
complex joins

Only textual query possible

Accessibility Easy to access Hard to access

Availability Percentage wise lower Percentage wise higher

Association Organized Scattered and dispersed

Analysis Efficient to analysis
Additional preprocessing is
needed

Appearance Formally defined Free- From

STRUCTURED DATA STORAGE TECHNIQUE

 This type of data storage is used in the context of storage-area network (SAN)
environments. In such environments, data is stored in volumes which is also referred to as
blocks.

 An arbitrary identifier is assigned to every block. It allows the block to be stored and
retrieved but there would be no metadata providing further context.

 Virtual machine file system volumes and structured database storage are the use cases of
block storage.

 When it comes to block storage, raw storage volumes are created on the device. With the
aid of a server-based system, the volumes are connected and each one is treated as an
individual hard drive.

UNSTRUCTURED
DATA STORAGE
TECHNIQUE

 This particular technique is basically a way of storing,
organizing and accessing data on disk. The difference
however is that it is done so in a more scalable and cost-
effective manner.

 This kind of storage system makes it possible to retain
huge volumes of unstructured data. When it comes to
storing photos on Facebook, songs on Spotify, or files in
collaboration services such as Dropbox, object storage
come into play.

 Each object incorporates data, a lot of metadata and a
singularly unique identifier. This kind of storage can be
done at different levels such as device level, system level
and interface level.

 Since objects are robust, this kind of storage works well
for long-term storage of data archives, analytics data and
service provider storage with SLAs (Service-level
agreement) linked with data delivery.

https://www.prowebscraper.c
om/blog/structured-vs-
unstructured-data-best-thing-
you-need-to-know/

Source for the last 9 slides.

https://www.prowebscraper.com/blog/structured-vs-unstructured-data-best-thing-you-need-to-know/

8 VITAL ALTERNATIVE DATA TYPES

 App Usage: behavioural data from purchase, etc

 Credit/Debit Card: Buying patterns and choices

 Geo-Location: Tracking Wi-Fi or Bluetooth beacons

 Public Data: Patents, Government Contracts, Import/Export data, etc

 Satellite: Satellite feed and low-level drones for supply chain, tracking agriculture yields

and oil and gas storage, etc

 Social or Sentiment: Social media, news, management communications, comments, shares,

likes on social media.

 Web Data: data scrapped from websites for product descriptions, flight bookings, real

estate listing, etc

 Web Traffic: Demographics of visitors visiting a particular website for travel bookings and e-

commerce as examples.

THE 8 V’S OF
BIG DATA

FROM 3 V’S
(1,5,6)

https://www.m-brain.com/home/technology/big-data-
with-8-vs/

https://www.m-brain.com/home/technology/big-data-with-8-vs/

https://www.educba.com/small-data-vs-big-data/

STORAGE

DISTRIBUTED VS CENTRALIZED NETWORKS FOR STORAGE

 Centralized data networks are those that maintain all the data in a single computer, location

and to access the information you must access the main computer of the system, known as

“server”.

 On the other hand, a distributed data network works as a single logical data network,

installed in a series of computers (nodes) located in different geographic locations and that

are not connected to a single processing unit, but are fully connected among them to

provide integrity and accessibility to information from any point. In this system all the nodes

contain information and all the clients of the system are in equal condition. In this way,

distributed data networks can perform autonomous processing. The clear example is the

blockchain, but there are others such as Spanner, a distributed database created by

Google.

https://icommunity.io/en/redes-centralizadas-vs-
distribuidas/ Source for the next six slides

Distributed VS centralized networks

ADVANTAGES AND
DISADVANTAGES OF
CENTRALIZED, DECENTRALIZED
AND DISTRIBUTED DATA
NETWORKS.

 Centralized and distributed

networks have different

characteristics and also have

different advantages and

disadvantages. For example,

centralized networks are

the easiest to maintain since

they have only one point of

failure, this is not the case of

the distributed ones, which in

theory are more difficult to

maintain.
Centralised Decentralised Distributed

BLOCKCHAIN IS A DISTRIBUTED DATA NETWORK

 There are other types of distributed data networks besides the blockchain. In fact, the

consensus and the immutability of the data are not unique characteristics of the blockchain,

since there are other distributed data networks that also have these characteristics, such as:

Paxos, Raft, Google HDFS, Zebra, CouchDB, Datomic, among other.

 But there are two characteristics that really differentiate the blockchain from the rest of the

data networks: the access control for writing and reading data is truly decentralized, unlike

other distributed data networks where it is logically centralized; and the ability to secure

transactions as there is no need for trusted third parties in a competitive environment.

 The blockchain has unique characteristics over the rest of the available data networks.

However, this does not mean that for all possible cases of data storage, the best option is

always to use the blockchain. This really depends on the needs and requirements of a

company or organization when using a database.

COMPARATIVE SUMMARY

 1. Security:

 CENTRALIZED: If someone has access to the server with the information, any data can be added,
modified and deleted.

 DISTRIBUTED: All data is distributed between the nodes of the network. If something is added, edited
or deleted in any computer, it will be reflected in all the computers in the network. If some legal
amendments are accepted, new information will be disseminated among other users throughout the
network. Otherwise, the data will be copied to match the other nodes. Therefore, the system is self-
sufficient and self-regulating. The databases are protected against deliberate attacks or accidental
changes of information.

 2. Availability:

 CENTRALIZED: If there are several requests, the server can break down and no longer respond.

 DISTRIBUTED: Can withstand significant pressure on the network. All the nodes in the network have the
data. Then, the requests are distributed among the nodes. Therefore, the pressure does not fall on a
computer, but on the entire network. In this case, the total availability of the network is much greater
than in the centralized one.

COMPARATIVE SUMMARY

 3. Accessibility:

 CENTRALIZED: If the central storage has problems, you will not be able to obtain your

information unless the problems are solved. In addition, different users have different needs,

but the processes are standardized and can be inconvenient for customers.

 DISTRIBUTED: Given that the number of computers in the distributed network is large, DDoS

attacks are possible only in case their capacity is much greater than that of the network. But that

would be a very expensive attack. In a centralized model, the response time is very similar in

this case. Therefore, it can be considered that distributed networks are secure.

COMPARATIVE SUMMARY

 4. Data transfer rates:

 CENTRALIZED: If the nodes are located in different countries or continents, the connection with

the server can become a problem.

 DISTRIBUTED: In distributed networks, the client can choose the node and work with all the

required information.

 5. e-Scalability:

 CENTRALIZED: Centralized networks are difficult to scale because the capacity of the server is

limited and the traffic can not be infinite. In a centralized model, all clients are connected to the

server. Only the server stores all the data. Therefore, all requests to receive, change, add or

delete data go through the main computer. But server resources are finite. As a result, he is able

to carry out his work effectively only for the specific number of participants. If the number of

clients is greater, the server load may exceed the limit during the peak time.

 DISTRIBUTED: Distributed models do not have this problem since the load is shared among

several computers.

ARTIFICIAL
INTELLIGENCE

TECHNOLOGY
STUDY – AI
PRIMER

 Definition of AI

 History of AI

 • Symbolic AI

 • Machine Learning

 • Deep Learning

 Primer Knowledge of AI

 • Machine Learning

 Supervised Learning

 Unsupervised Learning

 Semi-Supervised Learning

 Reinforcement Learning

 • Deep Learning

DEFINITION
OF AI

 AI originated more than 50 years ago, and it is generally

agreed that John McCarthy coined the phrase “artificial

intelligence” in a written proposal for a workshop in

Dartmouth in 1956. AI is now commonly understood as

the study and engineering of computations that make it

possible to perceive, reason, act, learn and adapt .

 In the widely referenced book, “Artificial Intelligence: A

Modern Approach ”, Dr Stuart Russell and Dr Peter

Norvig define AI as:

 “The study of agents that receive percepts from the

environment and perform actions.”

DEFINITIONS
AND NOT
DEFINITION

 The various definitions of AI, laid out along two

dimensions are also discussed.

 The definitions on top are concerned with thought

processes and reasoning, whereas the ones on the

bottom address behaviour.

 The definitions on the left measure success in terms

of fidelity to human performance, whereas the ones

on the right measure against an ideal performance

measure, rationality:

• SYMBOLIC AI

 Early in the 1940s and 1950s, a handful of scientists
from a variety of fields, including mathematics,
psychology, engineering, economics, and political
science, began to discuss the possibility of creating
an artificial brain.

 The term “Artificial Intelligence” was coined at a
Dartmouth conference and AI research was founded
as an academic discipline in 1956.

 At the early stage, teaching machine how to play
chess was one of the main research focuses on AI.

 Chess has its playing rules, many experts in AI
believed that AI could be achieved by having
programmers handcraft a sufficiently large set of
explicit rules for manipulating knowledge, these rules
are human-readable representations of problems
and logics.

 This is known as “Symbolic AI”, and it was the
dominant paradigm in AI from the 1950s to the late
1980s. Figure 2 illustrates how Symbolic AI works.

ILLUSTRATION OF SYMBOLIC AI

 Symbolic AI reached its peak popularity during the “Expert Systems” booms

of the 1980s.

 Expert systems are a logical and knowledge-based approach.

 Their power came from the expert knowledge they contained, but it also

limited the further development of expert systems.

 The knowledge acquisition problem, knowledge base increasing and updates

issues are all the major challenges of expert systems.

 A new type of AI approaches became required over rule-based technologies

at that time.

• MACHINE
LEARNING

 Machine learning, reorganized as a subset field of AI,

started to flourish in the 1990s.

 Different from Symbolic AI, machine learning does

not require humans to know the existing rules.

 It arises from this question: “could a computer go

beyond “what we know how to order it to

perform” (Symbolic AI), and learn on its own how

to perform a specified task?”

 Based on this, with machine learning, humans input

data as well as the expected answers from the data,

and the machine will “learn” by itself and outcome

the rules.

 These learned rules can then be applied to new data

to produce new answers.

 Figure 3 in the next page illustrates the simple

structure of machine leaning.

ILLUSTRATION OF MACHINE LEARNING
(FRANCOIS, 2017)

THE DIFFERENCE

 Starting from the 1990s, AI changed its goal from achieving AI

to tackling solvable problems of a practical nature.

 It shifted focus away from the symbolic approaches it had

inherited from AI, and toward methods and models borrowed

from statistics and probability theory (Langley 2011).

• DEEP
LEARNING

 Following a series of ups and downs, often

referred to as “AI summers and winters”, as

interest in AI has alternately grown and

diminished.

 This is illustrated in Figure 4. In this

evolution roadmap, we can see AI is a

general field, which covers machine

learning.

 Deep learning is one hot branch of

machine learning, which is also the symbol

of the current AI boom, about eight years

ago.

EVOLUTION OF AI

LEARNING – MACHINE VERSUS DEEP

 Compared to machine learning, deep learning automates the feature engineering of the
input data (the process of learning the optimal features of the data to create the best
outcome), and allows algorithms to automatically discover complex patterns and
relationships in the input data.

 Deep learning is based on Artificial Neural Networks (ANNs), which were inspired by
information processing and distributed communication nodes in biological systems, like
the human brain.

 Figure 5 simply shows the information processing framework of human brain and ANNs.

 ANN imitates the human brain’s process through using multiple layers to progressively
extract different levels of feature/interpretation from raw input data (each hidden layer
represents one feature/interpretation of the data).

 In essence, deep learning algorithms “learn how to learn”.

ALTHOUGH AI
RESEARCH
STARTED IN THE
1950S, ITS
EFFECTIVENESS
AND PROGRESS
HAVE BEEN MOST
SIGNIFICANT
OVER THE LAST
DECADE, DRIVEN
BY THREE
MUTUALLY
REINFORCING
FACTORS:

 The availability of big data: various data
sources including businesses, e-commerce,
social media, science, wearable devices,
government, etc.

 Dramatically improvement of machine
learning algorithms: the sheer amount of
available data accelerates algorithm
innovation.

 More powerful computing ability and cloud-
based services: make it possible to realize and
implement the advanced AI algorithms, like
deep neural networks.

 Significant progress in algorithms, hardware, and big data technology,
combined with the financial incentives to find new products, have also
contributed to the AI technology renaissance.

 Today, AI has transformed from “let the machine know what we know” to
“let the machine learn what we may don’t know” to “let the machine
automatically learn how to learn”.

 Researchers are working on much wider applications of AI that will
revolutionize the ways in which people work, communicate, study and enjoy
ourselves.

 Products and services incorporating such innovation will become part of
people’s day-to-day lives in the near future

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS

Dendrites are the segments
of the neuron that receive
stimulation in order for the
cell to become active. They
conduct electrical messages
to the neuron cell body for
the cell to function.

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS
Axon, also called nerve fibre,
portion of a nerve cell (neuron)
that carries nerve impulses
away from the cell body. A
neuron typically has one axon
that connects it with other
neurons or with muscle or
gland cells. Some axons may
be quite long, reaching, for
example, from the spinal cord
down to a toe.

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS
The function of the synapse is
to transfer electric activity
(information) from one cell to
another. The transfer can be
from nerve to nerve (neuro-
neuro), or nerve to muscle
(neuro-myo). The region
between the pre- and
postsynaptic membrane is very
narrow, only 30-50 nm.

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS

HUMAN BRAIN
AND
ARTIFICIAL
NEURAL
NETWORKS

ACTIVATION

 Activation functions are

mathematical equations

that determine the output

of a neural network.

 The function is attached to

each neuron in the network,

and determines whether it

should be activated

(“fired”) or not, based on

whether each neuron’s input

is relevant for the model’s

prediction.

 A cost function is then adopted to measure the “error”, that
is the difference between true output value and the predicted
output value.

 It basically judges how wrong or bad the learned model is in
its current form.

 The ideal goal is to have zero cost.

 Usually, a minimum cost value is set as a stop criteria.

FEED-FORWARD AND
BACKPROPAGATION LEARNING

1. BACKPROPAGATION

2. COST FUNCTION = ERROR

3. ADJUST THE WEIGHTS

4. MINIMISE THE COST
FUNCTION

5. GET THE “OPTIMUM”
WEIGHTS LAYER BY LAYER

ERROR,
BACKPROPAGA
TION,
GRADIENT
DESCENT

 After getting the “error”, the backpropagation process

is followed to reduce the current error cost.

 Backpropagation is to tweak the weights of the

previous layer, which aims to get the value we want in

the current layer.

 We do this recursively throughout however many

layers are in the network.

 Gradient Descent is usually used to tweak the weights.

 It is a first-order iterative optimization algorithm for

finding the global minimum of the cost function.

 In general, when we adjust the current weight, we

move to the left or right of the current value location,

and we can figure out which direction produces a

slope with a lower value than the current value, and we

can take a small step in that direction and then try

again (Figure 9).

GRADIENT
DESCENT

FORWARD AND
BACKWARD

 Feed-forward and backpropagation is a cycle

learning process.

 We need to repeat, maybe thousands even

millions of times before we can find the

global minimum value of the cost function.

 Once a neural network is trained, it may be

used to analyze new data.

 That is, the practitioner stops the training and

allows the network to function in forward

propagation mode only.

 The forward propagation output is the

predicted model used to interpret and make

sense of previously unknown input data.

DEMO 1: LET US DO A HANDS-ON ON NEURAL
NETWORK

Open the file Demo 1 Neural Network_David Lee

Watch the Videos

FURTHER ON
DEEP LEARNING

PRIMER
KNOWLEDGE
OF AI

 To further understand how current AI works, we

will introduce the primer knowledge of deep

learning in this section.

 As machine learning is the base of deep

learning, so a general introduction of some

basic knowledge about machine learning will be

shown first.

MACHINE
LEARNING

 Machine learning involves the creation of

algorithms which can modify/adjust itself

without human intervention to produce desired

output- by feeding itself through input data.

 Through this learning process, the machine can

categorize similar people or things, discover or

identify hidden or unknown patterns and

relationships, also can detect anomalous

behaviors in the given data, which allow it to

make predictions/estimations possible

outcomes or actions of future data.

 Therefore, to do machine learning, we usually

follow five steps, from data collection, data

preparation to modelling, understanding and

delivering the results (as shown in figure 6).

MACHINE LEARNING WORKFLOW

 Step 1 and 2 are data preparation work, it transforms the raw

data into structured data that the machine can read.

 For example, to do image classification (“dog” or “cat”), we

should know what kind of image features we need to extract

and how to extract, like texture, edge, and shape.

 We call these features as input data, usually represented as a

vector or matrix (𝑥1, 𝑥2, 𝑥3, … , 𝑥𝑛), 𝑥𝑖 is one structured
feature.

 And output data is the corresponding label (“dog” or “cat”).

Step 4, and 5 are straightforward and easily understandable.

 Step 3 Model building is the key process of machine learning.

 The processes machines use to learn are known as algorithms.

Based on different algorithms used at this step, machine

learning can be further categorized into four big types:

supervised learning, unsupervised learning, semi-supervised

learning and reinforcement learning.

• SUPERVISED
LEARNING

 The supervised learning algorithm is as its name shown, it
is trained/taught using given examples.

 The examples are labelled, means the desired output for
input is known.

 For example, a credit card application can be labelled
either as approved or rejected.

 The algorithm received a set of inputs (the applicants’
information) along with the corresponding outputs
(whether the application was approved or not) to foster
learning.

 The model building or the algorithm learning is a process
to minimize the error between the estimated output and
the correct output.

 Learning stops when the algorithm achieves an
acceptable level of performance, such as the error is
smaller than the pre-defined minimum error.

 The trained algorithm is then applied to unlabeled data
to predict the possible output value, such as whether a
new credit card application should be approved or not.

 This is helpful to what we are familiar with, called Know
Your Customer (KYC) in bank business.

 There are multiple supervised learning algorithms, Bayesian statistics,

regression analysis, decision trees, random forests, support vector

machines (SVM), ensemble models and so on.

 Practical applications include risk assessment, fraud detection, image,

speech and text recognition, etc.

• UNSUPERVISED
LEARNING

 Different from supervised learning, in unsupervised learning,
the algorithm is not trained/taught on the “right answer”. The
algorithm tries to explore the given data and detect or mine
the hidden patterns and relationships within the data. In this
case, there is no answer key. Learning is based on the
similarity/distance among the given data points.

 Take bank customer understanding as an example,
unsupervised learning can be used to identify several
groups of bank customers. The customers in a specific
group are with similar demographic information or same
bank product selections. The learned homogenous groups
can help the bank to figure out the hidden relationship
within the customer’s demographics and their bank
products selection.

 This would provide useful insights on customer targeting
when the bank would like to promote a product to new
customers. Also, unsupervised learning works well with
transactional data in that it can be used to identify a group
of individuals with similar purchase behavior who can then
be treated as a single homogenous unit during marketing
promotions.

 Association rule mining, clustering like K-means, nearest-neighbour

mapping, self-organizing mapping, dimensionality reduction like

principal component analysis, are all the common and popular

unsupervised learning algorithms.

 Practical applications cover market basket analysis, customer

segmentation, anomaly detection and so on.

DEMO 2: LET US DO A HANDS-ON ON K-MEAN AND K-NEAREST
NEIGHBOUR

 Open Demo 2 K Mean and K NN_David Lee

 Exercise in Data_k Mean_David Lee Exercise (1).xls

 Answers in Data_k Mean_David Lee Exercise (2).xls

FURTHER ON
DEEP LEARNING

• SEMI-
SUPERVISED
LEARNING

 Semi-supervised learning is used to address
similar problems as supervised learning.

 However, in semi-supervised learning, the
machine is provided both labelled and
unlabelled data.

 A small amount of labelled data is combined
with a large amount of unlabelled data.

 When the cost associated with labelling is too
high to allow for a fully labelled training process,
semi-supervised learning is normally utilized.

 Using the labelled data, semi-supervised
learning algorithms first use a large amount of
unlabelled data.

 A new model will further be trained using the
new labelled data set.

 For example, an online news portal wants to do web

pages classification or labelling.

 Let’s say the requirement is to classify web pages into

different categories (i.e. Sports, Politics, Business,

Entertainment, etc.).

 In this case, it is prohibitively expensive to go through

hundreds of millions of web pages and manually label

them.

 Therefore the intent of semi-supervised learning is to

take as much advantage of the unlabelled data as

possible, to improve the trained model.

 Image classification and text classification are good

practical applications of semi-supervised learning.

• REINFORCEMENT
LEARNING

 The intent of reinforcement learning is to find the
best actions that lead to maximum reward or drive
the most optimal outcome.

 The machine is provided with a set of allowed
actions, rules, and potential end states. In other
words, the rules of the game are defined. By
applying the rules, exploring different actions and
observing resulting reactions the machine learns
to exploit the rules to create the desired outcome.

 Thus determining what series of actions, in what
circumstances, will lead to an optimal or optimized
result.

 Reinforcement learning is the equivalent of
teaching someone to play a game. The rules and
objectives are clearly defined.

 However, the outcome of any single game
depends on the judgment of the player who must
adjust his approach in response to the incumbent
environment, skill and actions of a given opponent.
It is often utilized in gaming and robotics.

A BEGINNER’S
GUIDE TO DEEP
REINFORCEMENT
LEARNING

 https://pathmind.com/wiki/deep-reinforcement-learning

 Deep reinforcement learning combines artificial neural
networks with a reinforcement learning architecture that
enables software-defined agents to learn the best actions
possible in virtual environment in order to attain their goals.

 While neural networks are responsible for recent AI
breakthroughs in problems like computer vision, machine
translation and time series prediction – they can also combine
with reinforcement learning algorithms to create something
astounding like Deepmind’s AlphaGo, an algorithm that beat
the world champions of the Go board game.

 Google DeepMind’s Deep Q-learning playing Atari Breakout

 https://youtu.be/V1eYniJ0Rnk

 MarI/O – Machine Learning for Video Games

 https://youtu.be/qv6UVOQ0F44

https://pathmind.com/wiki/deep-reinforcement-learning

GOOGLE DEEPMIND’S DEEP Q-LEARNING

 https://www.youtube.com/watch?v=V1eYniJ0Rnk&feature=youtu.be

 The algo will play Atari breakout.

 The most important thing to know to know is that al the agent is given is sensory input (what you see on the

screen) andit was ordered to maximize the score on the screen.

 No domain knowledge is involved! This means that the algorithm does not know the concept of a ball or

what the controls exactly do.

 Starting out – 10 minutes of training

 The algorithm tries to hit the ball back, but it is yet to clumsy to manage.

 After 120 minutes of training. It plays like an expert.

 After 240 minutes of training

 This is where the magic happens: It realizes that digging a tunnel through the wall is the most effective

technique to beat the game.

MORE – ALPHAGO

 Reinforcement Learning – Ep. 30 (Deep Learning SIMPLIFIED)

 https://www.youtube.com/watch?v=e3Jy2vShroE

 Simulation and Automated Deep Learning

 https://youtu.be/EHP47tM6ctc

 Data is to machine learning what life is to human
learning. The output of a machine learning algorithm is
entirely dependent on the input data it is exposed to.

 Therefore, to train a good machine learning model,
experts need to do good data preparation beforehand.
To some extent, machine learning performance depends
on the quality of the input data.

 Deep learning follows a similar workflow as machine
learning, while the main advantage is deep learning
does not necessarily need structured data as input.
Imitating the way how our human brain works to solve
problems- by passing queries through various
hierarchies of concepts and related questions to find an
answer, deep learning uses artificial neural networks to
hierarchically define specific features via multiple layers
(as Figure 5 shown).

 Deep learning weakens the dependence of machine
learning on feature engineering, which makes it general
and easier to apply to more fields. The following section
illustrates the primer knowledge about how deep
learning works.

DEEP
LEARNING

 We know deep learning do the mapping of input to

output via a sequence of simple data transformations

(layers) in an Artificial Neural Network.

 Take face recognition as an example, as shown in

Figure 7, data (face image) is presented to the

network via the input layer, which connects to one or

more hidden layers. The hidden layers further

connect to an output layer.

 Each hidden layer represents one level of face image

features (greyscale, eye shape, facial contours, etc.).

Every node on each layer is connected to the nodes

on the neighbor layer with a weight value.

 The actual processing of deep learning is done by

adjusting the weights of each connection to realize

input-output mapping.

EXAMPLE OF ANN USING FOR FACE RECOGNITION

https://fortune.com/longform/ai-artificial-
intelligence-deep-machine-learning/

Source for the next 4 slides

https://fortune.com/longform/ai-artificial-intelligence-deep-machine-learning/

https://fortune.com/l
ongform/ai-artificial-
intelligence-deep-
machine-learning/

 GOOGLE

 Google launched the deep-learning-focused Google Brain project in 2011, introduced

neural nets into its speech-recognition products in mid-2012, and retained neural nets

pioneer Geoffrey Hinton in March 2013. It now has more than 1,000 deep-learning projects

underway, it says, extending across search, Android, Gmail, photo, maps, translate,

YouTube, and self-driving cars. In 2014 it bought DeepMind, whose deep reinforcement

learning project, AlphaGo, defeated the world’s go champion, Lee Sedol, in March,

achieving an artificial intelligence landmark.

 MICROSOFT

 Microsoft introduced deep learning into its commercial speech-recognition products,

including Bing voice search and X-Box voice commands, during the first half of 2011. The

company now uses neural nets for its search rankings, photo search, translation systems,

and more. “It’s hard to convey the pervasive impact this has had,” says Lee. Last year it won

the key image-recognition contest, and in September it scored a record low error rate on a

speech-recognition benchmark: 6.3%.

 FACEBOOK

 In December 2013, Facebook hired French neural nets innovator Yann LeCun to direct its new

AI research lab. Facebook uses neural nets to translate about 2 billion user posts per day in more

than 40 languages, and says its translations are seen by 800 million users a day. (About half its

community does not speak English.) Facebook also uses neural nets for photo search and photo

organization, and it’s working on a feature that would generate spoken captions for untagged

photos that could be used by the visually impaired.

 BAIDU

 In May 2014, Baidu hired Andrew Ng, who had earlier helped launch and lead the Google Brain

project, to lead its research lab. China’s leading search and web services site, Baidu uses neural

nets for speech recognition, translation, photo search, and a self-driving car project, among

others. Speech recognition is key in China, a mobile-first society whose main language, Mandarin,

is difficult to type into a device. The number of customers interfacing by speech has tripled in the

past 18 months, Baidu says.

ARTIFICIAL
NEURAL
NETWORKS
(ANN)

 Most ANNs contain a learning scheme that modifies the
connection weights based on the input patterns and connection
types that it is presented. This brings different deep neural
networks, such as Convolutional deep neural networks (CNNs),
recurrent neural networks (RNNs).

 Here we take a simple ANN to illustrate the learning process:
feed-forward and backpropagation, which is similar to biological
neural networks. Human brains learn to do complex things, such
as recognizing objects, not by processing exhaustive rules but
through experience, feedback, adjust and learn. Figure 8 gives an
illustration of this process.

 In the beginning, all the connections are randomly assigned
weight values. In feed-forward step, all the input nodes receive
their respective values from the given input and generate a
combination, like linear transmission, to the nodes in hidden
layers.

 Upon receiving the initial input, the hidden layers make a random
guess as to what that pattern might be via using the assigned
weights. There are various activation functions for the calculation
at the hidden and output layers. The sigmoid or logistic function
remains the most popular among users.

DEMO 3: LET US MOVE TO THE DEMO FOR CNN, GAN AND VAE

 Open Demo 3 Deep Learning Demo_David Lee

APPLICATIONS: MORE EXAMPLES

TESLA AUTOPILOT

 https://www.tesla.com/autopi

lot?redirect=no

https://www.tesla.com/autopilot?redirect=no

 https://medium.com/@tomyuz/a-sentiment-analysis-approach-to-

predicting-stock-returns-d5ca8b75a42

https://medium.com/@tomyuz/a-sentiment-analysis-approach-to-predicting-stock-returns-d5ca8b75a42

DEMO 4: NATURAL LANGUAGE PROCESSING

 Open Demo 4 NLP_David Lee

REFERENCES

 Langley, P. (2011). The changing science of machine learning. Machine Learning, 82(3), 275-279.

 Francois, C. (2017). Deep learning with Python. Manning Publications Co., NY, USA.

 The rest of the materials are prepared with SUSS FinTech and Blockchain Team (Prof Reng Jin, Wang Yu,

Low Swee Won) and a joint book with IMDA based on https://www.imda.gov.sg/infocomm-media-

landscape/services-40

https://www.imda.gov.sg/infocomm-media-landscape/services-40

DEMO 5: DEEP FAKES

 Open Demo 5 Deepfakes_David Lee