ESS116
ESS 116
Introduction to Data Analysis in Earth Science
Image Credit: NASA
Instructor: Mathieu Morlighem
E-mail: mmorligh@uci.edu (include ESS116 in subject line)
Office Hours: 3218 Croul Hall, Friday 2:00 pm – 3:00 pm
This content is protected and may not be shared uploaded or distributed
Lecture 2 quick review
Matrices/Vectors
Plot
I/O
Lecture 3 – MATLAB Programming
Program vs script
Algorithm example (script)
fprintf
User defined function
Today’s lecture
Lecture 2 – review
Defining matrices in MATLAB: use square brackets
M =
>> M = [1 3 5; 2 4 6]
>> M = [1,3,5; 2,4,6]
Matrix Operations: use a dot in front of the operator for element-by-element (or array) operation:
>> A * B %matrix multiplication
>> A .* B %array multiplication
Vectors and Matrices
Colon operator begin:increment:end
>> 0:3:61 %[0, 3, 6, 9,…..,60]
>> 0:60 %If not specified, the increment is assumed to be 1
“linspace” function: linspace(begin,end,n)
>> linspace(0,60,21) %Gives 21 evenly spaced values between 0-60
“zeros”, “ones” and “rand”
>> A = zeros(5,4); %Create a 5×4 matrix of zeros, and assign to variable A
>> B = ones(5,6); %Create a 5×6 matrix of ones, and assign to variable B
>> Z = rand(10,1); %Create a 10×1 matrix of random numbers between [0-1]
“size” (Matrices) and “length” (vectors)
>> size(A)
>> [m, n] = size(A);
>> m = length(A(1,:));
Vectors and Matrices
If A is the name of a MATLAB matrix:
>> A(3,1) returns the value of the 3rd row and 1st column
>> A(2,:) returns the 2nd row of A
>> A(3,1)=5; changes the 3rd row and 1st column of A to 5
>> A(:,4)=2; changes the entire 4th column of A to 2
Linear index:
(row,col) notation: Linear index:
>> A(7) will return the same value as A(3,2)
Matrix Elements
Finding Matrix Elements
A = B =
To find the locations of “2” in A (linear indices):
>> pos = find(A==2);
To find the locations where B>5 (linear indices):
>> pos = find(B>5);
Combining find and replace:
>> posAgt5 = find(A>5);
>> A(posAgt5) = 5;
Combining find and remove:
>> posBgt5 = find(B>5);
>> B(posBgt5) = [];
How would you replace the 5 of A by 42?
A =
A(6) = 42
A(3,2) = 42
a=find(A==5); A(a)=42;
All of the above
i>Clicker question
plot(x,y,’-r’)
Possible colors:
‘b’ (blue), ‘g’ (Green), ‘k’ (black), ‘r’ (red),…
Line types:
— (dashed), -. (dash dot), : (dotted), – (solid)
Markers:
o (circle), + (plus), * (asterisk), s (square), ^ (triangle),…
Plot
plot(x,y,’-.b’);
plot(x,y,’g*’);
plot(x,y,’*:r’);
Adding labels and title
xlabel(‘x label (unit)’);
ylabel(‘y label (unit)’);
title(‘This is the plot title’);
Multiple lines on the same plot:
hold on % After the first plot
Plot on multiple figures
call figure(1), figure(2),… before plot command
Multiple plots on the same figure
use subplot(nrows,ncols,number) before plot command
example: subplot(2,2,3) is the lower left subplot of 2×2
Plot options
Loading data from a file
load(‘filename’)
not covered in this course
File has a .mat extension
fopen,textscan,fclose
no (and yes)
Only contains numbers
Consistent number of columns
no
yes
Is it an ASCII (text) file?
yes
yes
no
“load” cannot open the following file:
Rocks.dat
1 81.472 granite a
2 90.579 rhyolite b
3 12.699 diorite c
4 91.338 andesite d
5 63.236 basalt e
6 9.754 gabbro f
%d %f %s %s
%open file
fid = fopen(‘rocks.dat’);
%read file
cellMat = textscan(fid,’%d %f %s %s’);
%close file
fclose(fid);
%extract data
number = cellMat{1};
density = cellMat{2};
name = cellMat{3};
letter = cellMat{4};
% From now on, do not use cellMat
File Input: textscan
fopen: opens a file and returns a “File Identifier” used to identify the file later on in your code (-1 signifies an error in opening the file)
fclose: After you are finished with a file, you should close it (prevents errors and unwanted behavior).
textscan: read all the lines of the file and populate variable ‘cellMat’ (one element per column)
%d: integer (e.g., 235)
%f: floating-point number (e.g., 20.5)
%s: string (e.g., word)
Use option ‘HeaderLines’ to skip first lines if needed
How should you be reading the following file?
textscan(fid, %s %f %s);
textscan(fid,’%s %f %s’);
textscan(fid,’%s %f %s’,’HeaderLines’,2);
textscan(fid,’%f %f %s’,’HeaderLines’,2);
All of the above
i>Clicker question
#Weatherstation data: date/temp/weather
#from Uumannaq Greenland
5/12/2016 1.42 sunny
5/13/2016 5.28 rain
5/14/2016 0.08 snow
…
Lecture 3 – MATLAB programming
Program vs script
Interactive MATLAB (command line)
only good if problem is simple
Often, many steps are needed
We also want to be able to automate repeated tasks
Automated data processing is common in Earth science!
MATLAB Programming
Automated Earthquake Detection (Caltech/USGS)
Automated Stream Discharge Monitoring (USGS)
MATLAB is an interpreted language
Code is read line by line by an interpreter (not a compiler)
Each line of code is translated into machine language and executed on the fly
No .exe file is generated (e.g. firefox.exe, Word.exe, etc.)
Because MATLAB code is not compiled…
source code is referred to as a script
Also called M-files (end in .m)
Advantages
Don’t need to spend time compiling the code to use it
Don’t need to recompile after you make changes
Same script will work on any operating system with MATLAB
Disadvantages
Because code is compiled on the fly, some tasks can be slow*
Others can change your code*
Program vs. Script
Writing a script
Before starting to write any code,
you should break the problem down into a simple algorithm
Algorithm: A sequence of steps to solve a problem
Example : Calculate Volume of a Sphere
Get the input: radius of sphere
Set the radius
Store the radius in a variable
Calculate the result: volume of sphere
Plug radius into volume equation
Store result in variable
Display the result
We’ll do this later…
Algorithm Example
The top of all scripts should contain commented documentation:
H1 line: a short comment on what the script does
Subsequent lines
Script name
Author info
Date
Details of what the code does
Usage (if a function)
Read by “help”
Simple Script Example
To execute a script, enter its name in the command window (don’t include the “.m”)
Any line of MATLAB code that begins with “%” is ignored by the interpreter
Referred to as: “comments”
Does not slow down execution of your code
MATLAB doesn’t even read comments, but people do
Comments allow you to tell yourself and others what you did
Typically you can fit most comments into a single line
In this class:
uncommented code gets a zero (sorry… that’s the rule)
Every section of code must have a brief comment
Documenting Your Code
%set the radius value
rad = 23;
%compute the area
Area = pi * (rad ^ 2);
%MATLAB ignores commented lines. Use them!!!
Output statement (fprintf)
To be useful, a script must be able to output a result
Simplest output: Print to the command window
Use either “disp” or “fprintf”
Output Statements: disp
“disp” can print strings or numbers
“disp” can print variables
“disp” can only print one thing at a time.
For this reason, MATLAB also provides “fprintf”
“fprintf” has a somewhat confusing syntax
Most programming languages have fprintf (or printf)
Syntax was inherited from C / C++
Output Statements: fprintf
“fprintf” does not include a new line (“\n”) after a string, unless you tell it to do so.
Use the new line special character, “\n”
You do not need a space before or after a special character, but adding a space makes code easier to read.
“fprintf” also recognizes these special characters
For more info, see “doc fprintf” and click on the “formatting strings” link near the bottom of the page
“fprintf” can also be used to print variables
Can apply special formatting to numeric variable (Very Useful!!)
Output Statements: fprintf
When you use fprintf to print a variable, a variable is indicated by a place holder, in this case “%d”
The variable name must be given after the string.
“%f” indicates that a floating point number is to be printed.
Can print multiple variables in one line! (“disp” can’t do this)
“fprintf” and %f can be used to control just about every aspect of the formatting of a floating point number
Output Statements: fprintf
By default, 7 digits are shown, even though MATLAB variables are stored with 17 digits of precision (if needed)
Want to print π rounded to two decimal places?
Want to print a variable in scientific notation with 5 decimal places?
MATLAB double variables can hold only 17 digits. Anything beyond the 17th place gets displayed as a zero and is not stored.
MATLAB uses a compact format by default, so numbers get displayed in scientific notation.
“fprintf” can override this!
“fprintf” is also great for printing out numbers so they are in neat columns.
Output Statements: fprintf
Note the different results. How does this work?
%6.2f
Leave at least 6 total spaces for each number
includes decimals, exponents, and negative signs
Round to 2 decimal places
%06.2f
Same as above, but show leading zeros
I define the following variable in MATLAB:
EarthRadius = 6371;
I would like MATLAB to print:
The radius of the earth is 6371.0
What is the correct command?
fprintf(‘The radius of the earth is %d\n’,EarthRadius);
fprintf(‘The radius of the earth is %.1f\n’,EarthRadius);
fprintf(‘The radius of the earth is %7.2f\n’,EarthRadius);
All of the above
i>Clicker question
Writing a function
You have already used many built in functions in MATLAB
plot, sin, size, int8, fprintf, linspace, etc…
For example “linspace”
Calling a User-Defined Functions
The “Call”, or “calling” the function
“Arguments”
Where the “returned” value is stored
Writing a User-Defined Functions
myfunction
x
a
b
Function Name
(same as file name)
Outputs
(returned)
Inputs (arguments)
The function is written in one MATLAB file (*.m)
filename = function name + .m
Compared to a script, a function
takes a number of inputs (arguments)
returns a number variables
acts like a “black box”
does not have access to the workspace!
Lets look at the general function setup:
Example: Converting mph to m/s
Writing a User-Defined Functions
The “function header” (required for all functions)
The reserved word, “function” (1st line must start with this)
Name of function (identical to name of m-file without .m)
Input arguments (must be provided by the user)
Value that is “returned” (not all functions need to return something)
Function must be in your pwd or MATLAB path
Call by function name
no .m at end
Same as scripts!
All variables inside a function are local variables
Only exist inside the function!!
May be confusing at first, but keeps things tidy
User doesn’t care about all of the intermediate variables
Only wants the returned value
User-Defined Functions
Why are “v” and “mph” not defined in the command window?
Functions can use any number of variables
User-Defined Functions
All of these variables are local!
The user won’t be aware of them unless he/she opens the m-file
Returned value was stored in “ans”
Not illegal, but bad programming
Local variables should not be printed to the screen
Confusing!!
Not accessible to the user
Printing variables is slow
Another Poorly Written Function
Functions can
Accept more than one argument (separated by commas)
Make plots
Functions That Make Plots
Functions That Make Plots
Functions can
Accept more than one argument (separated by commas)
Make plots
Functions can return multiple values
Or even matrices!
Functions That Return Multiple Values
Functions can return multiple values
Or even matrices!
Functions That Return Multiple Values
If you only specify one variable, only the first returned value is stored
How could we return both “x” and “y” as one matrix “xy”?
How could we return both “x” and “y” as a 2-row matrix?
Change the the first line from:
[x,y]=plotSinWave2(amp,lam,xmin,xmax,numPts)
To
xy = plotSinWave2(amp,lam,xmin,xmax,numPts)
And add one line to the file:
i>Clicker question
xy = x,y;
xy = [x;y];
xy = [x,y];
xy = {x,y};
MATLAB Commands to remember
Lab 3: MATLAB programming
Lecture 4: Selection statements and loops
What’s next?
1 3 5
2 4 6
�
2
66
4
(1, 1) (1, 2) (1, 3)
(2, 1) (2, 2) (2, 3)
(3, 1) (3, 2) (3, 3)
(4, 1) (4, 2) (4, 3)
3
77
5
2
66
4
(1) (5) (9)
(2) (6) (10)
(3) (7) (11)
(4) (8) (12)
3
77
5
2
66
4
9 4 2
3 7 5
11 1 10
8 6 12
3
77
5
⇥
3 8 2 �4 6 1
⇤
ESS116: MATLAB Cheat Sheet
1 Path and file operations
cd Change Directory (followed by absolute or relative path of a directory)
cd ../../Shared (relative path)
cd /Users/Shared (absolute path)
pwd display current directory’s absolute path (Path Working Directory)
ls display list of files and directories in the current directory
(can be followed by a path and/or file name pattern with *)
ls ../file*mat
ls *.txt
ls /Users/mmorligh/Desktop/
copyfile copy existing file into a new directory, and/or rename a file
copyfile(‘/Users/Shared/foo.txt’,’.’);
copyfile(‘foo.txt’,’bar.txt’);
mkdir create a directory
mkdir Lab1
2 Fundamental MATLAB classes
double floating point number (1.52, pi, …) → MATLAB’s default type
int8 Integer between -128 and 127 (8 bits, saves memory)
uint8 Unsigned integer between 0 and 255 (used primarily for images)
int16 Integer between -32768 and 32767 (16 bits)
logical true/false
string data type for text (str = ‘This is a string’;)
cell cell array, used by textscan
3 Matrices
Use square [] to create a matrix, and ; to separate rows
A=[1 2 3;4 5 6;7 8 9];
ones, zeros create a matrix full of ones or zeros
A=ones(5,2);
‘ transpose a matrix
B=A’;
length return length of a vector (do not use for matrices)
size returns the size of a matrix
(number of rows then columns, then 3rd dimension if 3D, etc)
[nrows,ncols]=size(A); [nrows,ncols,nlayers]=size(A3D);
linspace and : to create vectors
A=2:3:100;
A=linspace(2,100,10);
find return the linear indices where a condition on the elements of a matrix is met
pos=find(A==−9999);
pos=find(A>100);
Extract the first 10 even columns of a matrix
B=A(:,2:2:20);
Removing elements: use empty brackets
A(:,2)= [];
Concatenate matrices
A=’This is ‘; B=[A ‘an example’];
Replacing elements in a matrix (use either linear or row,col notation)
A(10,3)=5.5;
pos=find(A==−9999);
A(pos)= 0;
Element-by-element operation: use a dot (.) before the operator
A= C.*D;
4 I/O
load loads a MATLAB file (*.mat) into the workspace, or a text file with only numbers
and consistent number of columns
load(‘data.mat’);
data=load(‘data.txt’);
textscan loads a text file into a cell array (as many elements as there are columns in the file)
Use %d for integers, %f for floating point numbers %s for strings
fid = fopen(‘filename’);
data = textscan(fid,’%d %f %s %s’,’Headerlines’,5);
fclose(fid);
%Put first column in A, and second column in B
A = data{1}; B = data{2};
5 fprintf
fprintf print text (and variables) to the screen. First argument is a string with placeholders.
fprintf(‘The radius is %7.2f and A = %d !!\n’,EarthRadius,10);
– Special characters: \n (new line) %% (percent sign) ” (apostrophe)
– Variable specifiers: %s (string) %d (integer) %e (exponential) %f (float)
– %010.3f: leading 0, 10 total spaces, 3 decimals. Ex: 000003.142
6 Visualization
plot displays a list of points (x,y)
plot(x,y,’−r’);
plot(x,y,’r+:’,’MarkerFaceColor’,’g’,’MarkerSize’,5,’LineWidth’,2);
axis controls x and y axes
axis([xmin xmax ymin ymax]);
axis equal tight
legend adds a legend to previously plotted curves
legend(‘First curve’,’second curve’)
figure creates a new figure window
figure(2)
xlabel/ylabel/title control x/y axis labels and plot title
xlabel(‘Distance (km)’);
hold on keep current plot so that whatever follows is plotted on the same plot
subplot divide figure into several subplots
subplot(2,3,1)
histogram make a histogram for a vector
histogram(tmax,20);
histogram(tmax,round(sqrt(length(tmax))));
7 Relational and Logical operators
== equal to, ~= not equal to, > greater than, >= greater than or equal to,
< less than, <= less than or equal to.
&& and, || or, ~ not.
A=( (1>10)|| (3~=4));
8 If/elseif/else and for loops (examples)
Counting algorithm
%Initialize counters
counter1 = 0;
counter2 = 0;
%Go over all of the elements of T and increment counters when a condition is met
for i=1:length(T)
if T(i)>100
counter1 = counter1+1
elseif T(i)<0
counter2 = counter2+1
end
end
fprintf('Found %d days with T>%f, and %d with T<%f\n',counter1,100,counter2,0)
Extracting (after counting!)
%You first need to count how many times T>100 (for example), then: allocate memory
hotdays = zeros(counter1,1);
%Go through T, again, and store temperatures>100 in hotdays
count = 1;
for i=1:length(T)
if T(i)>100
hotdays(count)=T(i);
count = count+1;
end
end
9 Statistic
mean computes mean of a vector
median computes median of a vector
std computes standard deviation
min returns minimum value in a vector
max returns minimum value in a vector
skewness returns skewness
kurtosis returns kurtosis
normcdf/tcdf/chi2cdf cumulative density function for a normal, t and χ2 distributions
norminv/tinv/chi2inv inverse of the cumulative density function
p0 = normcdf(x0,mu,sigma);
x0 = norminv(p0,mu,sigma);
p0 = tcdf(x0,V);
x0 = tinv(p0,V);
10 Polynomials and interpolations
polyval returns the value of a polynomials (represented by its coefficient) for some x
coeff = [3 2.7 1 −5.7];
x=0:0.2:0.6;
y=polyval(coeff,x)
polyfit returns the coefficient of the polynomials that best fit data points
coeff = polyfit(datax,datay,3); %3 means cubic polynomial
interp1 interpolates between data points (spline or linear)
y1=interp1(datax,datay,’linear’);
y2=interp1(datax,datay,’spline’);
11 Image processing
imread loads an image (as a matrix) into the workspace
A=imread(‘image.png’)
image display an image (2D or 3D)
imagesc display a 2D image and scale indices to use all the colors in the color map
colormap set a colormap (only for indexed images)
Indexed Images (2D)
Indexed image, need two matrices:
iMat a 2D matrix with indices
cMap is a nx3 2D matrices with the RGB code for each index
To display this image:
image(iMat);
colormap(cMap);
If the colormap is not consistent with indices, you need to use imagesc(iMat).
True-color image (3D, RGB)
No need to prescribe a colormap:
iMat(:,:,1) Red matrix (between 0–255 if uint8, or 0–1 if double)
iMat(:,:,2) Green matrix
iMat(:,:,3) Blue matrix
To display this image: image(iMat).
12 Miscellaneous
rand returns a random floating point number between 0 and 1
x=rand
x=rand(10,2)
round round input to closest integer
A=round(rand*10);
whos displays list of all variables in MATLAB workspace
tic/toc displays cpu time for a chunk of code
sqrt square root
13 Functions
Calling a function: [output1,output2] = functionname(arg1,arg2);
Function header (top lines of the file that implements this function):
function [output1,output2] = functionname(arg1,arg2)
% H1 line: describe what the function does
ESS116, M. Morlighem, Updated: March 14, 2019
/docProps/thumbnail.jpeg