14.8. Temperature Data Project¶
This project will demonstrate:
- Importing data into MATLAB
- Fixing out-of-range data
- Working with data containing dates
- Using statistical moving window functions
The instructions are given partly as a tutorial, but are also my notes for when I was working on it for the first time. Depending on the data that you download, some adjustments may be needed.
The output of the assignment will be a plot. Save the plot to a ‘png’ picture file and upload the picture on Canvas.
14.8.1. Part 1: Getting the Data¶
Search the Internet for Hourly temperature data:
- Quality Controlled Datasets - National Climatic Data Center - NOAA https://www.ncdc.noaa.gov/crn/qcdatasets.html
- Hourly02 directory has what we want. The folders and files are organized by year and reporting station. Two stations are in KS (Manhattan, Oakley). Downloaded the Manhattan data for the year of your choice.
- From the documentation, field 4 - Local Standard Time (LST) date field 5 - The Local Standard Time (LST) time of the observation. field 10 - Average air temperature, in degrees C, for the entire hour.
- Note that data starts based on UTC time, so it has the last hours of the previous year.
- Use the Import Data tool. Rename and import fields 4, 5, and 10 to
Date
,Hour
, andTemp
.
Give the table a more manageable name and view a sample of the data:
>> tempData = CRNH02032016KSManhattan6SSW;
>> tempData(1:6,:)
ans =
6x3 table
Date Hour Temp
__________ ____ _____
2.0151e+07 1900 -1.8
2.0151e+07 2000 -3.3
2.0151e+07 2100 -4.4
2.0151e+07 2200 -5.7
2.0151e+07 2300 -3.7
2.016e+07 0 -3.7
Remove the big table
>> clear CRNH02032016KSManhattan6SSW
Take a quick look at the data
>> plot(tempData.Temp)
The data has some very large negative values.
Check for missing data – all there, good
>> any(ismissing(tempData.Temp))
ans =
logical
0
For convenience
temp = tempData.Temp
>> min(temp)
ans =
-9999
>> max(temp)
ans =
38.8000 -- about 101.8 F (probably okay)
It looks like there are a few clusters of -9999 temps, probably when no reading
was taken. We can use the fillmissing
function to take care of missing
data. Linear interpolation with the surrounding valid data will suit our
needs.
%% Fill missing data
badData = tempData.Temp < -30;
tempData.Temp(badData) = NaN;
tempData.Temp = fillmissing(tempData.Temp, 'linear');
Test the data now:
>> min(tempData.Temp)
ans =
-27
Save the clean data
>> writetable(tempData, 'EastKS_temperatures16.csv')
14.8.2. Part 2: The Plotted X-axis¶
Now for the date and time. The following function returns a datetime
value
from the dates and times given:
function date = get_date_time( dateNum, timeNum )
%GET_DATE_TIME Convert numeric date and time to datetime
% dateNum such as: 20151231
% timeNum such as: 1900
year = floor(dateNum/10000);
month = floor((dateNum - year*10000)/100);
day = dateNum - year*10000 - month*100;
hour = timeNum/100;
date = datetime(year,month,day,hour,0,0);
end
Make an okay initial plot:
>> dates = get_date_time(tempData.Date, tempData.Hour);
>> plot(dates, temp)
14.8.3. Part 3: Statistical Analysis¶
In the following code, we find the daily mean (average), maximum, and minimum temperatures. Then, to smooth out random fluctuations moving window averages are taken over a six week window size. This is probably a larger window size than is needed. Experiment with shortening the window size.
% convert to Fahrenheit
temps = tempData.Temp*9/5 + 32;
% plot(dates,temps);
%% Daily stats
dAve = movmean(temps, 24);
dMax = movmax(temps, 24);
dMin = movmin(temps, 24);
% 6 week filters
ave6w = movmean(dAve, 1008);
max6w = movmean(dMax, 1008);
min6w = movmean(dMin, 1008);
plot(dates, ave6w, 'LineWidth', 3);
hold on
plot(dates, max6w, 'r', 'LineWidth', 2);
plot(dates, min6w, 'g', 'LineWidth', 2);
hold off
legend('Daily Average','Daily Max','Daily Min','Location','northwest');
ylabel('Temperature, {}^{\circ}F');
title('Smoothed Temperature Data');