- Lab
- Core Tech

Guided: Analyze Weather Data with Kotlin
In this lab, you'll learn how to build a command-line application capable of processing and analyzing datasets of weather information. You'll explore concepts such as data filtering and processing, statistical analysis, trend detection, and anomaly identification, all while building a practical application that can be easily extended and customized.

Path Info
Table of Contents
-
Challenge
Introduction
Welcome to the lab Guided: Analysis Weather Data with Kotlin.
In this lab, you'll develop a Command Line Interface (CLI) application to analyze weather data and identify patterns and trends.
The application, named ClimateAnalyzer, will have the following functionality:
- Data ingestion: The application takes the path of a CSV file containing weather data from a command-line argument. It reads and parses the file to extract the weather information.
- Data filtering: Users can filter the imported data by specifying a date range. The application will return the filtered data based on the provided start and end dates.
- Statistical analysis: The application calculates the average, minimum, and maximum values for temperature, humidity, and pressure within the filtered data range.
- Anomaly detection: The application flags any temperature readings that are more than two standard deviations away from the mean. It displays the date and location of the anomalous readings.
- Trend detection: The application identifies linear trends in temperature, humidity, and pressure over the specified date range. It calculates the slope and intercept of the trend line for each weather parameter.
- Data persistence: The imported weather data is stored in memory. The application provides an option to save the filtered data and analysis results to new CSV and text files.
Here's a sample input CSV file:
date,location,temperature,humidity,pressure 2024-01-01,New York,45.2,45,1018.3 2024-01-15,New York,42.8,50,1022.1 2024-02-01,New York,43.5,55,1016.9
The application does not differentiate between data measurements units. For example, temperature data can be provided in either ºF or ºC.
Upon launch, the application will read the CSV file, display the total number of records read, analyze the data, detect trends and anomalies, and present the user with a menu. This menu will offer options to filter the data by date, view the analysis results, and save both the filtered data and the results. Each time the user filters the data, the application will perform all analyses on the filtered data. ---
Familiarizing with the Program Structure
Here's a description of the application's files:
-
src/WeatherData.kt
: Defines a data class to represent a single record of weather data, containing properties such as date, location, temperature, humidity, and pressure. -
src/DataProcessor.kt
: Responsible for reading the CSV file, parsing the weather data, filtering the data based on a date range, and performing analysis on the data by calling other classes. -
src/DataAnalyzer.kt
: Contains methods to perform statistical analysis on the weather data, calculating average, minimum, and maximum values for temperature, humidity, and pressure. -
src/TrendDetector.kt
: Implements methods to detect trends in the weather data, calculating the slope and intercept of the trend line for each weather parameter. -
src/AnomalyDetector.kt
: Provides methods to detect anomalies in the weather data, identifying temperature readings that are more than two standard deviations away from the mean. -
src/UserInterface.kt
: Implements the command-line user interface, displaying menus, prompts, and messages to the user, handling user input, and validating input format. -
src/DataSavingsUtils.kt
: Contains two functions, one to save the filtered data to a CSV file, and another to save the results of the weather data analysis to a text file. -
src/main.kt
: Contains the main function that serves as the entry point of the application, handling command-line arguments and initializing the necessary objects.
Your primary focus will be on the
DataProcessor.kt
,DataAnalyzer.kt
,TrendDetector.kt
,AnomalyDetector.kt
, andDataSavingsUtils.kt
files. As you progress, you'll understand how the other files interplay with these classes. Each file is extensively commented to help you understand what they do.You can compile and run the existing program using the Run button located at the bottom-right corner of the Terminal tab. Initially, it will compile with some warnings and won't produce functional outputs, but you will be able to navigate through the menus and options.
Begin by familiarizing yourself with the setup. When you're ready, dive into the coding process. If you have problems, remember that a
solution
directory is available for reference or to verify your code. -
Challenge
Step 1: Reading and Parsing CSV Data
Reading a CSV File in Kotlin
In Kotlin, you can read the contents of a Comma-Separated Values (CSV) file using the
File
class and itsforEachLine
method.First, you create a
File
object by providing the file path as a string:val file = File("path/to/file.csv")
This way, you can use the
forEachLine
method to iterate over each line of the file:file.forEachLine { line -> // Process each line }
Inside the
forEachLine
block, you can split the line into fields using thesplit
method and a comma (,
) as the delimiter:val fields = line.split(',')
In this case, the
fields
variable will be an array containing the individual fields from the CSV line, so you can access each field by its index:val field1 = fields[0] val field2 = fields[1] // ...
The index starts from
0
, sofields[0]
represents the first field,fields[1]
represents the second field, and so on.Once you get the value of a field, you can convert it to appropriate data types (using
toInt()
,toDouble()
, orLocalDate.parse()
, for example) or perform any necessary validations or checks on the fields.In the application, the
main()
function in themain.kt
file is responsible for handling the command-line arguments, checking if the file exists, creating an instance of theDataProcessor
class, and calling thereadCSVFile()
function to read the weather data from the specified CSV file. Everything before the user interface is displayed.In the next task, you'll implement the
readCSVFile
function. -
Challenge
Step 2: Filtering Data by Date Range
Filtering a List by Date Range
In Kotlin, you can use the
filter
function to create a new list containing only the elements that satisfy a given predicate.The
filter
function takes a lambda expression as an argument, which is applied to each element of the list. This lambda expression should return aboolean
value, indicating whether or not the element should be included in the resulting list.Here's the general syntax for using
filter
with a lambda expression:val filteredList = originalList.filter { element -> // Predicate condition // Return true if the element should be included, // false otherwise }
For example, say you have a list of
Order
objects, and eachOrder
has adate
property representing the date when the order was placed. If you want to filter the list to include only the orders placed within a specific date range, you can use thefilter
function as follows:val filteredOrders = orderList.filter { order -> order.date >= startDate && order.date <= endDate }
In this example,
orderList
is the original list ofOrder
objects, andstartDate
andendDate
represent the desired date range. The lambda expression{ order -> ... }
is applied to eachOrder
object in the list. Inside the lambda, you can access the properties of eachOrder
object, such asdate
, and define the filtering condition.The condition
order.date >= startDate && order.date <= endDate
checks if thedate
property of eachOrder
object falls within the specified date range. If the condition istrue
, the order is included in the resultingfilteredOrders
list, otherwise, it's excluded.In the application, after reading and processing the CSV file, the
main()
function in themain.kt
file creates an instance of theUserInterface
class and calls itsstart()
function to display the main menu. When the user selects the option to filter data by date range from the menu, thefilterDataByDateRange()
function of theUserInterface
class prompts the user to enter a start date and an end date for the desired date range, which are passed to thefilterDataByDateRange()
function of theDataProcessor
class.In the next task, you'll implement this function to filter the weather data by a date range.
-
Challenge
Step 3: Performing Statistical Analysis
Calculating Average, Minimum, and Maximum Values in Kotlin
The application needs to calculate the average, minimum, and maximum values from the list of weather data. Next, you will see how to perform these calculations and handle cases where the list might be empty.
1. Calculating the Average. To calculate the average of values in a list, you can use the
average()
function. However, if the list is empty, callingaverage()
will result inNaN
(Not-a-Number).NaN
is a special floating-point value that represents an undefined or unrepresentable result. In most cases, havingNaN
as a result is not desirable because it can propagate through further calculations and lead to unexpected behavior. To handle the case of an empty list and provide a more meaningful result, you can use anif-else
expression to check if the list is empty and provide a default value.Here's an example:
val numbers = listOf<Double>() val avg = if (numbers.isEmpty()) 0.0 else numbers.average() println("Average: $avg") // Output: Average: 0.0
In this example, if the
numbers
list is empty, the average will be set to0.0
. Otherwise, it will calculate the average using theaverage()
function. By checking for an empty list and providing a default value, you can avoid dealing withNaN
and ensure that your code behaves predictably.2. Finding the Minimum Value. To find the minimum value in a list, you can use the
minOrNull()
function. This function returns the minimum value if the list is not empty, ornull
if the list is empty. To provide a default value when the list is empty, you can use the Elvis operator (?:
).Here's an example:
val numbers = listOf(10, 20, 30, 40, 50) val min = numbers.minOrNull() ?: 0.0 println("Minimum: $min") // Output: Minimum: 10.0
In this example, if the
numbers
list is empty, the minimum value will be set to0.0
. Otherwise, it will find the minimum value using theminOrNull()
function.3. Finding the Maximum Value. Similar to finding the minimum value, you can use the
maxOrNull()
function to find the maximum value in a list. If the list is empty,maxOrNull()
returnsnull
. You can use the Elvis operator (?:
) to provide a default value in case the list is empty.Here's an example:
val numbers = listOf(10, 20, 30, 40, 50) val max = numbers.maxOrNull() ?: 0.0 println("Maximum: $max") // Output: Maximum: 50.0
In this example, if the
numbers
list is empty, the maximum value will be set to0.0
. Otherwise, it will find the maximum value using themaxOrNull()
function.By using these functions, you can calculate the average, minimum, and maximum values from a list in Kotlin while handling cases where the list might be empty.
In the application, after the CSV file is processed and whenever a new filter is applied, the
performAnalysis()
function of theDataProcessor
class calls theanalyze()
function of theDataAnalyzer
class to extract the temperatures, humidity values, and pressure from the list ofWeatherData
. It then calculates statistical summaries (average, min, max) for each weather attribute by calling thecalculateStats()
function.In the next task, you'll complete the implementation of the
calculateStats()
function by using the above concepts. -
Challenge
Step 4: Detecting Trends
Identifying Trends with Linear Regression
To identify trends in the dataset, you'll use linear regression.
Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables. In the case of a simple linear regression with one independent variable, the goal is to find the best-fitting straight line through the data points.
The equation of a straight line is typically written as:
y = mx + b
Where:
y
is the dependent variablex
is the independent variablem
is the slope of the lineb
is the y-intercept (the value ofy
whenx
is zero)
To find the best-fitting line, you will aim to calculate the optimal values of
m
(slope) andb
(they
-intercept, or simply, intercept), which minimize the sum of the squared differences between the observed and predicted values.The formulas for calculating the slope (
m
) and the intercept (b
) using linear regression are:m = (n * Σ(x_i * y_i) - Σx_i * Σy_i) / (n * Σ(x_i^2) - (Σx_i)^2) b = (Σy_i - m * Σx_i) / n
Where:
n
is the number of data pointsx_i
andy_i
are the values of the independent and dependent variables for thei
-th data pointΣ
represents the sum of the values
For example, say you have the following data points representing the number of hours studied (
x
) and the corresponding test scores (y
) for a group of students:Hours studied (
x
):2
,4
,6
,8
,10
Test scores (
y
):60
,75
,85
,90
,95
To find the trend line, you calculate:
n = 5 Σx_i = 2 + 4 + 6 + 8 + 10 = 30 Σy_i = 60 + 75 + 85 + 90 + 95 = 405 Σ(x_i * y_i) = (2 * 60) + (4 * 75) + (6 * 85) + (8 * 90) + (10 * 95) = 2600 Σ(x_i^2) = 2^2 + 4^2 + 6^2 + 8^2 + 10^2 = 220
Using the formulas above, you can calculate the slope (
m
) and intercept (b
):m = (5 * 2600 - 30 * 405) / (5 * 220 - 30^2) = 4.25 b = (405 - 4.25 * 30) / 5 = 55.5
Therefore, the equation of the trend line is:
y = 4.25x + 55.5
This means that, on average, for each additional hour studied, the test score is expected to increase by
4.25
points, and a student who studies for0
hours is expected to score55.5
points.Your task is to implement the
calculateTrend()
function in theTrendDetector.kt
file using the formulas provided above. This function takes two lists ofDouble
values representing thex
andy
coordinates of the data points and returns aTrend
object containing the calculated slope and intercept.Now, you might be thinking, where do these
Double
lists come from?In the application, after the CSV file is processed and whenever a new filter is applied, the
performAnalysis()
function of theDataProcessor
class is called. This function then invokes thedetectTrends()
function of theTrendDetector
class. The purpose of this function is to convert the dates to numerical values (epoch days), extract the temperatures, humidity values, and pressure from the list ofWeatherData
, and then pass the date as numerical values (asx
) along with the values for each weather attribute (asy
) to thecalculateTrend()
function.The choice of converting the dates to the number of days since the Unix epoch (
1970-01-01
) is somewhat arbitrary. You could also use the number of days from the first day of the year or the number of days since0001-01-01
, for example. Your choice of the base date will affect the intercept calculation because it significantly changes the magnitude of thex
values (dates) used in the regression formula. However, any method of converting dates to numerical values can be considered technically correct and valid. The choice depends on your specific needs and consistency across your data processing and analysis pipeline. The key is to be aware of which method you are using and the implications of that choice, especially regarding data interchange and comparison with other datasets. -
Challenge
Step 5: Detecting Anomalies
Standard Deviation and Anomaly Detection
Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a dataset. It tells you how much the data points deviate, on average, from the mean (average) of the dataset. A low standard deviation indicates that the data points tend to be close to the mean, while a high standard deviation indicates that the data points are spread out over a wider range.
To calculate the standard deviation:
- Calculate the mean of the dataset.
- For each data point, calculate the difference between the data point and the mean (deviation).
- Square each deviation to make them all positive.
- Calculate the average of the squared deviations (variance).
- Take the square root of the variance to get the standard deviation.
For example, say you have a dataset of exam scores:
85
,90
,92
,88
,95
. You can perform the following calculations:- Mean:
(85 + 90 + 92 + 88 + 95) / 5 = 90
- Deviations:
-5
,0
,2
,-2
,5
- Squared deviations:
25
,0
,4
,4
,25
- Variance:
(25 + 0 + 4 + 4 + 25) / 5 = 11.6
- Standard deviation:
sqrt(11.6) ≈ 3.4
Standard deviation is helpful in detecting anomalies or outliers in a dataset. Anomalies are data points that significantly deviate from the normal behavior or pattern of the data.
A common approach is to define a threshold based on the standard deviation. Data points that are more than a certain number of standard deviations away from the mean (either
mean + threshold
ormean - threshold
) are considered anomalies.Following the above example, if you define the threshold for detecting anomalies as two times the standard deviation and the exam scores are:
85
,90
,92
,88
,95
,65
, you can perform the following calculations:- Mean score ≈
85.8
- Standard deviation ≈
9.8
- Threshold:
2 * 9.8 = 19.6
- Anomaly range:
(85.8 - 19.6, 85.8 + 19.6) = (66.2, 105.4)
The score of
65
is below the lower limit and can be considered an anomaly.In the application, after the CSV file is processed and whenever a new filter is applied, the
performAnalysis()
function of theDataProcessor
class calls thedetectAnomalies()
function of theAnomalyDetector
class to identify an anomaly in the temperature data. This function callscalculateStandardDeviation()
in the same class to calculate the standard deviation using the steps mentioned above.In the next tasks, you'll complete the implementation of both functions,
calculateStandardDeviation()
anddetectAnomalies()
. -
Challenge
Step 6: Saving Data and Analysis Results
Writing Data to Files in Kotlin
In Kotlin, you can write data to files using the functions
bufferedWriter()
andwrite()
. These functions provide a convenient way to write text content to files efficiently.To write data to a file, you first need to create a
File
object representing the file you want to write to. You can specify the file path as a parameter when creating theFile
object:val file = File("path/to/file.txt")
Once you have the
File
object, you can use thebufferedWriter()
function to create aBufferedWriter
instance. TheBufferedWriter
is a high-level writer that provides buffering capabilities, which can improve performance when writing large amounts of data:val writer = file.bufferedWriter()
After getting the
BufferedWriter
, you can use thewrite()
function to write text content to the file. Thewrite()
function takes a string as a parameter, which represents the data you want to write.Here's an example of writing a line of text to a file:
writer.write("Hello, World!\n")
In the above example, the string "Hello, World!" is written to the file, followed by a newline character (
\n
) to start a new line.You can write multiple lines of text by calling the
write()
function multiple times or by using a loop to iterate over a collection of data.After you've finished writing data to the file, it's important to close the writer. This ensures that all the data is flushed and the file is properly closed. You can use the
close()
function to close the writer:writer.close()
Here's a complete example that demonstrates writing data to a file:
val file = File("path/to/file.txt") val writer = file.bufferedWriter() writer.write("Hello, World!\n") writer.write("This is a sample text.\n") writer.write("Writing data to a file in Kotlin is easy!") writer.close()
In this example, three lines of text are written to the file using the
write()
function. Finally, the writer is closed using theclose()
function.In the application, when the user selects the option to save the filtered data and analysis results, and specifies the directory path for these files, two functions are invoked. The
saveFilteredDataToCsv()
function saves the filtered data to a file namedfiltered_data.csv
, and thesaveAnalysisResultsToText()
function saves the analysis results to theanalysis_results.txt
file. Both functions are defined in theDataSavingUtils.kt
file.In the next tasks, you'll implement these functions. ### The
saveAnalysisResultsToText()
FunctionIn the file
DataSavingUtils.kt
, the functionsaveAnalysisResultsToText()
takes four parameters:analysisFile
: aFile
object representing the file to write the analysis results to.analysisResults
: anAnalysisResults
object containing the statistical analysis results.trendResults
: aTrendResults
object containing the trend analysis results.anomalies
: aList<Anomaly>
containing the detected anomalies.
Inside the function, a variable of type
BufferedWriter
, namedanalysisWriter
, is already created usinganalysisFile.bufferedWriter()
. In the next tasks, you'll use the parameters and theBufferedWriter
to write the statistical analysis results, the trend analysis results, and the detected anomalies. -
Challenge
Conclusion
Congratulations on successfully completing this Code Lab!
To compile and run your program, you'll need to use the Run button. This is located at the bottom-right corner of the Terminal. Here's how to proceed:
-
Compilation: Clicking Run will compile all the files in the
src
directory into a JAR file namedClimateAnalyzer.jar
. -
Running the Program: After compilation, the program will automatically execute using the command:
java -jar ClimateAnalyzer.jar data.csv
There is a
data.csv
file containing some sample weather data. Follow the prompts in the menu to filter the data and view the analysis, trend, and anomaly detection results. Then, you can save the filtered data in one file and all the analysis, trend, and anomaly detection results in another file. ---
Extending the Program
Consider exploring these ideas to further enhance your skills and expand the capabilities of the program:
-
Improve error handling. For simplicity, the application doesn't implement error handling in many areas, such as for cases where the file format might not be as expected (incorrect date format, non-numeric values for temperature, humidity, or pressure). Enhance the program by adding robust error handling mechanisms to gracefully handle and recover from errors, providing informative error messages to the user.
-
Implement data visualization. Integrate a data visualization library to create charts, graphs, or plots that visually represent the weather data, analysis results, trends, and anomalies. This will enable users to gain insights and spot patterns more easily.
-
Support multiple data formats. Extend the program to support reading weather data from different file formats such as JSON, XML, or databases. This will increase the flexibility and interoperability of the application, allowing it to work with various data sources.
-
Add more options for data filtering. Enhance the user interface to allow interactive filtering of weather data based on multiple criteria such as location, temperature range, humidity range, or pressure range. This will provide users with more control over the data they want to analyze and visualize.
-
Add support for configuration options. For example, the anomaly detection threshold is hardcoded as two times the standard deviation. Consider making this a configurable parameter, allowing for flexibility in anomaly detection sensitivity. ---
Related Courses on Pluralsight's Library
If you're interested in further honing your Kotlin skills or exploring more topics, Pluralsight offers several excellent courses in the following path:
These courses cover many aspects of Kotlin programming. You should check them out to continue your learning journey in Kotlin.
-
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.