官术网_书友最值得收藏!

Weather forecasting with Markov chains

We want to build a statistical model to predict the weather. To simplify the model, we will assume that there are only two states: sunny and rainy. Let's further assume that we have made some calculations and discovered that tomorrow's time is somehow based on today's time, according to the following transition matrix:

Recall that this matrix contains the conditional probabilities of the type expressed as P (A | B)—that is, the probability of A given B. So, this matrix contains the following conditional probabilities:

In the preceding matrix, Su = sunny and Ra = rainy. Each row must consist of a complete distribution; thus, all the numbers must be non-negative and sum to 1. Weather has a tendency to resist change—for instance, P(Sunny | Sunny) is more likely than P(Sunny | Rainy). Tomorrow's value is not directly related to yesterday's (or an earlier time's) value, so the process is Markovian.

The previous matrix corresponds to the following transition diagram:

At this point, the following questions come to mind:

  • If today is sunny, how can we calculate the probability that it is rainy in the next few days?
  • After a certain number of days, what will be the proportion of sunny and rainy days?

Both questions, as well as many others that can come to mind, can be answered through the tools that make Markov chains available to us. The following is the Python code that provides for the alternation between sunny and rainy days starting from a specific initial condition:

import numpy as np
import time
from matplotlib import pyplot

np.random.seed(1)
states = ["Sunny","Rainy"]

TransStates = [["SuSu","SuRa"],["RaRa","RaSu"]]
TransnMatrix = [[0.75,0.25],[0.30,0.70]]

if sum(TransnMatrix[0])+sum(TransnMatrix[1]) != 2:
print("Warning! Probabilities MUST ADD TO 1. Wrong transition matrix!!")
raise ValueError("Probabilities MUST ADD TO 1")

WT = list()
NumberDays = 200
WeatherToday = states[0]
print("Weather initial condition =",WeatherToday)

i = 0
while i < NumberDays:
if WeatherToday == "Sunny":
TransWeather = np.random.choice(TransStates[0],
replace=True,p=TransnMatrix[0])
if TransWeather == "SuSu":
pass
else:
WeatherToday = "Rainy"
elif WeatherToday == "Rainy":
TransWeather = np.random.choice(TransStates[1],
replace=True,p=TransnMatrix[1])
if TransWeather == "RaRa":
pass
else:
WeatherToday = "Sunny"
print(WeatherToday)
WT.append(WeatherToday)
i += 1
time.sleep(0.2)

pyplot.plot(WT)
pyplot.show()

pyplot.hist(WT)
pyplot.show()

We will analyze this code line by line. The first lines load the libraries:

import numpy as np
import time
from matplotlib import pyplot

The first line of this imports the numpy library: numpy is a library for the Python programming language, adding support for large, multidimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. Two functions will be used: random.seed() and random.choise(). The second line imports the time module which provides various time-related functions. Finally, the pyplot module is imported from the matplotlib library.

Let's continue analyzing the code:

np.random.seed(1)

The random.seed() function sets the seed of a random number generator, which is useful for creating simulations or random objects that can be reproduced. You have to use this function every time you want to get a reproducible random result. In this case, the random numbers are the same, and they would continue to be the same no matter how far out in the sequence we go. Each seed value will correspond to a sequence of values that are generated for a given random number generator. In the following line, we define the state of the weather condition:

states = ["Sunny","Rainy"]

As we said, only two states are provided: Sunny and Rainy. At this point, we have to define the possible transitions of weather conditions:

TransStates = [["SuSu","SuRa"],["RaRa","RaSu"]]

Only two states allow only four types of transitions: sunny to sunny, sunny to rainy, rainy to rainy, and rainy to sunny. Let's move on to define the transition matrix according to what was established at the beginning of the section:

TransnMatrix = [[0.75,0.25],[0.30,0.70]]

Remember that this matrix contains the conditional probabilities of the type expressed as P (A | B)—that is, the probability of A given B. As already mentioned, the rows of this matrix add up to 1. Then we insert the following check to verify that we did not make mistakes in defining the transition matrix:

if sum(TransnMatrix[0])+sum(TransnMatrix[1]) != 2:
print("Warning! Probabilities MUST ADD TO 1. Wrong transition matrix!!")
raise ValueError("Probabilities MUST ADD TO 1")

Let's move on to creating the main variable using the following code:

WT = list()

The WT phrase will be the variable that will contain the sequence of values that are representative of the weather forecast. This variable will be of a list type. Let's now fix the number of days for which we want to forecast the weather conditions using the following code:

NumberDays = 200

If we wanted to fix another interval, it would be enough to change this value. Let's now fix the initial conditions using the following code:

WeatherToday = states[0]

In this way, we have selected the first element of the states variable. Remember that in Python the first element of a list has an index equal to 0. At this point, we can move on to the weather forecast. But first, we set the initial conditions using the following code:

print("Weather initial condition =",WeatherToday)

We can now predict the weather conditions for each of the days set by the NumberDays variable. To do this, we will use a while loop. A while loop is different from a for loop, but basically, it performs the same functions. It consists of a control condition and a loop body. At the entrance of the cycle and every time that all the instructions contained in the body are executed, the validity of the control condition is verified. The cycle ends when the condition, consisting of a Boolean expression, returns false. To start, we initialize a counter to allow us to perform the check, as follows:

i = 0

The exit condition that we impose will be i <NumberDays. This means that when this Boolean expression is false (and this will happen when i = NumberDays), then the loop will not be executed, and it will go to the first statement after the while block, as shown in the following code:

while i < NumberDays:
if WeatherToday == "Sunny":
TransWeather = np.random.choice(TransStates[0],replace=True,p=TransnMatrix[0])
if TransWeather == "SuSu":
pass
else:
WeatherToday = "Rainy"
elif WeatherToday == "Rainy":
TransWeather = np.random.choice(TransStates[1],replace=True,p=TransnMatrix[1])
if TransWeather == "RaRa":
pass
else:
WeatherToday = "Sunny"
print(WeatherToday)
WT.append(WeatherToday)
i += 1
time.sleep(0.2)

This is the main part of the whole program, so you will need to pay close attention to it. Within the while loop, the prediction of the time for each day in succession occurs through a further conditional structure: the if statement. Starting from a weather condition (remember that we established at the beginning that the weather conditions are sunny), we must provide the weather condition of the next day. We can therefore start from two conditions: Sunny or Rainy.

In fact, the first if statement provides two control conditions, as shown in the following screenshot:

Starting from the weather condition (Sunny or Rainy), the random.choice() function is used for the forecast. As already mentioned, this function generates a random sample from a given 1-D array. The syntax of the function is as follows:

numpy.random.choice (a, size = None, replace = True, p = None)

Let's look at this code in more detail:

  • a: 1-D array-like or intif this is an ndarray, a random sample is generated from its elements. If this is an int, the random sample is generated as if a were np.arange (a).

  • size: int or tuple of int instances. This has an optional output shape. If the given shape is, for example, (m, n, k), then m ? n ? k samples are drawn. The default is None, in which case a single value is returned.

  • replace: Boolean. This is optional, whether the sample is with or without a replacement.

  • p: 1-D array-like. This is optional. This denotes the probabilities associated with each entry in a. If not given the sample, then it is assumed that there is a uniform distribution over all entries in a.

The generated random samples are returned—in our case, as one of the following values: SuSu, SuRa, RaRa, and RaSu. The first two start from the Sunny condition and the other two start from the Rainy condition. These values are stored in the TransWeather variable.

Within each condition (the if and elif clauses), there is a further if statement. We need this to determine whether to make the transition or to leave the time unchanged (leaving it set to the previous day). For example, in the case of sunny starting conditions, we use the following:

if TransWeather == "SuSu":
pass
else:
WeatherToday = "Rainy"

If the TransWeather variable contains the value SuSu, then the content of the WeatherToday variable (which contains the weather condition of the current day) remains unchanged; otherwise, it is replaced by the Rainy value. Of course, this applies when starting from sunny conditions. For the elif clause, a similar line of reasoning can be made. The last piece of code within the while loop allows us to update the values of the current iteration, as shown in the following code:

print(WeatherToday)
WT.append(WeatherToday)
i += 1
time.sleep(0.2)

The first line displays the current weather condition on video. The second line stores the current status in a list to memorize the evolution of the weather conditions. These value will return useful information with which to draw a diagram. The third line increases the cycle counter by one unit. The final line adjusts the printing time of the values to make it readable.

At this point, we have generated forecasts for the next 200 days. Let's plot the chart using the following code:

pyplot.plot(WT)
pyplot.show()

The pyplot.plot phrase plots WT (a list that contains the weather condition for the next 200 days) on the y axis using an array 0 ... N-1 as the x axis. The plot() phrase is a versatile command, and will take an arbitrary number of arguments. Finally, pyplot.show displays the diagram that was created. The following graph shows the weather conditions for the next 200 days, starting from the sunny condition:

At first sight, it seems that Sunny days prevail over rainy ones. To be sure, we can draw a histogram. In fact, a histogram will return the number of values for each of the classes that were counted in the distribution, as shown in the following code:

pyplot.hist(WT)
pyplot.show()

The following diagram shows a histogram of the weather conditions for the next 200 days:

As anticipated, we have confirmation of the prevalence of Sunny days. This result does not seem unexpected because it is based on the transition matrix. In fact, the forecasts are obtained on the basis of this information. If we analyze the transition matrix from which we started again, we can see that the probability of persistence of a Sunny condition is greater than that of a Rainy condition. Furthermore, the starting condition has been set as a Sunny condition. It can be a useful exercise to see what happens when we start from a Rainy condition. To do this, just modify the initial conditions.

主站蜘蛛池模板: 许昌县| 普兰店市| 宜宾县| 高密市| 那曲县| 太保市| 望都县| 广灵县| 正定县| 通州市| 霍州市| 靖安县| 济源市| 舒城县| 千阳县| 桃源县| 城口县| 定结县| 墨竹工卡县| 陇西县| 黔江区| 三台县| 乌拉特中旗| 忻州市| 措勤县| 宁武县| 博罗县| 家居| 义马市| 翁源县| 徐水县| 舒城县| 松潘县| 景宁| 宜章县| 措美县| 交城县| 蚌埠市| 会同县| 永善县| 石楼县|