用于多元时间序列的 Keras 递归神经网络

时间：2023-04-28

本文介绍了用于多元时间序列的 Keras 递归神经网络的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

限时送ChatGPT账号..

我一直在阅读有关 Keras RNN 模型(LSTM 和 GRU)的文章，作者似乎主要关注语言数据或使用由先前时间步长组成的训练实例的单变量时间序列.我的数据有点不同.

I have been reading about Keras RNN models (LSTMs and GRUs), and authors seem to largely focus on language data or univariate time series that use training instances composed of previous time steps. The data I have is a bit different.

我有 20 个变量在 10 年内每年测量 100,000 人作为输入数据，并将第 11 年测量的 20 个变量作为输出数据.我想做的是预测第 11 年的其中一个变量(而不是其他 19 个)的值.

I have 20 variables measured every year for 10 years for 100,000 persons as input data, and the 20 variables measured for year 11 as output data. What I would like to do is predict the value of one of the variables (not the other 19) for the 11th year.

我的数据结构为 X.shape = [persons, years, variables] = [100000, 10, 20] 和 Y.shape = [persons, variable] = [100000, 1].下面是我的 LSTM 模型的 Python 代码.

I have my data structured as X.shape = [persons, years, variables] = [100000, 10, 20] and Y.shape = [persons, variable] = [100000, 1]. Below is my Python code for a LSTM model.

## LSTM model.

# Define model.

network_lstm = models.Sequential()
network_lstm.add(layers.LSTM(128, activation = 'tanh', 
     input_shape = (X.shape[1], X.shape[2])))
network_lstm.add(layers.Dense(1, activation = None))

# Compile model.

network_lstm.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fit model.

history_lstm = network_lstm.fit(X, Y, epochs = 25, batch_size = 128)

我有四个(相关)问题，请:

I have four (related) questions, please:

我是否为我拥有的数据结构正确编码了 Keras 模型?我从全连接网络(使用扁平数据)和 LSTM、GRU 和 1D CNN 模型获得的性能几乎相同，我不知道我是否在 Keras 中犯了错误，或者循环模型是否只是简单的在这种情况下没有帮助.

Have I coded the Keras model correctly for the data structure I have? The performance I get from a fully-connected network (using flattened data) and from LSTM, GRU, and 1D CNN models are nearly identical, and I don't know if I have made an error in Keras or if a recurrent model is simply not helpful in this case.

我是否应该将 Y 作为一个形状为 Y.shape = [persons, years] = [100000, 11] 的系列，而不是将变量包含在 X 中，这样就会有形状 X.shape = [人、年、变量] = [100000, 10, 19]?如果是这样，我怎样才能让 RNN 输出预测的序列?当我使用 return_sequences = True 时，Keras 返回错误.

Should I have Y as a series with shape Y.shape = [persons, years] = [100000, 11], rather than including the variable in X, which would then have shape X.shape = [persons, years, variables] = [100000, 10, 19]? If so, how can I get the RNN to output the predicted sequence? When I use return_sequences = True, Keras returns an error.

这是用我拥有的数据进行预测的最佳方法吗?Keras RNN 模型甚至其他模型中是否有更好的选项可供选择?

Is this the best way to predict with the data I have? Are there better option choices available in the Keras RNN models, or even other models?

我如何模拟类似于现有数据结构的数据，以使 RNN 模型的性能优于全连接网络?

How could I simulate data resembling the data structure I have so that a RNN model would outperform a fully-connected network?

更新:

我尝试了一个模拟，我希望这是一个非常简单的案例，RNN 应该有望胜过 FNN.

I have tried a simulation, with what I hope is a very simple case where an RNN should be expected to outperform a FNN.

虽然当 LSTM 的隐藏层较少 (4) 时，LSTM 的性能往往优于 FNN，但在隐藏层较多 (8+) 的情况下，性能变得相同.谁能想到一个更好的模拟，其中 RNN 有望胜过具有相似数据结构的 FNN?

While the LSTM tends to outperform the FNN when both have less hidden layers (4), the performance becomes identical with more hidden layers (8+). Can anyone think of a better simulation where a RNN would be expected to outperform a FNN with a similar data structure?

from keras import models
from keras import layers

from keras.layers import Dense, LSTM

import numpy as np
import matplotlib.pyplot as plt

下面的代码模拟了 10,000 个实例、10 个时间步长和 2 个变量的数据.如果第二个变量在第一个时间步为 0，则 Y 是最后一个时间步的第一个变量的值乘以 3.如果第二个变量在第一个时间步为 1，则 Y 为最后一个时间步的第一个变量的值乘以 9.

The code below simulates data for 10,000 instances, 10 time steps, and 2 variables. If the second variable has a 0 in the very first time step, then Y is the value of the first variable for the very last time step multiplied by 3. If the second variable has a 1 in the very first time step, then Y is the value of the first variable for the very last time step multiplied by 9.

我希望 RNN 将第一个时间步的第二个变量的值保留在内存中，并使用它来知道哪个值(3 或 9)与最后一个时间步的第一个变量相乘.

My hope was that the RNN would keep the value of second variable at the very first time step in memory and use that to know which value (3 or 9) to multiply the the first variable for the very last time step.

## Simulate data.

instances = 10000

sequences = 10

X = np.zeros((instances, sequences * 2))

X[:int(instances / 2), 1] = 1

for i in range(instances):

    for j in range(0, sequences * 2, 2):

        X[i, j] = np.random.random()

Y = np.zeros((instances, 1))

for i in range(len(Y)):

    if X[i, 1] == 0:

        Y[i] = X[i, -2] * 3

    if X[i, 1] == 1:

        Y[i] = X[i, -2] * 9

下面是 FNN 的代码:

Below is code for a FNN:

## Densely connected model.

# Define model.

network_dense = models.Sequential()
network_dense.add(layers.Dense(4, activation = 'relu', 
     input_shape = (X.shape[1],)))
network_dense.add(Dense(1, activation = None))

# Compile model.

network_dense.compile(optimizer = 'rmsprop', loss = 'mean_absolute_error')

# Fit model.

history_dense = network_dense.fit(X, Y, epochs = 100, batch_size = 256, verbose = False)

plt.scatter(Y[X[:, 1] == 0, :], network_dense.predict(X[X[:, 1] == 0, :]), alpha = 0.1)
plt.plot([0, 3], [0, 3], color = 'black', linewidth = 2)
plt.title('FNN, Second Variable has a 0 in the Very First Time Step')
plt.xlabel('Actual')
plt.ylabel('Predicted')

plt.show()

plt.scatter(Y[X[:, 1] == 1, :], network_dense.predict(X[X[:, 1] == 1, :]), alpha = 0.1)
plt.plot([0, 9], [0, 9], color = 'black', linewidth = 2)
plt.title('FNN, Second Variable has a 1 in the Very First Time Step')
plt.xlabel('Actual')
plt.ylabel('Predicted')

plt.show()

下面是 LSTM 的代码:

Below is code for a LSTM:

## Structure X data for LSTM.

X_lstm = X.reshape(X.shape[0], X.shape[1] // 2, 2)

X_lstm.shape

## LSTM model.

# Define model.

network_lstm = models.Sequential()
network_lstm.add(layers.LSTM(4, activation = 'relu', 
     input_shape = (X_lstm.shape[1], 2)))
network_lstm.add(layers.Dense(1, activation = None))

# Compile model.

network_lstm.compile(optimizer = 'rmsprop', loss = 'mean_squared_error')

# Fit model.

history_lstm = network_lstm.fit(X_lstm, Y, epochs = 100, batch_size = 256, verbose = False)

plt.scatter(Y[X[:, 1] == 0, :], network_lstm.predict(X_lstm[X[:, 1] == 0, :]), alpha = 0.1)
plt.plot([0, 3], [0, 3], color = 'black', linewidth = 2)
plt.title('LSTM, FNN, Second Variable has a 0 in the Very First Time Step')
plt.xlabel('Actual')
plt.ylabel('Predicted')

plt.show()

plt.scatter(Y[X[:, 1] == 1, :], network_lstm.predict(X_lstm[X[:, 1] == 1, :]), alpha = 0.1)
plt.plot([0, 9], [0, 9], color = 'black', linewidth = 2)
plt.title('LSTM, FNN, Second Variable has a 1 in the Very First Time Step')
plt.xlabel('Actual')
plt.ylabel('Predicted')

plt.show()

用于多元时间序列的 Keras 递归神经网络

问题描述

推荐答案

相关文章

最新文章