Python - Applying LSTM to datasets with different time stamps












0















What I want to achieve



I am working on two data sets. Dataset #1 shows Natural Gas Futures for different months starting most current till 100 rows and Dataset #2 is Natural Gas storage data which is refreshed every week. I have included both datasets below.



Dataset#2
enter image description here



Dataset#1
enter image description here



The prices in dataset#1, for eg. NGLast, NGOpen,NGHigh fluctuate every 10 minutes while active trading continues.However, during every week data from dataset#2 is published which indicates Natural Gas storage and the prices in dataset#1 can fluctuate based on how much storage was reported.



Code



 import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Dense
from keras.models import Sequential
from keras.layers import LSTM
import datetime
from keras import metrics
from sklearn.preprocessing import MinMaxScaler

data = pd.read_excel("C:FuturesFutures.xls")

data['Contract'] = pd.to_datetime(data['Contract'],unit='s').dt.date
data['NG Last'] = data['NG Last'].str.rstrip('s')
data['CO Last'] = data['CO Last'].str.rstrip('s')

L=len(data)

COHigh = np.array([data.iloc[:,8]])
COLow = np.array([data.iloc[:,9]])
NGLast = np.array([data.iloc[:,1]])
NGOpen = np.array([data.iloc[:,2]])
NGLow = np.array([data.iloc[:,4]])
COOpen = np.array([data.iloc[:,7]])
NGHigh = np.array([data.iloc[:,3]])
COLast = np.array([data.iloc[:,10]])
NGP = np.array([data.iloc[:,5]])
NGVolumes = np.array([data.iloc[:,6]])
COVolumes = np.array([data.iloc[:,12]])
COP = np.array([data.iloc[:,11]])


X = np.concatenate([COHigh,COLow, NGLast,NGOpen,COOpen,COLast, NGHigh,NGVolumes,COVolumes, COP,NGP], axis =0)
X = np.transpose(X)

Y = NGLow
Y = np.transpose(Y)


scaler = MinMaxScaler()

scaler.fit(X)
X = scaler.transform(X)
scaler.fit(Y)
Y = scaler.transform(Y)

X = np.reshape(X,(X.shape[0],1,X.shape[1]))
print(X.shape)

model = Sequential()
model.add(LSTM(100,activation='tanh',input_shape=(1,11), recurrent_activation='hard_sigmoid'))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics = ['accuracy'])
model.fit(X,Y,epochs = 100,batch_size=1,verbose=2)

Predict = model.predict(X,verbose=1)

inversed = scaler.inverse_transform(Predict)
loss, accuracy = model.evaluate(X, Y, verbose=2)


What has been done thus far



The code above predicts "NG Low" data based on 10 input parameters from dataset#1 and its accurate.
However, I am not following how to include parameters from dataset#2 which has data that is released on weekly basis.
Dataset#1 updates every 10 minutes
Dataset#2 updates every week.



What additional information that I need to include.



I intend to bring in weekly weather data as well just to make my model more reliable.










share|improve this question



























    0















    What I want to achieve



    I am working on two data sets. Dataset #1 shows Natural Gas Futures for different months starting most current till 100 rows and Dataset #2 is Natural Gas storage data which is refreshed every week. I have included both datasets below.



    Dataset#2
    enter image description here



    Dataset#1
    enter image description here



    The prices in dataset#1, for eg. NGLast, NGOpen,NGHigh fluctuate every 10 minutes while active trading continues.However, during every week data from dataset#2 is published which indicates Natural Gas storage and the prices in dataset#1 can fluctuate based on how much storage was reported.



    Code



     import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from keras.layers import Dense
    from keras.models import Sequential
    from keras.layers import LSTM
    import datetime
    from keras import metrics
    from sklearn.preprocessing import MinMaxScaler

    data = pd.read_excel("C:FuturesFutures.xls")

    data['Contract'] = pd.to_datetime(data['Contract'],unit='s').dt.date
    data['NG Last'] = data['NG Last'].str.rstrip('s')
    data['CO Last'] = data['CO Last'].str.rstrip('s')

    L=len(data)

    COHigh = np.array([data.iloc[:,8]])
    COLow = np.array([data.iloc[:,9]])
    NGLast = np.array([data.iloc[:,1]])
    NGOpen = np.array([data.iloc[:,2]])
    NGLow = np.array([data.iloc[:,4]])
    COOpen = np.array([data.iloc[:,7]])
    NGHigh = np.array([data.iloc[:,3]])
    COLast = np.array([data.iloc[:,10]])
    NGP = np.array([data.iloc[:,5]])
    NGVolumes = np.array([data.iloc[:,6]])
    COVolumes = np.array([data.iloc[:,12]])
    COP = np.array([data.iloc[:,11]])


    X = np.concatenate([COHigh,COLow, NGLast,NGOpen,COOpen,COLast, NGHigh,NGVolumes,COVolumes, COP,NGP], axis =0)
    X = np.transpose(X)

    Y = NGLow
    Y = np.transpose(Y)


    scaler = MinMaxScaler()

    scaler.fit(X)
    X = scaler.transform(X)
    scaler.fit(Y)
    Y = scaler.transform(Y)

    X = np.reshape(X,(X.shape[0],1,X.shape[1]))
    print(X.shape)

    model = Sequential()
    model.add(LSTM(100,activation='tanh',input_shape=(1,11), recurrent_activation='hard_sigmoid'))
    model.add(Dense(1))

    model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics = ['accuracy'])
    model.fit(X,Y,epochs = 100,batch_size=1,verbose=2)

    Predict = model.predict(X,verbose=1)

    inversed = scaler.inverse_transform(Predict)
    loss, accuracy = model.evaluate(X, Y, verbose=2)


    What has been done thus far



    The code above predicts "NG Low" data based on 10 input parameters from dataset#1 and its accurate.
    However, I am not following how to include parameters from dataset#2 which has data that is released on weekly basis.
    Dataset#1 updates every 10 minutes
    Dataset#2 updates every week.



    What additional information that I need to include.



    I intend to bring in weekly weather data as well just to make my model more reliable.










    share|improve this question

























      0












      0








      0








      What I want to achieve



      I am working on two data sets. Dataset #1 shows Natural Gas Futures for different months starting most current till 100 rows and Dataset #2 is Natural Gas storage data which is refreshed every week. I have included both datasets below.



      Dataset#2
      enter image description here



      Dataset#1
      enter image description here



      The prices in dataset#1, for eg. NGLast, NGOpen,NGHigh fluctuate every 10 minutes while active trading continues.However, during every week data from dataset#2 is published which indicates Natural Gas storage and the prices in dataset#1 can fluctuate based on how much storage was reported.



      Code



       import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      from keras.layers import Dense
      from keras.models import Sequential
      from keras.layers import LSTM
      import datetime
      from keras import metrics
      from sklearn.preprocessing import MinMaxScaler

      data = pd.read_excel("C:FuturesFutures.xls")

      data['Contract'] = pd.to_datetime(data['Contract'],unit='s').dt.date
      data['NG Last'] = data['NG Last'].str.rstrip('s')
      data['CO Last'] = data['CO Last'].str.rstrip('s')

      L=len(data)

      COHigh = np.array([data.iloc[:,8]])
      COLow = np.array([data.iloc[:,9]])
      NGLast = np.array([data.iloc[:,1]])
      NGOpen = np.array([data.iloc[:,2]])
      NGLow = np.array([data.iloc[:,4]])
      COOpen = np.array([data.iloc[:,7]])
      NGHigh = np.array([data.iloc[:,3]])
      COLast = np.array([data.iloc[:,10]])
      NGP = np.array([data.iloc[:,5]])
      NGVolumes = np.array([data.iloc[:,6]])
      COVolumes = np.array([data.iloc[:,12]])
      COP = np.array([data.iloc[:,11]])


      X = np.concatenate([COHigh,COLow, NGLast,NGOpen,COOpen,COLast, NGHigh,NGVolumes,COVolumes, COP,NGP], axis =0)
      X = np.transpose(X)

      Y = NGLow
      Y = np.transpose(Y)


      scaler = MinMaxScaler()

      scaler.fit(X)
      X = scaler.transform(X)
      scaler.fit(Y)
      Y = scaler.transform(Y)

      X = np.reshape(X,(X.shape[0],1,X.shape[1]))
      print(X.shape)

      model = Sequential()
      model.add(LSTM(100,activation='tanh',input_shape=(1,11), recurrent_activation='hard_sigmoid'))
      model.add(Dense(1))

      model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics = ['accuracy'])
      model.fit(X,Y,epochs = 100,batch_size=1,verbose=2)

      Predict = model.predict(X,verbose=1)

      inversed = scaler.inverse_transform(Predict)
      loss, accuracy = model.evaluate(X, Y, verbose=2)


      What has been done thus far



      The code above predicts "NG Low" data based on 10 input parameters from dataset#1 and its accurate.
      However, I am not following how to include parameters from dataset#2 which has data that is released on weekly basis.
      Dataset#1 updates every 10 minutes
      Dataset#2 updates every week.



      What additional information that I need to include.



      I intend to bring in weekly weather data as well just to make my model more reliable.










      share|improve this question














      What I want to achieve



      I am working on two data sets. Dataset #1 shows Natural Gas Futures for different months starting most current till 100 rows and Dataset #2 is Natural Gas storage data which is refreshed every week. I have included both datasets below.



      Dataset#2
      enter image description here



      Dataset#1
      enter image description here



      The prices in dataset#1, for eg. NGLast, NGOpen,NGHigh fluctuate every 10 minutes while active trading continues.However, during every week data from dataset#2 is published which indicates Natural Gas storage and the prices in dataset#1 can fluctuate based on how much storage was reported.



      Code



       import pandas as pd
      import numpy as np
      import matplotlib.pyplot as plt
      from keras.layers import Dense
      from keras.models import Sequential
      from keras.layers import LSTM
      import datetime
      from keras import metrics
      from sklearn.preprocessing import MinMaxScaler

      data = pd.read_excel("C:FuturesFutures.xls")

      data['Contract'] = pd.to_datetime(data['Contract'],unit='s').dt.date
      data['NG Last'] = data['NG Last'].str.rstrip('s')
      data['CO Last'] = data['CO Last'].str.rstrip('s')

      L=len(data)

      COHigh = np.array([data.iloc[:,8]])
      COLow = np.array([data.iloc[:,9]])
      NGLast = np.array([data.iloc[:,1]])
      NGOpen = np.array([data.iloc[:,2]])
      NGLow = np.array([data.iloc[:,4]])
      COOpen = np.array([data.iloc[:,7]])
      NGHigh = np.array([data.iloc[:,3]])
      COLast = np.array([data.iloc[:,10]])
      NGP = np.array([data.iloc[:,5]])
      NGVolumes = np.array([data.iloc[:,6]])
      COVolumes = np.array([data.iloc[:,12]])
      COP = np.array([data.iloc[:,11]])


      X = np.concatenate([COHigh,COLow, NGLast,NGOpen,COOpen,COLast, NGHigh,NGVolumes,COVolumes, COP,NGP], axis =0)
      X = np.transpose(X)

      Y = NGLow
      Y = np.transpose(Y)


      scaler = MinMaxScaler()

      scaler.fit(X)
      X = scaler.transform(X)
      scaler.fit(Y)
      Y = scaler.transform(Y)

      X = np.reshape(X,(X.shape[0],1,X.shape[1]))
      print(X.shape)

      model = Sequential()
      model.add(LSTM(100,activation='tanh',input_shape=(1,11), recurrent_activation='hard_sigmoid'))
      model.add(Dense(1))

      model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics = ['accuracy'])
      model.fit(X,Y,epochs = 100,batch_size=1,verbose=2)

      Predict = model.predict(X,verbose=1)

      inversed = scaler.inverse_transform(Predict)
      loss, accuracy = model.evaluate(X, Y, verbose=2)


      What has been done thus far



      The code above predicts "NG Low" data based on 10 input parameters from dataset#1 and its accurate.
      However, I am not following how to include parameters from dataset#2 which has data that is released on weekly basis.
      Dataset#1 updates every 10 minutes
      Dataset#2 updates every week.



      What additional information that I need to include.



      I intend to bring in weekly weather data as well just to make my model more reliable.







      python lstm






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Dec 29 '18 at 15:44









      Siddharth KulkarniSiddharth Kulkarni

      388




      388
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53970954%2fpython-applying-lstm-to-datasets-with-different-time-stamps%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53970954%2fpython-applying-lstm-to-datasets-with-different-time-stamps%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Mossoró

          Error while reading .h5 file using the rhdf5 package in R

          Pushsharp Apns notification error: 'InvalidToken'