Recording data on a daily basis onto Excel using DataFrames





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















Good morning, right now I'm working on a RPi project that captures some data every second and translates it to a numerical figure. I want the RPi to save this number/time data to an excel file, and I also want this file to be accessible, with the data being "graph-able" depending on user input (i.e. user wants data from past 3 days > output graph of data from past 3 days). I understand that this can be divided into two parts: read and save data(1) and pull and graph data(2). For this question I would like to focus on (1).



For reading data from the sensor and saving it to an excel file that acts as a database, I was thinking of using pandas' DataFrame. Why, you may ask. I am basing my code on a previous code left by a predecessor, and it already has similar read/write code. However the scale is largely different (~50 entries vs ~38000 entries).



I was thinking of recording the data as:



Basic Text Sample
Data in Excel



As you can see, if there is data for every second, I would end up with 86400 entries for one single day.



Write now, my code for saving this data is as follows. I made a class for the data to be saved, which has the variables:



class ShiftDataSet:

def __init__(self):
self.effDataList =
self.timeDataList =
self.dateTimeToday = datetime.datetime.now()
self.date = self.dateTimeToday.strftime("%y%b%d")#str
#%y is year without century, %b is month abbv, %d is day of month


This data is then (attempted) to be recorded into a dataframe, then into an excel file as follows:



def saveToDf(self):
dataToSave = {self.date : self.effDataList}
#dictionary of data to save. effDataList is the list of 1's and 0's as read by the second.

dfToSave = pd.DataFrame(dataToSave, index=self.timeDataList)
#create DataFrame to save as Excel, using timeDataList as index. timeDataList is a str list of the second the recording is taken

print("Attempting to save data")

#code to combine dfToSave with old df record
oldDf = pd.read_excel("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
#oldDf is the database template, structured the same way like the "Data in Excel" image above

result = dfToSave.combine_first(oldDf)
#combine new dataframe of new data with template database

writer = pd.ExcelWriter("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
result.to_excel(writer, 'Sheet 1')

writer.save()
print("Save Complete")

return


I modeled this code based on my predecessor's code for smaller scale recordings. Running this code however, I run into the problem of the data not being written properly into the excel file, which ends up as follows:
Messed Data



So my questions are:

1) How can I record the data as taken per second into the proper "seconds" index?

2) Would it be better to record and save data every second, or pool it into a larger list then save once or twice during the day?

3) Is pandas DataFrame the best solution for what I want to do, or are there better methods out there?



Thank you very much for the help.










share|improve this question





























    0















    Good morning, right now I'm working on a RPi project that captures some data every second and translates it to a numerical figure. I want the RPi to save this number/time data to an excel file, and I also want this file to be accessible, with the data being "graph-able" depending on user input (i.e. user wants data from past 3 days > output graph of data from past 3 days). I understand that this can be divided into two parts: read and save data(1) and pull and graph data(2). For this question I would like to focus on (1).



    For reading data from the sensor and saving it to an excel file that acts as a database, I was thinking of using pandas' DataFrame. Why, you may ask. I am basing my code on a previous code left by a predecessor, and it already has similar read/write code. However the scale is largely different (~50 entries vs ~38000 entries).



    I was thinking of recording the data as:



    Basic Text Sample
    Data in Excel



    As you can see, if there is data for every second, I would end up with 86400 entries for one single day.



    Write now, my code for saving this data is as follows. I made a class for the data to be saved, which has the variables:



    class ShiftDataSet:

    def __init__(self):
    self.effDataList =
    self.timeDataList =
    self.dateTimeToday = datetime.datetime.now()
    self.date = self.dateTimeToday.strftime("%y%b%d")#str
    #%y is year without century, %b is month abbv, %d is day of month


    This data is then (attempted) to be recorded into a dataframe, then into an excel file as follows:



    def saveToDf(self):
    dataToSave = {self.date : self.effDataList}
    #dictionary of data to save. effDataList is the list of 1's and 0's as read by the second.

    dfToSave = pd.DataFrame(dataToSave, index=self.timeDataList)
    #create DataFrame to save as Excel, using timeDataList as index. timeDataList is a str list of the second the recording is taken

    print("Attempting to save data")

    #code to combine dfToSave with old df record
    oldDf = pd.read_excel("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
    #oldDf is the database template, structured the same way like the "Data in Excel" image above

    result = dfToSave.combine_first(oldDf)
    #combine new dataframe of new data with template database

    writer = pd.ExcelWriter("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
    result.to_excel(writer, 'Sheet 1')

    writer.save()
    print("Save Complete")

    return


    I modeled this code based on my predecessor's code for smaller scale recordings. Running this code however, I run into the problem of the data not being written properly into the excel file, which ends up as follows:
    Messed Data



    So my questions are:

    1) How can I record the data as taken per second into the proper "seconds" index?

    2) Would it be better to record and save data every second, or pool it into a larger list then save once or twice during the day?

    3) Is pandas DataFrame the best solution for what I want to do, or are there better methods out there?



    Thank you very much for the help.










    share|improve this question

























      0












      0








      0








      Good morning, right now I'm working on a RPi project that captures some data every second and translates it to a numerical figure. I want the RPi to save this number/time data to an excel file, and I also want this file to be accessible, with the data being "graph-able" depending on user input (i.e. user wants data from past 3 days > output graph of data from past 3 days). I understand that this can be divided into two parts: read and save data(1) and pull and graph data(2). For this question I would like to focus on (1).



      For reading data from the sensor and saving it to an excel file that acts as a database, I was thinking of using pandas' DataFrame. Why, you may ask. I am basing my code on a previous code left by a predecessor, and it already has similar read/write code. However the scale is largely different (~50 entries vs ~38000 entries).



      I was thinking of recording the data as:



      Basic Text Sample
      Data in Excel



      As you can see, if there is data for every second, I would end up with 86400 entries for one single day.



      Write now, my code for saving this data is as follows. I made a class for the data to be saved, which has the variables:



      class ShiftDataSet:

      def __init__(self):
      self.effDataList =
      self.timeDataList =
      self.dateTimeToday = datetime.datetime.now()
      self.date = self.dateTimeToday.strftime("%y%b%d")#str
      #%y is year without century, %b is month abbv, %d is day of month


      This data is then (attempted) to be recorded into a dataframe, then into an excel file as follows:



      def saveToDf(self):
      dataToSave = {self.date : self.effDataList}
      #dictionary of data to save. effDataList is the list of 1's and 0's as read by the second.

      dfToSave = pd.DataFrame(dataToSave, index=self.timeDataList)
      #create DataFrame to save as Excel, using timeDataList as index. timeDataList is a str list of the second the recording is taken

      print("Attempting to save data")

      #code to combine dfToSave with old df record
      oldDf = pd.read_excel("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
      #oldDf is the database template, structured the same way like the "Data in Excel" image above

      result = dfToSave.combine_first(oldDf)
      #combine new dataframe of new data with template database

      writer = pd.ExcelWriter("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
      result.to_excel(writer, 'Sheet 1')

      writer.save()
      print("Save Complete")

      return


      I modeled this code based on my predecessor's code for smaller scale recordings. Running this code however, I run into the problem of the data not being written properly into the excel file, which ends up as follows:
      Messed Data



      So my questions are:

      1) How can I record the data as taken per second into the proper "seconds" index?

      2) Would it be better to record and save data every second, or pool it into a larger list then save once or twice during the day?

      3) Is pandas DataFrame the best solution for what I want to do, or are there better methods out there?



      Thank you very much for the help.










      share|improve this question














      Good morning, right now I'm working on a RPi project that captures some data every second and translates it to a numerical figure. I want the RPi to save this number/time data to an excel file, and I also want this file to be accessible, with the data being "graph-able" depending on user input (i.e. user wants data from past 3 days > output graph of data from past 3 days). I understand that this can be divided into two parts: read and save data(1) and pull and graph data(2). For this question I would like to focus on (1).



      For reading data from the sensor and saving it to an excel file that acts as a database, I was thinking of using pandas' DataFrame. Why, you may ask. I am basing my code on a previous code left by a predecessor, and it already has similar read/write code. However the scale is largely different (~50 entries vs ~38000 entries).



      I was thinking of recording the data as:



      Basic Text Sample
      Data in Excel



      As you can see, if there is data for every second, I would end up with 86400 entries for one single day.



      Write now, my code for saving this data is as follows. I made a class for the data to be saved, which has the variables:



      class ShiftDataSet:

      def __init__(self):
      self.effDataList =
      self.timeDataList =
      self.dateTimeToday = datetime.datetime.now()
      self.date = self.dateTimeToday.strftime("%y%b%d")#str
      #%y is year without century, %b is month abbv, %d is day of month


      This data is then (attempted) to be recorded into a dataframe, then into an excel file as follows:



      def saveToDf(self):
      dataToSave = {self.date : self.effDataList}
      #dictionary of data to save. effDataList is the list of 1's and 0's as read by the second.

      dfToSave = pd.DataFrame(dataToSave, index=self.timeDataList)
      #create DataFrame to save as Excel, using timeDataList as index. timeDataList is a str list of the second the recording is taken

      print("Attempting to save data")

      #code to combine dfToSave with old df record
      oldDf = pd.read_excel("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
      #oldDf is the database template, structured the same way like the "Data in Excel" image above

      result = dfToSave.combine_first(oldDf)
      #combine new dataframe of new data with template database

      writer = pd.ExcelWriter("/home/pi/Sensor/FS Days/Shift Record Template.xlsx")
      result.to_excel(writer, 'Sheet 1')

      writer.save()
      print("Save Complete")

      return


      I modeled this code based on my predecessor's code for smaller scale recordings. Running this code however, I run into the problem of the data not being written properly into the excel file, which ends up as follows:
      Messed Data



      So my questions are:

      1) How can I record the data as taken per second into the proper "seconds" index?

      2) Would it be better to record and save data every second, or pool it into a larger list then save once or twice during the day?

      3) Is pandas DataFrame the best solution for what I want to do, or are there better methods out there?



      Thank you very much for the help.







      excel database pandas dataframe python-3.5






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 4 at 18:23









      Carlo GuanCarlo Guan

      32




      32
























          1 Answer
          1






          active

          oldest

          votes


















          0














          1) Just like you defined the index for dfToSave, you have to do the same when you read_excel().



          2) This depends on your hardware mostly I guess; it is difficult to make this judgement without any background info.



          3) I would use pandas as well but this doesn't mean it's the best way. You could look at other excel libraries for Python I guess...






          share|improve this answer
























          • 1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

            – Carlo Guan
            Jan 6 at 5:21














          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044215%2frecording-data-on-a-daily-basis-onto-excel-using-dataframes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          1) Just like you defined the index for dfToSave, you have to do the same when you read_excel().



          2) This depends on your hardware mostly I guess; it is difficult to make this judgement without any background info.



          3) I would use pandas as well but this doesn't mean it's the best way. You could look at other excel libraries for Python I guess...






          share|improve this answer
























          • 1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

            – Carlo Guan
            Jan 6 at 5:21


















          0














          1) Just like you defined the index for dfToSave, you have to do the same when you read_excel().



          2) This depends on your hardware mostly I guess; it is difficult to make this judgement without any background info.



          3) I would use pandas as well but this doesn't mean it's the best way. You could look at other excel libraries for Python I guess...






          share|improve this answer
























          • 1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

            – Carlo Guan
            Jan 6 at 5:21
















          0












          0








          0







          1) Just like you defined the index for dfToSave, you have to do the same when you read_excel().



          2) This depends on your hardware mostly I guess; it is difficult to make this judgement without any background info.



          3) I would use pandas as well but this doesn't mean it's the best way. You could look at other excel libraries for Python I guess...






          share|improve this answer













          1) Just like you defined the index for dfToSave, you have to do the same when you read_excel().



          2) This depends on your hardware mostly I guess; it is difficult to make this judgement without any background info.



          3) I would use pandas as well but this doesn't mean it's the best way. You could look at other excel libraries for Python I guess...







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 4 at 20:51









          jorijnsmitjorijnsmit

          643623




          643623













          • 1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

            – Carlo Guan
            Jan 6 at 5:21





















          • 1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

            – Carlo Guan
            Jan 6 at 5:21



















          1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

          – Carlo Guan
          Jan 6 at 5:21







          1) Thank you very much for the tip! That would mean adding idex_col=0 to the excel read, yes? Also it seems like when pandas read the DataFrame, it reparses the indeces back to int (052930 becomes 52930), so I added a letter to the time stamps. 2) Hardware is a Raspberry Pi 3B+. I noticed that it only always uses a single core though.. 3) I'll stick with pandas then if that's what someone with more experience would use..

          – Carlo Guan
          Jan 6 at 5:21






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54044215%2frecording-data-on-a-daily-basis-onto-excel-using-dataframes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas