Confusion between Operational and Analytical Big Data and on which category Hadoop operates?












0















I can't wrap my head around the basic theoretical concept of 'Operational and Analytical Big Data'.



According to me:




  1. Operational Big Data: Branch where we can perform Read/write operations on big data using specially designed Databases (NoSQL). Somewhat similar to ETL in RDMS.


  2. Analytical Big Data: Branch where we analyse data in retrospect and draw predictions using techniques like MPP and MapReduce. Somewhat similar to reporting in RDMS.



(Please feel free to correct wherever I'm wrong, it's just my understanding.)



So according to me, Hadoop is used for Analytical Big Data where we just process data for analysis but don't temper original data and hence is not an idea choice for ETL.
But recently I have come across this article which advocates using Hadoop for ETL: https://www.datanami.com/2014/09/01/five-steps-to-running-etl-on-hadoop-for-web-companies/










share|improve this question





























    0















    I can't wrap my head around the basic theoretical concept of 'Operational and Analytical Big Data'.



    According to me:




    1. Operational Big Data: Branch where we can perform Read/write operations on big data using specially designed Databases (NoSQL). Somewhat similar to ETL in RDMS.


    2. Analytical Big Data: Branch where we analyse data in retrospect and draw predictions using techniques like MPP and MapReduce. Somewhat similar to reporting in RDMS.



    (Please feel free to correct wherever I'm wrong, it's just my understanding.)



    So according to me, Hadoop is used for Analytical Big Data where we just process data for analysis but don't temper original data and hence is not an idea choice for ETL.
    But recently I have come across this article which advocates using Hadoop for ETL: https://www.datanami.com/2014/09/01/five-steps-to-running-etl-on-hadoop-for-web-companies/










    share|improve this question



























      0












      0








      0








      I can't wrap my head around the basic theoretical concept of 'Operational and Analytical Big Data'.



      According to me:




      1. Operational Big Data: Branch where we can perform Read/write operations on big data using specially designed Databases (NoSQL). Somewhat similar to ETL in RDMS.


      2. Analytical Big Data: Branch where we analyse data in retrospect and draw predictions using techniques like MPP and MapReduce. Somewhat similar to reporting in RDMS.



      (Please feel free to correct wherever I'm wrong, it's just my understanding.)



      So according to me, Hadoop is used for Analytical Big Data where we just process data for analysis but don't temper original data and hence is not an idea choice for ETL.
      But recently I have come across this article which advocates using Hadoop for ETL: https://www.datanami.com/2014/09/01/five-steps-to-running-etl-on-hadoop-for-web-companies/










      share|improve this question
















      I can't wrap my head around the basic theoretical concept of 'Operational and Analytical Big Data'.



      According to me:




      1. Operational Big Data: Branch where we can perform Read/write operations on big data using specially designed Databases (NoSQL). Somewhat similar to ETL in RDMS.


      2. Analytical Big Data: Branch where we analyse data in retrospect and draw predictions using techniques like MPP and MapReduce. Somewhat similar to reporting in RDMS.



      (Please feel free to correct wherever I'm wrong, it's just my understanding.)



      So according to me, Hadoop is used for Analytical Big Data where we just process data for analysis but don't temper original data and hence is not an idea choice for ETL.
      But recently I have come across this article which advocates using Hadoop for ETL: https://www.datanami.com/2014/09/01/five-steps-to-running-etl-on-hadoop-for-web-companies/







      hadoop bigdata






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 12 at 6:50









      cricket_007

      81.5k1142111




      81.5k1142111










      asked Dec 31 '18 at 10:43









      Kajal_TKajal_T

      256




      256
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Hadoop (MapReduce) is not an efficient processing layer, IMO, without adequate tweaking, so out of the box, the answer is neither. Sure, MapReduce could be used, and under the hood, that API is what most higher level tools depend on, but since those other tools exist, you wouldn't want to go write ETL jobs in plain MapReduce.



          You can combine Hadoop with Spark, Presto, HBase, Hive, etc. to unlock these other Operational or Analytical layers, some are useful for reporting use cases, and others are useful for ETL. Again, plenty of knobs to get useful results in a reasonable time compared to an RDBMS (or other NoSQL tools). Plus, it takes several attempts to know how to best store data in Hadoop to begin with (hint: not plaintext, and not lots of small files)



          That link is over 5 years old now, and references Flume and Sqoop. Other "web scale" technologies have shown their worth in that time, meanwhile Flume and Sqoop have shown their age can be difficult to configure manage compared to tools like Apache NiFi.






          share|improve this answer


























          • Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

            – Kajal_T
            Jan 22 at 9:51











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53986502%2fconfusion-between-operational-and-analytical-big-data-and-on-which-category-hado%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Hadoop (MapReduce) is not an efficient processing layer, IMO, without adequate tweaking, so out of the box, the answer is neither. Sure, MapReduce could be used, and under the hood, that API is what most higher level tools depend on, but since those other tools exist, you wouldn't want to go write ETL jobs in plain MapReduce.



          You can combine Hadoop with Spark, Presto, HBase, Hive, etc. to unlock these other Operational or Analytical layers, some are useful for reporting use cases, and others are useful for ETL. Again, plenty of knobs to get useful results in a reasonable time compared to an RDBMS (or other NoSQL tools). Plus, it takes several attempts to know how to best store data in Hadoop to begin with (hint: not plaintext, and not lots of small files)



          That link is over 5 years old now, and references Flume and Sqoop. Other "web scale" technologies have shown their worth in that time, meanwhile Flume and Sqoop have shown their age can be difficult to configure manage compared to tools like Apache NiFi.






          share|improve this answer


























          • Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

            – Kajal_T
            Jan 22 at 9:51
















          0














          Hadoop (MapReduce) is not an efficient processing layer, IMO, without adequate tweaking, so out of the box, the answer is neither. Sure, MapReduce could be used, and under the hood, that API is what most higher level tools depend on, but since those other tools exist, you wouldn't want to go write ETL jobs in plain MapReduce.



          You can combine Hadoop with Spark, Presto, HBase, Hive, etc. to unlock these other Operational or Analytical layers, some are useful for reporting use cases, and others are useful for ETL. Again, plenty of knobs to get useful results in a reasonable time compared to an RDBMS (or other NoSQL tools). Plus, it takes several attempts to know how to best store data in Hadoop to begin with (hint: not plaintext, and not lots of small files)



          That link is over 5 years old now, and references Flume and Sqoop. Other "web scale" technologies have shown their worth in that time, meanwhile Flume and Sqoop have shown their age can be difficult to configure manage compared to tools like Apache NiFi.






          share|improve this answer


























          • Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

            – Kajal_T
            Jan 22 at 9:51














          0












          0








          0







          Hadoop (MapReduce) is not an efficient processing layer, IMO, without adequate tweaking, so out of the box, the answer is neither. Sure, MapReduce could be used, and under the hood, that API is what most higher level tools depend on, but since those other tools exist, you wouldn't want to go write ETL jobs in plain MapReduce.



          You can combine Hadoop with Spark, Presto, HBase, Hive, etc. to unlock these other Operational or Analytical layers, some are useful for reporting use cases, and others are useful for ETL. Again, plenty of knobs to get useful results in a reasonable time compared to an RDBMS (or other NoSQL tools). Plus, it takes several attempts to know how to best store data in Hadoop to begin with (hint: not plaintext, and not lots of small files)



          That link is over 5 years old now, and references Flume and Sqoop. Other "web scale" technologies have shown their worth in that time, meanwhile Flume and Sqoop have shown their age can be difficult to configure manage compared to tools like Apache NiFi.






          share|improve this answer















          Hadoop (MapReduce) is not an efficient processing layer, IMO, without adequate tweaking, so out of the box, the answer is neither. Sure, MapReduce could be used, and under the hood, that API is what most higher level tools depend on, but since those other tools exist, you wouldn't want to go write ETL jobs in plain MapReduce.



          You can combine Hadoop with Spark, Presto, HBase, Hive, etc. to unlock these other Operational or Analytical layers, some are useful for reporting use cases, and others are useful for ETL. Again, plenty of knobs to get useful results in a reasonable time compared to an RDBMS (or other NoSQL tools). Plus, it takes several attempts to know how to best store data in Hadoop to begin with (hint: not plaintext, and not lots of small files)



          That link is over 5 years old now, and references Flume and Sqoop. Other "web scale" technologies have shown their worth in that time, meanwhile Flume and Sqoop have shown their age can be difficult to configure manage compared to tools like Apache NiFi.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 12 at 6:51

























          answered Jan 12 at 6:46









          cricket_007cricket_007

          81.5k1142111




          81.5k1142111













          • Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

            – Kajal_T
            Jan 22 at 9:51



















          • Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

            – Kajal_T
            Jan 22 at 9:51

















          Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

          – Kajal_T
          Jan 22 at 9:51





          Thank you! I am just starting to learn about Big Data Analysis and was unconsciously comparing every concept to relation DB processing. I am growing to have much more clarity on this every day as I continue learning. I truly appreciate your answer though!

          – Kajal_T
          Jan 22 at 9:51


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53986502%2fconfusion-between-operational-and-analytical-big-data-and-on-which-category-hado%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas