Why does the `updatedb` program run so fast?












22















Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










share|improve this question





























    22















    Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










    share|improve this question



























      22












      22








      22


      1






      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










      share|improve this question
















      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?







      performance updatedb






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 3 at 11:49









      Jeff Schaller

      43.4k1160140




      43.4k1160140










      asked Jan 2 at 16:14









      hugomghugomg

      1,88931635




      1,88931635






















          2 Answers
          2






          active

          oldest

          votes


















          22














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer


























          • Fairly good question and answer, did not even know there were "differencial" scannings.

            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1





            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

            – hugomg
            Jan 2 at 17:21






          • 4





            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

            – Kusalananda
            Jan 2 at 17:33











          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

            – Sergiy Kolodyazhnyy
            Jan 3 at 1:12











          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.

            – Kevin
            Jan 3 at 1:21



















          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer
























          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

            – Stephen Kitt
            Jan 3 at 8:17











          • Indeed. I was describing the defaults for Fedora.

            – hugomg
            Jan 3 at 15:21











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          22














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer


























          • Fairly good question and answer, did not even know there were "differencial" scannings.

            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1





            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

            – hugomg
            Jan 2 at 17:21






          • 4





            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

            – Kusalananda
            Jan 2 at 17:33











          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

            – Sergiy Kolodyazhnyy
            Jan 3 at 1:12











          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.

            – Kevin
            Jan 3 at 1:21
















          22














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer


























          • Fairly good question and answer, did not even know there were "differencial" scannings.

            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1





            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

            – hugomg
            Jan 2 at 17:21






          • 4





            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

            – Kusalananda
            Jan 2 at 17:33











          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

            – Sergiy Kolodyazhnyy
            Jan 3 at 1:12











          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.

            – Kevin
            Jan 3 at 1:21














          22












          22








          22







          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer















          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Jan 3 at 8:25

























          answered Jan 2 at 16:20









          Stephen KittStephen Kitt

          176k24401479




          176k24401479













          • Fairly good question and answer, did not even know there were "differencial" scannings.

            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1





            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

            – hugomg
            Jan 2 at 17:21






          • 4





            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

            – Kusalananda
            Jan 2 at 17:33











          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

            – Sergiy Kolodyazhnyy
            Jan 3 at 1:12











          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.

            – Kevin
            Jan 3 at 1:21



















          • Fairly good question and answer, did not even know there were "differencial" scannings.

            – Rui F Ribeiro
            Jan 2 at 16:25








          • 1





            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

            – hugomg
            Jan 2 at 17:21






          • 4





            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

            – Kusalananda
            Jan 2 at 17:33











          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

            – Sergiy Kolodyazhnyy
            Jan 3 at 1:12











          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.

            – Kevin
            Jan 3 at 1:21

















          Fairly good question and answer, did not even know there were "differencial" scannings.

          – Rui F Ribeiro
          Jan 2 at 16:25







          Fairly good question and answer, did not even know there were "differencial" scannings.

          – Rui F Ribeiro
          Jan 2 at 16:25






          1




          1





          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

          – hugomg
          Jan 2 at 17:21





          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.

          – hugomg
          Jan 2 at 17:21




          4




          4





          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

          – Kusalananda
          Jan 2 at 17:33





          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.

          – Kusalananda
          Jan 2 at 17:33













          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

          – Sergiy Kolodyazhnyy
          Jan 3 at 1:12





          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?

          – Sergiy Kolodyazhnyy
          Jan 3 at 1:12













          @Sergiy: Of course. locate isn't grep -R. It does not read file content.

          – Kevin
          Jan 3 at 1:21





          @Sergiy: Of course. locate isn't grep -R. It does not read file content.

          – Kevin
          Jan 3 at 1:21













          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer
























          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

            – Stephen Kitt
            Jan 3 at 8:17











          • Indeed. I was describing the defaults for Fedora.

            – hugomg
            Jan 3 at 15:21
















          9














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer
























          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

            – Stephen Kitt
            Jan 3 at 8:17











          • Indeed. I was describing the defaults for Fedora.

            – hugomg
            Jan 3 at 15:21














          9












          9








          9







          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer













          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jan 2 at 23:27









          hugomghugomg

          1,88931635




          1,88931635













          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

            – Stephen Kitt
            Jan 3 at 8:17











          • Indeed. I was describing the defaults for Fedora.

            – hugomg
            Jan 3 at 15:21



















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

            – Stephen Kitt
            Jan 3 at 8:17











          • Indeed. I was describing the defaults for Fedora.

            – hugomg
            Jan 3 at 15:21

















          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

          – Stephen Kitt
          Jan 3 at 8:17





          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)

          – Stephen Kitt
          Jan 3 at 8:17













          Indeed. I was describing the defaults for Fedora.

          – hugomg
          Jan 3 at 15:21





          Indeed. I was describing the defaults for Fedora.

          – hugomg
          Jan 3 at 15:21


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas