Training a network to Deep Dream specific classes into images












0















DISCLAIMERS-- Google Deep Dream - Use classes to "control dreams" : related, however it doesn't invalidate this question (I hope), since I (naively) want to retrain the network, I'm more confused than them, and the question is still open. By specific classes I mean cats, by images I mean images of people (but if it works in general, that's even better), and I heard AlexNet is simpler, so I'm going with it.



(Windows 7, sub-par GPU (GTX 860M; compute capability 5.0, though, so don’t discourage me), lots of time, machine learning noob)



What I’d like to do is have Deep Dream do what it usually does but instead of buildings, fishes, people, etc., I only want it to create cats in images (ideally, to be able to catify photos with people on them).



What I did: trained a network with exclusively cat pictures (different sets for training and testing but only class 0 in train.txt and test.txt, and num_output: 1 in train_val.prototxt) and I’m scared to see what it’d do in Deep Dream (probably just random noise? idk).



My guess is that I need to train the network to identify cats and people and then maximize the activations of cats. Which I guess involves changing the weights of neurons associated with cats.



Anaconda, CUDA (w/ CuDNN), Caffe and Deep Dream all seem to be working ok, however I have close to no idea how to use them (properly). Following this tutorial, which I guess is only for differentiating between dogs and cats, I managed to run caffe train without any fatal errors... but the network isn’t actually learning anything useful (I think). Anyway, my question(s) is(are):



Is there a way to just change a value somewhere so bvlc_googlenet only finds cats in whatever images I feed it? (Not the main question, though.)



To train a network to be useable with Deep Dream to recognize cats (but not people or fishes or buildings or anything else) in images that are photographs with people on them, and mainly just these kinds of images [BTW, it just so happens that I have a few hundred photos of me and my friends, which is what I’m really interested in catifying], what kind of training and testing data do I need (kinds of images (people, cats, dogs, landscapes), number of images, ratio of cats and not cats in train and test, etc.)?



What should train.txt and test.txt look like)? Should they have the same number of classes? Should I skip any class(es) in one of them? What are the rules for setting up the classes in train.txt and test.txt?



How should I change num_output (the tutorial told me to modify this value only for layer fc8)?



What configurations should an 8GB of RAM, 2GB of VRAM system use for solver.prototxt? Default is absurd, so I did (in new lines)(not full file) lr_policy:”step”; test_iter: 100; test_interval: 500; stepsize: 2500 (only using 5000 images, all cats, currently); max_iter: 10000; iter_size: 5; weight_decay: 0.004; [for both of the data layers in train_val.prototxt] batch_size:10. GPU-Z tells me I’m only using 1GB of VRAM.



Is test net supposed to have 2 outputs (accuracy and loss) per class? And train net, just 1 (loss) per class? (with just cats I got 2 and 1, respectively)



Do I actually need deploy.prototxt for AlexNet? The tutorial didn’t mention it and I think the docs said it was just a copy of train_val.prototxt?



How would I maximize the activations of cats? (probably connected to the first question)



Also, please point out any flagrant mistakes in my setup (besides only having cat images because I thought that was only logical), which is essentially the same as the tutorial’s.



python script to create train.txt and test.txt (just lists all images, dividing them between train.txt and test.txt and adding the proper class number [which, in my case, guarantees 0% loss :D])---> convert_imageset.exe to create lmdb folders---> compute_image_mean.exe to create the mean image---> caffe train call that results in no fatal exceptions, mostly because it’s not actually doing anything useful



(If you need files/more information, just ask.)










share|improve this question



























    0















    DISCLAIMERS-- Google Deep Dream - Use classes to "control dreams" : related, however it doesn't invalidate this question (I hope), since I (naively) want to retrain the network, I'm more confused than them, and the question is still open. By specific classes I mean cats, by images I mean images of people (but if it works in general, that's even better), and I heard AlexNet is simpler, so I'm going with it.



    (Windows 7, sub-par GPU (GTX 860M; compute capability 5.0, though, so don’t discourage me), lots of time, machine learning noob)



    What I’d like to do is have Deep Dream do what it usually does but instead of buildings, fishes, people, etc., I only want it to create cats in images (ideally, to be able to catify photos with people on them).



    What I did: trained a network with exclusively cat pictures (different sets for training and testing but only class 0 in train.txt and test.txt, and num_output: 1 in train_val.prototxt) and I’m scared to see what it’d do in Deep Dream (probably just random noise? idk).



    My guess is that I need to train the network to identify cats and people and then maximize the activations of cats. Which I guess involves changing the weights of neurons associated with cats.



    Anaconda, CUDA (w/ CuDNN), Caffe and Deep Dream all seem to be working ok, however I have close to no idea how to use them (properly). Following this tutorial, which I guess is only for differentiating between dogs and cats, I managed to run caffe train without any fatal errors... but the network isn’t actually learning anything useful (I think). Anyway, my question(s) is(are):



    Is there a way to just change a value somewhere so bvlc_googlenet only finds cats in whatever images I feed it? (Not the main question, though.)



    To train a network to be useable with Deep Dream to recognize cats (but not people or fishes or buildings or anything else) in images that are photographs with people on them, and mainly just these kinds of images [BTW, it just so happens that I have a few hundred photos of me and my friends, which is what I’m really interested in catifying], what kind of training and testing data do I need (kinds of images (people, cats, dogs, landscapes), number of images, ratio of cats and not cats in train and test, etc.)?



    What should train.txt and test.txt look like)? Should they have the same number of classes? Should I skip any class(es) in one of them? What are the rules for setting up the classes in train.txt and test.txt?



    How should I change num_output (the tutorial told me to modify this value only for layer fc8)?



    What configurations should an 8GB of RAM, 2GB of VRAM system use for solver.prototxt? Default is absurd, so I did (in new lines)(not full file) lr_policy:”step”; test_iter: 100; test_interval: 500; stepsize: 2500 (only using 5000 images, all cats, currently); max_iter: 10000; iter_size: 5; weight_decay: 0.004; [for both of the data layers in train_val.prototxt] batch_size:10. GPU-Z tells me I’m only using 1GB of VRAM.



    Is test net supposed to have 2 outputs (accuracy and loss) per class? And train net, just 1 (loss) per class? (with just cats I got 2 and 1, respectively)



    Do I actually need deploy.prototxt for AlexNet? The tutorial didn’t mention it and I think the docs said it was just a copy of train_val.prototxt?



    How would I maximize the activations of cats? (probably connected to the first question)



    Also, please point out any flagrant mistakes in my setup (besides only having cat images because I thought that was only logical), which is essentially the same as the tutorial’s.



    python script to create train.txt and test.txt (just lists all images, dividing them between train.txt and test.txt and adding the proper class number [which, in my case, guarantees 0% loss :D])---> convert_imageset.exe to create lmdb folders---> compute_image_mean.exe to create the mean image---> caffe train call that results in no fatal exceptions, mostly because it’s not actually doing anything useful



    (If you need files/more information, just ask.)










    share|improve this question

























      0












      0








      0








      DISCLAIMERS-- Google Deep Dream - Use classes to "control dreams" : related, however it doesn't invalidate this question (I hope), since I (naively) want to retrain the network, I'm more confused than them, and the question is still open. By specific classes I mean cats, by images I mean images of people (but if it works in general, that's even better), and I heard AlexNet is simpler, so I'm going with it.



      (Windows 7, sub-par GPU (GTX 860M; compute capability 5.0, though, so don’t discourage me), lots of time, machine learning noob)



      What I’d like to do is have Deep Dream do what it usually does but instead of buildings, fishes, people, etc., I only want it to create cats in images (ideally, to be able to catify photos with people on them).



      What I did: trained a network with exclusively cat pictures (different sets for training and testing but only class 0 in train.txt and test.txt, and num_output: 1 in train_val.prototxt) and I’m scared to see what it’d do in Deep Dream (probably just random noise? idk).



      My guess is that I need to train the network to identify cats and people and then maximize the activations of cats. Which I guess involves changing the weights of neurons associated with cats.



      Anaconda, CUDA (w/ CuDNN), Caffe and Deep Dream all seem to be working ok, however I have close to no idea how to use them (properly). Following this tutorial, which I guess is only for differentiating between dogs and cats, I managed to run caffe train without any fatal errors... but the network isn’t actually learning anything useful (I think). Anyway, my question(s) is(are):



      Is there a way to just change a value somewhere so bvlc_googlenet only finds cats in whatever images I feed it? (Not the main question, though.)



      To train a network to be useable with Deep Dream to recognize cats (but not people or fishes or buildings or anything else) in images that are photographs with people on them, and mainly just these kinds of images [BTW, it just so happens that I have a few hundred photos of me and my friends, which is what I’m really interested in catifying], what kind of training and testing data do I need (kinds of images (people, cats, dogs, landscapes), number of images, ratio of cats and not cats in train and test, etc.)?



      What should train.txt and test.txt look like)? Should they have the same number of classes? Should I skip any class(es) in one of them? What are the rules for setting up the classes in train.txt and test.txt?



      How should I change num_output (the tutorial told me to modify this value only for layer fc8)?



      What configurations should an 8GB of RAM, 2GB of VRAM system use for solver.prototxt? Default is absurd, so I did (in new lines)(not full file) lr_policy:”step”; test_iter: 100; test_interval: 500; stepsize: 2500 (only using 5000 images, all cats, currently); max_iter: 10000; iter_size: 5; weight_decay: 0.004; [for both of the data layers in train_val.prototxt] batch_size:10. GPU-Z tells me I’m only using 1GB of VRAM.



      Is test net supposed to have 2 outputs (accuracy and loss) per class? And train net, just 1 (loss) per class? (with just cats I got 2 and 1, respectively)



      Do I actually need deploy.prototxt for AlexNet? The tutorial didn’t mention it and I think the docs said it was just a copy of train_val.prototxt?



      How would I maximize the activations of cats? (probably connected to the first question)



      Also, please point out any flagrant mistakes in my setup (besides only having cat images because I thought that was only logical), which is essentially the same as the tutorial’s.



      python script to create train.txt and test.txt (just lists all images, dividing them between train.txt and test.txt and adding the proper class number [which, in my case, guarantees 0% loss :D])---> convert_imageset.exe to create lmdb folders---> compute_image_mean.exe to create the mean image---> caffe train call that results in no fatal exceptions, mostly because it’s not actually doing anything useful



      (If you need files/more information, just ask.)










      share|improve this question














      DISCLAIMERS-- Google Deep Dream - Use classes to "control dreams" : related, however it doesn't invalidate this question (I hope), since I (naively) want to retrain the network, I'm more confused than them, and the question is still open. By specific classes I mean cats, by images I mean images of people (but if it works in general, that's even better), and I heard AlexNet is simpler, so I'm going with it.



      (Windows 7, sub-par GPU (GTX 860M; compute capability 5.0, though, so don’t discourage me), lots of time, machine learning noob)



      What I’d like to do is have Deep Dream do what it usually does but instead of buildings, fishes, people, etc., I only want it to create cats in images (ideally, to be able to catify photos with people on them).



      What I did: trained a network with exclusively cat pictures (different sets for training and testing but only class 0 in train.txt and test.txt, and num_output: 1 in train_val.prototxt) and I’m scared to see what it’d do in Deep Dream (probably just random noise? idk).



      My guess is that I need to train the network to identify cats and people and then maximize the activations of cats. Which I guess involves changing the weights of neurons associated with cats.



      Anaconda, CUDA (w/ CuDNN), Caffe and Deep Dream all seem to be working ok, however I have close to no idea how to use them (properly). Following this tutorial, which I guess is only for differentiating between dogs and cats, I managed to run caffe train without any fatal errors... but the network isn’t actually learning anything useful (I think). Anyway, my question(s) is(are):



      Is there a way to just change a value somewhere so bvlc_googlenet only finds cats in whatever images I feed it? (Not the main question, though.)



      To train a network to be useable with Deep Dream to recognize cats (but not people or fishes or buildings or anything else) in images that are photographs with people on them, and mainly just these kinds of images [BTW, it just so happens that I have a few hundred photos of me and my friends, which is what I’m really interested in catifying], what kind of training and testing data do I need (kinds of images (people, cats, dogs, landscapes), number of images, ratio of cats and not cats in train and test, etc.)?



      What should train.txt and test.txt look like)? Should they have the same number of classes? Should I skip any class(es) in one of them? What are the rules for setting up the classes in train.txt and test.txt?



      How should I change num_output (the tutorial told me to modify this value only for layer fc8)?



      What configurations should an 8GB of RAM, 2GB of VRAM system use for solver.prototxt? Default is absurd, so I did (in new lines)(not full file) lr_policy:”step”; test_iter: 100; test_interval: 500; stepsize: 2500 (only using 5000 images, all cats, currently); max_iter: 10000; iter_size: 5; weight_decay: 0.004; [for both of the data layers in train_val.prototxt] batch_size:10. GPU-Z tells me I’m only using 1GB of VRAM.



      Is test net supposed to have 2 outputs (accuracy and loss) per class? And train net, just 1 (loss) per class? (with just cats I got 2 and 1, respectively)



      Do I actually need deploy.prototxt for AlexNet? The tutorial didn’t mention it and I think the docs said it was just a copy of train_val.prototxt?



      How would I maximize the activations of cats? (probably connected to the first question)



      Also, please point out any flagrant mistakes in my setup (besides only having cat images because I thought that was only logical), which is essentially the same as the tutorial’s.



      python script to create train.txt and test.txt (just lists all images, dividing them between train.txt and test.txt and adding the proper class number [which, in my case, guarantees 0% loss :D])---> convert_imageset.exe to create lmdb folders---> compute_image_mean.exe to create the mean image---> caffe train call that results in no fatal exceptions, mostly because it’s not actually doing anything useful



      (If you need files/more information, just ask.)







      caffe training-data deep-dream






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 1 at 2:04









      AbeAbe

      11




      11
























          0






          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53992617%2ftraining-a-network-to-deep-dream-specific-classes-into-images%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53992617%2ftraining-a-network-to-deep-dream-specific-classes-into-images%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Monofisismo

          Angular Downloading a file using contenturl with Basic Authentication

          Olmecas