attach “1” for the bias unit in neural networks. what does it mean?












1












$begingroup$


In the image below the instructor says attach a "1" for the bias unit in neural networks. what does a "bias unit"mean in neural network?



enter image description here










share|improve this question









$endgroup$



migrated from stackoverflow.com Jan 5 at 6:30


This question came from our site for professional and enthusiast programmers.























    1












    $begingroup$


    In the image below the instructor says attach a "1" for the bias unit in neural networks. what does a "bias unit"mean in neural network?



    enter image description here










    share|improve this question









    $endgroup$



    migrated from stackoverflow.com Jan 5 at 6:30


    This question came from our site for professional and enthusiast programmers.





















      1












      1








      1





      $begingroup$


      In the image below the instructor says attach a "1" for the bias unit in neural networks. what does a "bias unit"mean in neural network?



      enter image description here










      share|improve this question









      $endgroup$




      In the image below the instructor says attach a "1" for the bias unit in neural networks. what does a "bias unit"mean in neural network?



      enter image description here







      neural-network






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 4 at 18:22







      Ney J Torres











      migrated from stackoverflow.com Jan 5 at 6:30


      This question came from our site for professional and enthusiast programmers.









      migrated from stackoverflow.com Jan 5 at 6:30


      This question came from our site for professional and enthusiast programmers.
























          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          It allows you to account for some static offset in your learning process. To illustrate, consider the output of your first hidden layer without the bias unit.



          h = sigma(W * x)


          Here W is the matrix of your layer weights, and W * x is the matrix-vector multiplication between these weights and your input. sigma is your nonlinearity operating on each element of the result. Now consider what this looks like With the "bias unit".



          h = sigma(W' * [x, 1]) ~ sigma(W*x + b)


          Here, W' is your weight matrix with new entries associated with the "bias unit", and your new input is your original input with a 1 concatenated to it. You can think of this as being equivalent to your original matrix multiplication W * x plus some bias term b.



          I will have to dig for the sources on this, but conventional wisdom is that you don't need the bias unit in modern architectures. If it's for a class though, I'd say use it if that's what you are instructed to do.






          share|improve this answer









          $endgroup$














            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "557"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43506%2fattach-1-for-the-bias-unit-in-neural-networks-what-does-it-mean%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown
























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            It allows you to account for some static offset in your learning process. To illustrate, consider the output of your first hidden layer without the bias unit.



            h = sigma(W * x)


            Here W is the matrix of your layer weights, and W * x is the matrix-vector multiplication between these weights and your input. sigma is your nonlinearity operating on each element of the result. Now consider what this looks like With the "bias unit".



            h = sigma(W' * [x, 1]) ~ sigma(W*x + b)


            Here, W' is your weight matrix with new entries associated with the "bias unit", and your new input is your original input with a 1 concatenated to it. You can think of this as being equivalent to your original matrix multiplication W * x plus some bias term b.



            I will have to dig for the sources on this, but conventional wisdom is that you don't need the bias unit in modern architectures. If it's for a class though, I'd say use it if that's what you are instructed to do.






            share|improve this answer









            $endgroup$


















              0












              $begingroup$

              It allows you to account for some static offset in your learning process. To illustrate, consider the output of your first hidden layer without the bias unit.



              h = sigma(W * x)


              Here W is the matrix of your layer weights, and W * x is the matrix-vector multiplication between these weights and your input. sigma is your nonlinearity operating on each element of the result. Now consider what this looks like With the "bias unit".



              h = sigma(W' * [x, 1]) ~ sigma(W*x + b)


              Here, W' is your weight matrix with new entries associated with the "bias unit", and your new input is your original input with a 1 concatenated to it. You can think of this as being equivalent to your original matrix multiplication W * x plus some bias term b.



              I will have to dig for the sources on this, but conventional wisdom is that you don't need the bias unit in modern architectures. If it's for a class though, I'd say use it if that's what you are instructed to do.






              share|improve this answer









              $endgroup$
















                0












                0








                0





                $begingroup$

                It allows you to account for some static offset in your learning process. To illustrate, consider the output of your first hidden layer without the bias unit.



                h = sigma(W * x)


                Here W is the matrix of your layer weights, and W * x is the matrix-vector multiplication between these weights and your input. sigma is your nonlinearity operating on each element of the result. Now consider what this looks like With the "bias unit".



                h = sigma(W' * [x, 1]) ~ sigma(W*x + b)


                Here, W' is your weight matrix with new entries associated with the "bias unit", and your new input is your original input with a 1 concatenated to it. You can think of this as being equivalent to your original matrix multiplication W * x plus some bias term b.



                I will have to dig for the sources on this, but conventional wisdom is that you don't need the bias unit in modern architectures. If it's for a class though, I'd say use it if that's what you are instructed to do.






                share|improve this answer









                $endgroup$



                It allows you to account for some static offset in your learning process. To illustrate, consider the output of your first hidden layer without the bias unit.



                h = sigma(W * x)


                Here W is the matrix of your layer weights, and W * x is the matrix-vector multiplication between these weights and your input. sigma is your nonlinearity operating on each element of the result. Now consider what this looks like With the "bias unit".



                h = sigma(W' * [x, 1]) ~ sigma(W*x + b)


                Here, W' is your weight matrix with new entries associated with the "bias unit", and your new input is your original input with a 1 concatenated to it. You can think of this as being equivalent to your original matrix multiplication W * x plus some bias term b.



                I will have to dig for the sources on this, but conventional wisdom is that you don't need the bias unit in modern architectures. If it's for a class though, I'd say use it if that's what you are instructed to do.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jan 4 at 18:31









                EngineeroEngineero

                1156




                1156






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f43506%2fattach-1-for-the-bias-unit-in-neural-networks-what-does-it-mean%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Mossoró

                    Error while reading .h5 file using the rhdf5 package in R

                    Pushsharp Apns notification error: 'InvalidToken'