Beautiful Soup strangely returning '/photo-missing.png'

Multi tool use
Multi tool use





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















I'm attempting to automate the downloading of player images from the www.premierleague.com website. The issue I am now facing is that when I use BeautifulSoup to parse the img src of the player photos, it is returning "photo-missing.png".



de_gea



You can see when you inspect the html it is called p51490.png !? and not "photo-missing.png"



My code is as follows:



import requests
from bs4 import BeautifulSoup

player_page = requests.get('https://www.premierleague.com/players/4330/David-De-Gea/overview')
soup = BeautifulSoup(player_page.text, 'html.parser')
print(soup.find(class_="imgContainer"))


The output of this yields:



 <div class="imgContainer"><img alt="David De Gea" class="img" data- 
player="p51940" data-script="pl_player-image" data-size="250x250" data-
widget="player-image" src="//platform-static-
files.s3.amazonaws.com/premierleague/photos/players/250x250/Photo-
Missing.png"/></div>


I'm wondering if anyone knows why this is happening?










share|improve this question





























    1















    I'm attempting to automate the downloading of player images from the www.premierleague.com website. The issue I am now facing is that when I use BeautifulSoup to parse the img src of the player photos, it is returning "photo-missing.png".



    de_gea



    You can see when you inspect the html it is called p51490.png !? and not "photo-missing.png"



    My code is as follows:



    import requests
    from bs4 import BeautifulSoup

    player_page = requests.get('https://www.premierleague.com/players/4330/David-De-Gea/overview')
    soup = BeautifulSoup(player_page.text, 'html.parser')
    print(soup.find(class_="imgContainer"))


    The output of this yields:



     <div class="imgContainer"><img alt="David De Gea" class="img" data- 
    player="p51940" data-script="pl_player-image" data-size="250x250" data-
    widget="player-image" src="//platform-static-
    files.s3.amazonaws.com/premierleague/photos/players/250x250/Photo-
    Missing.png"/></div>


    I'm wondering if anyone knows why this is happening?










    share|improve this question

























      1












      1








      1








      I'm attempting to automate the downloading of player images from the www.premierleague.com website. The issue I am now facing is that when I use BeautifulSoup to parse the img src of the player photos, it is returning "photo-missing.png".



      de_gea



      You can see when you inspect the html it is called p51490.png !? and not "photo-missing.png"



      My code is as follows:



      import requests
      from bs4 import BeautifulSoup

      player_page = requests.get('https://www.premierleague.com/players/4330/David-De-Gea/overview')
      soup = BeautifulSoup(player_page.text, 'html.parser')
      print(soup.find(class_="imgContainer"))


      The output of this yields:



       <div class="imgContainer"><img alt="David De Gea" class="img" data- 
      player="p51940" data-script="pl_player-image" data-size="250x250" data-
      widget="player-image" src="//platform-static-
      files.s3.amazonaws.com/premierleague/photos/players/250x250/Photo-
      Missing.png"/></div>


      I'm wondering if anyone knows why this is happening?










      share|improve this question














      I'm attempting to automate the downloading of player images from the www.premierleague.com website. The issue I am now facing is that when I use BeautifulSoup to parse the img src of the player photos, it is returning "photo-missing.png".



      de_gea



      You can see when you inspect the html it is called p51490.png !? and not "photo-missing.png"



      My code is as follows:



      import requests
      from bs4 import BeautifulSoup

      player_page = requests.get('https://www.premierleague.com/players/4330/David-De-Gea/overview')
      soup = BeautifulSoup(player_page.text, 'html.parser')
      print(soup.find(class_="imgContainer"))


      The output of this yields:



       <div class="imgContainer"><img alt="David De Gea" class="img" data- 
      player="p51940" data-script="pl_player-image" data-size="250x250" data-
      widget="player-image" src="//platform-static-
      files.s3.amazonaws.com/premierleague/photos/players/250x250/Photo-
      Missing.png"/></div>


      I'm wondering if anyone knows why this is happening?







      python-3.x web-scraping beautifulsoup






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 3 at 21:43









      Liam GowerLiam Gower

      339




      339
























          1 Answer
          1






          active

          oldest

          votes


















          3














          It generated automatically by JS, maybe to prevent scraping. but you can just replace Photo-Missing with p51490, this value saved in data-player attribute.



          soup = BeautifulSoup(player_page.text, 'html.parser')
          # using CSS selector
          img = soup.select_one('.imgContainer img')
          img['src'] = img['src'].replace('Photo-Missing', img['data-player'])
          print(img)
          print(img['src'])





          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030214%2fbeautiful-soup-strangely-returning-photo-missing-png%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            It generated automatically by JS, maybe to prevent scraping. but you can just replace Photo-Missing with p51490, this value saved in data-player attribute.



            soup = BeautifulSoup(player_page.text, 'html.parser')
            # using CSS selector
            img = soup.select_one('.imgContainer img')
            img['src'] = img['src'].replace('Photo-Missing', img['data-player'])
            print(img)
            print(img['src'])





            share|improve this answer




























              3














              It generated automatically by JS, maybe to prevent scraping. but you can just replace Photo-Missing with p51490, this value saved in data-player attribute.



              soup = BeautifulSoup(player_page.text, 'html.parser')
              # using CSS selector
              img = soup.select_one('.imgContainer img')
              img['src'] = img['src'].replace('Photo-Missing', img['data-player'])
              print(img)
              print(img['src'])





              share|improve this answer


























                3












                3








                3







                It generated automatically by JS, maybe to prevent scraping. but you can just replace Photo-Missing with p51490, this value saved in data-player attribute.



                soup = BeautifulSoup(player_page.text, 'html.parser')
                # using CSS selector
                img = soup.select_one('.imgContainer img')
                img['src'] = img['src'].replace('Photo-Missing', img['data-player'])
                print(img)
                print(img['src'])





                share|improve this answer













                It generated automatically by JS, maybe to prevent scraping. but you can just replace Photo-Missing with p51490, this value saved in data-player attribute.



                soup = BeautifulSoup(player_page.text, 'html.parser')
                # using CSS selector
                img = soup.select_one('.imgContainer img')
                img['src'] = img['src'].replace('Photo-Missing', img['data-player'])
                print(img)
                print(img['src'])






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jan 4 at 6:00









                ewwinkewwink

                12.2k22440




                12.2k22440
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54030214%2fbeautiful-soup-strangely-returning-photo-missing-png%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    0,lvFzp,fLkGUPxLLiDs934qwUO,1vJX548pCR7pPV0QtxTtM7C i0puDH,XoVNSQ5HDx JN
                    a Yr,8q Y7,NHX23okjNmY3BnBQSzdOLuPvGFC0AcM0tmI,Ki,0Qo2J7

                    Popular posts from this blog

                    Monofisismo

                    Angular Downloading a file using contenturl with Basic Authentication

                    Olmecas