What's the fastest way to split dictionary keys into a string-type tuples and append another string to last...












7















Given a dictionary of string key and integer values, what's the fastest way to




  1. split each key into a string-type key tuple

  2. then append a special substring </w> to the last item in the tuple


Given:



counter = {'The': 6149,
'Project': 205,
'Gutenberg': 78,
'EBook': 5,
'of': 39169,
'Adventures': 2,
'Sherlock': 95,
'Holmes': 198,
'by': 6384,
'Sir': 30,
'Arthur': 18,
'Conan': 3,
'Doyle': 2,}


The goal is to achieve:



counter = {('T', 'h', 'e</w>'): 6149,
('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
('E', 'B', 'o', 'o', 'k</w>'): 5,
('o', 'f</w>'): 39169,
('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
('b', 'y</w>'): 6384,
('S', 'i', 'r</w>'): 30,
('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
('C', 'o', 'n', 'a', 'n</w>'): 3,
('D', 'o', 'y', 'l', 'e</w>'): 2,}


One way to do it is to




  • iterate through the counter and

  • converting all but the last character to the tuple

  • add to the tuple and create an outer tuple

  • and assign the tuple key to the count


I've tried



{(tuple(k[:-1])+(k[-1]+'</w>',) ,v) for k,v in counter.items()}


In more verbose form:



new_counter = {}
for k, v in counter.items():
left = tuple(k[:-1])
right = tuple(k[-1]+'w',)
new_k = (left + right,)
new_counter[new_k] = v


Is there a better way to do this?



Regarding the adding tuple and casting it to an outer tuple. Why is this allowed? Isn't tuple supposed to be immutable?










share|improve this question

























  • It sounds like your question belongs on CodeReview.

    – DYZ
    Jan 3 at 6:59













  • This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

    – igrinis
    Jan 13 at 7:07
















7















Given a dictionary of string key and integer values, what's the fastest way to




  1. split each key into a string-type key tuple

  2. then append a special substring </w> to the last item in the tuple


Given:



counter = {'The': 6149,
'Project': 205,
'Gutenberg': 78,
'EBook': 5,
'of': 39169,
'Adventures': 2,
'Sherlock': 95,
'Holmes': 198,
'by': 6384,
'Sir': 30,
'Arthur': 18,
'Conan': 3,
'Doyle': 2,}


The goal is to achieve:



counter = {('T', 'h', 'e</w>'): 6149,
('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
('E', 'B', 'o', 'o', 'k</w>'): 5,
('o', 'f</w>'): 39169,
('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
('b', 'y</w>'): 6384,
('S', 'i', 'r</w>'): 30,
('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
('C', 'o', 'n', 'a', 'n</w>'): 3,
('D', 'o', 'y', 'l', 'e</w>'): 2,}


One way to do it is to




  • iterate through the counter and

  • converting all but the last character to the tuple

  • add to the tuple and create an outer tuple

  • and assign the tuple key to the count


I've tried



{(tuple(k[:-1])+(k[-1]+'</w>',) ,v) for k,v in counter.items()}


In more verbose form:



new_counter = {}
for k, v in counter.items():
left = tuple(k[:-1])
right = tuple(k[-1]+'w',)
new_k = (left + right,)
new_counter[new_k] = v


Is there a better way to do this?



Regarding the adding tuple and casting it to an outer tuple. Why is this allowed? Isn't tuple supposed to be immutable?










share|improve this question

























  • It sounds like your question belongs on CodeReview.

    – DYZ
    Jan 3 at 6:59













  • This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

    – igrinis
    Jan 13 at 7:07














7












7








7








Given a dictionary of string key and integer values, what's the fastest way to




  1. split each key into a string-type key tuple

  2. then append a special substring </w> to the last item in the tuple


Given:



counter = {'The': 6149,
'Project': 205,
'Gutenberg': 78,
'EBook': 5,
'of': 39169,
'Adventures': 2,
'Sherlock': 95,
'Holmes': 198,
'by': 6384,
'Sir': 30,
'Arthur': 18,
'Conan': 3,
'Doyle': 2,}


The goal is to achieve:



counter = {('T', 'h', 'e</w>'): 6149,
('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
('E', 'B', 'o', 'o', 'k</w>'): 5,
('o', 'f</w>'): 39169,
('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
('b', 'y</w>'): 6384,
('S', 'i', 'r</w>'): 30,
('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
('C', 'o', 'n', 'a', 'n</w>'): 3,
('D', 'o', 'y', 'l', 'e</w>'): 2,}


One way to do it is to




  • iterate through the counter and

  • converting all but the last character to the tuple

  • add to the tuple and create an outer tuple

  • and assign the tuple key to the count


I've tried



{(tuple(k[:-1])+(k[-1]+'</w>',) ,v) for k,v in counter.items()}


In more verbose form:



new_counter = {}
for k, v in counter.items():
left = tuple(k[:-1])
right = tuple(k[-1]+'w',)
new_k = (left + right,)
new_counter[new_k] = v


Is there a better way to do this?



Regarding the adding tuple and casting it to an outer tuple. Why is this allowed? Isn't tuple supposed to be immutable?










share|improve this question
















Given a dictionary of string key and integer values, what's the fastest way to




  1. split each key into a string-type key tuple

  2. then append a special substring </w> to the last item in the tuple


Given:



counter = {'The': 6149,
'Project': 205,
'Gutenberg': 78,
'EBook': 5,
'of': 39169,
'Adventures': 2,
'Sherlock': 95,
'Holmes': 198,
'by': 6384,
'Sir': 30,
'Arthur': 18,
'Conan': 3,
'Doyle': 2,}


The goal is to achieve:



counter = {('T', 'h', 'e</w>'): 6149,
('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
('E', 'B', 'o', 'o', 'k</w>'): 5,
('o', 'f</w>'): 39169,
('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
('b', 'y</w>'): 6384,
('S', 'i', 'r</w>'): 30,
('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
('C', 'o', 'n', 'a', 'n</w>'): 3,
('D', 'o', 'y', 'l', 'e</w>'): 2,}


One way to do it is to




  • iterate through the counter and

  • converting all but the last character to the tuple

  • add to the tuple and create an outer tuple

  • and assign the tuple key to the count


I've tried



{(tuple(k[:-1])+(k[-1]+'</w>',) ,v) for k,v in counter.items()}


In more verbose form:



new_counter = {}
for k, v in counter.items():
left = tuple(k[:-1])
right = tuple(k[-1]+'w',)
new_k = (left + right,)
new_counter[new_k] = v


Is there a better way to do this?



Regarding the adding tuple and casting it to an outer tuple. Why is this allowed? Isn't tuple supposed to be immutable?







python string dictionary tuples






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 14 at 16:56









Borhan Kazimipour

8019




8019










asked Jan 3 at 6:49









alvasalvas

46k64249469




46k64249469













  • It sounds like your question belongs on CodeReview.

    – DYZ
    Jan 3 at 6:59













  • This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

    – igrinis
    Jan 13 at 7:07



















  • It sounds like your question belongs on CodeReview.

    – DYZ
    Jan 3 at 6:59













  • This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

    – igrinis
    Jan 13 at 7:07

















It sounds like your question belongs on CodeReview.

– DYZ
Jan 3 at 6:59







It sounds like your question belongs on CodeReview.

– DYZ
Jan 3 at 6:59















This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

– igrinis
Jan 13 at 7:07





This is possible because you create a NEW dictionary, and its keys are DIFFERENT tuples. The original keys of the dictionary are indeed immutable and you do not change them.

– igrinis
Jan 13 at 7:07












6 Answers
6






active

oldest

votes


















5





+50









I would propose a slightly modified version of your solution. Instead of using
tuple constructor you can use tuple unpacking:



>>> {(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}


The benefit of using tuple unpacking is you will get better performance as compared to tuple constructor. I will shed some more light on this by using timeit. I will be using randomly generated dict. Each key in the dict will have 2 randomly chosen characters from lower case alphabets and each value will be an integer in range 0-100. For all these benchmarks I am using Python 3.7.0



Benchmark with 100 elements in dict



$ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}
$ 10000 loops, best of 5: 36.6 usec per loop

$ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
$ 5000 loops, best of 5: 59.7 usec per loop


Benchmark with 1000 elements in dict



$ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
$ 1000 loops, best of 5: 192 usec per loop

$ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
$ 1000 loops, best of 5: 321 usec per loop


Benchmark with dict posted in question



$ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
$ 50000 loops, best of 5: 7.28 usec per loop

$ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
$ 20000 loops, best of 5: 11 usec per loop





share|improve this answer































    3














    You are close making a little changes to your code using tuple. You cannot modify the elements of a tuple, but you can replace one tuple with another::



    {tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}

    {('T', 'h', 'e</w>'): 6149,
    ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
    ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
    ('E', 'B', 'o', 'o', 'k</w>'): 5,
    ('o', 'f</w>'): 39169,
    ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
    ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
    ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
    ('b', 'y</w>'): 6384,
    ('S', 'i', 'r</w>'): 30,
    ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
    ('C', 'o', 'n', 'a', 'n</w>'): 3,
    ('D', 'o', 'y', 'l', 'e</w>'): 2}





    share|improve this answer































      2














      Or use str.split, and do str.join and '</w>' adding beforehand:



      >>> counter = {'The': 6149,
      'Project': 205,
      'Gutenberg': 78,
      'EBook': 5,
      'of': 39169,
      'Adventures': 2,
      'Sherlock': 95,
      'Holmes': 198,
      'by': 6384,
      'Sir': 30,
      'Arthur': 18,
      'Conan': 3,
      'Doyle': 2,}
      >>> {tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()}
      {('T', 'h', 'e</w>'): 6149, ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205, ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78, ('E', 'B', 'o', 'o', 'k</w>'): 5, ('o', 'f</w>'): 39169, ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2, ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95, ('H', 'o', 'l', 'm', 'e', 's</w>'): 198, ('b', 'y</w>'): 6384, ('S', 'i', 'r</w>'): 30, ('A', 'r', 't', 'h', 'u', 'r</w>'): 18, ('C', 'o', 'n', 'a', 'n</w>'): 3, ('D', 'o', 'y', 'l', 'e</w>'): 2}
      >>>


      Timings:



      import timeit
      print('bro-grammer:',timeit.timeit(lambda: [{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()} for i in range(1000)],number=10))
      print('Sandeep Kadapa:',timeit.timeit(lambda: [{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()} for i in range(1000)],number=10))
      print('U9-Forward:',timeit.timeit(lambda: [{tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()} for i in range(1000)],number=10))


      Output:



      bro-grammer: 0.1293355557653911
      Sandeep Kadapa: 0.20885866344797197
      U9-Forward: 0.3026948357193003





      share|improve this answer

































        0














        I would go for something like this:



        def f(string):
        l = list(string)
        l[-1] = l[-1] + '</w>'
        return tuple(l)
        dict((f(k), v) for k, v in counter.items())


        output:



        {('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
        ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
        ('C', 'o', 'n', 'a', 'n</w>'): 3,
        ('D', 'o', 'y', 'l', 'e</w>'): 2,
        ('E', 'B', 'o', 'o', 'k</w>'): 5,
        ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
        ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
        ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
        ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
        ('S', 'i', 'r</w>'): 30,
        ('T', 'h', 'e</w>'): 6149,
        ('b', 'y</w>'): 6384,
        ('o', 'f</w>'): 39169}





        share|improve this answer































          0














          With Python 3, you can use starred expression in tuples.



          You can try:



          >>> {(*key[:-1], key[-1] + '</w>'): value for key, value in counter.items()}





          share|improve this answer































            0














            You can also remove .items() from your iteration and go like this:



            {tuple(i[:-1]) + (i[-1]+'</w>',):counter[i] for i in counter}


            This is a little faster.



             timeit.timeit(lambda: {tuple(i[:-1]) + (i[-1]+'w',):counter[i] for i in counter}, number=10)
            0.000192291005179286





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54017546%2fwhats-the-fastest-way-to-split-dictionary-keys-into-a-string-type-tuples-and-ap%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              6 Answers
              6






              active

              oldest

              votes








              6 Answers
              6






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5





              +50









              I would propose a slightly modified version of your solution. Instead of using
              tuple constructor you can use tuple unpacking:



              >>> {(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}


              The benefit of using tuple unpacking is you will get better performance as compared to tuple constructor. I will shed some more light on this by using timeit. I will be using randomly generated dict. Each key in the dict will have 2 randomly chosen characters from lower case alphabets and each value will be an integer in range 0-100. For all these benchmarks I am using Python 3.7.0



              Benchmark with 100 elements in dict



              $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}
              $ 10000 loops, best of 5: 36.6 usec per loop

              $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
              $ 5000 loops, best of 5: 59.7 usec per loop


              Benchmark with 1000 elements in dict



              $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
              $ 1000 loops, best of 5: 192 usec per loop

              $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
              $ 1000 loops, best of 5: 321 usec per loop


              Benchmark with dict posted in question



              $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
              $ 50000 loops, best of 5: 7.28 usec per loop

              $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
              $ 20000 loops, best of 5: 11 usec per loop





              share|improve this answer




























                5





                +50









                I would propose a slightly modified version of your solution. Instead of using
                tuple constructor you can use tuple unpacking:



                >>> {(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}


                The benefit of using tuple unpacking is you will get better performance as compared to tuple constructor. I will shed some more light on this by using timeit. I will be using randomly generated dict. Each key in the dict will have 2 randomly chosen characters from lower case alphabets and each value will be an integer in range 0-100. For all these benchmarks I am using Python 3.7.0



                Benchmark with 100 elements in dict



                $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}
                $ 10000 loops, best of 5: 36.6 usec per loop

                $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                $ 5000 loops, best of 5: 59.7 usec per loop


                Benchmark with 1000 elements in dict



                $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                $ 1000 loops, best of 5: 192 usec per loop

                $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                $ 1000 loops, best of 5: 321 usec per loop


                Benchmark with dict posted in question



                $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                $ 50000 loops, best of 5: 7.28 usec per loop

                $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                $ 20000 loops, best of 5: 11 usec per loop





                share|improve this answer


























                  5





                  +50







                  5





                  +50



                  5




                  +50





                  I would propose a slightly modified version of your solution. Instead of using
                  tuple constructor you can use tuple unpacking:



                  >>> {(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}


                  The benefit of using tuple unpacking is you will get better performance as compared to tuple constructor. I will shed some more light on this by using timeit. I will be using randomly generated dict. Each key in the dict will have 2 randomly chosen characters from lower case alphabets and each value will be an integer in range 0-100. For all these benchmarks I am using Python 3.7.0



                  Benchmark with 100 elements in dict



                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}
                  $ 10000 loops, best of 5: 36.6 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 5000 loops, best of 5: 59.7 usec per loop


                  Benchmark with 1000 elements in dict



                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                  $ 1000 loops, best of 5: 192 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 1000 loops, best of 5: 321 usec per loop


                  Benchmark with dict posted in question



                  $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                  $ 50000 loops, best of 5: 7.28 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 20000 loops, best of 5: 11 usec per loop





                  share|improve this answer













                  I would propose a slightly modified version of your solution. Instead of using
                  tuple constructor you can use tuple unpacking:



                  >>> {(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}


                  The benefit of using tuple unpacking is you will get better performance as compared to tuple constructor. I will shed some more light on this by using timeit. I will be using randomly generated dict. Each key in the dict will have 2 randomly chosen characters from lower case alphabets and each value will be an integer in range 0-100. For all these benchmarks I am using Python 3.7.0



                  Benchmark with 100 elements in dict



                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}
                  $ 10000 loops, best of 5: 36.6 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(100)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 5000 loops, best of 5: 59.7 usec per loop


                  Benchmark with 1000 elements in dict



                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                  $ 1000 loops, best of 5: 192 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = {''.join(random.sample(string.ascii_lowercase,2)): random.randint(0,100) for _ in range(1000)}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 1000 loops, best of 5: 321 usec per loop


                  Benchmark with dict posted in question



                  $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()}"
                  $ 50000 loops, best of 5: 7.28 usec per loop

                  $ python -m timeit -s "import random" -s "import string" -s "counter = counter = {'The': 6149, 'Project': 205, 'Gutenberg': 78, 'EBook': 5, 'of': 39169, 'Adventures': 2, 'Sherlock': 95, 'Holmes': 198, 'by': 6384, 'Sir': 30, 'Arthur': 18, 'Conan': 3,'Doyle': 2}" "{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}"
                  $ 20000 loops, best of 5: 11 usec per loop






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 10 at 5:10









                  bro-grammerbro-grammer

                  3,35211433




                  3,35211433

























                      3














                      You are close making a little changes to your code using tuple. You cannot modify the elements of a tuple, but you can replace one tuple with another::



                      {tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}

                      {('T', 'h', 'e</w>'): 6149,
                      ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                      ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                      ('E', 'B', 'o', 'o', 'k</w>'): 5,
                      ('o', 'f</w>'): 39169,
                      ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                      ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                      ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                      ('b', 'y</w>'): 6384,
                      ('S', 'i', 'r</w>'): 30,
                      ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                      ('C', 'o', 'n', 'a', 'n</w>'): 3,
                      ('D', 'o', 'y', 'l', 'e</w>'): 2}





                      share|improve this answer




























                        3














                        You are close making a little changes to your code using tuple. You cannot modify the elements of a tuple, but you can replace one tuple with another::



                        {tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}

                        {('T', 'h', 'e</w>'): 6149,
                        ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                        ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                        ('E', 'B', 'o', 'o', 'k</w>'): 5,
                        ('o', 'f</w>'): 39169,
                        ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                        ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                        ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                        ('b', 'y</w>'): 6384,
                        ('S', 'i', 'r</w>'): 30,
                        ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                        ('C', 'o', 'n', 'a', 'n</w>'): 3,
                        ('D', 'o', 'y', 'l', 'e</w>'): 2}





                        share|improve this answer


























                          3












                          3








                          3







                          You are close making a little changes to your code using tuple. You cannot modify the elements of a tuple, but you can replace one tuple with another::



                          {tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}

                          {('T', 'h', 'e</w>'): 6149,
                          ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                          ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                          ('E', 'B', 'o', 'o', 'k</w>'): 5,
                          ('o', 'f</w>'): 39169,
                          ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                          ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                          ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                          ('b', 'y</w>'): 6384,
                          ('S', 'i', 'r</w>'): 30,
                          ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                          ('C', 'o', 'n', 'a', 'n</w>'): 3,
                          ('D', 'o', 'y', 'l', 'e</w>'): 2}





                          share|improve this answer













                          You are close making a little changes to your code using tuple. You cannot modify the elements of a tuple, but you can replace one tuple with another::



                          {tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()}

                          {('T', 'h', 'e</w>'): 6149,
                          ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                          ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                          ('E', 'B', 'o', 'o', 'k</w>'): 5,
                          ('o', 'f</w>'): 39169,
                          ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                          ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                          ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                          ('b', 'y</w>'): 6384,
                          ('S', 'i', 'r</w>'): 30,
                          ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                          ('C', 'o', 'n', 'a', 'n</w>'): 3,
                          ('D', 'o', 'y', 'l', 'e</w>'): 2}






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Jan 3 at 7:02









                          Sandeep KadapaSandeep Kadapa

                          7,388831




                          7,388831























                              2














                              Or use str.split, and do str.join and '</w>' adding beforehand:



                              >>> counter = {'The': 6149,
                              'Project': 205,
                              'Gutenberg': 78,
                              'EBook': 5,
                              'of': 39169,
                              'Adventures': 2,
                              'Sherlock': 95,
                              'Holmes': 198,
                              'by': 6384,
                              'Sir': 30,
                              'Arthur': 18,
                              'Conan': 3,
                              'Doyle': 2,}
                              >>> {tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()}
                              {('T', 'h', 'e</w>'): 6149, ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205, ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78, ('E', 'B', 'o', 'o', 'k</w>'): 5, ('o', 'f</w>'): 39169, ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2, ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95, ('H', 'o', 'l', 'm', 'e', 's</w>'): 198, ('b', 'y</w>'): 6384, ('S', 'i', 'r</w>'): 30, ('A', 'r', 't', 'h', 'u', 'r</w>'): 18, ('C', 'o', 'n', 'a', 'n</w>'): 3, ('D', 'o', 'y', 'l', 'e</w>'): 2}
                              >>>


                              Timings:



                              import timeit
                              print('bro-grammer:',timeit.timeit(lambda: [{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()} for i in range(1000)],number=10))
                              print('Sandeep Kadapa:',timeit.timeit(lambda: [{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()} for i in range(1000)],number=10))
                              print('U9-Forward:',timeit.timeit(lambda: [{tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()} for i in range(1000)],number=10))


                              Output:



                              bro-grammer: 0.1293355557653911
                              Sandeep Kadapa: 0.20885866344797197
                              U9-Forward: 0.3026948357193003





                              share|improve this answer






























                                2














                                Or use str.split, and do str.join and '</w>' adding beforehand:



                                >>> counter = {'The': 6149,
                                'Project': 205,
                                'Gutenberg': 78,
                                'EBook': 5,
                                'of': 39169,
                                'Adventures': 2,
                                'Sherlock': 95,
                                'Holmes': 198,
                                'by': 6384,
                                'Sir': 30,
                                'Arthur': 18,
                                'Conan': 3,
                                'Doyle': 2,}
                                >>> {tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()}
                                {('T', 'h', 'e</w>'): 6149, ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205, ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78, ('E', 'B', 'o', 'o', 'k</w>'): 5, ('o', 'f</w>'): 39169, ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2, ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95, ('H', 'o', 'l', 'm', 'e', 's</w>'): 198, ('b', 'y</w>'): 6384, ('S', 'i', 'r</w>'): 30, ('A', 'r', 't', 'h', 'u', 'r</w>'): 18, ('C', 'o', 'n', 'a', 'n</w>'): 3, ('D', 'o', 'y', 'l', 'e</w>'): 2}
                                >>>


                                Timings:



                                import timeit
                                print('bro-grammer:',timeit.timeit(lambda: [{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()} for i in range(1000)],number=10))
                                print('Sandeep Kadapa:',timeit.timeit(lambda: [{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()} for i in range(1000)],number=10))
                                print('U9-Forward:',timeit.timeit(lambda: [{tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()} for i in range(1000)],number=10))


                                Output:



                                bro-grammer: 0.1293355557653911
                                Sandeep Kadapa: 0.20885866344797197
                                U9-Forward: 0.3026948357193003





                                share|improve this answer




























                                  2












                                  2








                                  2







                                  Or use str.split, and do str.join and '</w>' adding beforehand:



                                  >>> counter = {'The': 6149,
                                  'Project': 205,
                                  'Gutenberg': 78,
                                  'EBook': 5,
                                  'of': 39169,
                                  'Adventures': 2,
                                  'Sherlock': 95,
                                  'Holmes': 198,
                                  'by': 6384,
                                  'Sir': 30,
                                  'Arthur': 18,
                                  'Conan': 3,
                                  'Doyle': 2,}
                                  >>> {tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()}
                                  {('T', 'h', 'e</w>'): 6149, ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205, ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78, ('E', 'B', 'o', 'o', 'k</w>'): 5, ('o', 'f</w>'): 39169, ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2, ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95, ('H', 'o', 'l', 'm', 'e', 's</w>'): 198, ('b', 'y</w>'): 6384, ('S', 'i', 'r</w>'): 30, ('A', 'r', 't', 'h', 'u', 'r</w>'): 18, ('C', 'o', 'n', 'a', 'n</w>'): 3, ('D', 'o', 'y', 'l', 'e</w>'): 2}
                                  >>>


                                  Timings:



                                  import timeit
                                  print('bro-grammer:',timeit.timeit(lambda: [{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()} for i in range(1000)],number=10))
                                  print('Sandeep Kadapa:',timeit.timeit(lambda: [{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()} for i in range(1000)],number=10))
                                  print('U9-Forward:',timeit.timeit(lambda: [{tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()} for i in range(1000)],number=10))


                                  Output:



                                  bro-grammer: 0.1293355557653911
                                  Sandeep Kadapa: 0.20885866344797197
                                  U9-Forward: 0.3026948357193003





                                  share|improve this answer















                                  Or use str.split, and do str.join and '</w>' adding beforehand:



                                  >>> counter = {'The': 6149,
                                  'Project': 205,
                                  'Gutenberg': 78,
                                  'EBook': 5,
                                  'of': 39169,
                                  'Adventures': 2,
                                  'Sherlock': 95,
                                  'Holmes': 198,
                                  'by': 6384,
                                  'Sir': 30,
                                  'Arthur': 18,
                                  'Conan': 3,
                                  'Doyle': 2,}
                                  >>> {tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()}
                                  {('T', 'h', 'e</w>'): 6149, ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205, ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78, ('E', 'B', 'o', 'o', 'k</w>'): 5, ('o', 'f</w>'): 39169, ('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2, ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95, ('H', 'o', 'l', 'm', 'e', 's</w>'): 198, ('b', 'y</w>'): 6384, ('S', 'i', 'r</w>'): 30, ('A', 'r', 't', 'h', 'u', 'r</w>'): 18, ('C', 'o', 'n', 'a', 'n</w>'): 3, ('D', 'o', 'y', 'l', 'e</w>'): 2}
                                  >>>


                                  Timings:



                                  import timeit
                                  print('bro-grammer:',timeit.timeit(lambda: [{(*a[:-1],f'a[-1]</w>',):b for a,b in counter.items()} for i in range(1000)],number=10))
                                  print('Sandeep Kadapa:',timeit.timeit(lambda: [{tuple(key[:-1])+(key[-1]+'</w>',):value for key,value in counter.items()} for i in range(1000)],number=10))
                                  print('U9-Forward:',timeit.timeit(lambda: [{tuple((' '.join(k)+'</w>').split()):v for k,v in counter.items()} for i in range(1000)],number=10))


                                  Output:



                                  bro-grammer: 0.1293355557653911
                                  Sandeep Kadapa: 0.20885866344797197
                                  U9-Forward: 0.3026948357193003






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Jan 13 at 2:56

























                                  answered Jan 11 at 9:05









                                  U9-ForwardU9-Forward

                                  16.9k51643




                                  16.9k51643























                                      0














                                      I would go for something like this:



                                      def f(string):
                                      l = list(string)
                                      l[-1] = l[-1] + '</w>'
                                      return tuple(l)
                                      dict((f(k), v) for k, v in counter.items())


                                      output:



                                      {('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                                      ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                                      ('C', 'o', 'n', 'a', 'n</w>'): 3,
                                      ('D', 'o', 'y', 'l', 'e</w>'): 2,
                                      ('E', 'B', 'o', 'o', 'k</w>'): 5,
                                      ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                                      ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                                      ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                                      ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                                      ('S', 'i', 'r</w>'): 30,
                                      ('T', 'h', 'e</w>'): 6149,
                                      ('b', 'y</w>'): 6384,
                                      ('o', 'f</w>'): 39169}





                                      share|improve this answer




























                                        0














                                        I would go for something like this:



                                        def f(string):
                                        l = list(string)
                                        l[-1] = l[-1] + '</w>'
                                        return tuple(l)
                                        dict((f(k), v) for k, v in counter.items())


                                        output:



                                        {('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                                        ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                                        ('C', 'o', 'n', 'a', 'n</w>'): 3,
                                        ('D', 'o', 'y', 'l', 'e</w>'): 2,
                                        ('E', 'B', 'o', 'o', 'k</w>'): 5,
                                        ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                                        ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                                        ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                                        ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                                        ('S', 'i', 'r</w>'): 30,
                                        ('T', 'h', 'e</w>'): 6149,
                                        ('b', 'y</w>'): 6384,
                                        ('o', 'f</w>'): 39169}





                                        share|improve this answer


























                                          0












                                          0








                                          0







                                          I would go for something like this:



                                          def f(string):
                                          l = list(string)
                                          l[-1] = l[-1] + '</w>'
                                          return tuple(l)
                                          dict((f(k), v) for k, v in counter.items())


                                          output:



                                          {('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                                          ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                                          ('C', 'o', 'n', 'a', 'n</w>'): 3,
                                          ('D', 'o', 'y', 'l', 'e</w>'): 2,
                                          ('E', 'B', 'o', 'o', 'k</w>'): 5,
                                          ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                                          ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                                          ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                                          ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                                          ('S', 'i', 'r</w>'): 30,
                                          ('T', 'h', 'e</w>'): 6149,
                                          ('b', 'y</w>'): 6384,
                                          ('o', 'f</w>'): 39169}





                                          share|improve this answer













                                          I would go for something like this:



                                          def f(string):
                                          l = list(string)
                                          l[-1] = l[-1] + '</w>'
                                          return tuple(l)
                                          dict((f(k), v) for k, v in counter.items())


                                          output:



                                          {('A', 'd', 'v', 'e', 'n', 't', 'u', 'r', 'e', 's</w>'): 2,
                                          ('A', 'r', 't', 'h', 'u', 'r</w>'): 18,
                                          ('C', 'o', 'n', 'a', 'n</w>'): 3,
                                          ('D', 'o', 'y', 'l', 'e</w>'): 2,
                                          ('E', 'B', 'o', 'o', 'k</w>'): 5,
                                          ('G', 'u', 't', 'e', 'n', 'b', 'e', 'r', 'g</w>'): 78,
                                          ('H', 'o', 'l', 'm', 'e', 's</w>'): 198,
                                          ('P', 'r', 'o', 'j', 'e', 'c', 't</w>'): 205,
                                          ('S', 'h', 'e', 'r', 'l', 'o', 'c', 'k</w>'): 95,
                                          ('S', 'i', 'r</w>'): 30,
                                          ('T', 'h', 'e</w>'): 6149,
                                          ('b', 'y</w>'): 6384,
                                          ('o', 'f</w>'): 39169}






                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Jan 8 at 8:28









                                          Farzad VertigoFarzad Vertigo

                                          601614




                                          601614























                                              0














                                              With Python 3, you can use starred expression in tuples.



                                              You can try:



                                              >>> {(*key[:-1], key[-1] + '</w>'): value for key, value in counter.items()}





                                              share|improve this answer




























                                                0














                                                With Python 3, you can use starred expression in tuples.



                                                You can try:



                                                >>> {(*key[:-1], key[-1] + '</w>'): value for key, value in counter.items()}





                                                share|improve this answer


























                                                  0












                                                  0








                                                  0







                                                  With Python 3, you can use starred expression in tuples.



                                                  You can try:



                                                  >>> {(*key[:-1], key[-1] + '</w>'): value for key, value in counter.items()}





                                                  share|improve this answer













                                                  With Python 3, you can use starred expression in tuples.



                                                  You can try:



                                                  >>> {(*key[:-1], key[-1] + '</w>'): value for key, value in counter.items()}






                                                  share|improve this answer












                                                  share|improve this answer



                                                  share|improve this answer










                                                  answered Jan 13 at 8:52









                                                  Laurent LAPORTELaurent LAPORTE

                                                  11.6k23062




                                                  11.6k23062























                                                      0














                                                      You can also remove .items() from your iteration and go like this:



                                                      {tuple(i[:-1]) + (i[-1]+'</w>',):counter[i] for i in counter}


                                                      This is a little faster.



                                                       timeit.timeit(lambda: {tuple(i[:-1]) + (i[-1]+'w',):counter[i] for i in counter}, number=10)
                                                      0.000192291005179286





                                                      share|improve this answer




























                                                        0














                                                        You can also remove .items() from your iteration and go like this:



                                                        {tuple(i[:-1]) + (i[-1]+'</w>',):counter[i] for i in counter}


                                                        This is a little faster.



                                                         timeit.timeit(lambda: {tuple(i[:-1]) + (i[-1]+'w',):counter[i] for i in counter}, number=10)
                                                        0.000192291005179286





                                                        share|improve this answer


























                                                          0












                                                          0








                                                          0







                                                          You can also remove .items() from your iteration and go like this:



                                                          {tuple(i[:-1]) + (i[-1]+'</w>',):counter[i] for i in counter}


                                                          This is a little faster.



                                                           timeit.timeit(lambda: {tuple(i[:-1]) + (i[-1]+'w',):counter[i] for i in counter}, number=10)
                                                          0.000192291005179286





                                                          share|improve this answer













                                                          You can also remove .items() from your iteration and go like this:



                                                          {tuple(i[:-1]) + (i[-1]+'</w>',):counter[i] for i in counter}


                                                          This is a little faster.



                                                           timeit.timeit(lambda: {tuple(i[:-1]) + (i[-1]+'w',):counter[i] for i in counter}, number=10)
                                                          0.000192291005179286






                                                          share|improve this answer












                                                          share|improve this answer



                                                          share|improve this answer










                                                          answered Jan 14 at 11:31









                                                          Mehrdad PedramfarMehrdad Pedramfar

                                                          6,33411643




                                                          6,33411643






























                                                              draft saved

                                                              draft discarded




















































                                                              Thanks for contributing an answer to Stack Overflow!


                                                              • Please be sure to answer the question. Provide details and share your research!

                                                              But avoid



                                                              • Asking for help, clarification, or responding to other answers.

                                                              • Making statements based on opinion; back them up with references or personal experience.


                                                              To learn more, see our tips on writing great answers.




                                                              draft saved


                                                              draft discarded














                                                              StackExchange.ready(
                                                              function () {
                                                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54017546%2fwhats-the-fastest-way-to-split-dictionary-keys-into-a-string-type-tuples-and-ap%23new-answer', 'question_page');
                                                              }
                                                              );

                                                              Post as a guest















                                                              Required, but never shown





















































                                                              Required, but never shown














                                                              Required, but never shown












                                                              Required, but never shown







                                                              Required, but never shown

































                                                              Required, but never shown














                                                              Required, but never shown












                                                              Required, but never shown







                                                              Required, but never shown







                                                              Popular posts from this blog

                                                              Mossoró

                                                              Error while reading .h5 file using the rhdf5 package in R

                                                              Pushsharp Apns notification error: 'InvalidToken'