Map unique raw words to a list of code wordsConverting base-10 numbers into base-26 lettersFiltering a long list of files through a set of ignore patterns using iteratorsCarTalk's Homophones Puzzler: A Programmatic SolutionCheck consistency of a list of statements, with fuzzy rhyme matchingIterator to generate all the words (of a given words) that are one change away“Acro Words” - Creating Acronyms of a Text that are Wordsinroder_iterator for syntax tree of markargsFirstDuplicate FinderCounting lower vs non-lowercase tokens for tokenized text with several conditionsA GapHelper static class to create unique (non-existing) size restricted names

Is it appropriate to ask for the text of a eulogy?

How to analyse 'Element not Found' exceptions when working with Selenium

What Lego set has the biggest box?

finding IP return hex address

Is it possible for a tiger's tail to be taken off and replaced with a living cobra, with both creatures still alive?

Declining a paper review after accepting it and seeing the manuscript

Is there something as common frequency?

How much of a discount should I seek when prepaying a whole year's rent?

How did 達 (~tachi) come to mean `pluralize` something?

Confused about Autoregressive AR(1) process

What are the downsides of being a debt-free country (no foreign national debt)?

Was a four year-old forced to sleep on the floor of Leeds General Infirmary?

Command to keep only a portion of JSON data from each line?

Why is there no FPU on (most) DSP chips?

Would Topic Modelling be classified as NLP or NLU?

Have spacecraft photographed each other beyond Earth orbit?

What is a Aged Rope Phrase™?

Can I ignore an open source license if I checkout a version that was released prior to the code having any license?

Does the basis graph of a matroid determine it?

Thoughts on if it's possible to succeed in math @ PhD level w/o natural ability in quant reasoning?

Do trolls appear to be dead after reaching 0 HP from non-fire/acid damage?

Why is Trump not being impeached for bribery?

Sudden cheap travel?

Sleep for 1000 years



Map unique raw words to a list of code words


Converting base-10 numbers into base-26 lettersFiltering a long list of files through a set of ignore patterns using iteratorsCarTalk's Homophones Puzzler: A Programmatic SolutionCheck consistency of a list of statements, with fuzzy rhyme matchingIterator to generate all the words (of a given words) that are one change away“Acro Words” - Creating Acronyms of a Text that are Wordsinroder_iterator for syntax tree of markargsFirstDuplicate FinderCounting lower vs non-lowercase tokens for tokenized text with several conditionsA GapHelper static class to create unique (non-existing) size restricted names






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









7















$begingroup$


Problem



Write a function that replaces the words in raw with the words in code_words such that the first occurrence of each word in raw is assigned the first unassigned word in code_words. If the code_words list is too short, raise an error. code_words may contain duplicates, in which case the function should ignore/skip them.



Examples:



encoder(["a"], ["1", "2", "3", "4"]) → ["1"]
encoder(["a", "b"], ["1", "2", "3", "4"]) → ["1", "2"]
encoder(["a", "b", "a"], ["1", "1", "2", "3", "4"]) → ["1", "2", "1"]


Solution



def encoder(raw, code_words):
cw = iter(code_words)
code_by_raw = # map of raw item to code item
result = []
seen = set() # for ignoring duplicate code_words
for r in raw:
if r not in code_by_raw:
for code in cw: # cw is iter(code_words), "persistent pointer"
if code not in seen:
seen.add(code)
break
else: # nobreak; ran out of code_words
raise ValueError("not enough code_words")
code_by_raw[r] = code
result.append(code_by_raw[r])
return result


Questions



My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?



What should be the typical best practices in this case?



Might it be better if I used the following instead?



try:
code = next(cw)
while code in seen:
code = next(cw)
except StopIteration:
raise ValueError("not enough code_words")
else:
seen.add(code)









share|improve this question











$endgroup$





















    7















    $begingroup$


    Problem



    Write a function that replaces the words in raw with the words in code_words such that the first occurrence of each word in raw is assigned the first unassigned word in code_words. If the code_words list is too short, raise an error. code_words may contain duplicates, in which case the function should ignore/skip them.



    Examples:



    encoder(["a"], ["1", "2", "3", "4"]) → ["1"]
    encoder(["a", "b"], ["1", "2", "3", "4"]) → ["1", "2"]
    encoder(["a", "b", "a"], ["1", "1", "2", "3", "4"]) → ["1", "2", "1"]


    Solution



    def encoder(raw, code_words):
    cw = iter(code_words)
    code_by_raw = # map of raw item to code item
    result = []
    seen = set() # for ignoring duplicate code_words
    for r in raw:
    if r not in code_by_raw:
    for code in cw: # cw is iter(code_words), "persistent pointer"
    if code not in seen:
    seen.add(code)
    break
    else: # nobreak; ran out of code_words
    raise ValueError("not enough code_words")
    code_by_raw[r] = code
    result.append(code_by_raw[r])
    return result


    Questions



    My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?



    What should be the typical best practices in this case?



    Might it be better if I used the following instead?



    try:
    code = next(cw)
    while code in seen:
    code = next(cw)
    except StopIteration:
    raise ValueError("not enough code_words")
    else:
    seen.add(code)









    share|improve this question











    $endgroup$

















      7













      7









      7





      $begingroup$


      Problem



      Write a function that replaces the words in raw with the words in code_words such that the first occurrence of each word in raw is assigned the first unassigned word in code_words. If the code_words list is too short, raise an error. code_words may contain duplicates, in which case the function should ignore/skip them.



      Examples:



      encoder(["a"], ["1", "2", "3", "4"]) → ["1"]
      encoder(["a", "b"], ["1", "2", "3", "4"]) → ["1", "2"]
      encoder(["a", "b", "a"], ["1", "1", "2", "3", "4"]) → ["1", "2", "1"]


      Solution



      def encoder(raw, code_words):
      cw = iter(code_words)
      code_by_raw = # map of raw item to code item
      result = []
      seen = set() # for ignoring duplicate code_words
      for r in raw:
      if r not in code_by_raw:
      for code in cw: # cw is iter(code_words), "persistent pointer"
      if code not in seen:
      seen.add(code)
      break
      else: # nobreak; ran out of code_words
      raise ValueError("not enough code_words")
      code_by_raw[r] = code
      result.append(code_by_raw[r])
      return result


      Questions



      My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?



      What should be the typical best practices in this case?



      Might it be better if I used the following instead?



      try:
      code = next(cw)
      while code in seen:
      code = next(cw)
      except StopIteration:
      raise ValueError("not enough code_words")
      else:
      seen.add(code)









      share|improve this question











      $endgroup$




      Problem



      Write a function that replaces the words in raw with the words in code_words such that the first occurrence of each word in raw is assigned the first unassigned word in code_words. If the code_words list is too short, raise an error. code_words may contain duplicates, in which case the function should ignore/skip them.



      Examples:



      encoder(["a"], ["1", "2", "3", "4"]) → ["1"]
      encoder(["a", "b"], ["1", "2", "3", "4"]) → ["1", "2"]
      encoder(["a", "b", "a"], ["1", "1", "2", "3", "4"]) → ["1", "2", "1"]


      Solution



      def encoder(raw, code_words):
      cw = iter(code_words)
      code_by_raw = # map of raw item to code item
      result = []
      seen = set() # for ignoring duplicate code_words
      for r in raw:
      if r not in code_by_raw:
      for code in cw: # cw is iter(code_words), "persistent pointer"
      if code not in seen:
      seen.add(code)
      break
      else: # nobreak; ran out of code_words
      raise ValueError("not enough code_words")
      code_by_raw[r] = code
      result.append(code_by_raw[r])
      return result


      Questions



      My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?



      What should be the typical best practices in this case?



      Might it be better if I used the following instead?



      try:
      code = next(cw)
      while code in seen:
      code = next(cw)
      except StopIteration:
      raise ValueError("not enough code_words")
      else:
      seen.add(code)






      python iterator iteration generator






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Sep 30 at 16:07









      Mast

      11.1k7 gold badges41 silver badges95 bronze badges




      11.1k7 gold badges41 silver badges95 bronze badges










      asked Sep 30 at 6:31









      nehcsivartnehcsivart

      1733 bronze badges




      1733 bronze badges























          1 Answer
          1






          active

          oldest

          votes


















          6

















          $begingroup$


          My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?




          No. Instead, you can just remove the line cw = iter(code_words) as long as it's a native iterable. "Persistent Pointer" isn't a thing in python, because all python knows are Names.




          What should be the typical best practices in this case?




          That would be building a dictionary and using it for the actual translation. You're basically already doing this with your code_by_raw, if a bit more verbose than others might. The only real difference would be that, in my opinion, it would be better to first establish the translation, and then create the result.



          Except for your premature result generation, I would say your current function isn't bad. It does what it needs to do, it does it well without stupid actions, but it's not very readable. It's said often, I think you need to factor out a bit of code. Specifically, the bit that handles the fact that your inputs don't have to yield unique values, and how you need to handle duplicates.



          I would suggest a generator to handle that. This simplifies the main function a ton. (A comment pointed me towards the unique_everseen recipe, which is a slightly broader function. We don't quite need all it's functionality, but it might be worth the effort if you need some more flexibility.)



          def unique(iterable):
          """ Generator that "uniquefies" an iterator. Subsequent values equal to values already yielded will be ignored. """
          past = set()
          for entry in iterable:
          if entry in past:
          continue
          past.add(entry)
          yield entry

          def encoder(raw_words, code_words):
          # Create mapping dictionary:
          code_by_raw = dict(zip(unique(raw_words), unique(code_words))
          # Check if we had sufficient code_words:
          if len(code_by_raw) < len(raw_words):
          raise ValueError("not enough code_words")
          # Do translation and return the result
          return [code_by_raw[raw] for raw in raw_words]


          I can't completely tell your experience level with python. For result creation, I'm using comprehensions here.




          Might it be better if I used the following instead?




          It would not be bad functionally to use a structure like that, but it's still ugly (but opinions may differ). It basically does the same as my unique() generator up there.






          share|improve this answer












          $endgroup$













          • $begingroup$
            Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
            $endgroup$
            – Graipher
            Sep 30 at 7:40







          • 1




            $begingroup$
            Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
            $endgroup$
            – Gloweye
            Sep 30 at 7:44






          • 1




            $begingroup$
            Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
            $endgroup$
            – Graipher
            Sep 30 at 7:48











          • $begingroup$
            Ah, OK. Didn't pay attention to the header.
            $endgroup$
            – Gloweye
            Sep 30 at 7:50






          • 1




            $begingroup$
            I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
            $endgroup$
            – GZ0
            Sep 30 at 21:23












          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );














          draft saved

          draft discarded
















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f229891%2fmap-unique-raw-words-to-a-list-of-code-words%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown


























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          6

















          $begingroup$


          My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?




          No. Instead, you can just remove the line cw = iter(code_words) as long as it's a native iterable. "Persistent Pointer" isn't a thing in python, because all python knows are Names.




          What should be the typical best practices in this case?




          That would be building a dictionary and using it for the actual translation. You're basically already doing this with your code_by_raw, if a bit more verbose than others might. The only real difference would be that, in my opinion, it would be better to first establish the translation, and then create the result.



          Except for your premature result generation, I would say your current function isn't bad. It does what it needs to do, it does it well without stupid actions, but it's not very readable. It's said often, I think you need to factor out a bit of code. Specifically, the bit that handles the fact that your inputs don't have to yield unique values, and how you need to handle duplicates.



          I would suggest a generator to handle that. This simplifies the main function a ton. (A comment pointed me towards the unique_everseen recipe, which is a slightly broader function. We don't quite need all it's functionality, but it might be worth the effort if you need some more flexibility.)



          def unique(iterable):
          """ Generator that "uniquefies" an iterator. Subsequent values equal to values already yielded will be ignored. """
          past = set()
          for entry in iterable:
          if entry in past:
          continue
          past.add(entry)
          yield entry

          def encoder(raw_words, code_words):
          # Create mapping dictionary:
          code_by_raw = dict(zip(unique(raw_words), unique(code_words))
          # Check if we had sufficient code_words:
          if len(code_by_raw) < len(raw_words):
          raise ValueError("not enough code_words")
          # Do translation and return the result
          return [code_by_raw[raw] for raw in raw_words]


          I can't completely tell your experience level with python. For result creation, I'm using comprehensions here.




          Might it be better if I used the following instead?




          It would not be bad functionally to use a structure like that, but it's still ugly (but opinions may differ). It basically does the same as my unique() generator up there.






          share|improve this answer












          $endgroup$













          • $begingroup$
            Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
            $endgroup$
            – Graipher
            Sep 30 at 7:40







          • 1




            $begingroup$
            Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
            $endgroup$
            – Gloweye
            Sep 30 at 7:44






          • 1




            $begingroup$
            Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
            $endgroup$
            – Graipher
            Sep 30 at 7:48











          • $begingroup$
            Ah, OK. Didn't pay attention to the header.
            $endgroup$
            – Gloweye
            Sep 30 at 7:50






          • 1




            $begingroup$
            I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
            $endgroup$
            – GZ0
            Sep 30 at 21:23















          6

















          $begingroup$


          My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?




          No. Instead, you can just remove the line cw = iter(code_words) as long as it's a native iterable. "Persistent Pointer" isn't a thing in python, because all python knows are Names.




          What should be the typical best practices in this case?




          That would be building a dictionary and using it for the actual translation. You're basically already doing this with your code_by_raw, if a bit more verbose than others might. The only real difference would be that, in my opinion, it would be better to first establish the translation, and then create the result.



          Except for your premature result generation, I would say your current function isn't bad. It does what it needs to do, it does it well without stupid actions, but it's not very readable. It's said often, I think you need to factor out a bit of code. Specifically, the bit that handles the fact that your inputs don't have to yield unique values, and how you need to handle duplicates.



          I would suggest a generator to handle that. This simplifies the main function a ton. (A comment pointed me towards the unique_everseen recipe, which is a slightly broader function. We don't quite need all it's functionality, but it might be worth the effort if you need some more flexibility.)



          def unique(iterable):
          """ Generator that "uniquefies" an iterator. Subsequent values equal to values already yielded will be ignored. """
          past = set()
          for entry in iterable:
          if entry in past:
          continue
          past.add(entry)
          yield entry

          def encoder(raw_words, code_words):
          # Create mapping dictionary:
          code_by_raw = dict(zip(unique(raw_words), unique(code_words))
          # Check if we had sufficient code_words:
          if len(code_by_raw) < len(raw_words):
          raise ValueError("not enough code_words")
          # Do translation and return the result
          return [code_by_raw[raw] for raw in raw_words]


          I can't completely tell your experience level with python. For result creation, I'm using comprehensions here.




          Might it be better if I used the following instead?




          It would not be bad functionally to use a structure like that, but it's still ugly (but opinions may differ). It basically does the same as my unique() generator up there.






          share|improve this answer












          $endgroup$













          • $begingroup$
            Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
            $endgroup$
            – Graipher
            Sep 30 at 7:40







          • 1




            $begingroup$
            Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
            $endgroup$
            – Gloweye
            Sep 30 at 7:44






          • 1




            $begingroup$
            Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
            $endgroup$
            – Graipher
            Sep 30 at 7:48











          • $begingroup$
            Ah, OK. Didn't pay attention to the header.
            $endgroup$
            – Gloweye
            Sep 30 at 7:50






          • 1




            $begingroup$
            I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
            $endgroup$
            – GZ0
            Sep 30 at 21:23













          6















          6











          6







          $begingroup$


          My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?




          No. Instead, you can just remove the line cw = iter(code_words) as long as it's a native iterable. "Persistent Pointer" isn't a thing in python, because all python knows are Names.




          What should be the typical best practices in this case?




          That would be building a dictionary and using it for the actual translation. You're basically already doing this with your code_by_raw, if a bit more verbose than others might. The only real difference would be that, in my opinion, it would be better to first establish the translation, and then create the result.



          Except for your premature result generation, I would say your current function isn't bad. It does what it needs to do, it does it well without stupid actions, but it's not very readable. It's said often, I think you need to factor out a bit of code. Specifically, the bit that handles the fact that your inputs don't have to yield unique values, and how you need to handle duplicates.



          I would suggest a generator to handle that. This simplifies the main function a ton. (A comment pointed me towards the unique_everseen recipe, which is a slightly broader function. We don't quite need all it's functionality, but it might be worth the effort if you need some more flexibility.)



          def unique(iterable):
          """ Generator that "uniquefies" an iterator. Subsequent values equal to values already yielded will be ignored. """
          past = set()
          for entry in iterable:
          if entry in past:
          continue
          past.add(entry)
          yield entry

          def encoder(raw_words, code_words):
          # Create mapping dictionary:
          code_by_raw = dict(zip(unique(raw_words), unique(code_words))
          # Check if we had sufficient code_words:
          if len(code_by_raw) < len(raw_words):
          raise ValueError("not enough code_words")
          # Do translation and return the result
          return [code_by_raw[raw] for raw in raw_words]


          I can't completely tell your experience level with python. For result creation, I'm using comprehensions here.




          Might it be better if I used the following instead?




          It would not be bad functionally to use a structure like that, but it's still ugly (but opinions may differ). It basically does the same as my unique() generator up there.






          share|improve this answer












          $endgroup$




          My main concern is the use of cw as a "persistent pointer". Specifically, might people be confused when they see for code in cw?




          No. Instead, you can just remove the line cw = iter(code_words) as long as it's a native iterable. "Persistent Pointer" isn't a thing in python, because all python knows are Names.




          What should be the typical best practices in this case?




          That would be building a dictionary and using it for the actual translation. You're basically already doing this with your code_by_raw, if a bit more verbose than others might. The only real difference would be that, in my opinion, it would be better to first establish the translation, and then create the result.



          Except for your premature result generation, I would say your current function isn't bad. It does what it needs to do, it does it well without stupid actions, but it's not very readable. It's said often, I think you need to factor out a bit of code. Specifically, the bit that handles the fact that your inputs don't have to yield unique values, and how you need to handle duplicates.



          I would suggest a generator to handle that. This simplifies the main function a ton. (A comment pointed me towards the unique_everseen recipe, which is a slightly broader function. We don't quite need all it's functionality, but it might be worth the effort if you need some more flexibility.)



          def unique(iterable):
          """ Generator that "uniquefies" an iterator. Subsequent values equal to values already yielded will be ignored. """
          past = set()
          for entry in iterable:
          if entry in past:
          continue
          past.add(entry)
          yield entry

          def encoder(raw_words, code_words):
          # Create mapping dictionary:
          code_by_raw = dict(zip(unique(raw_words), unique(code_words))
          # Check if we had sufficient code_words:
          if len(code_by_raw) < len(raw_words):
          raise ValueError("not enough code_words")
          # Do translation and return the result
          return [code_by_raw[raw] for raw in raw_words]


          I can't completely tell your experience level with python. For result creation, I'm using comprehensions here.




          Might it be better if I used the following instead?




          It would not be bad functionally to use a structure like that, but it's still ugly (but opinions may differ). It basically does the same as my unique() generator up there.







          share|improve this answer















          share|improve this answer




          share|improve this answer








          edited Sep 30 at 19:08

























          answered Sep 30 at 7:06









          GloweyeGloweye

          1,7165 silver badges19 bronze badges




          1,7165 silver badges19 bronze badges














          • $begingroup$
            Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
            $endgroup$
            – Graipher
            Sep 30 at 7:40







          • 1




            $begingroup$
            Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
            $endgroup$
            – Gloweye
            Sep 30 at 7:44






          • 1




            $begingroup$
            Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
            $endgroup$
            – Graipher
            Sep 30 at 7:48











          • $begingroup$
            Ah, OK. Didn't pay attention to the header.
            $endgroup$
            – Gloweye
            Sep 30 at 7:50






          • 1




            $begingroup$
            I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
            $endgroup$
            – GZ0
            Sep 30 at 21:23
















          • $begingroup$
            Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
            $endgroup$
            – Graipher
            Sep 30 at 7:40







          • 1




            $begingroup$
            Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
            $endgroup$
            – Gloweye
            Sep 30 at 7:44






          • 1




            $begingroup$
            Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
            $endgroup$
            – Graipher
            Sep 30 at 7:48











          • $begingroup$
            Ah, OK. Didn't pay attention to the header.
            $endgroup$
            – Gloweye
            Sep 30 at 7:50






          • 1




            $begingroup$
            I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
            $endgroup$
            – GZ0
            Sep 30 at 21:23















          $begingroup$
          Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
          $endgroup$
          – Graipher
          Sep 30 at 7:40





          $begingroup$
          Also it might be worth it to have a look at the unique_everseen function in the itertools recipes, which has some performance improvements and an optional key by which to determine uniqueness (but is otherwise the same as your unique function).
          $endgroup$
          – Graipher
          Sep 30 at 7:40





          1




          1




          $begingroup$
          Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
          $endgroup$
          – Gloweye
          Sep 30 at 7:44




          $begingroup$
          Yeah, that's worth mentioning. I put it in. I'll keep my unique() around for ease spotting of what it does.
          $endgroup$
          – Gloweye
          Sep 30 at 7:44




          1




          1




          $begingroup$
          Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
          $endgroup$
          – Graipher
          Sep 30 at 7:48





          $begingroup$
          Beware that it is just a recipe, though. Unfortunately you cannot just do from itertools import unique_everseen.
          $endgroup$
          – Graipher
          Sep 30 at 7:48













          $begingroup$
          Ah, OK. Didn't pay attention to the header.
          $endgroup$
          – Gloweye
          Sep 30 at 7:50




          $begingroup$
          Ah, OK. Didn't pay attention to the header.
          $endgroup$
          – Gloweye
          Sep 30 at 7:50




          1




          1




          $begingroup$
          I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
          $endgroup$
          – GZ0
          Sep 30 at 21:23




          $begingroup$
          I think dict.fromkeys(iterable) serves more or less the same functionality (for Python version >= 3.6) as unique(iterable).
          $endgroup$
          – GZ0
          Sep 30 at 21:23


















          draft saved

          draft discarded















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f229891%2fmap-unique-raw-words-to-a-list-of-code-words%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown









          Popular posts from this blog

          Tamil (spriik) Luke uk diar | Nawigatjuun

          Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

          Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?