Comparison of values in dictionaryPerforming calculations with a catalog of starsUnique dictionary values - print keysCopying a Dictionary and Replace its ValuesProduct of dictionary valuesSearch dictionary for matching valuesReverse dictionary where values are listsFinding dictionary keys whose values are duplicates
being overqualified as a barrier for getting a job
Has any person or company ever become a sovereign state?
How do I escape from a hanging Ubuntu OS?
Is verificationism dead?
Broken 26'' wheel
Can salted butter be used to make lemon curd?
How do I self-answer "What does this say?"
Is the weight of the aircraft flying in the sky transferred to the ground?
Vulcan Anatomy (Heart)
Examples of Lewis base nature of sulfur dioxide
How does the bypass air provide thrust?
Why are the 4th and 7th scale degrees removed from the major scale to make the Pentatonic scale?
Reconstructing the results of a 6-team soccer tournament
Where should I place my fictional continent in the South Pacific?
Select specific rows of a dataset
Why doesn't Stockfish (DroidFish) try to flag me?
Behaviour of verb that looks pronomial but doesn't have a "se" entry in the dictionary?
Aligning under and overbraces
Is it appropriate to rewrite and republish another author's useful but very badly written paper?
How to explain to people that I don't know very well about my complicated family background?
Why is wired Ethernet losing its speed advantage over wireless?
Could a German insult Hitler without being arrested?
Sometimes the updated rows are not locked within instead of update trigger
What is the PDF for the minimum difference between a random number and a set of random numbers
Comparison of values in dictionary
Performing calculations with a catalog of starsUnique dictionary values - print keysCopying a Dictionary and Replace its ValuesProduct of dictionary valuesSearch dictionary for matching valuesReverse dictionary where values are listsFinding dictionary keys whose values are duplicates
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
$begingroup$
I have the following dictionary:
results_dict = 'Current model': 'Recall': 0.77,
'Precision': 0.1,
'F1_score': 0.18,
'Frauds': 94,
'New model': 'Recall': 0.96,
'Precision': 0.17,
'F1_score': 0.29,
'Frauds': 149
What I want is to make a comparison for each metric between the two models, and in case the one belonging to the new model is better, adding +1 to a variable i. So far, I have made this:
recall_new = results_dict['New model']['Recall']
recall_current = results_dict['Current model']['Recall']
precision_new = results_dict['New model']['Precision']
precision_current = results_dict['Current model']['Precision']
F1_new = results_dict['New model']['F1_score']
F1_current = results_dict['Current model']['F1_score']
frauds_new = results_dict['New model']['Frauds']
frauds_current = results_dict['Current model']['Frauds']
i = 0
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
i+=1
if precision_new > precision_current or precision_new > (precision_current-(0.1)):
i+=1
if F1_new > F1_current or F1_new > (precision_current-(0.1)):
i+=1
if frauds_new > frauds_current or frauds_new > int(round(frauds_current*1.2,0)):
i+=1
The code makes what is intended to do but it is very verbose and repetitive and I was wondering whether it could be simplified or not. Thank you!
python dictionary
$endgroup$
add a comment
|
$begingroup$
I have the following dictionary:
results_dict = 'Current model': 'Recall': 0.77,
'Precision': 0.1,
'F1_score': 0.18,
'Frauds': 94,
'New model': 'Recall': 0.96,
'Precision': 0.17,
'F1_score': 0.29,
'Frauds': 149
What I want is to make a comparison for each metric between the two models, and in case the one belonging to the new model is better, adding +1 to a variable i. So far, I have made this:
recall_new = results_dict['New model']['Recall']
recall_current = results_dict['Current model']['Recall']
precision_new = results_dict['New model']['Precision']
precision_current = results_dict['Current model']['Precision']
F1_new = results_dict['New model']['F1_score']
F1_current = results_dict['Current model']['F1_score']
frauds_new = results_dict['New model']['Frauds']
frauds_current = results_dict['Current model']['Frauds']
i = 0
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
i+=1
if precision_new > precision_current or precision_new > (precision_current-(0.1)):
i+=1
if F1_new > F1_current or F1_new > (precision_current-(0.1)):
i+=1
if frauds_new > frauds_current or frauds_new > int(round(frauds_current*1.2,0)):
i+=1
The code makes what is intended to do but it is very verbose and repetitive and I was wondering whether it could be simplified or not. Thank you!
python dictionary
$endgroup$
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
2
$begingroup$
Why do you compareF1_new > (precision_current-(0.1))
. Shouldn't it beF1_new > (F1_current-(0.1))
?
$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20
add a comment
|
$begingroup$
I have the following dictionary:
results_dict = 'Current model': 'Recall': 0.77,
'Precision': 0.1,
'F1_score': 0.18,
'Frauds': 94,
'New model': 'Recall': 0.96,
'Precision': 0.17,
'F1_score': 0.29,
'Frauds': 149
What I want is to make a comparison for each metric between the two models, and in case the one belonging to the new model is better, adding +1 to a variable i. So far, I have made this:
recall_new = results_dict['New model']['Recall']
recall_current = results_dict['Current model']['Recall']
precision_new = results_dict['New model']['Precision']
precision_current = results_dict['Current model']['Precision']
F1_new = results_dict['New model']['F1_score']
F1_current = results_dict['Current model']['F1_score']
frauds_new = results_dict['New model']['Frauds']
frauds_current = results_dict['Current model']['Frauds']
i = 0
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
i+=1
if precision_new > precision_current or precision_new > (precision_current-(0.1)):
i+=1
if F1_new > F1_current or F1_new > (precision_current-(0.1)):
i+=1
if frauds_new > frauds_current or frauds_new > int(round(frauds_current*1.2,0)):
i+=1
The code makes what is intended to do but it is very verbose and repetitive and I was wondering whether it could be simplified or not. Thank you!
python dictionary
$endgroup$
I have the following dictionary:
results_dict = 'Current model': 'Recall': 0.77,
'Precision': 0.1,
'F1_score': 0.18,
'Frauds': 94,
'New model': 'Recall': 0.96,
'Precision': 0.17,
'F1_score': 0.29,
'Frauds': 149
What I want is to make a comparison for each metric between the two models, and in case the one belonging to the new model is better, adding +1 to a variable i. So far, I have made this:
recall_new = results_dict['New model']['Recall']
recall_current = results_dict['Current model']['Recall']
precision_new = results_dict['New model']['Precision']
precision_current = results_dict['Current model']['Precision']
F1_new = results_dict['New model']['F1_score']
F1_current = results_dict['Current model']['F1_score']
frauds_new = results_dict['New model']['Frauds']
frauds_current = results_dict['Current model']['Frauds']
i = 0
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
i+=1
if precision_new > precision_current or precision_new > (precision_current-(0.1)):
i+=1
if F1_new > F1_current or F1_new > (precision_current-(0.1)):
i+=1
if frauds_new > frauds_current or frauds_new > int(round(frauds_current*1.2,0)):
i+=1
The code makes what is intended to do but it is very verbose and repetitive and I was wondering whether it could be simplified or not. Thank you!
python dictionary
python dictionary
edited Oct 2 at 11:05
Eric Duminil
3,4811 gold badge10 silver badges19 bronze badges
3,4811 gold badge10 silver badges19 bronze badges
asked Oct 1 at 13:05
Javier López TomásJavier López Tomás
1415 bronze badges
1415 bronze badges
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
2
$begingroup$
Why do you compareF1_new > (precision_current-(0.1))
. Shouldn't it beF1_new > (F1_current-(0.1))
?
$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20
add a comment
|
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
2
$begingroup$
Why do you compareF1_new > (precision_current-(0.1))
. Shouldn't it beF1_new > (F1_current-(0.1))
?
$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
2
2
$begingroup$
Why do you compare
F1_new > (precision_current-(0.1))
. Shouldn't it be F1_new > (F1_current-(0.1))
?$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Why do you compare
F1_new > (precision_current-(0.1))
. Shouldn't it be F1_new > (F1_current-(0.1))
?$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20
add a comment
|
3 Answers
3
active
oldest
votes
$begingroup$
Index less
Also, you're indexing a lot. Perhaps it's better to seperate the two dicts:
old, new = results_dict["Current model"], results_dict["New model"]
(And what's up with having an underscore in Current_model
and a space in New model
? That's asking for typos. You're not even consistent with it - in your first code they're both spaces, but in the assignment you use 3 underscores and a space for Current
...)
Also, with how you make the checks, one of your conditions always implies the other. You should remove the redundant comparisons. Change:
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
# Into:
if recall_new > recall_current - 0.1:
The additional braces don't do anything, and if recall_new
is bigger than current-0.1
, then it is also bigger than current
.
Loop
If you look closely, you'll see you're doing the same thing multiple times. So just make it a loop.
Arguably you should make an outside variable for the keys to iterate over, or iterate over the keys of either new or old dict. But if hard-coding is appropriate, it could look a lot like:
new_better = 0 # i is a bad variable name. Names should have meaning.
for key in ("Recall", "Precision", "F1_score"):
if new[key] > old[key]-0.1:
new_better += 1
if new["Frauds"] > old["Frauds"]*1.2:
new_better += 1
Note that I removed your rounding. Python has no issues transforming a float to the closest integer by means of int()
, but neither does it have a problem with comparing ints to floats. I did notice that your adjustments make it easier to score an increment for the first three variables, but harder for the fourth. Is this intentional?
$endgroup$
add a comment
|
$begingroup$
I'm not convinced your code is too long, but it is verbose in the sense that it's a bit hard to read. To make sense of it one must read many lines in detail and identify the repeated pattern between them. Wrapping things in functions can improve clarity just as much as using a more concise syntax or pattern
Gloweye's concern about your choices of dictionary keys is sound. I'd go further and suggest that this is a good time to write a small class.
There are different ways to think about classes and objects. Without getting too deep into the weeds, the thing they offer us here is the ability to express the structure of a "dictionary" variable in our code.
from typing import Dict, Union
class MyModel:
def __init__(self, recall: float, precision: float, f1: float, frauds: int):
self.recall = recall
self.precision = precision
self.f1 = f1
self.frauds = frauds
def from_dict(data: Dict[str, Union[int, float]]) -> 'MyModel':
return MyModel(data['Recall'], data['Precision'], data['F1_score'], data['Frauds'])
def recall_surpassed(self, new: float) -> bool:
return new > self.recall - 0.1
def precision_surpassed(self, new: float) -> bool:
return new > self.precision - 0.1
def f1_surpassed(self, new: float) -> bool:
return new > self.f1 - 0.1
def frauds_surpassed(self, new: float) -> bool:
return new > self.frauds
def get_improvement_score(self, new: 'MyModel') -> int:
return (
int(self.recall_surpassed(new.recall))
+ int(self.precision_surpassed(new.precision))
+ int(self.f1_surpassed(new.f1))
+ int(self.frauds_surpassed(new.frauds))
)
This isn't any more concise than what you'd written, but here the verbosity serves a purpose: it's easier to make sense of the behavior and to find any particular detail because the pieces are split up and labeled. For example, did I get the frauds
check right, and if I didn't then how would you fix it?
To use this with your existing nested dicts would be easy enough, because I included from_dict
as a helper method:
i = MyModel.from_dict(
results_dict['Current model']
).get_improvement_score(
MyModel.from_dict(results_dict['New model'])
)
print(i)
$endgroup$
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
add a comment
|
$begingroup$
As others have pointed out, your comparison operations are skewed between the Frauds
fields and all other fields. Given that the data type appears to be different, I'm going to assume that you're doing the comparison correctly, and it's working as intended.
That said, the numbers you are using to subtract (0.1
) appear surprisingly large in relation to the values you are comparing them with. In one case, your initial value is 0.1, so subtracting 0.1 would result in comparing the new value > 0, which might not be what you intended.
Iterating the keys
You can use the dict.keys()
iterable to get all the keys for one of the dictionaries, and you can use Python's conditional expressions to evaluate your score.
Combine those together and you get:
def score_results(old, new):
""" Return a score based on the number of
'improved' fields of the new results over
the old.
"""
score = 0
for k in new.keys():
if k == 'Frauds':
score += 1 if new[k] > int(old[k] * 1.2) else 0
else:
score += 1 if new[k] > old[k] - 0.1 else 0
# NOTE: did you mean > old[k] * 0.9 ???
return score
Lambdas and defaults and iterables, oh my!
You can shorten this by putting your brain in Python-mode and treating code as data using Python's first-class functions. In this case, we'll make use of the lambda expression syntax, since the things we're doing are so short:
def score_results(old, new):
""" Idem. """
minor_change = lambda o: o - 0.1 # could be o * 0.9??
change_funcs = 'Frauds': lambda o: int(o * 1.2)
return sum((1 if new[k] > change_funcs.get(k, minor_change)(old[k]) else 0)
for k in new.keys())
In this version, I used the conditional-expression syntax from above to evaluate either 0 or 1 for each key k
. I used the sum()
built-in to add up the scores. This replaces the for k in new.keys()
loop with an iterable. The iterable I chose was the generator expression that looped over the k in new.keys()
.
I could have used an if
clause in the generator expression to skip over the Frauds
key. But we don't want to skip it, we want to compute it differently. So instead I built a dictionary where I could look up the keys. Every key would have a default behavior, except for special keys, by using the dict.get(key, default)
method. The special keys in this case are Frauds
:
change_funcs.get(k, minor_change)
Once I had the special function (for Frauds) or the default function (minor_change
for everything except Frauds) I could call it:
change_funcs.get(...)(old[k])
And then put it into the comparison with the new key as part of the conditional-expression.
int(bool) -> 0, 1
Another "shortening" that could be made is to note that Python converts Boolean values to integers by mapping False
to 0
and True
to 1
. So instead of the conditional expression:
1 if cond else 0
we could simply use:
int(cond)
This converts our sum expression to:
return sum(int(new[k] > change_funcs.get(k, minor_change)(old[k])) for k in new.keys())
$endgroup$
1
$begingroup$
You can even do without theint
and just add booleans directly. bool is a subclass of int.
$endgroup$
– Alex Hall
Oct 2 at 7:45
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f229972%2fcomparison-of-values-in-dictionary%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Index less
Also, you're indexing a lot. Perhaps it's better to seperate the two dicts:
old, new = results_dict["Current model"], results_dict["New model"]
(And what's up with having an underscore in Current_model
and a space in New model
? That's asking for typos. You're not even consistent with it - in your first code they're both spaces, but in the assignment you use 3 underscores and a space for Current
...)
Also, with how you make the checks, one of your conditions always implies the other. You should remove the redundant comparisons. Change:
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
# Into:
if recall_new > recall_current - 0.1:
The additional braces don't do anything, and if recall_new
is bigger than current-0.1
, then it is also bigger than current
.
Loop
If you look closely, you'll see you're doing the same thing multiple times. So just make it a loop.
Arguably you should make an outside variable for the keys to iterate over, or iterate over the keys of either new or old dict. But if hard-coding is appropriate, it could look a lot like:
new_better = 0 # i is a bad variable name. Names should have meaning.
for key in ("Recall", "Precision", "F1_score"):
if new[key] > old[key]-0.1:
new_better += 1
if new["Frauds"] > old["Frauds"]*1.2:
new_better += 1
Note that I removed your rounding. Python has no issues transforming a float to the closest integer by means of int()
, but neither does it have a problem with comparing ints to floats. I did notice that your adjustments make it easier to score an increment for the first three variables, but harder for the fourth. Is this intentional?
$endgroup$
add a comment
|
$begingroup$
Index less
Also, you're indexing a lot. Perhaps it's better to seperate the two dicts:
old, new = results_dict["Current model"], results_dict["New model"]
(And what's up with having an underscore in Current_model
and a space in New model
? That's asking for typos. You're not even consistent with it - in your first code they're both spaces, but in the assignment you use 3 underscores and a space for Current
...)
Also, with how you make the checks, one of your conditions always implies the other. You should remove the redundant comparisons. Change:
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
# Into:
if recall_new > recall_current - 0.1:
The additional braces don't do anything, and if recall_new
is bigger than current-0.1
, then it is also bigger than current
.
Loop
If you look closely, you'll see you're doing the same thing multiple times. So just make it a loop.
Arguably you should make an outside variable for the keys to iterate over, or iterate over the keys of either new or old dict. But if hard-coding is appropriate, it could look a lot like:
new_better = 0 # i is a bad variable name. Names should have meaning.
for key in ("Recall", "Precision", "F1_score"):
if new[key] > old[key]-0.1:
new_better += 1
if new["Frauds"] > old["Frauds"]*1.2:
new_better += 1
Note that I removed your rounding. Python has no issues transforming a float to the closest integer by means of int()
, but neither does it have a problem with comparing ints to floats. I did notice that your adjustments make it easier to score an increment for the first three variables, but harder for the fourth. Is this intentional?
$endgroup$
add a comment
|
$begingroup$
Index less
Also, you're indexing a lot. Perhaps it's better to seperate the two dicts:
old, new = results_dict["Current model"], results_dict["New model"]
(And what's up with having an underscore in Current_model
and a space in New model
? That's asking for typos. You're not even consistent with it - in your first code they're both spaces, but in the assignment you use 3 underscores and a space for Current
...)
Also, with how you make the checks, one of your conditions always implies the other. You should remove the redundant comparisons. Change:
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
# Into:
if recall_new > recall_current - 0.1:
The additional braces don't do anything, and if recall_new
is bigger than current-0.1
, then it is also bigger than current
.
Loop
If you look closely, you'll see you're doing the same thing multiple times. So just make it a loop.
Arguably you should make an outside variable for the keys to iterate over, or iterate over the keys of either new or old dict. But if hard-coding is appropriate, it could look a lot like:
new_better = 0 # i is a bad variable name. Names should have meaning.
for key in ("Recall", "Precision", "F1_score"):
if new[key] > old[key]-0.1:
new_better += 1
if new["Frauds"] > old["Frauds"]*1.2:
new_better += 1
Note that I removed your rounding. Python has no issues transforming a float to the closest integer by means of int()
, but neither does it have a problem with comparing ints to floats. I did notice that your adjustments make it easier to score an increment for the first three variables, but harder for the fourth. Is this intentional?
$endgroup$
Index less
Also, you're indexing a lot. Perhaps it's better to seperate the two dicts:
old, new = results_dict["Current model"], results_dict["New model"]
(And what's up with having an underscore in Current_model
and a space in New model
? That's asking for typos. You're not even consistent with it - in your first code they're both spaces, but in the assignment you use 3 underscores and a space for Current
...)
Also, with how you make the checks, one of your conditions always implies the other. You should remove the redundant comparisons. Change:
if recall_new > recall_current or recall_new > (recall_current-(0.1)):
# Into:
if recall_new > recall_current - 0.1:
The additional braces don't do anything, and if recall_new
is bigger than current-0.1
, then it is also bigger than current
.
Loop
If you look closely, you'll see you're doing the same thing multiple times. So just make it a loop.
Arguably you should make an outside variable for the keys to iterate over, or iterate over the keys of either new or old dict. But if hard-coding is appropriate, it could look a lot like:
new_better = 0 # i is a bad variable name. Names should have meaning.
for key in ("Recall", "Precision", "F1_score"):
if new[key] > old[key]-0.1:
new_better += 1
if new["Frauds"] > old["Frauds"]*1.2:
new_better += 1
Note that I removed your rounding. Python has no issues transforming a float to the closest integer by means of int()
, but neither does it have a problem with comparing ints to floats. I did notice that your adjustments make it easier to score an increment for the first three variables, but harder for the fourth. Is this intentional?
edited Oct 1 at 14:07
Toby Speight
35.2k8 gold badges46 silver badges146 bronze badges
35.2k8 gold badges46 silver badges146 bronze badges
answered Oct 1 at 13:24
GloweyeGloweye
1,7265 silver badges19 bronze badges
1,7265 silver badges19 bronze badges
add a comment
|
add a comment
|
$begingroup$
I'm not convinced your code is too long, but it is verbose in the sense that it's a bit hard to read. To make sense of it one must read many lines in detail and identify the repeated pattern between them. Wrapping things in functions can improve clarity just as much as using a more concise syntax or pattern
Gloweye's concern about your choices of dictionary keys is sound. I'd go further and suggest that this is a good time to write a small class.
There are different ways to think about classes and objects. Without getting too deep into the weeds, the thing they offer us here is the ability to express the structure of a "dictionary" variable in our code.
from typing import Dict, Union
class MyModel:
def __init__(self, recall: float, precision: float, f1: float, frauds: int):
self.recall = recall
self.precision = precision
self.f1 = f1
self.frauds = frauds
def from_dict(data: Dict[str, Union[int, float]]) -> 'MyModel':
return MyModel(data['Recall'], data['Precision'], data['F1_score'], data['Frauds'])
def recall_surpassed(self, new: float) -> bool:
return new > self.recall - 0.1
def precision_surpassed(self, new: float) -> bool:
return new > self.precision - 0.1
def f1_surpassed(self, new: float) -> bool:
return new > self.f1 - 0.1
def frauds_surpassed(self, new: float) -> bool:
return new > self.frauds
def get_improvement_score(self, new: 'MyModel') -> int:
return (
int(self.recall_surpassed(new.recall))
+ int(self.precision_surpassed(new.precision))
+ int(self.f1_surpassed(new.f1))
+ int(self.frauds_surpassed(new.frauds))
)
This isn't any more concise than what you'd written, but here the verbosity serves a purpose: it's easier to make sense of the behavior and to find any particular detail because the pieces are split up and labeled. For example, did I get the frauds
check right, and if I didn't then how would you fix it?
To use this with your existing nested dicts would be easy enough, because I included from_dict
as a helper method:
i = MyModel.from_dict(
results_dict['Current model']
).get_improvement_score(
MyModel.from_dict(results_dict['New model'])
)
print(i)
$endgroup$
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
add a comment
|
$begingroup$
I'm not convinced your code is too long, but it is verbose in the sense that it's a bit hard to read. To make sense of it one must read many lines in detail and identify the repeated pattern between them. Wrapping things in functions can improve clarity just as much as using a more concise syntax or pattern
Gloweye's concern about your choices of dictionary keys is sound. I'd go further and suggest that this is a good time to write a small class.
There are different ways to think about classes and objects. Without getting too deep into the weeds, the thing they offer us here is the ability to express the structure of a "dictionary" variable in our code.
from typing import Dict, Union
class MyModel:
def __init__(self, recall: float, precision: float, f1: float, frauds: int):
self.recall = recall
self.precision = precision
self.f1 = f1
self.frauds = frauds
def from_dict(data: Dict[str, Union[int, float]]) -> 'MyModel':
return MyModel(data['Recall'], data['Precision'], data['F1_score'], data['Frauds'])
def recall_surpassed(self, new: float) -> bool:
return new > self.recall - 0.1
def precision_surpassed(self, new: float) -> bool:
return new > self.precision - 0.1
def f1_surpassed(self, new: float) -> bool:
return new > self.f1 - 0.1
def frauds_surpassed(self, new: float) -> bool:
return new > self.frauds
def get_improvement_score(self, new: 'MyModel') -> int:
return (
int(self.recall_surpassed(new.recall))
+ int(self.precision_surpassed(new.precision))
+ int(self.f1_surpassed(new.f1))
+ int(self.frauds_surpassed(new.frauds))
)
This isn't any more concise than what you'd written, but here the verbosity serves a purpose: it's easier to make sense of the behavior and to find any particular detail because the pieces are split up and labeled. For example, did I get the frauds
check right, and if I didn't then how would you fix it?
To use this with your existing nested dicts would be easy enough, because I included from_dict
as a helper method:
i = MyModel.from_dict(
results_dict['Current model']
).get_improvement_score(
MyModel.from_dict(results_dict['New model'])
)
print(i)
$endgroup$
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
add a comment
|
$begingroup$
I'm not convinced your code is too long, but it is verbose in the sense that it's a bit hard to read. To make sense of it one must read many lines in detail and identify the repeated pattern between them. Wrapping things in functions can improve clarity just as much as using a more concise syntax or pattern
Gloweye's concern about your choices of dictionary keys is sound. I'd go further and suggest that this is a good time to write a small class.
There are different ways to think about classes and objects. Without getting too deep into the weeds, the thing they offer us here is the ability to express the structure of a "dictionary" variable in our code.
from typing import Dict, Union
class MyModel:
def __init__(self, recall: float, precision: float, f1: float, frauds: int):
self.recall = recall
self.precision = precision
self.f1 = f1
self.frauds = frauds
def from_dict(data: Dict[str, Union[int, float]]) -> 'MyModel':
return MyModel(data['Recall'], data['Precision'], data['F1_score'], data['Frauds'])
def recall_surpassed(self, new: float) -> bool:
return new > self.recall - 0.1
def precision_surpassed(self, new: float) -> bool:
return new > self.precision - 0.1
def f1_surpassed(self, new: float) -> bool:
return new > self.f1 - 0.1
def frauds_surpassed(self, new: float) -> bool:
return new > self.frauds
def get_improvement_score(self, new: 'MyModel') -> int:
return (
int(self.recall_surpassed(new.recall))
+ int(self.precision_surpassed(new.precision))
+ int(self.f1_surpassed(new.f1))
+ int(self.frauds_surpassed(new.frauds))
)
This isn't any more concise than what you'd written, but here the verbosity serves a purpose: it's easier to make sense of the behavior and to find any particular detail because the pieces are split up and labeled. For example, did I get the frauds
check right, and if I didn't then how would you fix it?
To use this with your existing nested dicts would be easy enough, because I included from_dict
as a helper method:
i = MyModel.from_dict(
results_dict['Current model']
).get_improvement_score(
MyModel.from_dict(results_dict['New model'])
)
print(i)
$endgroup$
I'm not convinced your code is too long, but it is verbose in the sense that it's a bit hard to read. To make sense of it one must read many lines in detail and identify the repeated pattern between them. Wrapping things in functions can improve clarity just as much as using a more concise syntax or pattern
Gloweye's concern about your choices of dictionary keys is sound. I'd go further and suggest that this is a good time to write a small class.
There are different ways to think about classes and objects. Without getting too deep into the weeds, the thing they offer us here is the ability to express the structure of a "dictionary" variable in our code.
from typing import Dict, Union
class MyModel:
def __init__(self, recall: float, precision: float, f1: float, frauds: int):
self.recall = recall
self.precision = precision
self.f1 = f1
self.frauds = frauds
def from_dict(data: Dict[str, Union[int, float]]) -> 'MyModel':
return MyModel(data['Recall'], data['Precision'], data['F1_score'], data['Frauds'])
def recall_surpassed(self, new: float) -> bool:
return new > self.recall - 0.1
def precision_surpassed(self, new: float) -> bool:
return new > self.precision - 0.1
def f1_surpassed(self, new: float) -> bool:
return new > self.f1 - 0.1
def frauds_surpassed(self, new: float) -> bool:
return new > self.frauds
def get_improvement_score(self, new: 'MyModel') -> int:
return (
int(self.recall_surpassed(new.recall))
+ int(self.precision_surpassed(new.precision))
+ int(self.f1_surpassed(new.f1))
+ int(self.frauds_surpassed(new.frauds))
)
This isn't any more concise than what you'd written, but here the verbosity serves a purpose: it's easier to make sense of the behavior and to find any particular detail because the pieces are split up and labeled. For example, did I get the frauds
check right, and if I didn't then how would you fix it?
To use this with your existing nested dicts would be easy enough, because I included from_dict
as a helper method:
i = MyModel.from_dict(
results_dict['Current model']
).get_improvement_score(
MyModel.from_dict(results_dict['New model'])
)
print(i)
answered Oct 1 at 15:24
ShapeOfMatterShapeOfMatter
1,0101 silver badge13 bronze badges
1,0101 silver badge13 bronze badges
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
add a comment
|
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
1
1
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
$begingroup$
I agree that a class might be a good solution, but be on guard for premature optimization. Still +1.
$endgroup$
– Gloweye
Oct 1 at 17:58
add a comment
|
$begingroup$
As others have pointed out, your comparison operations are skewed between the Frauds
fields and all other fields. Given that the data type appears to be different, I'm going to assume that you're doing the comparison correctly, and it's working as intended.
That said, the numbers you are using to subtract (0.1
) appear surprisingly large in relation to the values you are comparing them with. In one case, your initial value is 0.1, so subtracting 0.1 would result in comparing the new value > 0, which might not be what you intended.
Iterating the keys
You can use the dict.keys()
iterable to get all the keys for one of the dictionaries, and you can use Python's conditional expressions to evaluate your score.
Combine those together and you get:
def score_results(old, new):
""" Return a score based on the number of
'improved' fields of the new results over
the old.
"""
score = 0
for k in new.keys():
if k == 'Frauds':
score += 1 if new[k] > int(old[k] * 1.2) else 0
else:
score += 1 if new[k] > old[k] - 0.1 else 0
# NOTE: did you mean > old[k] * 0.9 ???
return score
Lambdas and defaults and iterables, oh my!
You can shorten this by putting your brain in Python-mode and treating code as data using Python's first-class functions. In this case, we'll make use of the lambda expression syntax, since the things we're doing are so short:
def score_results(old, new):
""" Idem. """
minor_change = lambda o: o - 0.1 # could be o * 0.9??
change_funcs = 'Frauds': lambda o: int(o * 1.2)
return sum((1 if new[k] > change_funcs.get(k, minor_change)(old[k]) else 0)
for k in new.keys())
In this version, I used the conditional-expression syntax from above to evaluate either 0 or 1 for each key k
. I used the sum()
built-in to add up the scores. This replaces the for k in new.keys()
loop with an iterable. The iterable I chose was the generator expression that looped over the k in new.keys()
.
I could have used an if
clause in the generator expression to skip over the Frauds
key. But we don't want to skip it, we want to compute it differently. So instead I built a dictionary where I could look up the keys. Every key would have a default behavior, except for special keys, by using the dict.get(key, default)
method. The special keys in this case are Frauds
:
change_funcs.get(k, minor_change)
Once I had the special function (for Frauds) or the default function (minor_change
for everything except Frauds) I could call it:
change_funcs.get(...)(old[k])
And then put it into the comparison with the new key as part of the conditional-expression.
int(bool) -> 0, 1
Another "shortening" that could be made is to note that Python converts Boolean values to integers by mapping False
to 0
and True
to 1
. So instead of the conditional expression:
1 if cond else 0
we could simply use:
int(cond)
This converts our sum expression to:
return sum(int(new[k] > change_funcs.get(k, minor_change)(old[k])) for k in new.keys())
$endgroup$
1
$begingroup$
You can even do without theint
and just add booleans directly. bool is a subclass of int.
$endgroup$
– Alex Hall
Oct 2 at 7:45
add a comment
|
$begingroup$
As others have pointed out, your comparison operations are skewed between the Frauds
fields and all other fields. Given that the data type appears to be different, I'm going to assume that you're doing the comparison correctly, and it's working as intended.
That said, the numbers you are using to subtract (0.1
) appear surprisingly large in relation to the values you are comparing them with. In one case, your initial value is 0.1, so subtracting 0.1 would result in comparing the new value > 0, which might not be what you intended.
Iterating the keys
You can use the dict.keys()
iterable to get all the keys for one of the dictionaries, and you can use Python's conditional expressions to evaluate your score.
Combine those together and you get:
def score_results(old, new):
""" Return a score based on the number of
'improved' fields of the new results over
the old.
"""
score = 0
for k in new.keys():
if k == 'Frauds':
score += 1 if new[k] > int(old[k] * 1.2) else 0
else:
score += 1 if new[k] > old[k] - 0.1 else 0
# NOTE: did you mean > old[k] * 0.9 ???
return score
Lambdas and defaults and iterables, oh my!
You can shorten this by putting your brain in Python-mode and treating code as data using Python's first-class functions. In this case, we'll make use of the lambda expression syntax, since the things we're doing are so short:
def score_results(old, new):
""" Idem. """
minor_change = lambda o: o - 0.1 # could be o * 0.9??
change_funcs = 'Frauds': lambda o: int(o * 1.2)
return sum((1 if new[k] > change_funcs.get(k, minor_change)(old[k]) else 0)
for k in new.keys())
In this version, I used the conditional-expression syntax from above to evaluate either 0 or 1 for each key k
. I used the sum()
built-in to add up the scores. This replaces the for k in new.keys()
loop with an iterable. The iterable I chose was the generator expression that looped over the k in new.keys()
.
I could have used an if
clause in the generator expression to skip over the Frauds
key. But we don't want to skip it, we want to compute it differently. So instead I built a dictionary where I could look up the keys. Every key would have a default behavior, except for special keys, by using the dict.get(key, default)
method. The special keys in this case are Frauds
:
change_funcs.get(k, minor_change)
Once I had the special function (for Frauds) or the default function (minor_change
for everything except Frauds) I could call it:
change_funcs.get(...)(old[k])
And then put it into the comparison with the new key as part of the conditional-expression.
int(bool) -> 0, 1
Another "shortening" that could be made is to note that Python converts Boolean values to integers by mapping False
to 0
and True
to 1
. So instead of the conditional expression:
1 if cond else 0
we could simply use:
int(cond)
This converts our sum expression to:
return sum(int(new[k] > change_funcs.get(k, minor_change)(old[k])) for k in new.keys())
$endgroup$
1
$begingroup$
You can even do without theint
and just add booleans directly. bool is a subclass of int.
$endgroup$
– Alex Hall
Oct 2 at 7:45
add a comment
|
$begingroup$
As others have pointed out, your comparison operations are skewed between the Frauds
fields and all other fields. Given that the data type appears to be different, I'm going to assume that you're doing the comparison correctly, and it's working as intended.
That said, the numbers you are using to subtract (0.1
) appear surprisingly large in relation to the values you are comparing them with. In one case, your initial value is 0.1, so subtracting 0.1 would result in comparing the new value > 0, which might not be what you intended.
Iterating the keys
You can use the dict.keys()
iterable to get all the keys for one of the dictionaries, and you can use Python's conditional expressions to evaluate your score.
Combine those together and you get:
def score_results(old, new):
""" Return a score based on the number of
'improved' fields of the new results over
the old.
"""
score = 0
for k in new.keys():
if k == 'Frauds':
score += 1 if new[k] > int(old[k] * 1.2) else 0
else:
score += 1 if new[k] > old[k] - 0.1 else 0
# NOTE: did you mean > old[k] * 0.9 ???
return score
Lambdas and defaults and iterables, oh my!
You can shorten this by putting your brain in Python-mode and treating code as data using Python's first-class functions. In this case, we'll make use of the lambda expression syntax, since the things we're doing are so short:
def score_results(old, new):
""" Idem. """
minor_change = lambda o: o - 0.1 # could be o * 0.9??
change_funcs = 'Frauds': lambda o: int(o * 1.2)
return sum((1 if new[k] > change_funcs.get(k, minor_change)(old[k]) else 0)
for k in new.keys())
In this version, I used the conditional-expression syntax from above to evaluate either 0 or 1 for each key k
. I used the sum()
built-in to add up the scores. This replaces the for k in new.keys()
loop with an iterable. The iterable I chose was the generator expression that looped over the k in new.keys()
.
I could have used an if
clause in the generator expression to skip over the Frauds
key. But we don't want to skip it, we want to compute it differently. So instead I built a dictionary where I could look up the keys. Every key would have a default behavior, except for special keys, by using the dict.get(key, default)
method. The special keys in this case are Frauds
:
change_funcs.get(k, minor_change)
Once I had the special function (for Frauds) or the default function (minor_change
for everything except Frauds) I could call it:
change_funcs.get(...)(old[k])
And then put it into the comparison with the new key as part of the conditional-expression.
int(bool) -> 0, 1
Another "shortening" that could be made is to note that Python converts Boolean values to integers by mapping False
to 0
and True
to 1
. So instead of the conditional expression:
1 if cond else 0
we could simply use:
int(cond)
This converts our sum expression to:
return sum(int(new[k] > change_funcs.get(k, minor_change)(old[k])) for k in new.keys())
$endgroup$
As others have pointed out, your comparison operations are skewed between the Frauds
fields and all other fields. Given that the data type appears to be different, I'm going to assume that you're doing the comparison correctly, and it's working as intended.
That said, the numbers you are using to subtract (0.1
) appear surprisingly large in relation to the values you are comparing them with. In one case, your initial value is 0.1, so subtracting 0.1 would result in comparing the new value > 0, which might not be what you intended.
Iterating the keys
You can use the dict.keys()
iterable to get all the keys for one of the dictionaries, and you can use Python's conditional expressions to evaluate your score.
Combine those together and you get:
def score_results(old, new):
""" Return a score based on the number of
'improved' fields of the new results over
the old.
"""
score = 0
for k in new.keys():
if k == 'Frauds':
score += 1 if new[k] > int(old[k] * 1.2) else 0
else:
score += 1 if new[k] > old[k] - 0.1 else 0
# NOTE: did you mean > old[k] * 0.9 ???
return score
Lambdas and defaults and iterables, oh my!
You can shorten this by putting your brain in Python-mode and treating code as data using Python's first-class functions. In this case, we'll make use of the lambda expression syntax, since the things we're doing are so short:
def score_results(old, new):
""" Idem. """
minor_change = lambda o: o - 0.1 # could be o * 0.9??
change_funcs = 'Frauds': lambda o: int(o * 1.2)
return sum((1 if new[k] > change_funcs.get(k, minor_change)(old[k]) else 0)
for k in new.keys())
In this version, I used the conditional-expression syntax from above to evaluate either 0 or 1 for each key k
. I used the sum()
built-in to add up the scores. This replaces the for k in new.keys()
loop with an iterable. The iterable I chose was the generator expression that looped over the k in new.keys()
.
I could have used an if
clause in the generator expression to skip over the Frauds
key. But we don't want to skip it, we want to compute it differently. So instead I built a dictionary where I could look up the keys. Every key would have a default behavior, except for special keys, by using the dict.get(key, default)
method. The special keys in this case are Frauds
:
change_funcs.get(k, minor_change)
Once I had the special function (for Frauds) or the default function (minor_change
for everything except Frauds) I could call it:
change_funcs.get(...)(old[k])
And then put it into the comparison with the new key as part of the conditional-expression.
int(bool) -> 0, 1
Another "shortening" that could be made is to note that Python converts Boolean values to integers by mapping False
to 0
and True
to 1
. So instead of the conditional expression:
1 if cond else 0
we could simply use:
int(cond)
This converts our sum expression to:
return sum(int(new[k] > change_funcs.get(k, minor_change)(old[k])) for k in new.keys())
answered Oct 1 at 18:34
Austin HastingsAustin Hastings
10k15 silver badges40 bronze badges
10k15 silver badges40 bronze badges
1
$begingroup$
You can even do without theint
and just add booleans directly. bool is a subclass of int.
$endgroup$
– Alex Hall
Oct 2 at 7:45
add a comment
|
1
$begingroup$
You can even do without theint
and just add booleans directly. bool is a subclass of int.
$endgroup$
– Alex Hall
Oct 2 at 7:45
1
1
$begingroup$
You can even do without the
int
and just add booleans directly. bool is a subclass of int.$endgroup$
– Alex Hall
Oct 2 at 7:45
$begingroup$
You can even do without the
int
and just add booleans directly. bool is a subclass of int.$endgroup$
– Alex Hall
Oct 2 at 7:45
add a comment
|
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f229972%2fcomparison-of-values-in-dictionary%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
At your fourth comparison, you're using a variable "frauds" that isn't defined. I assume that should be frauds_current?
$endgroup$
– Gloweye
Oct 1 at 13:25
$begingroup$
Correct. I have edited the question
$endgroup$
– Javier López Tomás
Oct 1 at 13:26
2
$begingroup$
Why do you compare
F1_new > (precision_current-(0.1))
. Shouldn't it beF1_new > (F1_current-(0.1))
?$endgroup$
– Graipher
Oct 1 at 13:56
$begingroup$
Another mistake
$endgroup$
– Javier López Tomás
Oct 1 at 15:20