SHA3-255, one bit lessHash functions and the Avalanche effectI need a 64-bit cryptographic hash for 96 bits of dataDoes this guarantee a unique 32 bit Hash?Floyd's cycle finding algorithm on SHA256Hash algorithm for converting n-bit number into another n-bit numberCreate a CR hash function where truncating one bit leads to collisionsHow many hex digits do I need to compare when manually checking hash functions?Hash Collision ProbabilitiesPartial collisions of hashes, why is this important?Integrity check of a binary firmware file with hash function
50K what to do with it?
What should be done when the theory behind a PhD thesis turns out to be wrong?
If I buy a super off-peak ticket, can I travel on a off-peak train and pay the difference?
Team members' and manager's behaviour is indifferent after I announce my intention to leave in 8 months
Why do right-wing parties generally oppose the legalization of marijuana?
Why are green parties so often opposed to nuclear power?
Evil plans - how do you come up with interesting ones?
Between while and do in shell script
In Germany, why does the burden of proof fall on authorities rather than the company or individual when it comes to possible illegal funds?
Is the use of ellipsis "..." dismissive or rude?
What is the purpose of the rules in counterpoint composition?
Is there a material or method to allow "swimmable" coins?
Is Jupiter still an anomaly?
How to run " $('div').clone().appendTo('.wrapper'); " in LWC?
Why do Russian names transliterated into English have unpronounceable 'k's before 'h's (e.g. 'Mikhail' instead of just 'Mihail')?
How much tech advancement could be made out of modern processor appearing in 1980s?
What's the best way to keep cover of a pan slightly opened?
Who verifies the trust of certificate authorities?
Template not provided using create-react-app
Mostly One Way Travel : Says Grandpa
Why try to impeach Trump now?
Is it academically dishonest to submit the same project to two different classes in the same semester?
What are the factors that decide on whether you die instantly or get knocked out in PUBG?
Should there be a comma between "Wonderful" and "Counselor" in Isa. 9:6?
SHA3-255, one bit less
Hash functions and the Avalanche effectI need a 64-bit cryptographic hash for 96 bits of dataDoes this guarantee a unique 32 bit Hash?Floyd's cycle finding algorithm on SHA256Hash algorithm for converting n-bit number into another n-bit numberCreate a CR hash function where truncating one bit leads to collisionsHow many hex digits do I need to compare when manually checking hash functions?Hash Collision ProbabilitiesPartial collisions of hashes, why is this important?Integrity check of a binary firmware file with hash function
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
.everyonelovesstackoverflowposition:absolute;height:1px;width:1px;opacity:0;top:0;left:0;pointer-events:none;
$begingroup$
I need a SHA3-255 or 511. What if I simply truncate a standard SHA3-256 or 512? Apart from the doubled probability of hash collision, are there any other things I should be aware of? I could also truncate one byte instead of one bit, if useful.
What I need is being able to store something different than the hash, in the same 32 bytes, or 64 bytes, so I need to sacrifice one bit to mark when the bytes represent a hash or something else.
Alternatively I could say that if the first byte, not bit, is 0xff, then the remaining ones represent something else. This should reduce the probability of hash collision, but what I would have a probability of 1/256 that a hash starts with 0xff, generating ambiguity in my encoding. I could say that if the for 4 bytes are 0xffffffff, then I would end up with probability of generating an ambiguous encoding of 1/2^32, but I would prefer having a well-defined encoding under any circumstances.
Any well known approach I'm not aware of?
hash collision-resistance preimage-resistance sha-3 2nd-preimage-resistance
$endgroup$
add a comment
|
$begingroup$
I need a SHA3-255 or 511. What if I simply truncate a standard SHA3-256 or 512? Apart from the doubled probability of hash collision, are there any other things I should be aware of? I could also truncate one byte instead of one bit, if useful.
What I need is being able to store something different than the hash, in the same 32 bytes, or 64 bytes, so I need to sacrifice one bit to mark when the bytes represent a hash or something else.
Alternatively I could say that if the first byte, not bit, is 0xff, then the remaining ones represent something else. This should reduce the probability of hash collision, but what I would have a probability of 1/256 that a hash starts with 0xff, generating ambiguity in my encoding. I could say that if the for 4 bytes are 0xffffffff, then I would end up with probability of generating an ambiguous encoding of 1/2^32, but I would prefer having a well-defined encoding under any circumstances.
Any well known approach I'm not aware of?
hash collision-resistance preimage-resistance sha-3 2nd-preimage-resistance
$endgroup$
14
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
2
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
2
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20
add a comment
|
$begingroup$
I need a SHA3-255 or 511. What if I simply truncate a standard SHA3-256 or 512? Apart from the doubled probability of hash collision, are there any other things I should be aware of? I could also truncate one byte instead of one bit, if useful.
What I need is being able to store something different than the hash, in the same 32 bytes, or 64 bytes, so I need to sacrifice one bit to mark when the bytes represent a hash or something else.
Alternatively I could say that if the first byte, not bit, is 0xff, then the remaining ones represent something else. This should reduce the probability of hash collision, but what I would have a probability of 1/256 that a hash starts with 0xff, generating ambiguity in my encoding. I could say that if the for 4 bytes are 0xffffffff, then I would end up with probability of generating an ambiguous encoding of 1/2^32, but I would prefer having a well-defined encoding under any circumstances.
Any well known approach I'm not aware of?
hash collision-resistance preimage-resistance sha-3 2nd-preimage-resistance
$endgroup$
I need a SHA3-255 or 511. What if I simply truncate a standard SHA3-256 or 512? Apart from the doubled probability of hash collision, are there any other things I should be aware of? I could also truncate one byte instead of one bit, if useful.
What I need is being able to store something different than the hash, in the same 32 bytes, or 64 bytes, so I need to sacrifice one bit to mark when the bytes represent a hash or something else.
Alternatively I could say that if the first byte, not bit, is 0xff, then the remaining ones represent something else. This should reduce the probability of hash collision, but what I would have a probability of 1/256 that a hash starts with 0xff, generating ambiguity in my encoding. I could say that if the for 4 bytes are 0xffffffff, then I would end up with probability of generating an ambiguous encoding of 1/2^32, but I would prefer having a well-defined encoding under any circumstances.
Any well known approach I'm not aware of?
hash collision-resistance preimage-resistance sha-3 2nd-preimage-resistance
hash collision-resistance preimage-resistance sha-3 2nd-preimage-resistance
edited Sep 29 at 16:27
kelalaka
13.5k4 gold badges34 silver badges61 bronze badges
13.5k4 gold badges34 silver badges61 bronze badges
asked Sep 27 at 19:04
ragazzojpragazzojp
3431 silver badge7 bronze badges
3431 silver badge7 bronze badges
14
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
2
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
2
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20
add a comment
|
14
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
2
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
2
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20
14
14
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
2
2
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
2
2
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20
add a comment
|
3 Answers
3
active
oldest
votes
$begingroup$
With all well-regarded hash functions, the bits of the hash all have equal worth: as far as anyone knows (unless they aren't telling), the bits are not correlated. If you take $k$ bits of an $n$-bit hash, you get a $k$-bit hash function. Truncating SHA-256 to 255 bits gives you a hash that's almost as good as SHA-256: it has $2^255$ strength against preimage attacks and $2^127.5$ strength against collision attacks.
There are precedents for taking certain bits of a hash. SHA-224 and SHA-384 are obtained with calculations that are essentially the same as SHA-256 and SHA-512 respectively, just with different initial values (which shouldn't matter to the strength of the algorithm) and with the output truncated to the smaller size. Another precedent is that UUID can be constructed from 122 bits of MD5 or SHA-1 hashes (out of 128 or 160 respectively).
For SHA3, there's a cleaner construction. Instead of taking SHA3-256 and cutting off one bit, take the SHAKE256 255-bit or 248-bit output. SHAKE256 is exactly the same as SHA3-256 except for two bits near the beginning of the calculation, so it has the same security properties, but it's explicitly designed to have variable-length output. You can even use SHAKE128 instead of SHAKE256, which is slightly cheaper to compute with no meaningful security loss.
It's unlikely that there's an actual security problem with truncating a hash, but using SHAKE gives you greater confidence that nothing will go wrong. Even better, to avoid domain collisions (where two parts of the system calculate the hash of the same string for different reasons), calculate the cSHAKE output with a unique string as the customization string.
$endgroup$
add a comment
|
$begingroup$
Apart from the slightly reduced resistances, there is no problem:
Resistances for SHA3-512;
Pre-image resistance decreased to $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^511 = sqrt2cdot 2^255$ or $sqrt2^504 = 2^252$, if 1 bit or 1 byte trimmed, respectively.
Similarly resistances for SHA3-256;
Pre-image resistance decreased to $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^255 = sqrt2cdot 2^127$ or $sqrt2^248 = 2^124$, if 1 bit or 1 byte trimmed, respectively.
The trimmed case resistance should be good enough for your application.
Actually, this is not uncommon, for example, SHA-224 is the truncated version of SHA-256 with different initial values that provides domain separation.
Trimming is secure as long as the generic attacks are in good bounds. We require the hash functions to have avalanche criteria on the output bits, that is a change in the any of input bits must randomly affect half of the output bits. Each bit of the hash function must depend on the input bits; removing 1 bit or 1 byte doesn't affect the results of other bits.
$endgroup$
add a comment
|
$begingroup$
Lets get one thing out of the way: forcing one bit to 0 or 1 does not change the output size of the hash. A hash output is not a number, so the output size would not be affected.
Reducing hash output is common practice. Although maybe not a direct requirement, generally the output of a hash is considered indistinguishable from random - if the input is unknown, of course. So basically you can do with it what you want. The common thing to do is to take the leftmost bits or bytes of the hash output. You're taking the rightmost bits or bytes, and that's OK too.
As you already found out, trying to use one value out of 256 to indicate a special case is tricky. You can of course use set of bytes to escape values, but since your output size is static, you'll have to sacrifice more security for the special cases: the hashes starting with 0xFF in your case. As SHA3-512 has plenty of security, I'd just sacrifice a bit or even byte.
Finally, there is one rather odd issue with taking the leftmost bytes. You might get in trouble if you have other full hashes over the same data (the domain collision in Gilles' answer). To compensate for this, most hash functions have special bits or other constants when generating a shorter hash. Usually publishing the shorter hash is not a problem though. You could play it safe though and start off by hashing a specific magic value in advance, or by using SHAKE instead.
$endgroup$
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "281"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcrypto.stackexchange.com%2fquestions%2f74646%2fsha3-255-one-bit-less%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
With all well-regarded hash functions, the bits of the hash all have equal worth: as far as anyone knows (unless they aren't telling), the bits are not correlated. If you take $k$ bits of an $n$-bit hash, you get a $k$-bit hash function. Truncating SHA-256 to 255 bits gives you a hash that's almost as good as SHA-256: it has $2^255$ strength against preimage attacks and $2^127.5$ strength against collision attacks.
There are precedents for taking certain bits of a hash. SHA-224 and SHA-384 are obtained with calculations that are essentially the same as SHA-256 and SHA-512 respectively, just with different initial values (which shouldn't matter to the strength of the algorithm) and with the output truncated to the smaller size. Another precedent is that UUID can be constructed from 122 bits of MD5 or SHA-1 hashes (out of 128 or 160 respectively).
For SHA3, there's a cleaner construction. Instead of taking SHA3-256 and cutting off one bit, take the SHAKE256 255-bit or 248-bit output. SHAKE256 is exactly the same as SHA3-256 except for two bits near the beginning of the calculation, so it has the same security properties, but it's explicitly designed to have variable-length output. You can even use SHAKE128 instead of SHAKE256, which is slightly cheaper to compute with no meaningful security loss.
It's unlikely that there's an actual security problem with truncating a hash, but using SHAKE gives you greater confidence that nothing will go wrong. Even better, to avoid domain collisions (where two parts of the system calculate the hash of the same string for different reasons), calculate the cSHAKE output with a unique string as the customization string.
$endgroup$
add a comment
|
$begingroup$
With all well-regarded hash functions, the bits of the hash all have equal worth: as far as anyone knows (unless they aren't telling), the bits are not correlated. If you take $k$ bits of an $n$-bit hash, you get a $k$-bit hash function. Truncating SHA-256 to 255 bits gives you a hash that's almost as good as SHA-256: it has $2^255$ strength against preimage attacks and $2^127.5$ strength against collision attacks.
There are precedents for taking certain bits of a hash. SHA-224 and SHA-384 are obtained with calculations that are essentially the same as SHA-256 and SHA-512 respectively, just with different initial values (which shouldn't matter to the strength of the algorithm) and with the output truncated to the smaller size. Another precedent is that UUID can be constructed from 122 bits of MD5 or SHA-1 hashes (out of 128 or 160 respectively).
For SHA3, there's a cleaner construction. Instead of taking SHA3-256 and cutting off one bit, take the SHAKE256 255-bit or 248-bit output. SHAKE256 is exactly the same as SHA3-256 except for two bits near the beginning of the calculation, so it has the same security properties, but it's explicitly designed to have variable-length output. You can even use SHAKE128 instead of SHAKE256, which is slightly cheaper to compute with no meaningful security loss.
It's unlikely that there's an actual security problem with truncating a hash, but using SHAKE gives you greater confidence that nothing will go wrong. Even better, to avoid domain collisions (where two parts of the system calculate the hash of the same string for different reasons), calculate the cSHAKE output with a unique string as the customization string.
$endgroup$
add a comment
|
$begingroup$
With all well-regarded hash functions, the bits of the hash all have equal worth: as far as anyone knows (unless they aren't telling), the bits are not correlated. If you take $k$ bits of an $n$-bit hash, you get a $k$-bit hash function. Truncating SHA-256 to 255 bits gives you a hash that's almost as good as SHA-256: it has $2^255$ strength against preimage attacks and $2^127.5$ strength against collision attacks.
There are precedents for taking certain bits of a hash. SHA-224 and SHA-384 are obtained with calculations that are essentially the same as SHA-256 and SHA-512 respectively, just with different initial values (which shouldn't matter to the strength of the algorithm) and with the output truncated to the smaller size. Another precedent is that UUID can be constructed from 122 bits of MD5 or SHA-1 hashes (out of 128 or 160 respectively).
For SHA3, there's a cleaner construction. Instead of taking SHA3-256 and cutting off one bit, take the SHAKE256 255-bit or 248-bit output. SHAKE256 is exactly the same as SHA3-256 except for two bits near the beginning of the calculation, so it has the same security properties, but it's explicitly designed to have variable-length output. You can even use SHAKE128 instead of SHAKE256, which is slightly cheaper to compute with no meaningful security loss.
It's unlikely that there's an actual security problem with truncating a hash, but using SHAKE gives you greater confidence that nothing will go wrong. Even better, to avoid domain collisions (where two parts of the system calculate the hash of the same string for different reasons), calculate the cSHAKE output with a unique string as the customization string.
$endgroup$
With all well-regarded hash functions, the bits of the hash all have equal worth: as far as anyone knows (unless they aren't telling), the bits are not correlated. If you take $k$ bits of an $n$-bit hash, you get a $k$-bit hash function. Truncating SHA-256 to 255 bits gives you a hash that's almost as good as SHA-256: it has $2^255$ strength against preimage attacks and $2^127.5$ strength against collision attacks.
There are precedents for taking certain bits of a hash. SHA-224 and SHA-384 are obtained with calculations that are essentially the same as SHA-256 and SHA-512 respectively, just with different initial values (which shouldn't matter to the strength of the algorithm) and with the output truncated to the smaller size. Another precedent is that UUID can be constructed from 122 bits of MD5 or SHA-1 hashes (out of 128 or 160 respectively).
For SHA3, there's a cleaner construction. Instead of taking SHA3-256 and cutting off one bit, take the SHAKE256 255-bit or 248-bit output. SHAKE256 is exactly the same as SHA3-256 except for two bits near the beginning of the calculation, so it has the same security properties, but it's explicitly designed to have variable-length output. You can even use SHAKE128 instead of SHAKE256, which is slightly cheaper to compute with no meaningful security loss.
It's unlikely that there's an actual security problem with truncating a hash, but using SHAKE gives you greater confidence that nothing will go wrong. Even better, to avoid domain collisions (where two parts of the system calculate the hash of the same string for different reasons), calculate the cSHAKE output with a unique string as the customization string.
edited Sep 28 at 6:00
answered Sep 27 at 21:40
Gilles 'SO- stop being evil'Gilles 'SO- stop being evil'
11.3k3 gold badges31 silver badges62 bronze badges
11.3k3 gold badges31 silver badges62 bronze badges
add a comment
|
add a comment
|
$begingroup$
Apart from the slightly reduced resistances, there is no problem:
Resistances for SHA3-512;
Pre-image resistance decreased to $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^511 = sqrt2cdot 2^255$ or $sqrt2^504 = 2^252$, if 1 bit or 1 byte trimmed, respectively.
Similarly resistances for SHA3-256;
Pre-image resistance decreased to $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^255 = sqrt2cdot 2^127$ or $sqrt2^248 = 2^124$, if 1 bit or 1 byte trimmed, respectively.
The trimmed case resistance should be good enough for your application.
Actually, this is not uncommon, for example, SHA-224 is the truncated version of SHA-256 with different initial values that provides domain separation.
Trimming is secure as long as the generic attacks are in good bounds. We require the hash functions to have avalanche criteria on the output bits, that is a change in the any of input bits must randomly affect half of the output bits. Each bit of the hash function must depend on the input bits; removing 1 bit or 1 byte doesn't affect the results of other bits.
$endgroup$
add a comment
|
$begingroup$
Apart from the slightly reduced resistances, there is no problem:
Resistances for SHA3-512;
Pre-image resistance decreased to $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^511 = sqrt2cdot 2^255$ or $sqrt2^504 = 2^252$, if 1 bit or 1 byte trimmed, respectively.
Similarly resistances for SHA3-256;
Pre-image resistance decreased to $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^255 = sqrt2cdot 2^127$ or $sqrt2^248 = 2^124$, if 1 bit or 1 byte trimmed, respectively.
The trimmed case resistance should be good enough for your application.
Actually, this is not uncommon, for example, SHA-224 is the truncated version of SHA-256 with different initial values that provides domain separation.
Trimming is secure as long as the generic attacks are in good bounds. We require the hash functions to have avalanche criteria on the output bits, that is a change in the any of input bits must randomly affect half of the output bits. Each bit of the hash function must depend on the input bits; removing 1 bit or 1 byte doesn't affect the results of other bits.
$endgroup$
add a comment
|
$begingroup$
Apart from the slightly reduced resistances, there is no problem:
Resistances for SHA3-512;
Pre-image resistance decreased to $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^511 = sqrt2cdot 2^255$ or $sqrt2^504 = 2^252$, if 1 bit or 1 byte trimmed, respectively.
Similarly resistances for SHA3-256;
Pre-image resistance decreased to $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^255 = sqrt2cdot 2^127$ or $sqrt2^248 = 2^124$, if 1 bit or 1 byte trimmed, respectively.
The trimmed case resistance should be good enough for your application.
Actually, this is not uncommon, for example, SHA-224 is the truncated version of SHA-256 with different initial values that provides domain separation.
Trimming is secure as long as the generic attacks are in good bounds. We require the hash functions to have avalanche criteria on the output bits, that is a change in the any of input bits must randomly affect half of the output bits. Each bit of the hash function must depend on the input bits; removing 1 bit or 1 byte doesn't affect the results of other bits.
$endgroup$
Apart from the slightly reduced resistances, there is no problem:
Resistances for SHA3-512;
Pre-image resistance decreased to $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^511$ or $2^504$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^511 = sqrt2cdot 2^255$ or $sqrt2^504 = 2^252$, if 1 bit or 1 byte trimmed, respectively.
Similarly resistances for SHA3-256;
Pre-image resistance decreased to $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Secondary preimage resistance decreased to o $2^255$ or $2^248$, if 1 bit or 1 byte trimmed, respectively.
Collision resistance decreased to o $sqrt2^255 = sqrt2cdot 2^127$ or $sqrt2^248 = 2^124$, if 1 bit or 1 byte trimmed, respectively.
The trimmed case resistance should be good enough for your application.
Actually, this is not uncommon, for example, SHA-224 is the truncated version of SHA-256 with different initial values that provides domain separation.
Trimming is secure as long as the generic attacks are in good bounds. We require the hash functions to have avalanche criteria on the output bits, that is a change in the any of input bits must randomly affect half of the output bits. Each bit of the hash function must depend on the input bits; removing 1 bit or 1 byte doesn't affect the results of other bits.
edited Sep 28 at 18:36
answered Sep 27 at 20:11
kelalakakelalaka
13.5k4 gold badges34 silver badges61 bronze badges
13.5k4 gold badges34 silver badges61 bronze badges
add a comment
|
add a comment
|
$begingroup$
Lets get one thing out of the way: forcing one bit to 0 or 1 does not change the output size of the hash. A hash output is not a number, so the output size would not be affected.
Reducing hash output is common practice. Although maybe not a direct requirement, generally the output of a hash is considered indistinguishable from random - if the input is unknown, of course. So basically you can do with it what you want. The common thing to do is to take the leftmost bits or bytes of the hash output. You're taking the rightmost bits or bytes, and that's OK too.
As you already found out, trying to use one value out of 256 to indicate a special case is tricky. You can of course use set of bytes to escape values, but since your output size is static, you'll have to sacrifice more security for the special cases: the hashes starting with 0xFF in your case. As SHA3-512 has plenty of security, I'd just sacrifice a bit or even byte.
Finally, there is one rather odd issue with taking the leftmost bytes. You might get in trouble if you have other full hashes over the same data (the domain collision in Gilles' answer). To compensate for this, most hash functions have special bits or other constants when generating a shorter hash. Usually publishing the shorter hash is not a problem though. You could play it safe though and start off by hashing a specific magic value in advance, or by using SHAKE instead.
$endgroup$
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
add a comment
|
$begingroup$
Lets get one thing out of the way: forcing one bit to 0 or 1 does not change the output size of the hash. A hash output is not a number, so the output size would not be affected.
Reducing hash output is common practice. Although maybe not a direct requirement, generally the output of a hash is considered indistinguishable from random - if the input is unknown, of course. So basically you can do with it what you want. The common thing to do is to take the leftmost bits or bytes of the hash output. You're taking the rightmost bits or bytes, and that's OK too.
As you already found out, trying to use one value out of 256 to indicate a special case is tricky. You can of course use set of bytes to escape values, but since your output size is static, you'll have to sacrifice more security for the special cases: the hashes starting with 0xFF in your case. As SHA3-512 has plenty of security, I'd just sacrifice a bit or even byte.
Finally, there is one rather odd issue with taking the leftmost bytes. You might get in trouble if you have other full hashes over the same data (the domain collision in Gilles' answer). To compensate for this, most hash functions have special bits or other constants when generating a shorter hash. Usually publishing the shorter hash is not a problem though. You could play it safe though and start off by hashing a specific magic value in advance, or by using SHAKE instead.
$endgroup$
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
add a comment
|
$begingroup$
Lets get one thing out of the way: forcing one bit to 0 or 1 does not change the output size of the hash. A hash output is not a number, so the output size would not be affected.
Reducing hash output is common practice. Although maybe not a direct requirement, generally the output of a hash is considered indistinguishable from random - if the input is unknown, of course. So basically you can do with it what you want. The common thing to do is to take the leftmost bits or bytes of the hash output. You're taking the rightmost bits or bytes, and that's OK too.
As you already found out, trying to use one value out of 256 to indicate a special case is tricky. You can of course use set of bytes to escape values, but since your output size is static, you'll have to sacrifice more security for the special cases: the hashes starting with 0xFF in your case. As SHA3-512 has plenty of security, I'd just sacrifice a bit or even byte.
Finally, there is one rather odd issue with taking the leftmost bytes. You might get in trouble if you have other full hashes over the same data (the domain collision in Gilles' answer). To compensate for this, most hash functions have special bits or other constants when generating a shorter hash. Usually publishing the shorter hash is not a problem though. You could play it safe though and start off by hashing a specific magic value in advance, or by using SHAKE instead.
$endgroup$
Lets get one thing out of the way: forcing one bit to 0 or 1 does not change the output size of the hash. A hash output is not a number, so the output size would not be affected.
Reducing hash output is common practice. Although maybe not a direct requirement, generally the output of a hash is considered indistinguishable from random - if the input is unknown, of course. So basically you can do with it what you want. The common thing to do is to take the leftmost bits or bytes of the hash output. You're taking the rightmost bits or bytes, and that's OK too.
As you already found out, trying to use one value out of 256 to indicate a special case is tricky. You can of course use set of bytes to escape values, but since your output size is static, you'll have to sacrifice more security for the special cases: the hashes starting with 0xFF in your case. As SHA3-512 has plenty of security, I'd just sacrifice a bit or even byte.
Finally, there is one rather odd issue with taking the leftmost bytes. You might get in trouble if you have other full hashes over the same data (the domain collision in Gilles' answer). To compensate for this, most hash functions have special bits or other constants when generating a shorter hash. Usually publishing the shorter hash is not a problem though. You could play it safe though and start off by hashing a specific magic value in advance, or by using SHAKE instead.
edited Sep 27 at 21:58
answered Sep 27 at 21:52
Maarten Bodewes♦Maarten Bodewes
64.2k7 gold badges94 silver badges224 bronze badges
64.2k7 gold badges94 silver badges224 bronze badges
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
add a comment
|
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Do you know a good domain collision weakness example? Is it about beeing able to guess the pre-image if you know the input to a full hash somewhere else?
$endgroup$
– eckes
Sep 28 at 15:14
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
Yes, or likewise knowing the partial hash beforehand. If you're a bit careful then you can easily avoid issues with this, but crypto is about avoiding possible weaknesses...
$endgroup$
– Maarten Bodewes♦
Sep 29 at 16:45
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
$begingroup$
@eckes One component of an application uses SHA-256(file) as the index for a content-addressed storage system. Another component of the application uses SHA-256(file) as the encryption key to encrypt the file in a deterministic encryption scheme to keep the file content secret from the storage server (except for plaintext confirmation attacks). By combining these components, the storage server can decrypt any file! (Consider Tahoe-LAFS for an example of an application that could make this mistake, but carefully avoids it by domain separation.)
$endgroup$
– Squeamish Ossifrage
Oct 21 at 2:38
add a comment
|
Thanks for contributing an answer to Cryptography Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcrypto.stackexchange.com%2fquestions%2f74646%2fsha3-255-one-bit-less%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
14
$begingroup$
I'm intensely curious to know why you need this.
$endgroup$
– chrylis -on strike-
Sep 28 at 8:19
$begingroup$
I need to identify a set or messages by a fixed length key, that must be calculated out of the messages themselves. Clearly, a hash does the job (assuming no hash collisions). Messages can be multiple kb or few bytes. Call it premature optimization or just curiosity, I'm thinking if it's possible to avoid storing the actual message in case it's smaller than the hash, so using the hashes only for longer messages and messages themselves for the short ones. But I need one bit to identify which case it is.
$endgroup$
– ragazzojp
Sep 28 at 23:17
2
$begingroup$
I'd definitely call that premature optimization, FWIW. Writing secure code is as much about writing simple, understandable code as it is writing code that's actually secure -- because the simple, understandable code will be easy to read and understand later, and easy to fix bugs in. Adding a condition based on the first bit will be ugly no matter your language, and you're not going to save much of anything -- one SHA3 call, when SHA3 is already designed to be fast. Doubly so if you ever want to interoperate with other stuff, and since you'll need 256 more bits anyway... what's the point?
$endgroup$
– Fund Monica's Lawsuit
Sep 29 at 0:02
2
$begingroup$
Sometime the point is just the discussion, yes it's probably premature optimization, but I like to thing about problems and finding solutions. For example, answers pushed me to read the NIST documents about SHA3 and I learned new things ...
$endgroup$
– ragazzojp
Sep 29 at 0:20