codeql/python/ql/src/Security/CWE-327/WeakSensitiveDataHashing.qhelp

<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
     <overview>
          <p>
               Using a broken or weak cryptographic hash function can leave data
               vulnerable, and should not be used in security related code.
          </p>

          <p>
               A strong cryptographic hash function should be resistant to:
          </p>
          <ul>
               <li>
                    pre-image attacks: if you know a hash value <code>h(x)</code>,
                    you should not be able to easily find the input <code>x</code>.
               </li>
               <li>
                    collision attacks: if you know a hash value <code>h(x)</code>,
                    you should not be able to easily find a different input <code>y</code>
                    with the same hash value <code>h(x) = h(y)</code>.
               </li>
          </ul>
          <p>
               In cases with a limited input space, such as for passwords, the hash
               function also needs to be computationally expensive to be resistant to
               brute-force attacks. Passwords should also have an unique salt applied
               before hashing, but that is not considered by this query.
          </p>

          <p>
               As an example, both MD5 and SHA-1 are known to be vulnerable to collision attacks.
          </p>

          <p>
               Since it's OK to use a weak cryptographic hash function in a non-security
               context, this query only alerts when these are used to hash sensitive
               data (such as passwords, certificates, usernames).
          </p>

          <p>
               Use of broken or weak cryptographic algorithms that are not hashing algorithms, is
               handled by the <code>py/weak-cryptographic-algorithm</code> query.
          </p>

     </overview>
     <recommendation>

          <p>
               Ensure that you use a strong, modern cryptographic hash function:
          </p>

          <ul>
               <li>
                    such as Argon2, scrypt, bcrypt, or PBKDF2 for passwords and other data with limited input space.
               </li>
               <li>
                    such as SHA-2, or SHA-3 in other cases.
               </li>
          </ul>

     </recommendation>
     <example>

          <p>
               The following example shows two functions for checking whether the hash
               of a certificate matches a known value -- to prevent tampering.

               The first function uses MD5 that is known to be vulnerable to collision attacks.

               The second function uses SHA-256 that is a strong cryptographic hashing function.
          </p>

          <sample src="examples/weak_certificate_hashing.py" />

     </example>
     <example>
          <p>
               The following example shows two functions for hashing passwords.

               The first function uses SHA-256 to hash passwords. Although SHA-256 is a
               strong cryptographic hash function, it is not suitable for password
               hashing since it is not computationally expensive.
          </p>

          <sample src="examples/weak_password_hashing_bad.py" />


          <p>
               The second function uses Argon2 (through the <code>argon2-cffi</code>
               PyPI package), which is a strong password hashing algorithm (and
               includes a per-password salt by default).
          </p>

          <sample src="examples/weak_password_hashing_good.py" />

     </example>

     <references>
          <li>OWASP: <a href="https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html">Password Storage Cheat Sheet</a></li>
     </references>

</qhelp>