+When a class subclasses multiple base classes, attribute lookup is performed from left to right amongst the base classes. +This form of attribute lookup is called "method resolution order" and is a solution to the +diamond inheritance problem where several base classes +override a method in a shared superclass. +
+
+Unfortunately, this means that if more than one base class defines the same attribute, the leftmost base class will effectively override
+the attribute of the rightmost base class, even though the leftmost base class is not a subclass of the rightmost base class.
+Unless the methods in question are designed for inheritance, using super, then this implicit overriding may not be the desired behavior.
+Even if it is the desired behavior it makes the code hard to understand and maintain.
+
There are a number of ways that might be used to address this issue: +
+In this example the class ThreadingTCPServer inherits from ThreadingMixIn and from TCPServer.
+However, both these classes implement process_request which means that ThreadingTCPServer will inherit
+process_request from ThreadingMixIn. Consequently, the implementation of process_request in TCPServer
+will be ignored, which may not be the correct behavior.
+
+This can be fixed either by overriding the method, as shown in class ThreadingTCPServerOverriding
+or by ensuring that the
+functionality provided by the two base classes does not overlap, as shown in class ThreadingTCPServerChangedHierarchy.
+
A class that defines attributes that are not present in its superclasses
+may need to override the __eq__() method (__ne__()
+should also be defined).
Adding additional attributes without overriding __eq__() means
+that the additional attributes will not be accounted for in equality tests.
Override the __eq__ method.
In the following example the ColorPoint
+class subclasses the Point class and adds a new attribute,
+but does not override the __eq__ method.
+
In order to conform to the object model, classes that define their own equality method should also
+define their own hash method, or be unhashable. If the hash method is not defined then the hash of the
+super class is used. This is unlikely to result in the expected behavior.
A class can be made unhashable by setting its __hash__ attribute to None.
In Python 3, if you define a class-level equality method and omit a __hash__ method
+then the class is automatically marked as unhashable.
When you define an __eq__ method for a class, remember to implement a __hash__ method or set
+__hash__ = None.
In the following example the Point class defines an equality method but
+no hash method. If hash is called on this class then the hash method defined for object
+is used. This is unlikely to give the required behavior. The PointUpdated class
+is better as it defines both an equality and a hash method.
+If Point was not to be used in dicts or sets, then it could be defined as
+UnhashablePoint below.
+
+To comply fully with the object model this class should also define an inequality method (identified +by a separate rule).
+ +In order to conform to the object model, classes should define either no equality methods, or both
+an equality and an inequality method. If only one of __eq__ or __ne__ is
+defined then the method from the super class is used. This is unlikely to result in the expected
+behavior.
When you define an equality or an inequality method for a class, remember to implement both an
+__eq__ method and an __ne__ method.
In the following example the PointOriginal class defines an equality method but
+no inequality method. If this class is tested for inequality then a type error will be raised. The
+PointUpdated class is better as it defines both an equality and an inequality method. To
+comply fully with the object model this class should also define a hash method (identified by
+a separate rule).
A class that implements an ordering operator
+(__lt__, __gt__, __le__ or __ge__) should implement
+all four in order that ordering between two objects is consistent and obeys the usual mathematical rules.
+If the ordering is inconsistent with default equality, then __eq__ and __ne__
+should also be implemented.
+
Ensure that all four ordering comparisons are implemented as well as __eq__ and
+__ne__ if required.
It is not necessary to manually implement all four comparisons,
+the functools.total_ordering class decorator can be used.
In this example only the __lt__ operator has been implemented which could lead to
+inconsistent behavior. __gt__, __le__, __ge__, and in this case,
+__eq__ and __ne__ should be implemented.
Python 2.3 introduced new-style classes (classes inheriting from object). New-style classes use
+the C3 linearization method to determine a method resolution ordering (MRO) for each class. The C3
+linearization method ensures that for a class C, if a class C1 precedes a class C2 in the MRO of C
+then C1 should also precede C2 in the MRO of all subclasses of C. It is possible to create a
+situation where it is impossible to achieve this consistency and this will guarantee that a
+TypeError will be raised at runtime.
Use a class hierarchy that is not ambiguous.
+ +The MRO of class X is just X, object. The program will fail when the MRO
+of class Y needs to be calculated because object precedes X in
+the definition of Y but the opposite is true in the MRO of X.
+When an instance of a class is initialized, the super-class state should be
+fully initialized before it becomes visible to the subclass.
+Calling methods of the subclass in the superclass' __init__
+method violates this important invariant.
+
Do not use methods that are subclassed in the construction of an object.
+For simpler cases move the initialization into the superclass' __init__ method,
+preventing it being overridden. Additional initialization of subclass should
+be done in the __init__ method of the subclass.
+For more complex cases, it is advisable to use a static method or function to manage
+object creation.
+
Alternatively, avoid inheritance altogether using composition instead.
+ +A possibly non-existent attribute of self is accessed in a method.
+The attribute is set in another method of the class, but may be uninitialized if the
+method that uses the attribute is called before the one that sets it.
+This may result in an AttributeError at run time.
+
Ensure that all attributes are initialized in the __init__ method.
Python, unlike statically typed languages such as Java, allows complete freedom when calling methods during object destruction.
+However, standard object-oriented principles apply to Python classes using deep inheritance hierarchies.
+Therefore the developer has responsibility for ensuring that objects are properly cleaned up when
+there are multiple __del__ methods that need to be called.
+
+If the __del__ method of a superclass is not called during object destruction it is likely that
+that resources may be leaked.
+
A call to the __del__ method of a superclass during object destruction may be omitted:
+
__del__ method of the wrong class.__del__ method of one its base classes is omitted.Either be careful to explicitly call the __del__ of the correct base class, or
+use super() throughout the inheritance hierarchy.
Alternatively refactor one or more of the classes to use composition rather than inheritance.
+ + +In this example, explicit calls to __del__ are used, but SportsCar erroneously calls
+Vehicle.__del__. This is fixed in FixedSportsCar by calling Car.__del__.
+
Python, unlike statically typed languages such as Java, allows complete freedom when calling methods during object initialization.
+However, standard object-oriented principles apply to Python classes using deep inheritance hierarchies.
+Therefore the developer has responsibility for ensuring that objects are properly initialized when
+there are multiple __init__ methods that need to be called.
+
+If the __init__ method of a superclass is not called during object initialization it is likely that
+that object will end up in an incorrect state.
+
A call to the __init__ method of a superclass during object initialization may be omitted:
+
__init__ method of the wrong class.__init__ method of one its base classes is omitted.super() in its own __init__ method.Either be careful to explicitly call the __init__ of the correct base class, or
+use super() throughout the inheritance hierarchy.
Alternatively refactor one or more of the classes to use composition rather than inheritance.
+ + +In this example, explicit calls to __init__ are used, but SportsCar erroneously calls
+Vehicle.__init__. This is fixed in FixedSportsCar by calling Car.__init__.
+
The descriptor protocol allows user programmable attribute access.
+The descriptor protocol is what enables class methods, static methods, properties and super().
+
+Descriptor objects are class attributes which control the behavior of instance attributes. Consequently, a single descriptor +is common to all instances of a class and should not be mutated when instance attributes are accessed. +
+ + +Do not mutate the descriptor object, rather create a new object that contains the necessary state.
+ + +In this example the descriptor class MutatingDescriptor stores a reference to obj in an attribute.
+
In the following example, the descriptor class NonMutatingDescriptor returns a new object every time __get__
+is called.
+
+
+Subclasses should not set attributes that are set in the superclass. +Doing so may violate invariants in the superclass.
+ +
+If you did not intend to override the attribute value set in the superclass,
+then rename the subclass attribute.
+If you do want to be able to set a new value for the attribute of the superclass,
+then convert the superclass attribute to a property.
+Otherwise the value should be passed as a parameter to the superclass
+__init__ method.
+
Property descriptors are only supported for the new-style classes that were introduced in Python +2.1. Property descriptors should only be used in new-style classes.
+ +If you want to define properties in a class, then ensure that the class is a new-style class. You can
+convert an old-style class to a new-style class by inheriting from object.
In the following example all the classes attempt to set a property for x. However, only
+the third and fourth classes are new-style classes. Consequently, the x
+property is only available for the NewStyle and InheritNewStyle classes.
If you define the OldStyle class as inheriting from a new-style class, then the x
+ property would be available for both the OldStyle and InheritOldStyle classes.
If a class has a close() or similar method to release resources, then it
+should be made a context manager. Using a context manager allows instances of the class to be used
+in the with statement, improving code size and readability. This is a simpler and more
+reliable method than implementing just a __del__ method.
+
The context manager requires an __enter__ and an __exit__ method:
__enter__ method acquires the resource or does nothing if the resource
+is acquired in the __init__ method__exit__ method releases the resource, this can just be a simple wrapper around the
+ close method.The following example shows how a class definition that implements __del__ can be
+updated to use a context manager.
The ability to override the class dictionary using a __slots__ declaration
+is supported only by new-style classes. When you add a __slots__ declaration to an
+old-style class it just creates a class attribute called '__slots__'.
If you want to override the dictionary for a class, then ensure that the class is a new-style class.
+You can convert an old-style class to a new-style class by inheriting from object.
In the following example the KeyedRef class is an old-style class (no inheritance). The
+__slots__ declaration in this class creates a class attribute called '__slots__', the class
+dictionary is unaffected. The KeyedRef2 class is a new-style class so the
+__slots__ declaration causes special compact attributes to be created for each name in
+the slots list and saves space by not creating attribute dictionaries.
Subclass shadowing occurs when an instance attribute of a superclass has the +the same name as a method of a subclass, or vice-versa. +The semantics of Python attribute look-up mean that the instance attribute of +the superclass hides the method in the subclass. +
+ +Rename the method in the subclass or rename the attribute in the superclass.
+ +The following code includes an example of subclass shadowing. When you call Cow().milk()
+an error is raised because Cow().milk is interpreted as the 'milk' attribute set in
+Mammal.__init__, not the 'milk' method defined within Cow. This can be fixed
+by changing the name of either the 'milk' attribute or the 'milk' method.
The ability to access inherited methods that have been overridden in a class using super()
+is supported only by new-style classes. When you use the super() function in an old-style
+class it fails.
If you want to access inherited methods using the super() built-in, then ensure that
+the class is a new-style class. You can convert an old-style class to a new-style class by inheriting
+from object. Alternatively, you can call the __init__ method of the superclass
+directly from an old-style class using: BaseClass.__init__(...).
In the following example, PythonModule is an old-style class as it inherits from another
+old-style class. If the _ModuleIteratorHelper class cannot be converted into a new-style
+class, then the call to super() must be replaced. The PythonModule2 class
+demonstrates the correct way to call a superclass from an old-style class.
Python, unlike statically typed languages such as Java, allows complete freedom when calling methods during object destruction.
+However, standard object-oriented principles apply to Python classes using deep inheritance hierarchies.
+Therefore the developer has responsibility for ensuring that objects are properly cleaned up when
+there are multiple __del__ methods that need to be called.
+
+Calling a __del__ method more than once during object destruction risks resources being released multiple
+times. The relevant __del__ method may not be designed to be called more than once.
+
There are a number of ways that a __del__ method may be be called more than once.
__del__ methods.__del__ methods of its base types.
+ One or more of those base types uses super() to pass down the inheritance chain.Either be careful not to explicitly call a __del__ method more than once, or
+use super() throughout the inheritance hierarchy.
Alternatively refactor one or more of the classes to use composition rather than inheritance.
+ +In the first example, explicit calls to __del__ are used, but SportsCar erroneously calls
+both Vehicle.__del__ and Car.__del__.
+This can be fixed by removing the call to Vehicle.__del__, as shown in FixedSportsCar.
+
In the second example, there is a mixture of explicit calls to __del__ and calls using super().
+To fix this example, super() should be used throughout.
+
Python, unlike statically typed languages such as Java, allows complete freedom when calling methods during object initialization.
+However, standard object-oriented principles apply to Python classes using deep inheritance hierarchies.
+Therefore the developer has responsibility for ensuring that objects are properly initialized when
+there are multiple __init__ methods that need to be called.
+
+Calling an __init__ method more than once during object initialization risks the object being incorrectly initialized.
+It is unlikely that the relevant __init__ method is designed to be called more than once.
+
There are a number of ways that an __init__ method may be be called more than once.
__init__ methods.__init__ methods of its base types.
+ One or more of those base types uses super() to pass down the inheritance chain.Either be careful not to explicitly call an __init__ method more than once, or
+use super() throughout the inheritance hierarchy.
Alternatively refactor one or more of the classes to use composition rather than inheritance.
+ +In the first example, explicit calls to __init__ are used, but SportsCar erroneously calls
+both Vehicle.__init__ and Car.__init__.
+This can be fixed by removing the call to Vehicle.__init__, as shown in FixedSportsCar.
+
In the second example, there is a mixture of explicit calls to __init__ and calls using super().
+To fix this example, super() should be used throughout.
+
A non-existent attribute of self is accessed in a method.
+An attribute is treated as non-existent if it is not a class attribute
+and it is not set in any method of the class.
+This may result in an AttributeError at run time.
+
+
Ensure that all attributes are initialized in the __init__ method.
If a class has only one public method (other than its __init__)
+it should be replaced with a function.
+
Convert the single public method into a function.
+If there is an __init__ and it sets attributes on the self
+then rename the __init__ method and remove the self parameter
+Make the public method an inner function and return that.
Delete the class.
+ +In this example the class only has a single method. This method does not need to be in its own
+class. It should be a method on its own that takes a and b as parameters.
+
+Using a named argument whose name does not correspond to a parameter of the __init__ method of the class being instantiated, will result in a
+TypeError at runtime.
+
Check for typos in the name of the arguments and fix those. +If the name is clearly different, then this suggests a logical error. +The change required to correct the error will depend on whether the wrong argument has been +specified or whether the wrong class has been specified. +
+ +
+ A call to the __init__ method of a class must supply an argument
+ for each parameter that does not have a default value defined, so:
+
__init__ method takes a varargs (starred) parameter in
+ which case there is no limit.If there are too few arguments then check to see which arguments have been omitted and supply values for those.
+ +If there are too many arguments then check to see if any have been added by mistake and remove those.
+ +
+ Also check where a comma has been inserted instead of an operator or a dot.
+ For example, the code is obj,attr when it should be obj.attr.
+
If it is not clear which are the missing or surplus arguments, then this suggests a logical error. +The fix will then depend on the nature of the error. +
+ +
+All exception classes in Python derive from BaseException. BaseException has three important subclasses,
+Exception from which all errors and normal exceptions derive, KeyboardInterrupt which is raised when the
+user interrupts the program from the keyboard and SystemExit which is raised by the sys.exit() function to
+terminate the program.
+
+Since KeyboardInterrupt and SystemExit are special they should not be grouped together with other
+Exception classes.
+
+Catching BaseException, rather than its subclasses may prevent proper handling of
+KeyboardInterrupt or SystemExit. It is easy to catch BaseException
+accidentally as it is caught implicitly by an empty except: statement.
+
+Handle Exception, KeyboardInterrupt and SystemExit separately. Do not use the plain except: form.
+
+In these examples, a function application.main() is called that might raise SystemExit.
+In the first two functions, BaseException is caught, but this will discard KeyboardInterrupt.
+In the third function, call_main_program_fixed only SystemExit is caught,
+leaving KeyboardInterrupt to propagate.
+
In these examples KeyboardInterrupt is accidentally ignored.
Ignoring exceptions that should be dealt with in some way is almost always a bad idea.
+The loss of information can lead to hard to debug errors and incomplete log files.
+It is even possible that ignoring an exception can cause a security vulnerability.
+An empty except block may be an indication that the programmer intended to
+handle the exception but never wrote the code to do so.
Ensure all exceptions are handled correctly.
+ +In this example the program keeps running with the same privileges if it fails to drop to lower +privileges.
+If the class specified in an except handler (within a try statement) is
+not a legal exception class, then it will never match a raised exception and never be executed
+
Legal exception classes are:
+BaseException
+However, it recommended that you only use subclasses of the builtin class
+Exception (which is itself a subclass of BaseException).
+
Ensure that the specified class is the one intended. If it is not then replace it with
+the correct one. Otherwise the entire except block can be deleted.
+
If the object raised is not a legal Exception class or an instance of one, then
+a TypeError will be raised instead.
Legal exception classes are:
+BaseException
+However, it recommended that you only use subclasses of the builtin class
+Exception (which is itself a subclass of BaseException).
+
Change the expression in the raise statement to be a legal exception.
When handling an exception, Python searches the except blocks in source code order
+until it finds a matching except block for the exception.
+An except block, except E:, specifies a class E and will match any
+exception that is an instance of E.
+
+If a more general except block precedes a more specific except block,
+then the more general block is always executed and the more specific block is never executed.
+An except block, except A:, is more general than another except block, except B:,
+if A is a super class of B.
+
+For example:
+except Exception: is more general than except Error: as Exception
+is a super class of Error.
+
Reorganize the except blocks so that the more specific except
+is defined first. Alternatively, if the more specific except block is
+no longer required then it should be deleted.
In this example the except Exception: will handle AttributeError preventing the
+subsequent handler from ever executing.
NotImplemented is not an Exception, but is often mistakenly used in place of NotImplementedError.
+Executing raise NotImplemented or raise NotImplemented() will raise a TypeError.
+When raise NotImplemented is used to mark code that is genuinely never called, this mistake is benign.
+
+However, should it be called, then a TypeError will be raised rather than the expected NotImplemented,
+which might make debugging the issue difficult.
+
The correct use of NotImplemented is to implement binary operators.
+Code that is not intended to be called should raise NotImplementedError.
Replace uses of NotImplemented with NotImplementedError.
+In the example below, the method wrong will incorrectly raise a TypeError when called.
+The method right will raise a NotImplementedError.
+
In Python 2, if a tuple is raised then all elements but the first are ignored and only the first part is raised. +If the first element is itself a tuple, then the first element of that is used and so on. +This unlikely to be the intended effect and will most likely indicate some sort of error.
+ +It is important to note that the exception in raise Exception, message is not a tuple, whereas the exception
+in ex = Exception, message; raise ex is a tuple.
+In Python 3, raising a tuple is an error. +
+ + +Given that all but the first element of the tuple is ignored, +the tuple should be replaced with its first element in order to +improve the clarity of the code. If the subsequent parts of the tuple +were intended to form the message, then they should be passed as an argument +when creating the exception. +
+ + + +In the following example the intended error message is mistakenly used to form a tuple.
+This can be fixed, either by using the message to create the exception or using the message in the raise +statement, as shown below.
+
+The function next() will raise a StopIteration exception
+if the underlying iterator is exhausted.
+Normally this is fine, but in a generator may cause problems.
+Since the StopIteration is an exception it will be propagated out of the generator
+causing termination of the generator. This is unlikely to be the expected behavior and may mask
+errors.
+
+This problem is considered sufficiently serious that PEP 479
+has been accepted to modify the handling of StopIteration in generators. Consequently, code that does not handle
+StopIteration properly is likely to fail in future versions of Python.
+
+Each call to next() should be wrapped in a try-except to explicitly
+handle StopIteration exceptions.
+
+In the following example, an empty file part way through iteration will silently truncate the output as
+the StopIteration exception propagates to the top level.
+
+In the following example StopIteration exception is explicitly handled,
+allowing all the files to be processed.
+
+The super class should be called with the enclosing class as its first argument and self as its second argument.
+
+Passing a different class may work correctly, provided the class passed is a super class of the enclosing class and the enclosing class
+does not define an __init__ method.
+However, this may result in incorrect object initialization if the enclosing class is later subclassed using multiple inheritance.
+
+ Ensure that the first argument to super() is the enclosing class.
+
+In this example the call to super(Vehicle, self) in Car.__init__ is incorrect as it
+passes Vehicle rather than Car as the first argument to super.
+As a result, super(SportsCar, self).__init__() in the SportsCar.__init__ method will not call
+all __init__() methods because the call to super(Vehicle, self).__init__()
+skips StatusSymbol.__init__().
+
When two constants are compared it is typically an
+indication of a mistake, since the Boolean value of the comparison
+will always be the same. In very old code this may be used to initialize
+True and False.
It is never good practice to compare a value with itself. If the constant
+behavior is indeed required, use the Boolean literals True or
+False, rather than encoding them obscurely as 1 == 1
+or similar. If there is a mistake, ascertain the desired behavior and correct it.
+
In this example, old code uses 1==1 to initialize __builtins__.True.
+This code has been unnecessary on all versions of Python released since 2003 and can be deleted.
+
When two identical expressions are compared it is typically an
+indication of a mistake, since the Boolean value of the comparison
+will always be the same, unless the value is the floating point value float('nan').
+
It is not good practice to compare a value with itself, as it makes the code hard to read
+and can hide errors with classes that do not correctly implement equality.
+If testing whether a floating-point value is not-a-number, then use math.isnan().
+If the value may be a complex number, then use cmath.isnan() instead.
+
In this example f == f is used to check for float('nan').
+This makes the code difficult to understand as the reader may not be immediately familiar with this pattern.
+
When two identical expressions are compared it is typically an
+indication of a mistake, since the Boolean value of the comparison
+will always be the same. Often, it can indicate that self has
+been omitted.
It is never good practice to compare a value with itself.
+If self has been omitted, then insert it. If the constant
+behavior is indeed required, use the Boolean literals True or
+False, rather than encoding them obscurely as x == x
+or similar.
The result of certain comparisons can sometimes be inferred from their context and the results of other +comparisons. This can be an indication of faulty logic and may result in dead +code or infinite loops if, for example, a loop condition never changes its value. +
+ +Inspect the code to check whether the logic is correct, and consider +simplifying the logical expression. +
+ +In the following (real world) example the test obj1 < obj2 is repeated and thus the
+second test will always be false, and the function _compare will only ever return 0 or -1.
+
A membership test, that is a binary expression with
+in or not in as the operator, expects that the
+expression to the right of the operator will be a container.
As well as standard containers such as list, tuple,
+dict or set,
+a container can be an instance of any class that has the __contains__,
+__iter__ or __getitem__ method.
+
+
+Ensure that the right hand side of the expression is a container, or add a guard
+clause for other cases.
+For example, if the right side may be a container or None then change
+if x in seq: to if seq is not None and x in seq:
+
In this example the NotAContainer class has no __contains__,
+__iter__ or __getitem__ method.
+Consequently, when the line if 2 in cont: is executed a TypeError
+will be raised. Adding a __getitem__ method to the
+NotAContainer class would solve the problem.
+
Dictionary literals are constructed in the order given in the source. +This means that if a key is duplicated the second key-value pair will overwrite +the first as a dictionary can only have one value per key. +
+ +Check for typos to ensure that the keys are supposed to be the same. +If they are then decide which value is wanted and delete the other one.
+ +This example will output "c" because the mapping between 2 and "b" is overwritten by the +mapping from 2 to "c". The programmer may have meant to map 3 to "c" instead.
+ When you compare an object to None, use is rather than ==.
+None is a singleton object, comparing using == invokes the __eq__
+method on the object in question, which may be slower than identity comparison. Comparing to
+None using the is operator is also easier for other programmers to read.
Replace == with is.
The filter2 function is likely to be more efficient than the filter1
+function because it uses an identity comparison.
If a format string includes conversion specifiers of the form %(name)s then the right hand side of the operation must be a mapping.
+A string is a format string if it appears on the left of a modulo (%) operator, the right hand side being the value to be formatted.
+If the right hand side is not a mapping then a TypeError will be raised.
+Mappings are usually dicts but can be any type that implements the mapping protocol.
+
Change the format to match the arguments and ensure that the right hand side is always a mapping. + +
In the following example the right hand side of the formatting operation can be a tuple, which is not a mapping.
+To fix this example, ensure that args is a mapping when unlike_condition occurs.
+
The __del__ special method is designed to be called by the Python virtual machine when an object is no longer reachable,
+but before it is destroyed. Calling a __del__ method explicitly may cause an object to enter an unsafe state.
If explicit clean up of an object is required, a close() method should be called or, better still,
+wrap the use of the object in a with statement.
+
In the first example, rather than close the zip file in a conventional manner the programmer has called __del__.
+A safer alternative is shown in the second example.
+
A formatting expression, that is an expression of the form the_format.format(args) or format(the_format, args),
+can use explicitly numbered fields, like {1}, or implicitly numbered fields, such as {}, but it cannot use both.
+Doing so will raise a ValueError.
+
+Use either explicitly numbered fields or implicitly numbered fields, but be consistent. +
+ +
+In the following example the formatting uses both implicit, {}, and explicit, {1}, numbering for fields, which is illegal.
+
A formatting expression, that is an expression of the form the_format.format(args) or format(the_format, args),
+can have any number of arguments, provided that there are enough to match the format.
+However, surplus arguments are redundant and clutter the code, making it harder to read.
+
+It is also possible that surplus arguments indicate a mistake in the format string. +
+ ++Check that the format string is correct and then remove any surplus arguments. +
+ +In the following example there are three arguments for the call to the str.format() method, but the format string only requires two.
+The third argument should be deleted.
A formatting expression, that is an expression of the form the_format.format(args) or format(the_format, args)
+can have keyword arguments of any name, as long as all the required names are provided.
+However, surplus keyword arguments, those with names that are not in the format, are redundant.
+These surplus arguments clutter the code, making it harder to read.
+
+It is also possible that surplus keyword arguments indicate a mistake in the format string. +
+ ++Check that the format string is correct and then remove any surplus keyword arguments. +
+ +In the following example, the comment indicates that the chips keyword argument is no longer required and should be deleted.
+
A formatting expression, that is an expression of the form the_format.format(args) or format(the_format, args),
+can use named fields. If it does, then keyword arguments must be supplied for all named fields.
+If any of the keyword arguments are missing then a KeyError will be raised.
+
+Change the format to match the arguments and ensure that the arguments have the correct names. +
+ +In the following example, if unlikely_condition() is true, then a KeyError will be raised
+as the keyword parameter eggs is missing.
+Adding a keyword parameter named eggs would fix this.
+
A formatting expression, that is an expression of the form the_format.format(args) or format(the_format, args),
+must have sufficient arguments to match the format. Otherwise, an IndexError will be raised.
+
+Either change the format to match the arguments, or ensure that there are sufficient arguments. +
+ +In the following example, only 2 arguments may be provided for the call to the str.format method,
+which is insufficient for the format string used. To fix this a third parameter should be provided on line 4.
+
If an object is used as a key in a dictionary or as a member of a set then it must be hashable,
+that is it must define a __hash__ method. All built-in immutable types are hashable, but
+mutable ones are not. Common hashable types include all numbers, strings (both unicode and bytes)
+and tuple. Common unhashable types include list, dict and set.
+
+In order to store a key in a dict or set a hash value is needed. To determine this value the built-in
+function hash() is called which in turn calls the __hash__ method on the object.
+If the object's class does not have the __hash__ method, then a TypeError will be raised.
+
Since this problem usually indicates a logical error, it is not possible to give a general recipe for fixing it.
+Mutable collections can be converted into immutable equivalents where appropriate. For example sets can be hashed by converting any instances
+of set into frozenset instances.
+
lists are not hashable. In this example, an attempt is made to use a list
+as a key in a mapping which will fail with a TypeError.
+
When you compare two values using the is or is not operator, it is the
+object identities of the two values that is tested rather than their equality.
+ If the class of either of the values in the comparison redefines equality then the
+ is operator may return False even though the objects compare as equal.
+ Equality is defined by the __eq__ or, in Python2, __cmp__ method.
+ To compare two objects for equality, use the == or != operator instead.
When you want to compare the value of two literals, use the comparison operator == or
+!= in place of is or is not.
If the uniqueness property or performance are important then use an object that does not redefine equality.
+ +In the first line of the following example the programmer tests the value of value against
+DEFAULT using the is operator. Unfortunately, this may fail when the function
+is called with the string "default".
+To function correctly, change the expression value is DEFAULT to value == DEFAULT.
+Alternatively, if the uniqueness property is desirable, then change the definition of DEFAULT to
+either of the alternatives below.
+
If an object is called, obj(), then that object must be a callable or
+a TypeError will be raised. A callable object is any object whose class defines
+the __call__ special method.
+Callable objects include functions, methods, classes.
The callable(object) builtin function determines if an object is callable or not.
+When the Python interpreter attempts to evaluate a call such as func(arg) it will
+invoke the __call__ special method on func.
+Thus, func(arg) is roughly equivalent to type(func).__call__(func, arg)
+which means that the class must define the attribute __call__,
+merely adding it to the instance is not sufficient.
+
Since this problem usually indicates a logical error, it is not possible to give a general recipe for fixing it.
+ +lists are not callable. In this example, an attempt is made to call a list
+which will fail with a TypeError.
+
When you compare two values using the is or is not operator, it is the
+object identities of the two values that is tested rather than their equality.
+If the class of either of the values in the comparison redefines equality then the
+is operator may return False even though the objects compare as equal.
+
+CPython interns a number of commonly used values, such as small integers, which means that using
+is instead of == will work correctly. However, this might not be portable
+to other implementations such as PyPy, IronPython, Jython or MicroPython.
+
When you want to compare the value of two literals, use the comparison operator == or
+!= in place of is or is not.
If the uniqueness property or performance are important then use an object that does not redefine equality.
+ +The function equals_to_twelve() relies on CPython interning small integers.
+To function correctly for all implementations, change the expression x is CONSTANT to x == CONSTANT.
+
+The meaning of the \b escape sequence inside a regular expression depends on its
+syntactic context: inside a character class, it matches the backspace character; outside of a
+character class, it matches a word boundary. This context dependency makes regular expressions
+hard to read, so the \b escape sequence should not be used inside character classes.
+
+Replace \b in character classes with the semantically identical escape sequence \x08.
+
+In the following example, the regular expression contains two uses of \b: in the
+first case, it matches a word boundary, in the second case it matches a backspace character.
+
+You can make the regular expression easier for other developers to interpret, by rewriting it as r"\b[\t\x08]".
+
+Character classes in regular expressions represent sets of characters, so there is no need to specify +the same character twice in one character class. Duplicate characters in character classes are at best +useless, and may even indicate a latent bug. +
+ +Determine whether a character is simply duplicated or whether the character class was in fact meant as a group. +If it is just a duplicate, then remove the duplicate character. +If was supposed to be a group, then replace the square brackets with parentheses. +
+ + +
+In the following example, the character class [password|pwd] contains two instances each
+of the characters d, p, s, and w. The programmer most likely meant
+to write (password|pwd) (a pattern that matches either the string "password"
+or the string "pwd"), and accidentally mistyped the enclosing brackets.
+
+To fix this problem, the regular expression should be rewritten to r"(password|pwd)".
+
+One of the problems with using regular expressions is that almost any sequence of characters is a valid pattern. +This means that it is easy to omit a necessary character and still have a valid regular expression. +Omitting a character in a named capturing group is a specific case which can dramatically change the meaning of a regular expression. +
+ ++Examine the regular expression to find and correct any typos. +
+ +
+In the following example, the regular expression for matcher, r"(P<name>[\w]+)", is missing a "?" and will
+match only strings of letters that start with "P<name>", instead of matching any sequence of letters
+and placing the result in a named group.
+The fixed version, fixed_matcher, includes the "?" and will work as expected.
+
+The caret character ^ anchors a regular expression to the beginning of the input, or
+(for multi-line regular expressions) to the beginning of a line.
+If it is preceded by a pattern that must match a non-empty sequence of (non-newline) input characters,
+then the entire regular expression cannot match anything.
+
+Examine the regular expression to find and correct any typos. +
+ +
+In the following example, the regular expression r"\[^.]*\.css" cannot match any
+string, since it contains a caret assertion preceded by an escape sequence that matches an
+opening bracket.
+
+In the second regular expression, r"[^.]*\.css", the caret is part of a character class, and will not match the start of the string.
+
+A dollar assertion $ in a regular expression only matches at the end of the input, or
+(for multi-line regular expressions) at the end of a line. If it is followed by a pattern
+that must match a non-empty sequence of (non-newline) input characters, it cannot possibly match,
+rendering the entire regular expression unmatchable.
+
+Examine the regular expression to find and correct any typos. +
+ +
+In the following example, the regular expression r"\.\(\w+$\)" cannot match any
+string, since it contains a dollar assertion followed by an escape sequence that matches a
+closing parenthesis.
+
+The second regular expression, r"\.\(\w+\)$", has the dollar at the end and will work as expected.
+
+ In Python 2, the result of dividing two integers is silently truncated into an integer. This may lead to unexpected behavior. +
+ +
+ If the division should never be truncated, add
+ from __future__ import division
+ to the beginning of the file. If the division should always
+ be truncated, replace the division operator / with the
+ truncated division operator //.
+
+ The first example shows a function for calculating the average of a sequence + of numbers. When the function runs under Python 2, and the sequence contains + only integers, an incorrect result may be returned because the result is + truncated. The second example corrects this error by following the + recommendation listed above. +
+ +When two string literals abut each other the Python interpreter implicitly concatenates them into a +single string. On occasion this can be useful, but is more commonly misleading or incorrect. +
+ +If the concatenation is deliberate, then use + to join the strings. This has no runtime overhead,
+and makes the intention clear.
+
+In the first function below, unclear, implicit string concatenation is used twice; once deliberately and once by accident.
+In the second function, clarified, the first concatenation is made explicit and the second is removed.
+
A lambda that calls a function without modifying any of its parameters is unnecessary. +Python functions are first class objects and can be passed around in the same way as the resulting lambda. +
+ +Remove the lambda, use the function directly.
+ +In this example a lambda is used unnecessarily in order to pass a method as an argument to
+call_with_x_squared.
This is not necessary as methods can be passed directly. They behave as callable objects.
+A format string, that is the string on the left hand side of an expression like fmt % arguments, must consist of legal conversion specifiers.
+Otherwise, a ValueError will be raised.
+
+
Choose a legal conversion specifier.
+ +In format_as_tuple_incorrect, "t" is not a legal conversion specifier.
+
+
The 'apply' function is deprecated and makes code harder to read as most Python programmers +will not be familiar with it (it has been deprecated since 2003). +
+ +Replace apply(function, args) with function(*args).
+
+Replace apply(function, args, keywords) with function(*args, **keywords).
+
A call to the input() function, input(prompt) is equivalent to eval(raw_input(prompt)). Evaluating user input without any checking can be a serious security flaw.
Get user input with raw_input(prompt) and then validate that input before evaluating. If the expected input is a number or
+string, then ast.literal_eval() can always be used safely.
+Using a named argument whose name does not correspond to a parameter of the called function (or method), will result in a
+TypeError at runtime.
+
Check for typos in the name of the arguments and fix those. +If the name is clearly different, then this suggests a logical error. +The change required to correct the error will depend on whether the wrong argument has been +specified or whether the wrong function (or method) has been specified. +
+ +A formatting expression, that is an expression of the format fmt % arguments must have the correct number of
+arguments on the right hand side of the expression. Otherwise, a TypeError will be raised.
+
+
Change the format to match the arguments and ensure that the right hand argument always has the correct number of elements. + +
In the following example the right hand side of the formatting operation can be of length 2, which does not match the format string<./p> +
+ A function call must supply an argument for each parameter that does not have a default value defined, so: +
+If there are too few arguments then check to see which arguments have been omitted and supply values for those.
+ +If there are too many arguments then check to see if any have been added by mistake and remove those.
+ +
+ Also check where a comma has been inserted instead of an operator or a dot.
+ For example, the code is obj,attr when it should be obj.attr.
+
If it is not clear which are the missing or surplus arguments, then this suggests a logical error. +The fix will then depend on the nature of the error. +
+ +When a function contains both explicit returns (return value) and implicit returns
+(where code falls off the end of a function) this often indicates that a return
+statement has been forgotten. It is best to return an explicit return value even when returning
+None because this makes it easier for other developers to read your code.
+
Add an explicit return at the end of the function.
+ + +In the check_state1 function, the developer probably did intend to use an implicit
+return value of None as this equates to False. However, the function in
+check_state2 is easier to read.
The __getslice__, __setslice__ and __delslice__ methods have been deprecated since Python 2.0.
+In general, no class should implement these methods.
+
+The only exceptions to this rule are classes that inherit from list and override __getitem__,
+__setitem__ or __delitem__.
+Since list implements the slicing methods any class inheriting from list must implement the
+the slicing methods to ensure correct behavior of __getitem__, __setitem__ and __delitem__.
+These exceptions to the rule will not be treated as violations.
+
+Delete the slicing method. Any functionality should be moved to the equivalent __xxxitem__ method:
+
__getslice__ should be replaced with __getitem____setslice__ should be replaced with __setitem____delslice__ should be replaced with __delitem__The __init__ method of a class is used to initialize new objects,
+not create them. As such, it should not return any value. Returning None
+is correct in the sense that no runtime error will occur,
+but it suggests that the returned value is meaningful, which it is not.
Convert the return expr statement to a plain return statement,
+or omit it altogether if it is at the end of the method.
In this example, the __init__ method attempts to return the newly created
+object. This is an error and the return method should be removed.
User-defined classes interact with the Python virtual machine via special methods (also called "magic methods").
+For example, for a class to support addition it must implement the __add__ and __radd__ special methods.
+When the expression a + b is evaluated the Python virtual machine will call type(a).__add__(a, b) and if that
+is not implemented it will call type(b).__radd__(b, a).
+Since the virtual machine calls these special methods for common expressions, users of the class will expect these operations to raise standard exceptions.
+For example, users would expect that the expression a.b might raise an AttributeError
+if the object a does not have an attribute b.
+If a KeyError were raised instead,
+then this would be unexpected and may break code that expected an AttributeError, but not a KeyError.
+
+Therefore, if a method is unable to perform the expected operation then its response should conform to the standard protocol, described below. +
+ +a.b: Raise AttributeErrora + b: Do not raise an exception, return NotImplemented instead.a[b]: Raise KeyError.hash(a): Use __hash__ = None to indicate that an object is unhashable.a != b: Never raise an exception, always return True or False.a < b: Raise a TypeError if the objects cannot be ordered.TypeError to indicate that the operation is unsupported.If the method is meant to be abstract, then declare it so using the @abstractmethod decorator.
+Otherwise, either remove the method or ensure that the method raises an exception of the correct type.
+
+This example shows two unhashable classes. The first class is unhashable in a non-standard way which may cause maintenance problems. +The second, corrected, class uses the standard idiom for unhashable classes. +
+
+In this example, the first class is implicitly abstract; the __add__ method is unimplemented,
+presumably with the expectation that it will be implemented by sub-classes.
+The second class makes this explicit with an @abstractmethod decoration on the unimplemented __add__ method.
+
+In this last example, the first class implements a collection backed by the file store.
+However, should an IOError be raised in the __getitem__ it will propagate to the caller.
+The second class handles any IOError by reraising a KeyError which is the standard exception for
+the __getitem__ method.
+
There is a call to the overridden method, and potentially the overriding method, +with arguments that are not legal for the overriding method. +This will cause an error if the overriding method is called and is a +violation of the Liskov substitution principle. +
+ +Ensure that the overriding method accepts all the parameters that are legal for the +overridden method.
+ +In this example there is a mismatch between the legal parameters for the base
+class method (self, source, filename, symbol) and the extension method
+(self, source). The extension method can be used to override the base
+method as long as values are not specified for the filename and (optional)
+symbol parameters. If the extension method was passed the additional
+parameters accepted by the base method then an error would occur.
The extension method should be updated to support the filename and
+symbol parameters supported by the overridden method.
There is a call to the overriding method, and potentially the overridden method, +with arguments that are not legal for the overridden method. +This will cause an error if the overridden method is called and is a +violation of the Liskov substitution principle. +
+Ensure that the overridden method accepts all the parameters that are legal for +overriding method(s).
+ +In this example there is a mismatch between the legal parameters for the base
+class method (self, source, filename) and the extension method
+(self, source). Since there is a call that uses the signature of the extension method
+then it can be inferred that the base signature is erroneous and should be updated to
+match that of the extension method.
+
The base method should be updated to either remove the filename parameters, or add a default value for it.
The __init__ method of a class is used to initialize new objects,
+not create them. As such, it should not return any value.
+By including a yield expression in the method turns it into a generator method.
+On calling it will return a generator resulting in a runtime error.
The presence of a yield expression in an __init__ method
+suggests a logical error, so it is not possible to suggest a general fix.
In this example the __init__ method contains a yield expression. This is
+not logical in the context of an initializer.
The __iter__ method of a class should return an iterator.
+
+Iteration in Python relies on this behavior and attempting to iterate over an
+instance of a class with an incorrect __iter__ method will raise a TypeError.
+
Make the __iter__ return a new iterator, either as an instance of
+a separate class or as a generator.
In this example the MyRange class's __iter__ method does not
+return an iterator. This will cause the program to fail when anyone attempts
+to use the iterator in a for loop or in statement.
+
The fixed version implements the __iter__ method as a generator function.
The __iter__ method of an iterator should return self.
+This is important so that iterators can be used as sequences in any context
+that expect a sequence. To do so requires that __iter__ is
+idempotent on iterators.
+Note that sequences and mapping should return a new iterator, it is just the returned +iterator that must obey this constraint. +
+ +Make the __iter__ return self unless the class should not be an iterator,
+in which case rename the next (Python 2) or __next__ (Python 3)
+to something else.
In this example the Counter class's __iter__ method does not
+return self (or even an iterator). This will cause the program to fail when anyone attempts
+to use the iterator in a for loop or in statement.
The default value of a parameter is computed once when the function is +created, not for every invocation. The "pre-computed" value is then used for every +subsequent call to the function. Consequently, if you modify the default +value for a parameter this "modified" default value is used for the parameter +in future calls to the function. This means that the function may not behave as +expected in future calls and also makes the function more difficult to understand. +
+ +If a parameter has a default value, do not modify the default value. When +you use a mutable object as a default value, you should use a placeholder value +instead of modifying the default value. This is a particular problem when you +work with lists and dictionaries but there are standard methods of avoiding +modifying the default parameter (see References).
+ +In the following example, the default parameter is set with a default
+value of an empty list. Other commands in the function then append values to the
+list. The next time the function is called, the list will contain values, which
+may not have been intended.
The recommended workaround is use a placeholder value. That is, define the
+function with a default of default=None, check if the parameter is
+None and then set the parameter to a list.
The first argument of a class method, a new method or any metaclass method
+should be called cls. This makes the purpose of the argument clear to other developers.
+
Change the name of the first argument to cls as recommended by the style guidelines
+in PEP 8.
In the example, the first parameter to make() is klass which should be changed to cls
+for ease of comprehension.
+
Normal methods should have at least one parameter and the first parameter should be called self.
+This makes the purpose of the parameter clear to other developers.
+
If there is at least one parameter, then change the name of the first parameter to self as recommended by the style guidelines
+in PEP 8.
If there are no parameters, then it cannot be a normal method. It may need to be marked as a staticmethod
+or it could be moved out of the class as a normal function.
+
The following methods can both be used to assign values to variables in a point
+object. The second method makes the association clearer because the self parameter is
+used.
The __del__ method exists to release any resources held by an object when that object is deleted.
+The __del__ is called only by the garbage collector which may call it after an indefinite delay or
+never.
+
+Consequently, __del__ method should not be relied on to release resources, such as file descriptors.
+Rather, these resources should be released explicitly.
+
The existence of a complex __del__ method suggests that this is the main or only way to release resources
+associated with the object.
In order to ensure correct cleanup of the object add an explicit close(), or similar, +method. Possibly make the object a context manager.
+ +The __del__ method should just call close()
+ + +The first example below shows a class which relies on __del__ to release resources.
+The second example shows an improved version of the class where __del__ simply calls close.
+ A common pattern for functions returning multiple arguments is to return a + single tuple containing said arguments. If the function has multiple return + points, care must be taken to ensure that the tuples returned have the same + length. +
+Ensure that the function returns tuples of similar lengths.
+ +
+ In this example, the sum_length_product1 function
+ simultaneously calculates the sum, length, and product of the values in the
+ given list. For empty lists, however, the returned tuple only contains the
+ sum and length of the list. In sum_length_product2 this error
+ has been corrected.
+
When a function returns a non-trivial value, that value should not be ignored. Doing so may result in errors being ignored or +information being thrown away.
+ +A return value is considered to be trivial if it is None or it is a parameter (parameters, usually self are often
+returned to assist with method chaining, but can be ignored).
+A return value is also assumed to be trivial if it is ignored for 75% or more of calls.
+
Act upon all non-trivial return values, either propagating each value or recording it. +If a return value should be ignored, then ensure that it is ignored consistently. +
+ ++If you have access to the source code of the called function, then consider modifying it so that it does not return pointless values. +
+ + +
+In the ignore_error function the error condition is ignored.
+Ideally the Resource.initialize() function would raise an exception if it failed, but as it does not, the caller must deal with the error.
+The do_not_ignore_error function checks the error condition and raises an exception if Resource.initialize() fails.
+
There are one (or more) legal parameters for an overridden method that are +not legal for an overriding method. This will cause an error when the overriding +method is called with a number of parameters that is legal for the overridden method. +This violates the Liskov substitution principle. +
+ +Ensure that the overriding method accepts all the parameters that are legal for +overridden method.
+ +In this example there is a mismatch between the legal parameters for the base
+class method (self, source, filename, symbol) and the extension method
+(self, source). The extension method can be used to override the base
+method as long as values are not specified for the filename and
+symbol parameters. If the extension method was passed the additional
+parameters accepted by the base method then an error would occur.
The extension method should be updated to support the filename and
+symbol parameters supported by the overridden method.
Special methods (sometimes also called magic methods) are how user defined classes interact with the Python virtual machine.
+For example, for a class to support addition it must implement the __add__ and __radd__ special methods.
+When the expression a + b is evaluated the Python virtual machine will call type(a).__add__(a, b) and if that
+is not implemented it will call type(b).__radd__(b, a).
+Since these special methods are always called by the virtual machine with a fixed number of parameters, if the method is implemented with
+a different number of parameters it will fail at runtime with a TypeError.
+
Ensure that the method has the correct number of parameters
+ +In the example the __str__ method has an extra parameter. This means that if str(p) is called when p
+is a Point then it will fail with a TypeError.
+
All functions in Python return a value.
+If a function has no return statements or none of the return statements return a value
+then the function will return None. However, this value has no meaning and should be ignored.
Using the return value of such a 'procedure' is confusing to the reader as it suggests +that the value is significant. +
+ +Do not use the return value of a procedure; replace x = proc() with proc()
+and replace any use of the value with None.
In this example, the my_print function is a procedure as it returns no value of any meaning.
+Using the return value is misleading in subsequent code.
+
A cyclic import is an import which imports another module
+and that module imports (possibly indirectly) the module which contains the
+import statement.
Cyclic imports indicate that two modules are circularly dependent. This means +that the modules cannot be tested independently, and it makes it harder to +understand the architecture of the system. +
+ +The cycle may be broken by removing any one import. If only one function or +method requires the import, then consider moving that to the other module and +deleting the import. If the two modules are more intimately connected, then move +the inter-dependent parts into a third module and have both the original modules +import that. +
+ + +A module is deprecated when it cannot or will not be maintained indefinitely in the standard library. +Deprecated modules may not receive security fixes or other important updates. +See PEP 4 for a list of all deprecated modules. +
+Do not import the deprecated module. Replace uses of it with uses of a better maintained module. +
+ + Encoding errors prevent a module being evaluated and thus imported.
+An attempt to import a module with an invalid encoding will fail; a SyntaxError will be raised.
+Note that in Python 2, the default encoding is ASCII.
+
The existence of an encoding error in a module may suggest other problems as well.
+Either the module is never imported in practice and could be deleted or a
+try statement around the import is mistakenly discarding the SyntaxError.
+
Fixing the encoding error is the obvious fix. +However, it is worth investigating why a module containing an encoding error +was able to persist and address that problem as well. +
+
+ If a different encoding should be used for the file, specify it explicitly by
+ putting an encoding specification at the top of the file. For instance, to
+ specify UTF-8 encoding, add the line # coding=utf-8.
+
+Explicitly importing an attribute from a module into the current namespace means that the value of that attribute will not be updated if the value in the original module changes. +
++This can mean that changes in global state are not observed locally, which may lead to inconsistencies and possible errors. +
+ + +Instead of using from module import attr, simply import the module using import module
+and replace all uses of attr with module.attr.
+
In the first of the two modules shown below, from sys import stdout is used to import the stdout attribute,
+rather than using import sys to import the module. Then stdout is used in the main() function.
+
In the second module, below, a function, redirect_to_file is defined to collect the output from sys.stdout and save it to a file.
+However, redirect_to_file will not work correctly when passed the main() function.
+This is because the main() function will not see the change to sys.stdout,
+as it uses its own version of stdout that was defined when the module was loaded.
+
+The problem can be fixed by rewriting the first module to import the sys module and write to sys.stdout, as shown below.
+
This is defined as an error in PyFlakes.
+ + Using from xxx import * makes it difficult to determine what has
+been defined by the import statement. This may hide errors and introduce
+unexpected dependencies.
+Use explicit imports. For example from xxx import a, b, c
+
Importing a module twice using the import xxx and
+from xxx import yyy is confusing.
+
Remove the from xxx import yyy statement.
+Add yyy = xxx.yyy if required.
Code is easier to read when each import statement is defined on a separate line. +
+ +Update the code so that each import is defined on a separate line. PEP8 notes that it is +acceptable to define multiple imports from a subprocess in a single statement.
+ +The import statement:
+should be changed to:
+There is no need for a module to import itself. A module importing itself may lead to errors as +the module may be in an incomplete state when imported by itself. +
+ +Remove the import statement.
+Convert all expressions of the form mod.name where "mod" is the name
+of the current module to name.
In this example the module, ModuleImportsItself imports itself and has an expression
+referencing the module it is in as well.
The import can be removed and the reference can be corrected.
+A cyclic import is an import which imports another module
+and that module imports (possibly indirectly) the module which contains the
+import statement.
+If all imports in a cyclic import occur at module level, then a module will be
+imported when it is part way through its initialization. This may rest in
+surprising errors, as parts of the module being imported may not yet exist.
+
In addition to the possible errors, cyclic imports indicate that two modules +are circularly dependent. This means that the modules cannot be tested +independently, and it makes it harder to understand the architecture of the system. +
+ +The cycle may be broken by removing any one import. If only one function or +method requires the import, then consider moving that to the other module and +deleting the import. If the two modules are more intimately connected, then move +the inter-dependent parts into a third module and have both the original modules +import that. +
+ + +Importing the same module more than once has no effect as each module is only loaded once. It also +confuses readers of the code.
+ +Remove the second import.
+ + Syntax errors prevent a module being evaluated and thus imported.
+An attempt to import a module with invalid syntax will fail; a SyntaxError will be raised.
A common cause of syntax errors is the difference in syntax between Python 2 +and Python 3. In particular, a syntax error may be alerted if a Python 3 file is +assumed to be compatible with Python 2 (or vice versa). Explicitly specifying +the expected Python version can help prevent this. +
+ +The existence of a syntax error in a module may suggest other problems as well.
+Either the module is never imported in practice and could be deleted or a
+try statement around the import is mistakenly discarding the SyntaxError.
+
Fixing the syntax error is the obvious fix. +However, it is worth investigating why a module containing a syntax error +was able to persist and address that problem as well. +
+If you suspect that the syntax error is caused by the analysis using the
+wrong version of Python, consider specifying the version explicitly. For
+LGTM.com, you can customize extraction using an lgtm.yml file as
+described here.
+
When you import a module using from xxx import * all public names defined in the
+module are imported and bound in the local namespace of the import statement. The
+public names are determined by checking the __all__ variable for the module. If
+__all__ is not defined then all names within the module that do not start with an underscore
+character are imported. This pollutes the current namespace with names that are not part of the
+public API for the module.
+
There are two ways to address this problem:
+__all__
+ to restrict the names to be importedThe following simple example shows how __all__ controls the public names for the
+module finance.
If the finance module did not include a definition of __all__, then you
+could replace from finance import * with from finance import tax1, tax2.
+
A module is imported (using the import statement) but that module
+is never used. This creates a dependency that does not need to exist and makes the code
+more difficult to read.
+
Delete the import statement.
+ +
+Remove the commented-out code, or reinstate it if necessary. If you want to include a snippet
+of example code in a comment, consider adding an @example tag or enclosing the code
+in a code or pre element.
+
+In the following example, a print statement, originally used
+for debugging, is left in the code, but commented out. It should be removed altogether.
+
+Octal literals starting with 0 are easily misread as a decimal, +particularly by those programmers who do not have a C or Java background. +
+ ++The new literal syntax for non-decimal numbers is more distinct and is thus less likely to be misunderstood. +
+ ++Use the 0oXXX form instead of the 0XXX form. Alternatively use binary or hexadecimal format if that would be clearer. +
+ +A comment that includes the word TODO often marks a part of
+the code that is incomplete or broken, or highlights ambiguities in the
+software's specification.
For example, this list of comments is typical of those found in real +programs:
+ +TODO: move this code somewhere elseTODO: find a better solution to this workaroundTODO: test thisIt is very important that TODO comments are
+not just removed from the code. Each of them must be addressed in some way.
Simpler comments can usually be immediately addressed by fixing the code, +adding a test, doing some refactoring, or clarifying the intended behavior of +a feature.
+ +In contrast, larger issues may require discussion, and a significant amount +of work to address. In these cases it is a good idea to move the comment to an +issue-tracking system, so that the issue can be tracked +and prioritized relative to other defects and feature requests.
+ +The following example shows a function where a TODO comment indicates a known limitation in the +existing implementation. The function should be reviewed, the limitation addressed and then the +comment deleted.
+ +This metric measures the number of lines of code in a function. This excludes comments and blank lines.
+ +Having too many lines of code in a function is an indication that it can be split into several functions of more manageable size.
+ +Long functions should be examined to see if they can be split into smaller, more cohesive functions.
+ ++This metric measures the number of incoming dependencies for each +class, that is the number of other classes that depend on it. +
+ ++Classes that are depended upon by many other classes typically require a lot of +effort to change, because changing them will force their dependents to change +as well. This is not necessarily a bad thing -- indeed, most systems will have +some such classes (one example might be a string class). However, classes with a high number +of incoming dependencies +and a high number of outgoing dependencies are hard to maintain. A class with both high afferent +coupling and high efferent coupling is referred to as a hub class. +Such classes can be problematic, because on the one hand they are hard to +change (high afferent coupling), yet on the other they have many reasons to +change (high efferent coupling). This contradiction yields code that is very +hard to maintain or test. +
+ +
+Conversely, some classes may only be depended on by very few other classes. Again,
+this is not necessarily a problem -- we would expect, for example, that the
+top-level classes of a system would meet this criterion. When lower-level
+classes have very few incoming dependencies, however, it can be an indication
+that a class is not pulling its weight. In extreme cases, classes may even
+have an afferent coupling of 0, indicating that they are dead
+code.
+
+It is unwise to refactor a class based purely on its high or low number of +incoming dependencies -- a class's afferent coupling value only makes sense +in the context of its role in the system as a whole. However, when combined +with other metrics such as efferent coupling, it is possible to make some +general recommendations: +
+ +0 may be dead code --
+in this situation, they can often be deleted.
++Efferent coupling is the number of outgoing dependencies for each class. In other words, it is the +number of other classes on which each class depends. +
+ ++A class that depends on many other classes is quite brittle, because if any of +its dependencies change, the class itself may have to change as well. Furthermore, the +reason for the high number of dependencies is often that different parts of +the class depend on different groups of other classes, so it is common to +find that classes with high efferent coupling also lack cohesion. +
+ ++You can reduce efferent coupling by splitting up a class so that each part depends on fewer classes. +
+ +In the following example, class X depends on both Y and
+Z.
+
However, the methods that use Y do not use Z, and the methods
+that use Z do not use Y. Therefore, the class can be split into
+two classes, one of which depends only on Y and the other only on Z
+Although this is a slightly artificial example, this sort of situation +does tend to occur in more complicated classes, +so the general technique is quite widely applicable. +
+ +This metric measures the percentage of lines in a file that contain a comment or are part of a +multi-line comment. Note that this metric ignores docstrings.
+ +The percentage of comment lines should always be considered with the value for the related metric +"Percentage of docstrings". For public modules, functions, classes and methods docstrings are the +preferred method of documentation because the information can be inspected by the program at runtime, +for example, as an interactive help system or as metadata for a function.
+ +Having a low percentage of comments and docstrings is an indication that a file does not have +sufficient documentation. Undocumented code is difficult to understand, modify, and reuse.
+ +Add documentation to files with a low comment and docstring ratio. Use docstrings to document +public modules, functions, classes and methods.
+ +This metric measures the total cyclomatic complexity for the functions in a file. +
+ ++Cyclomatic complexity approximates the number of paths that can be taken during the execution of a +function (and hence, the minimum number of tests cases necessary to test it thoroughly). Straight-line +code has zero cyclomatic complexity, while branches and loops increase cyclomatic complexity.
+ +Files that contain too many complex functions can be difficult to test, understand, and maintain.
+ +Try to simplify overly-complex code. For example:
+ +This metric measures the number of modules that are directly imported by each module (file). +Modules that import many other modules often have too many responsibilities and are not well-focused. +This makes it difficult to understand and maintain the module. +
+ +Split and/or refactor files with too many responsibilities to create modules with a single, +well-defined role.
+ + +This metric measures the percentage of lines in a file that contain a docstring. Note that this +metric ignores comments. + +
Docstrings are a good way to associate documentation with a specific object in Python. For public +modules, functions, classes and methods docstrings are the preferred method of documentation because +the information can be inspected by the program at runtime, for example, as an interactive help system +or as metadata for a function.
+ +Having a low percentage of docstrings is often an indication that a file has insufficient +documentation. However, the value for the related metric "Percentage of comments" should also be +considered because packages and non-public methods may be documented using comments. Undocumented +code is difficult to understand, modify, and reuse.
+ +Add documentation to files with a low docstring ratio. It is most useful to start documenting +the public functions first.
+ ++Duplicated code increases overall code size, making the code base +harder to maintain and harder to understand. It also becomes harder to fix bugs, +since a programmer applying a fix to one copy has to always remember to update +other copies accordingly. Finally, code duplication is generally an indication of +a poorly designed or hastily written code base, which typically suffers from other +problems as well. +
+ + +This metric measures the number of classes in each file.
+ +There are advantages and disadvantages associated with defining multiple classes in the same file. +However, if you define unrelated classes in one file then the resulting module API is difficult for +other developers to understand and use.
+ +The disadvantages of putting multiple classes in the same file include:
+Sometimes there are advantages of putting multiple classes in the same file, for example:
+Each module should have a single, well-defined role. Consequently, only logically-related classes +should be grouped together in the same file. If your code defines unrelated classes in the same file +then you should refactor the code and create new files, each containing logically related classes.
+ +This metric measures the number of functions and methods in each file.
+ +Tracking this metric over time will indicate which parts of the system are under active development. +Cross-referencing with the other metrics "Cyclomatic Complexity" and "Lines of Code" is recommended, +because files with high values for all three metrics are very likely to be too big and unwieldy; such +files should be split up.
+ +If a file is too big, identify the different tasks that are carried out by its functions and split +the file according to these tasks.
+ +This metric measures the number of lines of code in each file. The value excludes docstrings, comments and +blank lines.
+ +Organizing source into very large files is not recommended because:
+The solution depends on the underlying cause:
+This metric measures the number of comment lines per file. A low number of comments may indicate files that are difficult to understand due to poor documentation.
+ +Consider if the file needs more documentation. Most files should have at least a comment explaining their purpose.
+ ++A file that contains many lines that are similar to other code within the code base is +problematic for the same reasons as a file that contains a lot of (exactly) +duplicated code. +
+ ++Refactor similar code snippets by extracting common functionality into functions +that can be reused across modules. +
+ ++ This metric measures the number of tests below this location in the tree. + At a file level, this would just be the number of tests in the file. +
+ ++ A function or method is considered to be a "test" if one of the major + testing frameworks would invoke it as part of a test run. + Recognized frameworks include unittest, pytest, doctest and nose. +
+ ++ In general, having many test cases is a good thing rather than a bad + thing. However, at the file level, tests should typically be grouped + by the functionality they relate to, which makes a file with an + exceptionally high number of tests a strong candidate for splitting + up. At a higher level, this metric makes it possible to compare the + number of tests in different components, potentially flagging + functionality that is comparatively under-tested. +
++ Since it is typically not a problem to have too many tests, this + metric is usually included for the purposes of collecting + information, rather than finding problematic areas in the code. With + that in mind, it is usually a good idea to avoid an excessive number + of tests in a single file, and to maintain a broadly comparable + level of testing across components. +
+ ++ When assessing the thoroughness of a code base's test suite, the number + of tests provides only part of the story. Test coverage + statistics allow a more detailed examination of which parts of the + code deserve improvements in this area. +
+If the number of calls that is made by a function (or method) to other functions is high, +the function can be difficult to +understand, because you have to read through all the functions that it calls +to fully understand what it does. There are various reasons why +a function may make a high number of calls, including: +
+ ++The appropriate action depends on the reason why the function +makes a high number of calls: +
+ +
+A method that contains a high level of nesting can be very difficult to understand. As noted in
+[McConnell], the human brain cannot easily handle more than three levels of nested if
+statements.
+Extract the control flow into a separate generator and use that to control iteration.
+ +
+Use early exits to move nested statements out of conditions. For example:
+
+def func(x):
+ if x:
+ long_complex_block()
+
+can be replaced by
+
+def func(x):
+ if x:
+ return
+ long_complex_block()
+
+
+Extract nested statements into new functions, for example by using the 'Extract Method' refactoring +from [Fowler].
+ ++For more ways to reduce the level of nesting in a method, see [McConnell]. +
+ ++Furthermore, a method that has a high level of nesting often indicates that its design can be +improved in other ways, as well as dealing with the nesting problem itself. +
+ ++In the following example, the code has four levels of nesting and is unnecessarily difficult to read. +
+ +
+In the following modified example, three different approaches to reducing the nesting depth are shown.
+The first, print_character_codes_early_exit, uses early exits, either return
+or continue.
+The second, print_character_codes_use_gen, extracts the control flow into a generator.
+The third, print_character_codes_extracted, uses a separate function for the inner loop.
+
+This metric measures the number of lines of text that have been added, deleted +or modified in files below this location in the tree. +
+ ++Code churn is known to be a good (if not the best) predictor of defects in a +code component (see e.g. [Nagappan] or [Khoshgoftaar]). The intuition is that +files, packages or projects that have experienced a disproportionately high +amount of churn for the amount of code involved may have been harder to write, +and are thus likely to contain more bugs. +
+ ++It is a fact of life that some code is going to be changed more than the rest, +and little can be done to change this. However, bearing in mind code churn's +effectiveness as a defect predictor, code that has been repeatedly changed +should be subjected to vigorous testing and code review. +
+ ++This metric measures the number of different authors (by examining the +version control history) +for files below this location in the tree. (This is a better version +of the metric that counts the number of different authors using Javadoc +tags.) +
+ ++Files that have been changed by a large number of different authors are +by definition the product of many minds. New authors working on a file +may be less familiar with the design and implementation of the code than +the original authors, which can be a potential source of bugs. Furthermore, +code that has been worked on by many people, if not carefully maintained, +often ends up lacking conceptual integrity. For both of these reasons, any +code that has been worked on by an unusually high number of different people +merits careful inspection in code reviews. +
+ ++There is clearly no way to reduce the number of authors that have worked +on a file - it is impossible to rewrite history. However, files highlighted +by this metric should be given special attention in a code review, and may +ultimately be good candidates for refactoring/rewriting by an individual, +experienced developer. +
+ + + ++This metric measures the average number of co-committed files for the files +below this location in the tree. +
+ +
+A co-committed file is one that is committed at the same time as a given file.
+For instance, if you commit files A, B and C together, then B and C would be
+the co-committed files of A for that commit. The value of the metric for an
+individual file is the average number of such co-committed files over all
+commits. The value of the metric for a directory is the aggregation of these
+averages - for instance, if we are using max as our aggregation
+function, the value would be the maximum of the average number of co-commits
+over all files in the directory.
+
+An unusually high value for this metric may indicate that the file in question +is too tightly-coupled to other files, and it is difficult to change it in +isolation. Alternatively, it may just be an indication that you commit lots of +unrelated changes at the same time. +
+ ++Examine the file in question to see what the problem is. +
+ ++This metric measures the total number of commits made to files +below this location in the tree. For an individual file, it measures the +number of commits that have affected that file. For a directory of files, it +measures the total number of commits affecting files below that +directory. +
+ ++This metric measures the number of file re-commits that have occurred below +this location in the tree. A re-commit is taken to mean a commit to a file +that was touched less than five days ago. +
+ ++In a system that is being developed using a controlled change process (where +changes are not committed until they are in some sense 'complete'), re-commits +can be (but are not always) an indication that an initial change was not +successful and had to be revisited within a short time period. The intuition +is that the original change may have been difficult to get right, and hence +the code in the file may be more than usually defect-prone. The concept is +somewhat similar to that of 'change bursts', as described in [Nagappan]. +
+ ++High numbers of re-commits can be addressed on two levels: preventative and +corrective. +
+ +
+This metric measures the number of recent commits to files that have occurred
+below this location in the tree. A recent commits is taken to mean a commits
+that has occurred in the last 180 days.
+
+All code that has changed a great deal may be more than usually prone to +defects, but this is particularly true of code that has been changing +dramatically in the recent past, because it has not yet had a chance to be +properly field-tested in order to iron out the bugs. +
+ ++There is more than one reason why a file may have been changing a lot +recently: +
+ ++A cohesive class is one in which most methods access the same fields. A class that +lacks cohesion is usually one that has multiple responsibilities. +
+ ++Various measures of lack of cohesion have been proposed. The Chidamber and Kemerer +version of lack of cohesion inspects pairs of methods. If there are many pairs that +access the same data, the class is cohesive. If there are many pairs that do not access +any common data, the class is not cohesive. More precisely, if:
+ +n1 is the number of pairs of distinct methods in a class that
+ do not have at least one commonly-accessed field, andn2 is the number of pairs of distinct methods in a class that
+ do have at least one commonly-accessed field,the lack of cohesion measure (LCOM) can be defined as: +
+ ++LCOM = max((n1 - n2) / 2, 0) +
+ ++High values of LCOM indicate a significant lack of cohesion. As a rough +indication, an LCOM of 500 or more may give you cause for concern. +
+ ++Classes generally lack cohesion because they have more responsibilities +than they should (see [Martin]). In general, the solution is to identify each +of the different responsibilities that the class has, and split them +into multiple classes, using the 'Extract Class' refactoring from [Fowler], for +example. +
+ + + ++A cohesive class is one in which most methods access the same fields. A class that +lacks cohesion is usually one that has multiple responsibilities. +
+ ++Various measures of lack of cohesion have been proposed. A measure proposed by Hitz and Montazeri +counts the number of strongly connected components, that is disjoint subgraphs, +in the graph of method and attribute dependencies. +This can be thought of as the number of possible classes that a single class could be split into. +
+ ++Values of LCOM above 1 indicate a lack of cohesion in that there are several +disjoint subgraphs in a graph of intra-class dependencies. +
+ ++Classes generally lack cohesion because they have more responsibilities +than they should (see [Martin]). In general, the solution is to identify each +of the different responsibilities that the class has, and split them +into multiple classes, using the 'Extract Class' refactoring from [Fowler], for +example. +
+ + + ++This metric measures the number of incoming dependencies for each +module, that is the number of other modules that depend on it. +
+ ++Modules that are depended upon by many other modules typically require a lot of +effort to change, because changing them will force their dependents to change +as well. This is not necessarily a bad thing -- indeed, most systems will have +some such modules (one example might be an I/O module). However, modules with a high number +of incoming dependencies and a high number of outgoing dependencies are hard to maintain. +A module with both high afferent coupling and high efferent coupling can be problematic +because, on the one hand they are hard to change (high afferent coupling), yet on the other they +have many reasons to change (high efferent coupling). This contradiction yields code that is very +hard to maintain or test. +
+ +
+Conversely, some modules may only be depended on by very few other modules. Again,
+this is not necessarily a problem -- we would expect, for example, that the
+top-level modules of a system would meet this criterion. When lower-level
+modules have very few incoming dependencies, however, it can be an indication
+that a module is not pulling its weight. In extreme cases, modules may even
+have an afferent coupling of 0, indicating that they are dead
+code.
+
+It is unwise to refactor a module based purely on its high or low number of +incoming dependencies -- a module's afferent coupling value only makes sense +in the context of its role in the system as a whole. However, when combined +with other metrics such as efferent coupling, it is possible to make some +general recommendations: +
+ +0 may be dead code --
+in this situation, they can often be deleted.
++Efferent coupling is the number of outgoing dependencies for each module. In other words, it is the +number of other modules on which each module depends. +
+ ++A module that depends on many other modules is quite brittle, because if any of +its dependencies change, the module itself may have to change as well. Furthermore, the +reason for the high number of dependencies is often that different parts of +the module depend on different groups of other modules, so it is common to +find that modules with high efferent coupling also lack cohesion. +
+ ++You can reduce efferent coupling by splitting up a module so that each part depends on fewer modules. +
+ + ++A function (or method) that uses a high number of parameters makes maintenance more difficult: +
+ ++Restrict the number of parameters for a function, according to the reason for the high number: +
+ +When a function is part of a published interface, one possible solution is to add a new, wrapper +function to the interface that has a tidier signature. Alternatively, you can publish a new version of +the interface that has a better design. Clearly, however, neither of these solutions is ideal, +so you should take care to design interfaces the right way from the start.
+ +The practice of adding parameters for future extensibility is especially +bad. It is confusing to other programmers, who are uncertain what values they should pass +in for these unnecessary parameters, and it adds unused code that is potentially difficult to remove +later.
+ +In the following example, although the parameters are logically related, they are passed into the
+print_annotation function separately.
In the following modified example, the print_annotation function is simplified by logically grouping
+the related parameters into a single class.
+An instance of the class can then be passed into the function instead, as shown below.
+
In the following example, the print_membership function has too many responsibilities,
+and so needs to be passed four arguments.
In the following modified example, print_membership has been broken into four functions.
+(For brevity, only one function is shown.) As a result, each new function needs to be passed only one
+of the original four arguments.
+This metric measures the number of statements that occur in a module. +
+ ++If there are too many statements in a module, it is generally +for one of two reasons: +
+ ++As described above, modules reported as violations by this rule contain one +or more classes or functions with too many statements, or the module itself contains +too many classes or functions.
+ +This metric measures the number of modules that are imported by each module (file) - either directly +by an import statement or indirectly (that is, imported by a module that is imported). Modules that +import many other modules often have too many responsibilities and are not well-focused. +This makes it difficult to understand and maintain the module. +
+ +Split and/or refactor files with too many responsibilities to create modules with a single, +well-defined role.
+ +If a file is opened then it should always be closed again, even if an +exception is raised. +Failing to ensure that all files are closed may result in failure due to too +many open files.
+ +Ensure that if you open a file it is always closed on exiting the method.
+Wrap the code between the open() and close()
+functions in a with statement or use a try...finally
+statement. Using a with statement is preferred as it is shorter
+and more readable.
The following code shows examples of different ways of closing a file. In the first example, the +file is closed only if the method is exited successfully. In the other examples, the file is always +closed on exiting the method.
+ ++Accessing files using paths constructed from user-controlled data can allow an attacker to access +unexpected resources. This can result in sensitive information being revealed or deleted, or an +attacker being able to influence behavior by modifying unexpected files. +
+
+Validate user input before using it to construct a file path, either using an off-the-shelf library function
+like werkzeug.utils.secure_filename, or by performing custom validation.
+
+Ideally, follow these rules: +
+ +
+In the first example, a file name is read from an HTTP request and then used to access a file.
+However, a malicious user could enter a file name that is an absolute path, such as
+"/etc/passwd".
+
+In the second example, it appears that the user is restricted to opening a file within the
+"user" home directory. However, a malicious user could enter a file name containing
+special characters. For example, the string "../../../etc/passwd" will result in the code
+reading the file located at "/server/static/images/../../../etc/passwd", which is the system's
+password file. This file would then be sent back to the user, giving them access to all the
+system's passwords.
+
+In the third example, the path used to access the file system is normalized before being checked against a +known prefix. This ensures that regardless of the user input, the resulting path is safe. +
+ +Code that passes user input directly to
+exec, eval, or some other library
+routine that executes a command, allows the user to execute malicious
+code.
If possible, use hard-coded string literals to specify the command to run +or the library to load. Instead of passing the user input directly to the +process or library function, examine the user input and then choose +among hard-coded string literals.
+ +If the applicable libraries or commands cannot be determined at +compile time, then add code to verify that the user input string is +safe before using it.
+ +The following example shows two functions. The first is unsafe as it takes a shell script that can be changed
+by a user, and passes it straight to subprocess.call() without examining it first.
+The second is safe as it selects the command from a predefined white-list.
+Directly writing user input (for example, an HTTP request parameter) to a webpage +without properly sanitizing the input first, allows for a cross-site scripting vulnerability. +
+
+To guard against cross-site scripting, consider escaping the input before writing user input to the page.
+The standard library provides escaping functions: html.escape() for Python 3.2 upwards
+or cgi.escape() older versions of Python.
+Most frameworks also provide their own escaping functions, for example flask.escape().
+
+The following example is a minimal flask app which shows a safe and unsafe way to render the given name back to the page.
+The first view is unsafe as first_name is not escaped, leaving the page vulnerable to cross-site scripting attacks.
+The second view is safe as first_name is escaped, so it is not vulnerable to cross-site scripting attacks.
+
+If a database query (such as a SQL or NoSQL query) is built from +user-provided data without sufficient sanitization, a user +may be able to run malicious database queries. +
++Most database connector libraries offer a way of safely +embedding untrusted data into a query by means of query parameters +or prepared statements. +
++In the following snippet, from an example django app, +a name is stored in the database using two different queries. +
+ ++In the first case, the query string is built by +directly using string formatting from a user-supplied request attribute. +The parameter may include quote characters, so this +code is vulnerable to a SQL injection attack. +
+ ++In the second case, the user-supplied request attribute is passed +to the database using query parameters. +
+ +
+Directly evaluating user input (for example, an HTTP request parameter) as code without properly
+sanitizing the input first allows an attacker arbitrary code execution. This can occur when user
+input is passed to code that interprets it as an expression to be
+evaluated, such as eval or exec.
+
+Avoid including user input in any expression that may be dynamically evaluated. If user input must +be included, use context-specific escaping before including it. +It is important that the correct escaping is used for the type of evaluation that will occur. +
+
+The following example shows two functions setting a name from a request.
+The first function uses exec to execute the setname function.
+This is dangerous as it can allow a malicious user to execute arbitrary code on the server.
+For example, the user could supply the value "' + subprocess.call('rm -rf') + '"
+to destroy the server's file system.
+The second function calls the setname function directly and is thus safe.
+
+
+Software developers often add stack traces to error messages, as a +debugging aid. Whenever that error message occurs for an end user, the +developer can use the stack trace to help identify how to fix the +problem. In particular, stack traces can tell the developer more about +the sequence of events that led to a failure, as opposed to merely the +final state of the software when the error occurred. +
+ ++Unfortunately, the same information can be useful to an attacker. +The sequence of class names in a stack trace can reveal the structure +of the application as well as any internal components it relies on. +Furthermore, the error message at the top of a stack trace can include +information such as server-side file names and SQL code that the +application relies on, allowing an attacker to fine-tune a subsequent +injection attack. +
++Send the user a more generic error message that reveals less information. +Either suppress the stack trace entirely, or log it only on the server. +
++In the following example, an exception is handled in two different +ways. In the first version, labeled BAD, the exception is sent back to +the remote user by returning it from the function. As such, +the user is able to see a detailed stack trace, which may contain +sensitive information. In the second version, the error message is +logged only on the server, and a generic error message is displayed to +the user. That way, the developers can still access and use the error +log, but remote users will not see the information. +
+ ++ Using broken or weak cryptographic algorithms can leave data + vulnerable to being decrypted or forged by an attacker. +
+ ++ Many cryptographic algorithms provided by cryptography + libraries are known to be weak, or flawed. Using such an + algorithm means that encrypted or hashed data is less + secure than it appears to be. +
+ ++ Ensure that you use a strong, modern cryptographic + algorithm. Use at least AES-128 or RSA-2048 for + encryption, and SHA-2 or SHA-3 for secure hashing. +
+ +
+ The following code uses the pycrypto
+ library to encrypt some secret data. When you create a cipher using
+ pycrypto you must specify the encryption
+ algorithm to use. The first example uses DES, which is an
+ older algorithm that is now considered weak. The second
+ example uses Blowfish, which is a stronger more modern algorithm.
+
+ WARNING: Although the second example above is more robust,
+ pycrypto is no longer actively maintained so we recommend using cryptography instead.
+