官术网_书友最值得收藏!

Understanding readability

The readability of a software system is closely tied to its modifiability. Well-written, well-documented code, keeping up with standard or adopted practices for the programming language, tends to produce simple, concise code that is easy to read and modify.

Readability is not only related to the aspect of following good coding guidelines, but it also ties up to how clear the logic is, how much the code uses standard features of the language, how modular the functions are, and so on.

In fact, we can summarize the different aspects of readability as follows:

  • Well-written: A piece of code is well-written if it uses simple syntax and well-known features and idioms of the language, if the logic is clear and concise, and if it uses variables, functions, and class/module names meaningfully, that is, they express what they do.
  • Well-documented: Documentation usually refers to the inline comments in the code. A well-documented piece of code tells what it does, what its input arguments are, and what is its return value (if any) along with the logic or algorithm, in some detail. It also documents any external library or API usage and configuration required for running the code either inline or in separate files.
  • Well-formatted: Most programming languages, especially the open source languages like Python, developed over the internet via distributed but closely-knit programming communities, tend to have well-documented style guidelines. A piece of code that keeps up with these guidelines on aspects such as indentation and formatting will tend to be more readable than something that doesn't.

Lack of readability affects modifiability, and hence, maintainability of the code, thereby incurring ever-increasing costs for the organization in terms of resources—mainly people and time—in maintaining the system in a useful state.

Python and readability

Python is a language that has been designed from the ground-up for readability. To borrow a line from the well-known Zen of Python, we can say:

Readability counts

Tip

The Zen of Python is a set of 20 principles that influence the design of the Python programming language, 19 of which have been written down. You can see the Zen of Python by opening the Python interpreter prompt and typing this:

>>>import this

Python, as a language, emphasizes readability. It achieves this by clear, concise keywords, which mimic their English language counterparts, using minimal operators, and using the following philosophy:

There should be one—and preferably only one—obvious way to do it.

For example, here is one way to iterate through a sequence in Python while also printing its index:

for idx in range(len(seq)):
    item = seq[idx]
    print(idx, '=>', item)

However, a more common idiom used in Python is the enumerate() helper for iterators, which returns a two tuple of (idx, item) for each item in the sequence:

for idx, item in enumerate(seq):
    print(idx, '=>', item)

In many other programming languages such as C++ or Java, the first version would be considered with the same merit as the second version. However, in Python, there are certain idioms of writing code that keep up with the language's principles—the Zen—than certain others.

In this case, the second version is closer to the way Python programmers would write code to solve the problem. The first way would be considered less Pythonic than the second one.

The term "Pythonic" is something you would commonly encounter when interacting with the Python community. It means that the code not just solves the problem, but follows the conventions and idioms the Python community generally follows, and uses the language in the way it is intended to be used.

Note

The definition of Pythonic is subjective, but you can think of it as Python code keeping up with the Zen of Python, or in general, following well-known idiomatic programming practices adopted by the community.

Python, by its design principles and clean syntax, makes writing readable code easy. However, it is a common trap for programmers migrating to Python from other more pedantic and less-idiomatic languages to write Python code in a less Pythonic way.

It is important for a Python programmer to understand this aspect early so that you tend to write more idiomatic or Pythonic code as you get used to the language more and more. You can be more productive with Python in the long term if you familiarize yourself with its coding principles and idioms than otherwise.

Readability – antipatterns

Python, in general, encourages and facilitates writing readable code. However, it would be, of course, very unrealistic to say that any code written in Python is highly readable. Even with all of its readability DNA, Python also has its fair share of difficult-to-read, badly written, or unreadable code as can be evident by spending some time scanning through some of the public, open source code written in Python on the web.

There are certain practices that tend to produce difficult-to-read or unreadable code in a programming language. These can be thought of as antipatterns, which are a bane, not just in programming with Python, but in any programming language:

  • Code with little or no comments: Lack of code comments is often the primary reason for producing code that is unreadable. More often than not, programmers don't do a very good job of documenting their thoughts, which led to a particular implementation, in code. When the same code is read by another programmer or by the same programmer a few months later (this happens quite a lot!), it is not easy to figure out why the specific implementation approach was followed. This makes it difficult to reason about the pros and cons of an alternate approach.

    This also makes taking decisions on modifying the code—perhaps for a customer fix—difficult, and in general, affects code modifiability in the long term. The commenting of code is often an indicator of the discipline and rigor of the programmer who wrote the code and of the organization in enforcing such practices.

  • Code that breaks best practices of the language: Best practices of a programming language typically evolve from years of experience in using the language by a community of developers, and the efficient feedback that it generates. They capture the best way of putting the programming language to good use to solve problems, and typically, capture the idioms and common patterns for using the language.

    For example, in Python, the Zen can be considered as a shining torch to its best practices and the set of common programming idioms adopted by the community.

    Often, programmers who are either inexperienced or those who migrate from other programming languages or environments tend to produce code that is not in keeping with these practices, and hence, end up writing code that is low on readability.

  • Programming antipatterns: There are a large number of coding or programming antipatterns, which tend to produce difficult-to-read, and hence, difficult-to-maintain code. Here are some of the well-known ones:
    • Spaghetti code: This is a piece of code with no discernible structure or control-flow. It is typically produced by following complex logic with a lot of unconditional jumps and unstructured exception handling, badly written concurrent code and so on.
    • Big ball of mud: This is a system with pieces of code that show no overall structure or goal. Big ball of mud typically consists of many pieces of spaghetti code and is usually a sign of code that has been worked on by multiple people, patched-up multiple times with little or zero documentation.
    • Copy-Paste programming: Often produced in organizations where expediency of delivery is favored over thoughtful design, copy/paste coding produces long, repetitive chunks of code, which essentially do the same thing again and again with minor modifications. This leads to code-bloat and, in the long term, the code becomes unmaintainable.

      A similar antipattern is cargo-cult programming, where programmers follows the same design or programming pattern over and over again without a thought to whether it fits the specific scenarios or problems that they are trying to solve.

    • Ego programming: Ego programming is where a programmer—often an experienced one—favors their personal style over the documented best practices or the organizational style of coding. This sometimes creates code that is cryptic and difficult to read for the other—usually, younger or less-experienced programmers. An example is the tendency to use functional programming constructs in Python to write everything as a one-liner.

Coding antipatterns can be circumvented by adopting practices of structured programming in your organization, and by enforcing the use of coding guidelines and best practices.

The following are some antipatterns that are specific to Python:

  • Mixed indentation: Python uses indentation to separate blocks of code, as it lacks braces or other syntactical constructs of languages such as C/C++ or Java, which separate code blocks. However, we need to be careful when indenting code in Python. A common antipattern is where people mix both tabs (the \t character) and spaces in their Python code. This can be fixed by using editors that always use either tabs or spaces to indent code.

    Python comes with built-in modules such as tabnanny, which can be used to check your code for indentation issues.

  • Mixing string literal types: Python provides three different ways to create string literals: either by using the single quote ('), the double quote ("), or Python's own special triple quote (''' or """). Code that mixes these three types of literals in the same block of code or functional unit becomes more difficult to read.
  • Overuse of functional constructs: Python, being a mixed paradigm language, provides support for functional programming via its lambda keyword and its map(), reduce(), and filter()functions. However, sometimes, experienced programmers or programmers coming from a background of functional programming to Python overuse these constructs, producing code that is too cryptic and, hence, unreadable to other programmers.

Techniques for readability

Now that we have a good knowledge on what helps readability of code, let's look at the approaches that we can adopt in order to improve the readability of code in Python.

Document your code

A simple and effective way to improve the readability of your code is to document what it does. Documentation is important for readability and long term modifiability of your code.

Code documentation can be categorized as follows:

  • Inline documentation: The programmer documents their code by using code comments, function documentation, module documentation, and others as part of the code itself. This is the most effective and useful type of code documentation.
  • External documentation: These are additional documentation captured in separate files, which usually document aspects such as usage of code, code changes, install steps, deployment, and the like. Examples are the README, INSTALL, or CHANGELOG files usually found with open source projects keeping up with the GNU build principles.
  • User manuals: These are formal documents, usually by a dedicated person or team, using pictures and text that is usually targeted toward users of the system. Such documentation is usually prepared and delivered toward the end of a software project when the product is stable and is ready to ship. We are not concerned with this type of documentation in our discussion here.

Python is a language that is designed for smart inline code documentation from the ground up. In Python, inline documentation can be done at the following levels:

  • Code comments: This is the text inline with code, prefixed by the hash (#) character. They can be used liberally inside your code explaining what each step of the code does.

    Here is an example:

    # This loop performs a network fetch of the URL, retrying up to 3
    # times in case of errors. In case the URL can't be fetched, 
    # an error is returned.
    
    # Initialize all state
    count, ntries, result, error = 0, 3, None, None
    while count < ntries:
        try:
            # NOTE: We are using an explicit   timeout of 30s here
            result = requests.get(url, timeout=30)
        except Exception as error:
            print('Caught exception', error, 'trying again after a while')
          # increment count
          count += 1
          # sleep 1 second every time
          time.sleep(1)
      
      if result == None:
        print("Error, could not fetch URL",url)
        # Return a tuple of (<return code>, <lasterror>)
        return (2, error)
    
    # Return data of URL
        return result.content

    Notice the liberal use of comments even in places it may be deemed superfluous. We will look at some general rules of thumb in commenting your code later.

  • The docstring function: Python provides a simple way to document what a function does by using a string literal just below the function definition. This can be done by using any of the three styles of string literals.

    Here is an example:

        def fetch_url(url, ntries=3, timeout=30):
             " Fetch a given url and return its contents "
    
            # This loop performs a network fetch of the URL, retrying 
            # up to
            # 3 times in case of errors. In case the URL can't be 
            # fetched,       
            # an error is returned.
    
            # Initialize all state
            count, result, error = 0, None, None
            while count < ntries:
                try:
                    result = requests.get(url, timeout=timeout)
                except Exception as error:
                    print('Caught exception', error, 'trying again after a while')
                    # increment count
                    count += 1
                    # sleep 1 second every time
                    time.sleep(1)
        
            if result == None:
                print("Error, could not fetch URL",url)
            # Return a tuple of (<return code>, <lasterror>)
            return (2, error)
        
            # Return data of URL
            return result.content

    The function docstring is the line that says fetch a given URL and return its contents. However, though it is useful, the usage is limited, since it only says what the function does and doesn't explain its parameters. Here is an improved version:

      def fetch_url(url, ntries=3, timeout=30):
            """ Fetch a given url and return its contents. 
            
            @params
                url - The URL to be fetched.
                ntries - The maximum number of retries.
                timeout - Timout per call in seconds.
        
            @returns
                On success - Contents of URL.
                On failure - (error_code, last_error)
            """
    
            # This loop performs a network fetch of the URL, 
            # retrying up to      
            # 'ntries' times in case of errors. In case the URL 
            # can't be fetched, an error is returned.
    
            # Initialize all state
            count, result, error = 0, None, None
            while count < ntries:
                try:
                    result = requests.get(url, timeout=timeout)
                except Exception as error:
                    print('Caught exception', error, 'trying again after a while')
                    # increment count
                    count += 1
                    # sleep 1 second every time
                    time.sleep(1)
        
            if result == None:
                print("Error, could not fetch URL",url)
                # Return a tuple of (<return code>, <lasterror>)
                return (2, error)
        
            # Return data of the URL
            return result.content

    In the preceding code, the function usage has become much clearer to the programmer. Note that such extended documentation would usually span more than one line, and hence, it is a good idea to always use triple quotes with your function docstrings.

  • Class docstrings: These work just like a function docstring except that they provide documentation for a class directly. This is provided just below the class keyword defining the class.

    Here is an example:

        class UrlFetcher(object):
             """ Implements the steps of fetching a URL.
    
                 Main methods:
                 fetch - Fetches the URL.
                 get - Return the URLs data.
             """
    
            def __init__(self, url, timeout=30, ntries=3, headers={}):
                """ Initializer. 
                @params
                    url - URL to fetch.
                    timeout - Timeout per connection (seconds).
                    ntries - Max number of retries.
                    headers - Optional request headers.
                """
                self.url = url
                self.timeout = timeout
                self.ntries = retries
                self.headers = headers
                # Enapsulated result object
                self.result = result 
    
            def fetch(self):
                """ Fetch the URL and save the result """
        
                # This loop performs a network fetch of the URL, 
                  # retrying 
                # up to 'ntries' times in case of errors. 
    
                count, result, error = 0, None, None
                while count < self.ntries:
                    try:
                        result = requests.get(self.url,
                                              timeout=self.timeout,
                                              headers = self.headers)
                    except Exception as error:
                        print('Caught exception', error, 'trying again after a while')
                        # increment count
                        count += 1
                        # sleep 1 second every time
                        time.sleep(1)
    
                if result != None:
                    # Save result
                    self.result = result
        
            def get(self):
                """ Return the data for the URL """
    
                if self.result != None:
                    return self.result.content

    See how the class docstring defines some of the main methods of the class. This is a very useful practice, as it gives the programmer useful information at the top level without having to go and inspect each function's documentation separately.

  • Module docstrings: Module docstrings capture information at the module level, usually about the functionality of the module and some detail about what each member of the module (function, class, and others) does. The syntax is the same as the class or function docstring. The information is usually captured at the top of the module, before any code.

    A module documentation can also capture any specific external dependencies of a module:

    """
        urlhelper - Utility classes and functions to work with URLs.
        
        Members:
    
            # UrlFetcher - A class which encapsulates action of 
            # fetching content of a URL.
            # get_web_url - Converts URLs so they can be used on the 
            # web.
            # get_domain - Returns the domain (site) of the URL.
    """
    
    import urllib
    
    def get_domain(url):
        """ Return the domain name (site) for the URL"""
    
        urlp = urllib.parse.urlparse(url)
        return urlp.netloc
    
    def get_web_url(url, default='http'):
        """ Make a URL useful for fetch requests
        -  Prefix network scheme in front of it if not present already
        """ 
    
        urlp = urllib.parse.urlparse(url)
        if urlp.scheme == '' and urlp.netloc == '':
              # No scheme, prefix default
          return default + '://' + url
    
        return url
    
    class UrlFetcher(object):
         """ Implements the steps of fetching a URL.
    
            Main methods:
            fetch - Fetches the URL.
            get - Return the URLs data.
        """
    
        def __init__(self, url, timeout=30, ntries=3, headers={}):
            """ Initializer. 
            @params
                url - URL to fetch.
                timeout - Timeout per connection (seconds).
                ntries - Max number of retries.
                headers - Optional request headers.
            """
            self.url = url
            self.timeout = timeout
            self.ntries = retries
            self.headers = headers
            # Enapsulated result object
            self.result = result 
    
        def fetch(self):
            """ Fetch the URL and save the result """
    
            # This loop performs a network fetch of the URL, retrying 
            # up to 'ntries' times in case of errors. 
    
            count, result, error = 0, None, None
            while count < self.ntries:
                try:
                    result = requests.get(self.url,
                                          timeout=self.timeout,
                                          headers = self.headers)
                except Exception as error:
                    print('Caught exception', error, 'trying again after a while')
                    # increment count
                    count += 1
                    # sleep 1 second every time
                    time.sleep(1)
    
            if result != None:
                # Save result
                self.result = result
    
        def get(self):
            """ Return the data for the URL """
    
            if self.result != None:
                return self.result.content

Follow coding and style guidelines

Most programming languages have a relatively well-known set of coding and/or style guidelines. These are either developed over many years of use as a convention, or come as a result of discussions in the online community of that programming language. C/C++ is a good example of the former, and Python is a good example of the latter.

It is also a common practice for companies to specify their own guidelines—mostly, by adopting existing standard guidelines and customizing them for the company's own specific development environment and requirements.

For Python, there is a clear set of coding style guidelines published by the Python programming community. This guideline, known as PEP-8, is available online as part of the Python Enhancement Proposal (PEP) set of documents.

Note

You can find PEP-8 at the following URL: https://www.python.org/dev/peps/pep-0008/.

PEP-8 was first created in 2001 and has undergone multiple revisions since then. The primary author is the creator of Python, Guido Van Rossum, with input from Barry Warsaw and Nick Coghlan.

PEP-8 was created by adapting Guido's original Python Style Guide essay with additions from Barry's style guide.

We will not go deep into PEP-8 in this book, as the goal of this section is not to teach you PEP-8. However, we will discuss the general principles underlying PEP-8.

The philosophy underlying PEP-8 can be summarized as follows:

  • Code is read more than it is written. Hence, providing a guideline would make code more readable and make it consistent across a full spectrum of Python code.
  • Consistency within a project is important. However, consistency within a module or package is more important. Consistency within a unit of code—such as class or function is the most important.
  • Know when to ignore a guideline. For example, this may happen if adopting the guideline makes your code less readable, breaks the surrounding code, or breaks backward compatibility of the code. Study examples, and choose what is best.
  • If a guideline is not directly applicable or useful for your organization, customize it. If you have any doubts about a guideline, get clarification by asking the Python community.

Review and refactor code

Code requires maintenance. Unmaintained code that is used in production can become a problem if not tended to periodically.

Periodically scheduled reviews of code can be very useful in keeping the code readable and in good health aiding modifiability and maintainability. Code that is central to a system or an application in production tends to get a lot of quick-fixes over time, as it is customized or enhanced for different use cases or patched for issues. It is observed that programmers generally don't document such quick fixes, as the situations demand expedite testing and deployment over good engineering practices such as documentation.

Over time, such patches can accumulate, thereby causing code-bloat and creating future engineering debts for the team, which can become a costly affair. The solution is periodical reviews.

Reviews should be done with engineers who are familiar with the application, but ideally, who are not working on the same code. This gives the code a fresh set of eyes, which is often useful in detecting bugs that the original author(s) may have overlooked. It is a good idea to get large changes reviewed by a couple of reviewers who are experienced developers.

This can be combined with the general refactoring of code to improve implementation, reduce coupling, or increase cohesion.

Commenting the code

We are coming toward the end of our discussions on readability of code, and it is a good time to introduce some general rules of thumb to follow when writing code comments. These can be listed as follows:

  • Comments should be descriptive, and should explain the code. A comment that simply repeats what is obvious from the function name is not very useful.

    Here is an example. Both of the following codes show the same implementation of a Root-Mean-Squared (RMS) velocity calculation, but the second version has a much more useful docstring than the first:

    def rms(varray=[]):
        """ RMS velocity """
    
        squares = map(lambda x: x*x, varray)
        return pow(sum(squares), 0.5) 
    
    def rms(varray=[]):
        """ Root mean squared velocity. Returns
        square root of sum of squares of velocities """
    
        squares = map(lambda x: x*x, varray)
        return pow(sum(squares), 0.5)
  • Code comments should be written in the block we are commenting on, rather than as follows:
    # This code calculates the sum of squares of velocities 
    squares = map(lambda x: x*x, varray)

    The preceding version is much clearer than the following version, which uses comments below the code:

    squares = map(lambda x: x*x, varray)
    # The above code calculates the sum of squares of velocities 
  • Inline comments should be used as little as possible. This is because it is very easy to get these confused as part of the code itself, especially if the separating comment character is accidentally deleted, causing bugs:
    squares = map(lambda x: x*x, varray)   # Calculate squares of velocities
  • Try to avoid comments that are superfluous and add little value:
    # The following code iterates through odd numbers
    for num in nums:
        # Skip if number is odd
        if num % 2 == 0: continue

    The second comment in the last piece of code adds little value and can be omitted.

主站蜘蛛池模板: 酒泉市| 克什克腾旗| 迁西县| 礼泉县| 石城县| 武汉市| 泸州市| 宜良县| 拜城县| 澳门| 横山县| 阳春市| 英超| 海淀区| 工布江达县| 南和县| 蓬安县| 远安县| 绍兴县| 高台县| 昌邑市| 彩票| 永丰县| 安吉县| 贵港市| 彩票| 泸定县| 左云县| 许昌县| 商洛市| 珲春市| 通城县| 南澳县| 宝鸡市| 澎湖县| 英吉沙县| 徐州市| 白山市| 桓仁| 梅河口市| 新安县|