- Software Architecture with Python
- Anand Balachandran Pillai
- 2999字
- 2021-07-02 23:29:57
Understanding testability
Testability can be defined as follows:
"The degree of ease with which a software system exposes its faults through execution-based testing".
A software system with a high level of testability provides a high degree of exposure of its faults through testing, thereby giving the developers higher accessibility to the system's issues and allowing them to find and fix bugs faster. A less testable system, on the other hand, would make it difficult for developers to figure out issues with it and can often lead to unexpected failures in production.
Testability is, hence, an important aspect in ensuring the quality, stability, and predictability of the software system in production.
Software testability and related attributes
A software system is testable if it gives up (exposes) its faults easily to the tester. Not only that, the system should behave in a predictable way for the tester to develop useful tests. An unpredictable system would give varying output variables to fixed input at varying times, hence, is not testable (or very useful for that matter!).
More than unpredictability, complex or chaotic systems are also less amenable to testing. For example, a system whose behavior varies wildly across a spectrum under load doesn't make a good candidate for load testing. Hence, deterministic behavior is also important to assure the testability of a system.
Another aspect is the amount of control that the tester has on the substructures of the system. In order to design meaningful tests, a system should be easily identifiable to subsystems with their well-defined APIs, for which tests can be written. A software system that is complex and doesn't provide easy access to its subsystems, by definition, becomes much less testable than the one which does.
This means that systems that are more structurally complex are more difficult to test than ones that aren't.
Let's list this in an easy-to-read table:

Testability – architectural aspects
Software testing generally implies that the software artifact being tested is being assessed for its functionality. However, in practical software testing, functionality is just one of the aspects that can fail. Testing implies assessing the software for other quality attributes such as performance, security, and robustness.
Due to these different aspects of testing, software testability is usually grouped at different levels. We will take a look at these from the point of view of software architecture.
Here is a brief listing of the different aspects that usually fall under software testing:
- Functional testing: This involves testing the software for verifying its functionality. A unit of software passes its functional test if it behaves exactly the way it is supposed to as per its development specifications. Functional testing is usually of two types:
- White-box testing: These are usually tests implemented by the developers, who have visibility into the software code, themselves. The units being tested here are the inpidual functions, methods, classes, or modules that make up the software rather than the end user functionality. The most basic form of white-box testing is unit testing. Other types are integration testing and system testing.
- Black-box testing: This type of testing is usually performed by someone who is outside the development team. The tests have no visibility into the software code, and treat the entire system like a black box. Black-box testing tests the end user functionality of the system without bothering about its internal details. Such tests are usually performed by dedicated testing or QA engineers. However, nowadays, a lot of black-box tests on web-based applications can be automated by using testing frameworks such as Selenium.
Other than functional testing, there are a lot of testing methodologies that are used to assess the various architectural quality attributes of a system. We will discuss these next.
- Performance testing: Tests that measure how a software performs with respect to its responsiveness and robustness (stability) under high workloads come within this category. Performance tests are usually categorized into the following:
- Load testing: These are tests that assess how a system performs under a certain specific load, either in terms of the number of concurrent users, input data, or transactions.
- Stress testing: This tests the robustness and response of the system when some inputs present a sudden or high rate of growth and go to extreme limits. Stress tests typically tend to test the system slightly beyond its prescribed design limits. A variation of stress testing is running the system under a certain specified load for extended periods of time and measuring its responsiveness and stability.
- Scalability testing: Measure how much the system can scale out or scale up when the load is increased. For example, if a system is configured to use a cloud service, this can test the horizontal scalability—as in how the system auto scales to a certain number of nodes upon increased load or vertical scalability—in terms of the degree of utilization of CPU cores and/or RAM of the system.
- Security testing: Tests that verify the system's security fall into this category. For web-based applications, this usually involves verifying authorization of roles by checking that a given login or role can only perform a specified set of actions and nothing more (or less). Other tests that fall under security would be to verify proper access to data or static files to make sure that all sensitive data of an application is protected by proper authorization via logins.
- Usability testing: Usability testing involves testing how much the user interface of a system is easy to use, is intuitive, and understandable by its end users. Usability testing is usually done via target groups comprising selected people who fall into the definition of the intended audience or end users of the system.
- Installation testing: For software that is shipped to the customer's location and is installed there, installation testing is important. This tests and verifies that all of the steps involved in building and/or installing the software at the customer's end work as expected. If the development hardware differs from the customer's, then the testing also involves verifying the steps and components in the end user's hardware. Apart from a regular software installation, installation testing is also important when delivering software updates, partial upgrades, and so on.
- Accessibility testing: Accessibility, from a software standpoint, refers to the degree of usability and inclusion of a software system towards end users with disabilities. This is usually done by incorporating support for accessibility tools in the system, and designing the user interface by using accessible design principles. A number of standards and guidelines have been developed over the years, which allow organizations to develop software with a view to making the software accessible to such an audience. Examples are the Web Content Accessibility Guidelines (WCAG) of W3C, Section 508 of the Government of USA, and the like.
Accessibility testing aims to assess the accessibility of software with respect to these standards, wherever applicable.
There are various other types of software testing, which involves different approaches, and are invoked at various phases of software development, such as regression testing, acceptance testing, alpha or beta testing, and so on.
However, since our focus of discussion is on the architectural aspects of software testing, we will limit our attention to the topics mentioned in the previous list.
Testability – strategies
We saw in a previous section how testability varies according to the complexity and determinism of the software system under testing.
Being able to isolate and control the artifacts that are being tested is critical to software testing. Separation of concerns on the system being tested, as in being able to test components independently and without too much external dependency, is key to this.
Let's look at the strategies that the software architect can employ in order to make sure that the components he/she is subjecting to tests provide predictable and deterministic behavior, which will provide valid and useful test results.
Reduce system complexity
As mentioned earlier, a complex system has lower testability. The system complexity can be reduced by techniques such as splitting systems into subsystems, providing well-defined APIs for systems to be tested, and so on. Here is a list of these techniques in some detail:
- Reducing coupling: This is to isolate components so that coupling is reduced in the system. Inter-component dependencies should be well defined, and if possible, documented.
- Increasing cohesion: This is to increase cohesion of modules, that is, to make sure that a particular module or class performs only a well-defined set of functions.
- Providing well-defined interfaces: Try to provide well-defined interfaces for getting/setting the state of the components and classes involved. For example, getters and setters allow us to provide specific methods for getting and setting the value of a class's attributes. A reset method allows to set the internal state of an object to its state at the time of creation. In Python, this can be done by defining properties.
- Reducing class complexity: This means to reduce the number of classes a class derives from. A metric called Response For Class (RFC) is a set of methods of a class C, plus the methods on other classes called by the methods of class C. It is suggested to keep the RFC of a class in manageable limits, usually not more than 50 for small- to medium-sized systems.
Improving predictability
We saw that having a deterministic behavior is very important to design tests that provide predictable results, and hence, can be used to build a test harness for repeatable testing. Here are some strategies to improve the predictability of the code under test:
- Correct exception handling: Missing or improperly-written exception handlers is one of the main reasons for bugs and thence, unpredictable behavior in software systems. It is important to find out places in the code where exceptions can occur and then handle errors. Most of the time, exceptions occur when a code interacts with an external resource such as performing a database query, fetching a URL, waiting on a shared mutex, and the like.
- Infinite loops and/or blocked wait: When writing loops that depend on specific conditions such as availability of an external resource, or getting an handle to or data from a shared resource, say a shared mutex or queue, it is important to make sure that there are always safe exit or break conditions provided in the code. Otherwise, the code can get stuck in infinite loops that never break or on never-ending blocked waits on resources causing bugs that are hard to troubleshoot and fix.
- Logic that is time dependent: When implementing logic that is dependent on certain times of the day (hours or specific weekdays), make sure that the code works in a predictable fashion. When testing such code, we often need to isolate such dependencies by using mocks or stubs.
- Concurrency: When writing code that uses concurrent methods such as multiple threads and/or processes, it is important to make sure that the system logic is not dependent on threads or processes starting in any specific order. The system state should be initialized in a clean and repeatable way via well-defined functions or methods that allow the system behavior to be repeatable, and hence, testable.
- Memory management: A very common reason for software errors and unpredictability is incorrect usage and mismanagement of memory. In modern runtimes with dynamic memory management, such as Python, Java, or Ruby, this is less of a problem. However, memory leaks and unreleased memory leading to bloated software are still very much a reality in modern software systems.
It is important to analyze and be able to predict the maximum memory usage of your software system so that you allocate enough memory for it and run it on the right hardware. Also, software should be periodically evaluated and tested for memory leaks and better memory management, and any major issues should be addressed and fixed.
Control and isolate external dependencies
Tests usually have some sort of external dependency. For example, a test may need to load/save data to/from a database. Another may depend on the test running on specific times of the day. A third may require fetching data from a URL on the web.
However, having external dependencies usually complicates a test scenario. This is because external dependencies are usually not within the control of the test designer. In the aforementioned cases, the database may be in another data center, the connection may fail, or the website may not respond within the configured time or may give a 50X error.
Isolating such external dependencies is very important in designing and writing repeatable tests. The following are a few techniques for the same:
- Data sources: Most realistic tests require data of some form. More often than not, data is read from a database. However, a database being an external dependency cannot be relied upon. The following are a few techniques to control data source dependencies:
- Using local files instead of a database: Quite often, test files with prefilled data can be used instead of querying a database. Such files could be text, JSON, CSV, or YAML files. Usually, such files are used with mock or stub objects.
- Using an in-memory database: Rather than connecting to a real database, a small in-memory database could be used. A good example is the SQLite DB, a file or memory-based database which implements a good, but minimal, subset of SQL.
- Using a test database: If the test really requires a database, the operation can use a test database that uses transactions. The database is set up in the
setUp()
method of the test case, and rolled back in thetearDown()
method so that no real data remains at the end of the operation.
- Resource virtualization: In order to control the behavior of resources that are outside the system, we can virtualize them, that is, build a version of these resources that mimic their APIs, but not the internal implementation. Some common techniques for resource virtualization are as follows:
- Stubs: Stubs provide standard (canned) responses to function calls made during a test. A
Stub()
function replaces the details of the function it replaces, only returning the response as required.For example, here is a function that returns
data
for a given URL:import hashlib import requests def get_url_data(url): """ Return data for a URL """ # Return data while saving the data in a file # which is a hash of the URL data = requests.get(url).content # Save it in a filename filename = hashlib.md5(url).hexdigest() open(filename, 'w').write(data) return data
And the following is the stub that replaces it, which internalizes the external dependency of the URL:
import os def get_url_data_stub(url): """ Stub function replacing get_url_data """ # No actual web request is made, instead # the file is opened and data returned filename = hashlib.md5(url).hexdigest() if os.path.isfile(filename): return open(filename).read()
A more common way to write such a function is to combine both the original request and the file cache in the same code. The URL is requested just once—the first time the function is called—and in subsequent requests, the data from the file cache is returned:
def get_url_data(url): """ Return data for a URL """ # First check for cached file - if so return its # contents. Note that we are not checking for # age of the file - so content may be stale. filename = hashlib.md5(url).hexdigest() if os.path.isfile(filename): return open(filename).read() # First time - so fetch the URL and write to the # file. In subsequent calls, the file contents will # be returned. data = requests.get(url).content open(filename, 'w').write(data) return data
- Mocks: Mocks fake the API of the real-world objects they replace. We program mock objects directly in the test by setting expectations—in terms of the type and order of the arguments the functions will expect and the responses they will return. Later, the expectations can be optionally verified in a verification step.
We will see examples of writing unit test via mocks with Python later.
Note
The main difference between mocks and stubs is that a stub implements just enough behavior for the object under test to execute the test. A mock usually goes beyond by also verifying that the object under test calls the mock as expected—for example, in terms of number and order of arguments.
When using a mock object, part of the test involves verifying that the mock was used correctly. In other words, both mocks and stubs answer the question, What is the result?, but mocks also answer the question, How has the result been achieved?
- Fakes:
Fake
objects have working implementations, but fall short of production usage because they have some limitations. AFake
object provides a very lightweight implementation, which goes beyond just stubbing the object.For example, here is a
Fake
object that implements a very minimal logging, mimicking the API of theLogger
object of the Python's logging module:import logging class FakeLogger(object): """ A class that fakes the interface of the logging.Logger object in a minimalistic fashion """ def __init__(self): self.lvl = logging.INFO def setLevel(self, level): """ Set the logging level """ self.lvl = level def _log(self, msg, *args): """ Perform the actual logging """ # Since this is a fake object - no actual logging is # done. # Instead the message is simply printed to standard # output. print (msg, end=' ') for arg in args: print(arg, end=' ') print() def info(self, msg, *args): """ Log at info level """ if self.lvl<=logging.INFO: return self._log(msg, *args) def debug(self, msg, *args): """ Log at debug level """ if self.lvl<=logging.DEBUG: return self._log(msg, *args) def warning(self, msg, *args): """ Log at warning level """ if self.lvl<=logging.WARNING: return self._log(msg, *args) def error(self, msg, *args): """ Log at error level """ if self.lvl<=logging.ERROR: return self._log(msg, *args) def critical(self, msg, *args): """ Log at critical level """ if self.lvl<=logging.CRITICAL: return self._log(msg, *args)
- Stubs: Stubs provide standard (canned) responses to function calls made during a test. A
The FakeLogger
class in the preceding code implements some main methods of the logging.Logger
class, which it is trying to fake.
It is ideal as a fake object for replacing the Logger
object for implementing tests.
- C++程序設(shè)計(jì)(第3版)
- Learning SQLite for iOS
- Python零基礎(chǔ)快樂(lè)學(xué)習(xí)之旅(K12實(shí)戰(zhàn)訓(xùn)練)
- Python神經(jīng)網(wǎng)絡(luò)項(xiàng)目實(shí)戰(zhàn)
- Raspberry Pi 2 Server Essentials
- Python貝葉斯分析(第2版)
- Hands-On Full Stack Development with Go
- Java系統(tǒng)化項(xiàng)目開(kāi)發(fā)教程
- Processing創(chuàng)意編程指南
- HTML5移動(dòng)前端開(kāi)發(fā)基礎(chǔ)與實(shí)戰(zhàn)(微課版)
- Drupal 8 Development Cookbook(Second Edition)
- Learning Cocos2d-JS Game Development
- 征服C指針(第2版)
- 少年小魚(yú)的魔法之旅:神奇的Python
- 打造流暢的Android App