Good Automated Tests are Critical to Agility

A dog showing off its agility by jumping over pole

Some people new to Agile, think Agile bypasses a lot of the checks and controls put in place at a waterfall organization.  However, Agile done correctly, these controls become automated tests in a Continuous Integration (CI) and Continuous Delivery (CD) pipeline. This allows for the same checks and controls as before, if not more, but at a much faster speed and lower cost.  This has a significant impact on your Agile projects.

Agile provides cost saving through delivering projects earlier and more in line with stakeholders’ needs.  Achieve these results through less waste and focus on delivering the highest value features with the least effort required.  So, stakeholders see the value of the software early on and can provide constructive feedback to guide it forward.  The key to Agile is iteration.

Successful Agile projects iterate quickly.  Iteration allows for adaptability based on internal and external circumstances.  The product owner and stakeholders handle external circumstances, while Scrum sprints help protect the Agile team.  Internal circumstances impact the team on an hourly basis and can greatly impact the speed of an Agile team.  To get the most agility from your Agile team requires constant and quick feedback.  Achieve these results with automated testing from development to production using Continuous Delivery (CD).

Continuous Delivery Feedback Loops

CD is all about providing feedback to developers and project stakeholders.  The faster they can get this feedback, the quicker they can correct any issues.  This is great, since the developer still has fresh experience working with the faulty code.  This allows them to quickly fix the issue.  Whereas, letting buggy code get to production can take up to 24 times longer to fix.  This is due to time between introducing the bug and its identification.

In the example CD pipeline below, we can see feedback coming from each stage of the pipeline.  In the example, feedback is going back to the developer.  However, in an Agile environment, there should be information radiators, which any one can see.  This means stakeholders, management, security, operations, basically anyone in the company can see how the CD pipeline is doing.  This allows for quick resolution of issues.

CD Pipeline & Feedback 2

Do not boil down CD pipeline metrics to one metric.  Some people desire one metric.  Resist this urge and learn the meaningful value of each metric coming from your CD pipeline.  Then set thresholds for each of these metrics, where possible.  As critical speed is to a CD pipeline, it is more critical to put quality automated tests and checks in it.

Continuous Delivery and Automated Testing

To get maximum velocity from your CD pipeline requires testing, use automated testing where possible.  In this post for simplicity, we will use the term automated testing to cover a wide swath.  In most cases, this term implies either automated tests, analysis or anything measuring the quality of the software.  However, this term may include any manual or automated process generate a metric in the CD pipeline.

If your CD pipeline is missing automated tests, then it is not of much value.  At each stage of the CD pipeline, there should be one or more test suites run.  Each test should provide feedback.  This feedback helps people understand CD pipeline status and the quality of code moving through it.  The CD process uses the same feedback to move the process forward.

In the example CD pipeline below, each stage runs one or more automated test suites.  Let’s identify automated tests you might see in each stage.

CD Pipeline

Git Stage

The Git stage is the source versioning control stage.  Developers commit their source code into the repository, which will start the CD pipeline process.  Code reviews and pull requests can happen here which may call out for automated tests to run before the code is fully committed.  This pre-commit tests are designed to quickly check the code prior to commit and help with code reviews.

Code reviews are an important manual check and should not be automated.  However, they should be augmented with automated tests, so the code reviewer can focus on things the automation cannot do well, like software design and readability to mention a couple.  Once the code passes review and is pushed to the repo, then it is time to build and package it.

Build Stage

This is where code is built and packaged before moving down the CD pipeline.  Several automated tests can begin once the code is packaged.  Due to the high number tests is best to do these parallel.  Recommend a pipeline which supports fan-out and fan-in.  Here is a list of automated tests to consider running at this stage:

All of these automated tests do not require an application environment to run.  This means they can run on the same system doing the build/packaging or a dedicated system for the each test.  With the software package passing its tests, it is versioned and stored in an artifact repo.  This versioned software in the repo is used throughout the CD pipeline.  Now, the software is deployed to QA for automated testing.

QA Stage

The QA stage is an application environment mimicking production for automated testing.  The focus at this stage is end-to-end(E2E) testing of the application.  Another test to run in this stage is Dynamic Application Security Testing (DAST). Before starting the E2E test, it is best to use a smoke tests.  This ensures the application is connected to all its components and is ready to start the E2E tests.  W

UAT Stage

UAT stage is where the non-automated E2E tests are conducted.  The environment is similar to the QA environment.  It too uses smoke tests to ensure the environment is ready for testing.  As the automated tests prove their worth, this environment may go away to speed up delivery to production.  Or may be pulled out of the CD pipeline, but still used for manual exploratory testing.  The results from exploratory testing then feedback to automated testing.  How this environment is used is a business decision and is based on a strong successful history of automation.  With successful testing, it is time to deploy to production.

Production Stage

Your application is in production, no need for automated tests, right?  Wrong!  When deploying to any application environment, you should always run, at a minimum, smoke test.  This ensures the environment has all the systems integrated and working together as planned.

There are other automated tests to do in production.  One is checkups, they run automated checks to find hard to find issues like race conditions and data corruption.  These should be run in parallel with other types of monitoring.  One type of monitoring similar to automated e2e testing is Synthetic Monitoring.  This simulates a user using the application to ensure areas of the application are still meeting SLA requirements.

Automated Tests vs. Manual Tests

Not all tests can be automated or benefited from automation.  Some tests may be too hard to verify with software.  So, having a person doing them manually is preferred.  Or upfront it is too time consuming to build out, but in the long-term it may be appropriate to automate.  You need to consider these tradeoffs when automating tests.  In some cases, it may be better not to create the test.

When Not to Test

With continuous delivery, some people want to automate everything!  It is easy to get into the mindset, but it is all about ROI.  At some point, it may not be worth adding a test, as the value of it is meaningless or of little value.  Determining this value is the trick.

The pyramid is based on an article from Google’s Testing Blog.  It will help determine the value of a test.

When writing automated tests, consider three things:

  1. Likelihood of bugs
  2. Cost of the bugs
  3. Cost of testing

Likelihood of bugs is higher in complex code maintained by larger teams.  As you can see in the example below, there are two extremes.  The simple getter has no logic in it, so it really does not require a test.  While the complex example requires a few automated tests to cover all the branches within it.

Simple code example:

    public Employee getEmployee() {
        return employee;
    }

Complex code example:

    
    public void bfs(int adjacency_matrix[][], int source)
    {
        int number_of_nodes = adjacency_matrix[source].length - 1;
        int[] visited = new int[number_of_nodes + 1];
        int i, element;
        visited[source] = 1;
        queue.add(source);
 
        while (!queue.isEmpty())
        {
            element = queue.remove();
            i = element;
           while (i <= number_of_nodes)
           {
               if (adjacency_matrix[element][i] == 1 && visited[i] == 0)
               {
                   queue.add(i);
                   visited[i] = 1;
               }
               i++;
           }
        }
        boolean connected = false; 
 
        for (int vertex = 1; vertex <= number_of_nodes; vertex++)
        {
            if (visited[vertex] == 1)
            {
                connected = true;
            } else
            { 
                connected = false;
                break;
            }
        }
 
        if (connected)
        {
            System.out.println("The graph is connected");
        } else
        {
            System.out.println("The graph is disconnected");
        }
    } 

Cost of the bugs implies the severity of impact they will have on the system.  Payments, security and other high severity systems should have more automated testing when compare to a function like printing.

Cost of tests takes into account the time and effort to create automated tests.  Unit tests are the cheapest tests to create.  This makes them great for refactoring.  While integration and E2E tests are more expensive in cost.

Combined these three factors should influence when a team should create automated tests.  With a solid understanding of where automated tests go within your CD pipeline and when to create tests, it is time to look at the all the automated tests in detail.

Types of Automated Tests

This is not a complete list of testing for a CD pipeline, but are some of the more common automated tests used in a CD pipeline.  Each of these automated tests will provide a set of metrics showing as specific aspect of quality of the application.  Some metrics are a better reflection of quality, so identify those and focus on them.  In this section, you will learn more about the different types of automated tests and some of these key metrics, so you can get the most out of your CD pipeline.

Pre-Commit Tests

Pre-commit tests are the very beginning of a pipeline to production.  Not every commit may result in a push to production, but can change the quality of the code.  Code quality has many aspects from being maintainable by humans, to it being reliable along with a number of other aspects.  Pre-commit tests help code reviews before code is committed.

Mozilla, the creators of Firefox and other open source tools, have a stringent code review for code committed to their projects.  So much so, they warn new contributors upfront, so they are not upset when their code review fails.  Pre-commit tests can help with code reviews or be the initial gatekeeper before committing the code.  Pre-commit tests are quick automated tests and analysis.

For code reviews, run code quality analysis, like lint test and check style test.  Together these two automated tests will help the reviewer and the committer identify any potential issues.  To help reduce breaking the build, run unit tests at this point.  They are quick automated tests suite.  Pre-commit tests can go beyond code health to include repository and security health.

In 2013, Github released a new search service, which allowed people to find passwords and other security details left in repositories.  These should not be in a project’s repository, as they compromise security, especially if they are for production.  These automated tests should check for secrets like passwords and security tokens.  Git Secret is one option to deal with this issue.

Unit Tests

Unit tests validate the functionality of the application at the lowest level.  They focus on validating the expected behavior of one function/method and its supporting code.  These automated tests are about helping the developer.  They help them validate low-level functionality, but even more importantly helps them with refactoring.  These automated tests allow for major redesigns/refactoring without breaking functionality.  Redesigns are common when the team has to make changes to support new functionality.  Unit tests help the developer by providing fast feedback.

Unit tests are typically the fastest test suite out of unit, integration and E2E.  This speed is achieved by not running any additional software, just running the unit test harness and the code under test.  This means no database, limited to no frameworks or any other support software.  This additional software just adds time to the tests suite and no other value.  If your testing requires these additional items for unit testing, recommend moving this to integration testing.  Now you have a unit test suite, how do you make sure it is a quality test suite?

Unit Test Feedback

A unit test framework will run your tests and provide limited feedback.  In the figure below, you can see the result of the Maven tests command.  This unit test run was a success along with some additional information.  However, you get no knowledge of the test suite quality, code coverage can help.

Code coverage is a great tool to show one aspect of the unit tests suite quality.  In addition, allows the team to target code needing unit tests.  This is a key element in knowing when not to test.  The image below of a JaCoCo report.  JaCoCo is one of many code coverage tools, so there it is very likely you can find one for your project.   The code coverage report shows coverage metrics for a sample project.  It is ordered by the highest areas of concern by package name.  It allows the user to drill down into the package and eventually to the source code it self, which is in the following image.

In the image below, shows highlighted code in red, yellow and green.  Red code means not tested.  Yellow code means partially tested.  This example shows the start of a branch with only one path tested.  Green code means fully tested.  These colors allow a developer to quickly focus their unit testing efforts.

Unit tests are great, but are even better with code coverage, as it give you more metrics.  Some metrics are great for set thresholds, while other are great for trends.  This allows teams to improve the coverage of their unit tests and to make better decisions on when to unit tests and not unit test.

Integration Tests

Following Google’s Test Blog as a guide, their medium size tests are a good set of criteria for integration tests.  This is a big step up from unit tests as there are several software processes in the test.  For speed and reduced complexity, keep integration tests on one system.  This is great for testing the server-side of an application, if it is client-server in design.  These automated tests are about making sure key components of the system are working together.

In a web application, you would test the server-side components integrated together on one system.  For simplicity, the clients talk to the server via REST, which your integration tests will use to exercise the system under test.  As the system is exercised, the test suite will validate the other system components to ensure the system is behaving correctly.  Here are some examples of these validations:

  • Checks the database to ensure CRUD operations are correct
  • Check caching system to ensure data and it retention time is correct

Unlike unit tests, code coverage is not used.  To ensure a quality integration test suite, recommend code reviews prior to committing code.

End-to-End Tests

E2E tests exercise a full system from client to furthest sever-side systems.  This testing requires an environment that mimics production.   In some cases, it may not possible to tests at the scale of production.  However, the environment should have all the components of production, just scaled down.

E2E tests can be automated or manual.  Recommend automated as much as possible, but remember the when not to test guidelines.  Your E2E test suite should be your smallest when compared to the other automated tests suites, so do not go overboard.

E2E tests control the client(s) to exercise the system under test.  Basically, the E2E tests mimic users, but the automated tests themselves may access server-side systems to verify tests ran as expected.  To control the client, tools like Selenium is great for controlling a browser for your web app.   For mobile app, there are Selendroid and Appium.  With all the client technologies available, there are a number of clients’ configurations to tests on.  This makes it too expensive for a company to buy and maintain a fleet of client technology.  This is where online services can help.

Companies like BrowserStack and Sauce Labs offer companies the ability to test on a number of clients.  They offer a number of configurations from OS and browser configurations to mobile devices.  Your test suite can run in your local network and create a secure connection to these services.  This allows you to keep your software secure in you network, but exercise your app with the same clients used by your users.  This helps companies by getting rid of the need to maintain a fleet of clients.

Code Quality Analysis

Code quality analysis measures different quality aspects.  Functional, security and performance are handled by other automated tests.  Code quality is about the following:

  • How easy is the code to maintain?
  • Does the code follow the style guide?
  • How complex is the code?
  • Does the code have ‘possible’ errors?
  • Is the code following language guidelines?
  • Are there spelling errors?
  • Is there dead code in the project?

Modern code quality analysis tools can do a lot to improve the quality of a project’s code.  They are typically highly configurable to meet the needs of a project.  Tools like SonarQube, lint tools like TSLint for Typescript, and a check style tool like Maven Check Style plugin fit into this category.  Data from these tools should be put on an information radiator along with histogram form to show trends.

Smoke Tests

Smoke tests verify the set up and integration is correct.  This suite confirms proper configuration with a short and limited test suite.  They quickly find problems and stop before going any further.

Most failing test suite goes all the way to completion.  This allows the team to see all the errors, so they can work them.  This is a great behavior.  However, a fundamental issue in a test environment, like a closed database port can cause havoc.  This could results in a majority of the automated tests failing in a test suite.  A smoke test would have identified this issue before the tests suite started.  This allows for quick identification and resolution of the issue.

Smoke tests are required from QA to production in the picture below.  Smoke tests should run right before the test suite.  This is due to the changes to the environment for a new test run.  In a cloud environment, these environments can be created on the fly using infrastructure as code tools like Terraform, AWS CodeDeploy, Chef and others.  Even in more static environments, systems have to change for the test.  Therefore, it is best to smoke test to get a quick piece of mind before testing.  For production, you should also use smoke tests.

CD Pipeline

You have just deployed to production your latest release.  How do you know it will work?  Smoke tests are a great place to start.  They give you a level of confidence immediately.  For higher level of confidence, move up to a portion of E2E tests.

Static Application Security Testing (SAST)

Hackers are always looking for a weak point into your application and not every developer is a security expert, so security tools can help.  One way to help is with static code analysis.  This can target your project’s code and its dependencies.  Scanning your code is easy with the number of tools available.  However, there are number of tools to help beyond your project’s code.

OWASP Dependency Check will look at your dependencies and see if there is a known issue with them.  This allows your team and management to know of these issues.  As needed, mitigation plans can deal with these issues.  This is only one piece in security analysis, so best to understand the limits of SAST.

SAST has its strengths and weaknesses.  The strengths are the ability to look at the source code and find patterns of failure.  While it has a number of weaknesses, due to it not being a complete and running system, which can have a number of security issues.  These issues can relate to other systems or configuration.  For further details, look at a write-up by OWASP on static code analysis.  There are number of SAST tools available, OWASP has a number of tools listed here.

Dynamic Application Security Testing (DAST)

Where SAST has its weaknesses, DAST tools can help fill the gap.  These tools can scan an up and running systems for security vulnerabilities.  They behave similar to hackers trying to break-in to your app.

DAST tools can attack your application in a similar fashion to hackers.  They can try SQL injection, cross-site scripting, OWASP top ten threats and much more.  These tools can crawl all over your application or target specific areas. There are a number of tools available list at OWASP Vulnerability Scanning Tool page.

Load and Capacity Testing

Great, you built this new web e-commerce system right before black Friday.  Will it handle the load?  How can you know it will handle the load?  You use load and capacity testing.

Load and capacity testing gives you confidence to handle a predicted load on your site.  Typically, people want to know if it can handle the busiest hour of the busiest day.  It can also identify weak spots and break points where the system will not scale to meet demand.  So, it is best to test above your predicted future load and in some cases, take the system to the max to find its limit.  Doing this will give you confidence in the application and plenty of time to react to scalability issues.

With the cloud, one can easily add this to a continuous delivery pipeline and scale it to failure.  Failure in this context, can be crashing or too slow to respond to user’s requests.

There are a number of load tools available, especially commercial offerings.  Here are a few open source offerings to consider:

Open Source Scan

In the SAST section, we covered open source from a known bug perspective, but left out one curial perspective, licensing.  Based on licenses used in your application, can impact how you use the software and its privacy.  The type of open source license used can greatly impact your project, so know your dependency licenses!

Some licenses can limit patent use of your source code.  While others require disclosing your source code. Or you may have to put your code under the same license.  While other restrictions are less onerous, like license and copyright notification.  For a matrix with the most common licenses and restrictions go here.

For tools to help identify licenses you do not want in your dependencies, look at the following:

Conclusion

By putting these different automated tests and checks in a continuous delivery pipeline, they will help your Agile team achieve true agility.  Automate tests where possible.  However, know when it is valuable to test and not to test, as 100% testing is not valuable, especially at the lower levels of testing.  Once these automated tests and checks are in place, your Agile teams should be able to move with new speed and agility.

For more information on Continuous Integration, Continuous Delivery, Agile or Automated Testing, please contact OSG.