December 1, 2019 Jonny Steiner

How a Big Data Tool Like Elasticsearch will Shorten your Test Result Analysis

As we wade into this blog’s topic we are going to take a close look at continuous testing. We will focus on some of the problems around continuous testing how it relates to tools like Elasticsearch and how it is used in a big data environment.

When we talk about continuous testing there is an evolution that the organizations are going through. This starts from local test execution, moves into connecting tests and further executing tests with Jenkins. In the end, this journey and a fragment of the organization achieving continuous testing is all about a very high velocity and connecting your testing to receive fast feedback for developers. That way whenever a developer performs a push request the tests are triggered and within minutes, the developer can get the feedback on the change, and if everything is green, go for the next task.

What we are seeing is that as part of the continuous testing implementation you will be executing a growing number of tests and those tests will run in parallel. The growth will continue until you get to hundreds of thousands of tests running every day which will introduce new challenges.

The New Bottleneck

After gathering all of this test data we have discovered a new bottleneck. In the past, the real problem was generating a huge load of data and running parallel tests. We see that the bottleneck has become the ability to analyze and process a large number of tests. If you ask our customers today, how much time they spent on test result analysis, it will be somewhere between 20 and 50%. People are starting to realize that running tests is only part one. Part two and where we generate the real value is through analyzing those tests.

With deep analysis, you will get a high-level view of your project’s progress and see if it is ready for release. You can identify bottlenecks at an early stage, and you can identify the bottleneck that actually prevents you from, reducing the time to market. This way during the analysis phase you will be able to invest your resources to where they will be most effective.

SeeTest Case Studies

The problem with a lot of data is organizing it, gathering useful views of your data. In the image below we have examples from several of our customers. One is the largest bank in the US. The next is one of the largest international airlines. The third is an international mobile operator. In all of these cases, we can see a growing number of tests that are being executed every day. The range is between tens of thousands of tests each day to around one million tests per day.

The challenge in writing automation tests is running tests in parallel while getting feedback as fast as possible. Let’s say that I have executed 1,000 test cases for my build and 100 of those cases failed. How would we handle that? It is important to note that just because you have tens of failed tests does not mean that you have tens of bugs. Usually, you will be able to narrow your issues into a more manageable number and in some cases, it may be a single issue.

The challenge around test analysis is to shorten the amount of time we spend prioritizing the issues that we are addressing.

Enter ElasticSearch

Elasticsearch is one of the main tools that can improve the problems we have mentioned. Elastic designed it as a group of open source products that help users take data from any source and format in order to search, analyze and visualize the data in real-time. It uses some other open-source tools, that are all part of the same package LogStash for log aggregation. Elasticsearch, which is a powerful open-source search engine that can deal with many use cases. Then there is Kibana which is the way to visualize your data. You can index your data and then create a matrix of relevant visualizations.

The Elastic Stack

What we can all understand is that when you visualize data it is easier to analyze. When we learn more about the solution we also discover the Elastic Stack. See the image below for more.

Kibana is the user interface we mentioned earlier. It enables you to create visualizations in a dashboard for any metrics that you would like, such as pie charts and bar charts and many more. You can create dashboards for different types of users, such as management and executive system administrators testers r&d and a lot of other use cases. It uses LogStash in order to handle all of the data. It stores the data as events that make it easier for you to work with Elastic later on. Then we have the Elasticsearch engine so let’s understand a little bit of how we store data and what it looks like in Elasticsearch.

The highly available and distributed Elasticsearch engine

The Elasticsearch engine can be thought of as a database. You can see a comparison in the table below.

Elasticsearch - engine

Similar to a database, Elasticsearch will store data and enable you to send a query and retrieve the data. If we compare a relational database to Elasticsearch. So a database is similar to an index in Elasticsearch so you can create multiple indexes and similarly, you can create multiple databases within Postgres SQL or any other database.

Between relational databases, there are certain equivalencies. You can see that in the table above.

Even though there are equivalencies the differences are quite pronounced. In Elasticsearch you can have different fields and they can all live in the same index. You also have a REST API that enables you to query the same data and enter that data into Elasticsearch. Also, the data that you are storing is JSON based.

These are quite easy to manipulate and you don’t need to decide in relational databases when you create a table. What you need to do is to define the different fields within that table. It is not the case when we talk about Elasticsearch. In Elastic, you can map fields in advance, but it’s not something that is mandatory to do so you can just create an index.

Example of a document

Elasticsearch - document

This is an example of a document that has a few fields like first name, last name, and interest. The fields can be different types of information. They can be strings or numbers, they can be arrays and they can be objects. So the JASON that you are sending is not limited by its structure and Elastic enables you to query and index any parts of that document.

Taking a look at a demo

Now that we have taken the time to explain the relationship between Elasticsearch and SeeTest we are sure that you would like to see a demo of the solution in real-time so do not hesitate to click here and see the full demo.