Experitest Blog

Boost Your Test Automation with Selenium WebDriver Internals

In the world of web applications and multiple versions of different browsers across thousands of devices, there is one thing you would probably want to avoid completely: manual testing. Firstly, automating your testing at the unit and integration level is essential, but it is only a start. You also need to rethink your strategy at the top of the testing pyramid. Without Selenium WebDriver, there is a good chance you are not automating the high-level end-to-end tests on your web applications. Above all, if that is the case, you are probably doing quite a bunch of repetitive tasks manually. With Selenium WebDriver, you can automate the manual part of your testing and free up your QA time to focus more on what is exploratory or requires human judgment.

Selenium Overview

Selenium is an open-source testing automation suite which is primarily used to automate testing for web applications. Selenium is not a single tool. It’s a suite of tools, each catering to different testing and development needs. The following are its four key components.

  • Selenium RC (now deprecated)
  • Selenium IDE
  • Selenium Grid
  • Selenium WebDriver

At the moment, Selenium WebDriver and Remote Control are combined into a single framework that makes up “WebDriver” (or sometimes called Selenium 2).

Selenium comes with three different modes when it comes to implementing a browser test. The local mode is the simplest one where we run a test on a local browser. Then comes the local Selenium server where a test flows from Selenium server to a browser driver, and from browser driver to the actual browser. The Grid node is the most advanced mode which allows for running tests on remote machines. Selenium facilitates parallel execution through its Hub-Node architecture, allowing you to execute multiple test scripts on multiple machines at the same time.

implementation - selenium webdriver

Why Selenium WebDriver?

Selenium WebDriver is a natural option if you want to create browser-based automation suites and tests which are robust and scalable. It’s also a great choice if you want to distribute scripts across various development environments.

Selenium WebDriver Architecture

Selenium WebDriver Architecture is all about how the automation framework works internally.

Every browser follows a different pattern and logic when it comes to performing actions like loading a page, processing information, or closing the browser, etc. For simplicity, we may consider Chrome as a real browser and see how Selenium interacts with it to create end-to-end automation tests. Selenium Architecture basically has four key components:

arhcitecture - selenium webdriver

  • Selenium Client Library
  • JSON Wire Protocol
  • Browser Drivers
  • Browsers

The chart below shows the architectural flow and interaction of these key components.

Selenium Client Library

This refers to the language bindings that Selenium developers have developed to allow this framework to support multiple languages. The availability of multiple libraries such as Java, Ruby, and Python, etc. reflects the platforms’ usability and resourcefulness in development environments. Testers and developers with expertise in any of the available programming languages can use Selenium to automate their web application tests.

JSON Wire Protocol

WebDriver utilizes a JavaScript Object Notation (JSON) wire protocol to communicate between client libraries and different driver implementations. JSON is primarily used to transfer data between a server and a client on the web. JSON has become almost the industry standard for various REST web services.

Here is how JSON Data is sent to and received from a server:

A client (or the machine on which the WebDriver API is being used) converts an object into a JSON object before sending it to a server. The server parses the JSON object and converts it back into a JavaScript object. The server converts the response object into a JSON string and returns it back to the client. The client converts the JSON string into an object for use (a JavaScript object).

json - selenium webdriver

Browser Drivers

Every browser has its own browser driver, like ChromeDriver, FirefoxDriver, InternetExplorerDriver, PhantomJSDriver, and SafariDriver. These drivers communicate with their respective browsers without revealing the internal logic of how the browsers function. Basically, browser drivers are the extension of Webdriver wire protocol that acts as a bridge between the respective browsers and the driver (language bindings). In other words, browser drivers are a middleman between the browser and your test code.

Browsers

Selenium supports all the major browsers, including Chrome, Firefox, Internet Explorer, and Safari.

Let’s dive into the specifics now and try to figure out exactly how WebDriver functions internally.

Let’s say you have created a Gradle project within Eclipse IDE and configured it with all the required resources, including the plugin IDs, dependencies, and browser drivers. Now you are good to write a test script in your UI using Java (or any of the supported client libraries in Selenium).

Example:

demo

Once your code is ready, you can run it to execute the program. Based on the above string, the Chrome browser will be launched with the text “Selenium is great” into the text box (the search box).

demo 1

Once you run the script, here is what happens internally. Every statement in your code string is converted into a URL through JSON Wire Protocol over HTTP. The URLs are then passed on to the ChromeDriver (the one we are using in the demo case). The client library, Java in the case of the demo, converts the script statements into JSON format and communicates with the ChromeDriver. Here is what the URL looks like:

http://localhost:21653/{“url”:”https://www.google.com”}

demo - 2

Every browser driver receives HTTP requests using an HTTP server. Once a browser driver receives the HTTP request, it passes it on to the real browser (over HTTP). If the request is a GET request, the browser will generate the corresponding response and pass it on to the browser driver over HTTP. And the browser driver sends to the UI (Eclipse IDE) over JSON Wire Protocol. However, if the request is POST request, the action takes place on the browser itself.

That is how Selenium WebDriver Internals work and it is pretty much what you need to know to get things off the ground.

flow - selenium webdriver

Conclusion

Web applications are increasingly becoming an essential component of business in today’s digital-driven world. They help businesses interact with their customers and achieve their business goals in a much faster and effective way. The usability and effectiveness of web applications are the reason why the demand for them is soaring. They are also the reason why developers are looking for better ways to get them to the market faster. Selenium WebDriver is the defacto framework when it comes to automating tests for web applications. It is not only effective for testing purposes, but also for automating any repetitive tasks on the web. Once you learn how to leverage this tool, you can automate tasks in your browser as if a real person executes them for you.

xcuitest

Jonny SteinerContent Manager

Comments are closed.