Web Automation Testing

This summer I have the privilege of working as an Engineering Intern at Personal Capital. Not only do I have the pleasure of working with really great people, I am also learning about how various engineering teams come together to build an awesome product. There is only so much you can learn in a classroom; this is the real world we’re talking about!

My main project is implementing an automated test suite for Personal Capital’s marketing website and web app. Automated tests make the feedback loop faster, reduce the workload on testers, and allows testers to do more exploratory and higher-value activities. Overall, we’re trying to make the release process more efficient.

Our automated testing stack consists of Selenium WebDriverJS, Mocha + Chai, Selenium Server, and PhantomJS. Tests are run with each build by our continuous integration tool Hudson, and we can mark a build as a success or fail based on its results. Our tests are written in JavaScript since our entire WebUI team is familiar with it.

In an effort to keep our test scripts clean and easily readable, Casey, one of our Web Developers, ingeniously thought of creating helper functions. So instead of having numerous driver.findElement()’s and a chai.expect() throughout our scripts, these were integrated into a single function. An example of a one is below.

var expectText = function(selector, text) {
	scrollToElement(selector).then(function(el) {
		chai.expect(selector).dom.to.contain.text(text);
	});
};

We were having issues when testing in Chrome (while Hudson runs PhantomJS, our tests are written to work in Firefox, Chrome, and Safari) where elements weren’t visible so we need to scroll to their location first. We then have our scrollToElement() method that is chained with every other helper function.

var scrollToElement = function(selector) {
	var d = webdriver.promise.defer(),
		el;

	// Get element by CSS selector
	driver.findElement(webdriver.By.css(selector))
		// Get top and left offsets of element
		.then( function(elt)	{
			el = elt;
			return elt.getLocation(); 
		} )
		// Execute JS script to scroll to element's top offset
		.then(	function(loc)	{ 
			driver.executeScript('window.scrollTo(0,' + loc.y + ')');
		} )
		// If successful, fulfill promise.  Else, log ERR
		.then(	
			function(success)	{ 
				d.fulfill(el);
			}, 
			function(err)	{ 
				d.reject('Unable to locate element using selector: ' + selector);
			} );

	return d.promise;
};

Then a typical test script would look like this:
helper.clickLink();
helper.expectText();
helper.enterInput();
Super clean, simple, and awesome. Anyone can write an automation script!

One of the main challenges in automation is timing. Some browsers (I’m looking at you Chrome) are faster than others, and the driver will attempt to execute commands before elements on the page can be interacted with. So to overcome this we used a mixture of implicit and explicit waits. There are two ways to do an implicit wait. The first is setting WebDriverJS’s implicitlyWait() by having the following line of code after defining the driver:

driver.manage().timeouts().implicitlyWait(1300);

This is global, so before throwing an error saying an element cannot be found or be interacted with, WebDriverJS will wait up to 1.3 seconds. The second method is waiting for an element to be present on the page, and setting a timeout. This is helpful if we need more than 1.3 seconds on a certain element. We have a helper function called cssWait() that looks like this:

var cssWait = function(selector, timeout) {
	driver.wait(function() {
		return driver.isElementPresent(webdriver.By.css(selector));
	}, timeout);
};

On top of those we use explicit waits that are simply “driver.sleep(<time>)”. Sometimes we need to hard code a wait to get the timing just right.

Unfortunately that’s it for this post. If you have any questions feel free to leave a comment and I’ll get back to you. In my next blog post, or one that will be written by Aaron, I will talk more about some of the challenges we faced and how we dealt with them.

To get started with Web Automation, I suggest heading over to SimpleProgrammer.com where John Sonmez put together some instructions on getting your environment set up. While his are for Windows, the Mac version is pretty similar.

Evolving End-User Authentication

EV Certificate Display

The adoption of EV Certificates has rendered the login image obsolete.

This week, Personal Capital discontinued the use of the “login image”, as part of an upgrade to our security and authentication processes.   By “login image”, I mean the little personalized picture that is shown to you on our login page, before you enter your password.

Mine was a picture of a starfish.

Several users have asked us about this decision and, beyond the simple assertion that the login image is outmoded, a little more background is offered here.

 

The founders and technology principals in Personal Capital were responsible for introducing the login image for website authentication, a decade ago. In 2004, Personal Capital’s CEO Bill Harris founded, along with Louie Gasparini (now with Cyberflow Analytics), a company called PassMark Security, which invented and patented the login image concept, and the associated login flow process. Personal Capital’s CTO, Fritz Robbins, and our VP of Engineering, Ehsan Lavassani, led the engineering at PassMark Security and designed and built the login image technology, as well as additional security and authentication capabilities.

Server login images (or phrases, in some implementations) were a response to the spate of phishing scams that were a popular fraud scheme in the early- and mid-2000s.  When phishing, fraudsters create fake websites that impersonate financial institutions, e-commerce sites, and other secure websites.  The fraudsters send spam email containing links to the fake sites, and unsuspecting users click on the links and end up at the fake site. The user then enters their credentials (username/password), thinking they are at the real site. The hacker running the fake site then has the user’s username/password for the real site and, well, you know what happens next. It’s hard to believe that anyone actually falls for those sorts of things, but plenty of people have. (Phishing is still out there, and has gotten a lot more sophisticated (see spear-phishing for example), but that is a whole other topic).

So, the login image/phrase was a response to the very real question of:  “How can I tell that I am at the legitimate website rather than a fraudulent site?”  With login image/phrase, the user would pick/upload a personalized image or phrase at the secure website. And the login flow changed to a two-step flow: the user enters their username, then the secure site displays the personal image/phrase, and then, assured that they are at the legitimate secure site when they recognize the image/phrase, the user enters their password. The use of login image/phrase was a simple and elegant solution to a vexing problem. And when the FFIEC (U.S. banking regulatory agency) mandated stronger authentication standards for U.S. banking sites in 2005, login image quickly became ubiquitous across financial websites, including Bank of America and many others, during the mid-2000s.

From a security perspective, the login image/phrase is a kind of a shared secret between the secure site and the user. Not as important a secret as the password, of course, but important nonetheless, and here’s why: If a hacker posing as the real user enters the user name at the secure site, and the site displays the user’s login image/phrase then the hacker can steal the image/phrase and use it in constructing their fake website. Then the fake website would then look like the real website (since it would have the image/phrase) and could then fool the user to giving up the real prize (the password) at the fake phishing site. So, the issue of “how to protect the security of the login image?” becomes a relevant question.

Device identification is the answer:  If the website is able to recognize the device that is sending a request containing the username, and if the site knows that device has been authorized by the user, then the site can safely show the login image/phrase, and the user feels secure, and enters their password. This is essentially a process of exchanging more information in each step of the authentication conversation, a process of incremental and escalating trust, culminating in the user entering their password and being granted full access to the site.

But the use of device identification to protect the login image is secondary to the real technology advance of this approach: the use of device identification and device forensics as a second factor in authentication. Combining the device identity with the password creates a lightweight form of two-factor authentication, widely recognized as being far superior to single-factor (password only) authentication.

The simplest form of device identification involves placing a web cookie in the user’s browser. Anyone out there not heard of cookies and need an explanation? OK, good, I didn’t think so. Cookies work pretty well for a lot of purposes, but they have a couple of problems when being used for device identification: (1) the user can remove them from the machine; and (2) malware on the user’s machine can steal them.

The technology of device identification quickly evolved, at PassMark and other security companies, to move beyond cookies and to look at inherent characteristics of the web request, the browser, and the device being used. Data such as the IP address, User-Agent header (the browser identity information), other HTTP headers, etc. Not just the raw data elements, but derived data as well, such as geolocation and ISP data from the IP address. And, looking at patterns and changes in the data across multiple requests, including request velocity, characteristic time-of-day login patterns, changes in data elements such as User-Agent string etc.  Some providers started using opt-in plugins or browser extensions to extract deeper intrinsic device characteristics, such as hardware network (MAC) address, operating system information, and other identifiers.

“Device forensics” evolved as the practice of assembling large numbers of data points about the device and using sophisticated statistical techniques to create device “fingerprints” with a high degree of accuracy. The whole arena of device identification and device forensics is now leveraged in a variety of authentication and fraud-detection services, including at Personal Capital. This is the real value that grew out of the “login image” effort.

But, while the use of device identification and device forensics was flourishing and becoming a more central tool in the realm of website authentication, the need for the login image itself was becoming less compelling.

Starting in the late 2000s, the major SSL Certificate Authorities, (such as Verisign), and the major browser providers (such as IE, Firefox, Chrome, Safari) began adopting Extended Validation (EV) certificates. These certificates require a higher level of validation of the certificate owner (i.e. the website operator, such as Personal Capital), so they are more trusted. And, just as important, the browsers adopted a common user interface idiom for EV certificates, which include the display of the company name (e.g. “Personal Capital Corporation”) that owns the certificate, displayed in a distinctive color (generally, green) in the browser address bar (see picture). The adoption of EV certificates has essentially tackled the original question that led to the use of the login image (i.e. “how does the user know they are at the real website?”).

Which brings us to today. Personal Capital has removed the login image from our authentication flow. It is a simpler and more streamlined flow for our users, and has the added benefit of reducing complexity in the login process. It is a security truism that, all else being equal, simpler implementations are more secure implementations – fewer attack vectors, fewer states, fewer opportunities for errors. Personal Capital continues to use device identification and device forensics, allowing users to “remember” authorized devices and to de-authorize devices. We also augment device identification with “out of band” authentication, using one-time codes and even voice-response technology to verify user identity when they want to login from a non-authorized or new device.

I’ll admit that I will miss my little starfish picture when I log in to Personal Capital. But this small loss is offset by my knowledge that we are utilizing best, and current, security practices.

PassMark Security, circa 2005

“Ugly Shirt Fridays” at PassMark Security, circa 2005

Mobile Development: Testing for Multiple Device Configurations

All Android developers should have at least 1 “old” device running OS 2.3.3 and a current “popular” device. Ideally, one should also have a current device that is considered “maxed out” on specs. A company should additionally have the latest “Google” device (currently the Nexus series), and an HTC, Sony, and Samsung device. (These manufacturers are mentioned because of popularity and/or significant differences not found when developing on other devices.) Additionally, OS 4.2, 4.3, and 4.4, though minor OS increments, offer differences that should be considered.

Though development for iPhone/iPad is more forgiving given the fewer configurations, it still offers challenges. For example, if you are developing on a Mac running OS X Mavericks with a version of Xcode above 5.0 for a product that still needs to support iOS 5.x, you will need a physical device because the iOS 5.x simulator isn’t available for that development configuration.

If testing mobile websites, the configurations can be endless.

At Apps World 2014, Perfecto Mobile (http://www.perfectomobile.com) introduced me to mobile cloud device testing. Their product offers access to real devices (not emulators or simulators) connected to actual carriers physically hosted at one of their sites around the world.

The concept of mobile cloud device testing allows the ability to test on a multitude of configurations of devices, locations/timezones, carriers, and operating systems.

Beyond access to multiple devices, Perfecto Mobile offers automation testing across these platforms via scripts written in Java. I wasn’t able to personally delve as far as I wanted into these automation tests, the recording feature, or the object mapper before my trial ran out, but the demo at Apps World gave me the impression it behaves similar to Xcode’s Automation instrument but expanded to all devices.  The scripts enable your team to target certain device configurations and automatically launch, execute the given tests, clean and close the devices, and export the test results to your team.  I wish I could say more because it looked really promising but without actual usage, I can only mention what I viewed during the demo.

It’s impossible to cover every configuration during native Android application development, but after a release, for all platforms, if your product is experiencing issues and a crash report doesn’t reveal enough, mobile cloud device testing offers the a real option for true coverage.

Below is a listing of some features of interest Perfecto Mobile offers:
- Access to real devices connected to actual carriers (not emulators or simulators) physically hosted at one of Perfecto’s sites around the world. Since these are real devices, you can dial numbers, make calls, send text messages, and install apps.
- UI for devices available displays availability, manufacturer, model, os + version, location, network, phone number, device id, firmware, resolution.
- Ability to open multiple devices at the same time.
- Requests for devices and configurations not available are responded to in real-time.
- Ability to take screenshots and record sessions to save and/or share results with others.
- Ability to share the device screen in real-time with your team.
- Ability to collect information for a device such as battery level, CPU, memory, and network activity.
- Export of device logs.
- Beta of MobileCloud for Jenkins plug-in that allows running scripts on actual cloud devices after a build so you can see reports on a single device after a build (multiple devices is not available yet).

Automating your javascript unit tests – Karma

why

  • instant feedback while writing your code and eliminate the need to remember to run tests before checking in your code. thus leading to stable builds.
  • continuos integration with our hudson build system.
  • testing on real and multiple browsers.

what

  • set up karma to watch your source files and run tests on code changes.
  • set up karma to run the tests during hudson build and validate the build.
  • set up karma to track our code coverage.

how

We use mocha as our test framework. This along with chai (expect/should – BDD style) worked out great for us with an effective yet readable tests. I cannot emphasize the importance of readable tests enough. We had a team member who did a feature walk through by running through the tests which i thought was pretty rad. Product and QA could easily see what was the feature set, what was expected of and what was the outcome. I guess we have to do a write up sharing more of our excitement.

Before karma, we were running tests using individual test files. More often, you are working on multiple files and remembering to run tests on all these files manually was becoming cumbersome and error prone. So we started researching on test runners and karma seemed to fit all our necessities: automation, continuos integration, run tests on multiple real browsers and support for mocha.

set up karma to watch your source files and run tests on code changes

This was fairly straight forward. Karma’s setup is driven by a single configuration file where in you provide the location of files you want to watch for changes, browsers that you want to run tests, your testing framework and any preprocessors. Here’s a gist of our configuration file. The only tricky part was preprocessors. We use handlebars along with requirejs-handlebars-plugin for our templating purposes and serve our templates as individual html files. This was causing a problem karma was converting them into js strings because of its default preprocessor: html2js. It needed a bit of reading, but the fix was simple enough. The following additions to the config file fixed the problem.

preprocessors : [{'scripts/**/*.html' : ''}]
files:[...{pattern: 'scripts/**/*.html', served: true, included: false}]

set up karma to run the tests during hudson build and validate the build

We created another karma configuration file for this purpose. We added a junitReporter  so that we could export the tests in a format that could be interpreted by our hudson setup. The key differences are as follows. We are currently using phantomJS for testing on our build systems, but in near future, we want to extend this to real browsers.

reporters: ['progress', 'junit']
junitReporter: {outputFile: "testReports/unit-tests.xml"}
autoWatch: false
browsers: ['PhantomJS']
singleRun: true

set up karma to track our code coverage

Once we were able to configure karma to run in hudson, this was just a natural addition. The only additions to the karma configuration are as follows.

reporters: ['progress', 'junit', 'coverage']
coverageReporter: {
 type : 'cobertura',
 dir : 'coverage/'
}
preprocessors : {
 '**/scripts/**/*.js': 'coverage'
}

As you may have noticed, i may used simple and straight-forward words quite a few times and that is what karmajs is all about.

reads

http://karma-runner.github.io/0.10/index.html

http://googletesting.blogspot.com/2012/11/testacular-spectacular-test-runner-for.html

 

https://www.npmjs.org/package/karma-handlebars-preprocessor

Incremental Web Performance Improvements

Compression (gzip) of front end resources (js/css)

When we moved to Amazon’s Cloudfront (CDN), we lost the ability to serve gzipped version of our script files and stylesheets. We are single page app and we have a large javascript and css footprint and this was greatly affecting our application performance.  We had two options to fix this.

  • Upload a gzip version of the resource along with the original resource and set the content-encoding header for the file to gzip. CDN would then serve the appropriate resource based on request headers.
  • Use a custom origin server that is capable of compressing resources based on request headers. The default origin server which is a simple Amazon Simple Storage Service (S3) is not capable of this and hence the problem.

Fortunately for us, all our application servers use apache as a web server and we decided to leverage this setup as our custom origin server. We had to simply change our deployment process to have front-end resources deploy to our app servers as against a S3 bucket. This does make the deployment process tiny bit complex, but the benefits are huge.

 

Dividing our main javascript module into smaller modules.

As mentioned earlier, we are single page app and have a large javascript footprint. We bundle all our javascript files into a single module and fetch it during our app initialization. And as we grow, so will our javascript footprint and did not want to run into long load times during our app initialization as demonstrated by Steve Souders in his book: High Performance Web Sites.

We use requirejs to build all our javascript modules into one single module. Fortunately enough requirejs provides for combining modules into more than one module in a very flexible manner. We package all our “common” modules and the main module we would need on loading the application into one module. All other modules are dynamically loaded based on when they are required. More specific details will be posted soon.

Pre-caching our main javascript module.

I believe this is a very common practice and simple implementation that does reap a huge performance benefit. We now pre-fetch our main javascript module during our login process using iframe and html object tag. The iframe keeps the login page load times independent of the resources being fetched through it. Again, there are many ways to implement this as mentioned by Steve Souders, but we chose this for simplicity.

Additional Links

  • http://stackoverflow.com/questions/5442011/serving-gzipped-css-and-javascript-from-amazon-cloudfront-via-s3
  • http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/ServingCompressedFiles.html
  • http://requirejs.org/docs/optimization.html