The Power of Encouraging Women Who Code

Personal Capital is proud to sponsor Women Who Code, a non-profit organization dedicated to inspiring women to excel in the technology industry.

Most articles about the ratio of women to men in tech focus on the negative, citing differences in population, salary, and leadership roles. I would like to take time to put a spotlight on a couple of positive anecdotes that show how people can make a huge difference by providing a little support.

About a year ago, a female friend of mine pulled me aside at dinner and told me she had made a career change to become a web developer. She said the idea of entering the tech industry first took hold after a discussion we had years back, in which I (as a female developer myself) had encouraged her toward this new direction. The news was such a pleasant surprise because it made me realize how a simple conversation can change the course of a person’s life in a positive way, either through the introduction of an idea or by instilling confidence in one that had been lying dormant.

It reminded me of a conversation that changed the course of my own life. It was my first year at UC Berkeley, and I was taking CS3: Introduction to Symbolic Programming, on a whim. I loved the class and considered departing the hazy pre-med path my parents laid out for a computer science focus. I was concerned, though, because so many people in the class had previous programming experience and I was starting from scratch. I asked my sister if I was crazy for pursuing this foreign path, and she told me that if I was good at it and liked it, go for it.

Fast forward to now, I’m the Director of Mobile Technology at Personal Capital, a company that I believe in, where I’m doing what I love. It just goes to show how a little inspiration and support can go a long way, and I’m proud that Personal Capital is sponsoring Women Who Code, an organization that is dedicated in providing both.

Our Engineering Values

What We Do

We align our engineering efforts with our company’s mission: Better financial lives, through technology and people.

Just like our financial advisors abide by the fiduciary standard (meaning financial advice they give must be in the best interest of our clients), we develop products and services that serve the best interests of our users and clients.

We create new business opportunities by tackling complex financial problems facing millions of American families.

Every day we create a more connected and data-driven financial ecosystem for our users.

We take pride in our intuitive user experiences and a high quality of service that delights our users.

We filter out the noise to surface the most important information via data visualizations that show how financial actions create a storyline our users can follow to understand their current and future net worth.

We empower our users with personalized financial planning tools.

 

How We Do It

We believe the best solutions come from empowered cross-functional teams of engineering, product, marketing and advisory, and we believe in the spirit of collaboration.

We believe an important way to deliver financial advice is through data visualization and intuitive user interfaces.

We believe everything we do should be measurable,transparent and accessible to all.

We value open and streamlined communication from documentation in code, automated tests and captured product discussion; we speak our minds openly and freely.

We believe quality is everyone’s concern, and together we work hard to create a product that will improve people’s lives.

We embed best security practices in everything we do, at every level of our organization.

We value accountability, with the goal of fostering leadership and focus.

We work smarter, not harder: we use data to create and refine our business rules to get better results and use automation to scale them.

We promote our processes and best practices by automating them.

We let our code speak for itself through documentation, unit tests and test cases.

 

At Personal Capital, we’re all about changing the face of financial technology, and we are always looking for great talents to join our team. And you will join a team that works together to motivate, inspire, and change an industry.

Amazon Machine Learning – Is It Right For Your Company?

At Personal Capital we have been leveraging machine-learning techniques to solve business problems since the early days of our company.  As we’ve grown and faced new challenges that require intelligent and scalable solutions we’ve turned to machine-learning based solutions over and over again.  We firmly believe that a continued investment in systems that help garner insights and solve complex problems for our customers will be critically important.

When Amazon announced their new machine-learning platform, Amazon Machine Learning, we were very excited to evaluate it.  With the promise of scalability, speed, and ease of use, we felt that it could serve as a potential platform upgrade for our machine-learning challenges.

In the spirit of exploration, we spent a few days getting familiar with the AML product.

AML – Current Capabilities:

Create Dataset:

AML provides you the ability to consume CSV data from an S3 bucket.  Additionally, it will allow you to define a query to run on your redshift servers in order to create a CSV file usable for machine learning.

After defining an input source, it will then process your file and attempt to assign types to each of your features (categorical, numeric, binary, and text). The UI presents a sampling of the feature values so you can see if an error was made.  It will allow you to re-select the type for any column if it was chosen incorrectly.  You must also choose a column to be trained on (dependent variable).

The interface for pointing to datasets and processing them is clean and intuitive.  In fact, it seems that the entire AML UI has been carefully composed to make AML usable by almost anyone.

Train Model:

Once you’ve created a dataset you can then build a model on that dataset.  Logistic regression is currently the only supported model type for binary classification.

By default AML will split your file into training and testing parts.  It randomly samples 70% of the dataset for training and 30% for testing.  This is par for the course when it comes to machine learning.  AML will also allow you to specify the training and testing sets separately.  This is good because sometimes it’s desirable to split in a different way.  You may want to split by a timestamp or ensure that a given user is only in the training or testing set.

Evaluate Model:

AML will evaluate the model for you as part of the model-building step.  It has a nice interface that allows you to see how accuracy, false positive rate, precision, and recall vary with different thresholds.  You can choose a new threshold for the model if you wish, which will then be used for scoring purposes.

Generate Predictions:

You can generate predictions using your new model in batches (upload a file of observations to score) or in real-time using their prediction API.

Thoughts On AML:

Ok so the system seems to be well thought out and intuitive.  How does it work for us though?  Here are our thoughts on each of the various pieces:

Model Type:

Logistic Regression (LR) is the only model type available with AML.  LR generally requires more time and care in handcrafting your features than other model types.  For example, if you have two features, one called state and the other called age, a logistic regression model will not be able to figure out that older people who live in Florida prefer a specific type of product.  By default it will have one weight for state code and one weight for age.  You need to create a feature that is a combination of state and age in order to capture the combined signal.  AML does provide the ability to create combination variables by combining all possible text values of two variables.  This is good, but still requires you to have a suspicion that two variables interact before performing the combination.  It’s not feasible to combine all features with each other, especially when three way or four way combinations may contain most of the predictive power.  Additionally, adding features to a dataset increases the dimensionality and therefore the amount of data required to train a robust model.  Other model types like decision trees, neural networks, etc. can figure out these feature combinations for you and drastically reduce the time to build a good model.

Logistic Regression cannot handle numeric features which are non-monotonic in relation to the outcome.  An example of where this is important would be a feature tracking the number of site logins a user has and assigning a probability of an outcome to that user based entirely on that feature.  The likelihood of the outcome could be higher when you’ve not seen the user before, lower when you’ve seen them a normal amount of times, and higher again when you’ve seen them a very large number of times.  You could solve this problem by binning your numeric variables using a supervised technique.  It would be nice if AML supported this use case as one of its feature transformations.  Something like the MDLP function within the discretization package in R would be great.

Transparency:

Amazon will not expose the model that it built to you.  You will never be able to see which features received which weights.  Depending on your application and goals, this can be a real deal breaker.  You can espouse the great AUC and PR curve that your model has but when the consumers of your model notice something amiss and you state that you “don’t know what’s going on” … well I certainly would not want to be put in that position.  Keep in mind that these metrics for measuring models do not usually take into account the subtleties of errors.  Its great that your model has a recall rate of 95%, but perhaps that 5% contains one of the largest and most important cases.  You could argue that the impact of each observation should have been modeled, but to be honest these things tend to be iterative.  Not being able to see what your model is doing is a HUGE downfall and in my opinion cannot form the basis of a system for any serious type of machine learning, but your mileage may vary.

Cost:

In total it cost me $0.61 to build the sample model that comes built into AML … that represents over an hour of machine time to build a single logistic regression model.  If we gloss over the issue of whether this sample model is representative of real world models this amount can either be very low or very high, depending on what you’re doing.  If you are building a model across you’re entire dataset which can be leveraged for a few months without retraining then this cost is extremely low.  If on the other hand you are building models for subsets of data (each user, each item, each offer, etc.) and you are doing this on an ongoing basis (every few days/hours/minutes) then the costs can really begin to add up.

Scoring:

The ability to create real time scores on your models is what will sell many people on AML immediately.  You can bypass the creation of software required to score models in real time.  Typically this code is not complex but you have to manage models in memory, determine which model to apply to which event, and be very rigorous about testing the accuracy of the scoring system.  All of this can be bypassed by using AML.

They’ll charge you a penny for 100 predictions.  Whether this is expensive again depends on the scale of your classification/scoring problem.  If you’re scoring new user registrations, and there are a few thousand of those a month, then this would likely be low cost.  If however you’re scoring something that happens hundreds of millions of times per month your cost would be large and ongoing.

When it comes to latency, here’s what they say, “The Amazon ML system is designed to respond to most online prediction requests within 100 milliseconds.”  Depending on your use case this may not be particularly comforting.  If you want your site to appear seamless while scoring events AML is not likely for you.  You can of course write a wrapper around your scoring requests and serve some default when AML does not respond in time, but that’s a bit of a gamble.  Generally, scoring a logistic regression model should be lightning-quick … if you code the scoring system yourself.

It would be nice if they made the trained models available through a download in some format like PMML.  If they’ve structured their profits around scoring though, this is not likely to ever happen.  Still, it’s a major flaw in the system design from a usability perspective.

Transformations:

The concept of transformation expressed in a flexible way is a great idea.  No doubt many machine learning teams have thought of this and hoped for it but didn’t have the time to build it for their specific application.  It makes a lot of sense that a large-scale service provider like Amazon would build something like this.  That being said, the set of transformations is light.  As Amazon themselves state on their AML website, feature pre-processing is usually the most impactful part of building models.  I’d personally take a bad model and a rich feature set over the opposite any day.  That is why it’s surprising that their transformation set is so limited.  Perhaps they are planning to build it out over time.  In any case, this can be overcome by using EMR to transform and create features prior to model building.  That begs the question though of why you’re using AML at all when EMR comes with mahout built in.  You can do your own feature transformations and build a random forest model with minimal effort.  Creating code to score a random forest in production is not too difficult.  It’s true that you should be very rigorous around testing it, but once you have it your system will be much more flexible in general and there is no cost per transaction (on top of server costs).

AML is a wonderful proof of concept tool.  Its great to show management that this nebulous thing they’ve heard of called machine learning can be wrestled down and made functional in a few hours.  For very simple tasks that do not require much oversight (i.e. anything is better than nothing) AML would work quite well.  In general though, I would imagine that any team solving serious Machine Learning problems would have to evolve past AML at a very early point.  Either that or wait for the evolution of AML, which will no doubt occur.

3 Ways to Build Engaging Forms

Why

The Input form is an essential element of our web application and is widely used to gather key information to build personalized features for the user. And in many cases, it represents the key engagement driver for our conversion points.

What

Build engaging forms that drive conversion.

How

Add personality

Most of the times web forms are completed by users without any human guidance. So, it is essential we are communicating our personality and being relatable. Thus making the process more enjoyable and human. With our most recent form, we have approached the language to be more colloquial with the effort of making it more of a conversation style with a financial advisor. We have presented the form with a chart background to make it more contextual and demonstrate how the form data affects the chart.

Screen Shot 2015-09-14 at 4.53.58 AM

Our next level would be to add some quirkiness and make it more fun. Turbotax seems to do this quite well with a very casual style for form labels and every user input evoking a quirky/fun response :) For example, check out the responses in gray text below each input in the following screenshot.

Screen Shot 2015-09-14 at 4.51.28 AM

Add interactivity

In general, people want to interact with elements that feel alive and look familiar. The key to make the form alive is by providing instant feedback as user interacts with the form. And we have achieved this in most recent effort by illustrating how each component of the end-feature is built as the user progresses through the form. With this, the user should be able to relate how the inputs they have provided have helped us build the feature rather than overwhelm the user with asking bunch of inputs and present the feature at the end. This should also educate user as to what they need to adjust to have the desired effect on the end goal.

Other forms of interaction that we have used is providing context based helpful hints and tips, smart defaults,  avoid unnecessary inputs and minimizing number of inputs by deriving them based on provided data and instant validations. Using appropriate input controls also helps a long way to make the form more interactive. For eg: using a slider for how much you save input vs using a text input for retirement age.

Take a look at this short video to see how these all come together.

Personal Capital is uniquely positioned to suggest values for most of the financial data based on financial accounts that a user has aggregated with, thus making it one less entry for user, but more importantly, one less mental calculation that user needs to perform. Instead we use data science to more accurately calculate these values. This will be discussed at length in a different post.

 

Break up the forms

Last year, we ran a A/B test with a long form vs short form variation. The long form had all the inputs up front and had a chart that would update based on the provided inputs. The short form grouped inputs into smaller set (2-4 questions in one set) and presented as a sequence of steps and at the end of each step updated the chart as an instant feedback to the user inputs.

The results of the test were that the long form was more engaging and the short form converted better. We have learned that breaking up forms into bite-sized chunks and building a sense that user is completing steps and working towards the end goal is better for conversion and drives users to the next level.

So, when we built our most recent feature, we used these findings to build two different experiences for a first time user and an engaged user. For users coming to the feature for the first time, we take them through a short form variation while an engaged (returning) user would see the long form.

This has proved very successful for us in building a complex feature that requires a fair amount of data and present it in a way that is engaging, interactive and provides a path to completion towards the end goal.

Reads

http://www.smashingmagazine.com/2011/06/useful-ideas-and-guidelines-for-good-web-form-design/

http://www.lynda.com/Web-Interactive-User-Experience-tutorials/Web-Form-Design-Best-Practices/83786-2.html

Tips for Interns from a Former Intern

This blog post would be better suited if I wrote it in the beginning of the summer but whoops. I’ve been working full-time at Personal Capital for over two months now. Last summer I interned here and in this post I’d like to share to all current and future interns some tips to maximize their internship experience.

tl;dr Summary:

  1. This is your time to shine. Show them what you got!
  2. Always be eager to learn and ask how to improve.
  3. Seek out help if you need it.
  4. Dabble in different things.
  5. Go out to lunch.
  6. If you’re interested in returning, let them know!

1- This is your time to shine. Show them what you got!
Congrats! You made it past the interviews, offered an internship, and now spending the next couple of months working for them. This is an audition for both sides. They’re trying to further evaluate your skills and how you interact with others. From your perspective you want to see if this is a company that you see yourself working for after college.

2- Always be eager to learn and ask how to improve.
You’re probably the youngest person in the company. You’re surrounded by people with years of experience who have been through a lot. You can learn so much from them! Also soliciting feedback on the work you’re doing is very helpful in developing you, the professional.

3- Seek out help if you need it.
If you’re stuck on something or just don’t understand the task/problem at hand be sure to ask questions. There’s no such thing as asking too many questions. You might spend a lot of time on something that was not needed or wrong. I’m sure your supervisor would rather take a little time in the beginning to ensure you don’t spend unnecessary time down the road.

4- Dabble in different things.
This is your opportunity to try out something you don’t really know about. Some companies will differ in how much you can explore outside what you’re brought into do, so your mileage may vary on this one. Every company has many projects happening or want to happen, and this is your chance to see if it’s what your passionate about.

5- Go out to lunch.
You’re surrounded by many fascinating people, all with different backgrounds and journeys they took to get where they are today. Unfortunately in the office we’re just so focused on the task at hand that we really don’t get to know the people we work with. Grabbing lunch is an excellent opportunity to really get to know about them. Be it one-on-one or a group; I know I really enjoyed “Lunch Crew” when I was an intern.

6- If you’re interested in returning, let them know!
Lastly if you are interested in working there in the future, communicate that towards the end of your internship. After having a great time last summer I spoke with our Ehsan, our VP of Engineering, and was fortunate to be offered a full-time position after graduating. I’m not alone. According to the National Association of Colleges and Employers (NACE) 2015 survey, employers converted 51.7 percent of their interns into full-time hires. You can be a part of the majority!

That’s it! I hope you enjoyed this post and find it useful!!