Continuous Integration and Delivery Magic by Steve Mays
As part of our series of interviews with FinTech CTOs, we talked to Steve Mays, who used to be Chief Technology Officer (CTO) at Trizic, about how he managed to establish development and delivery processes in the company.
Steve has a long background in computing and internet technologies. He also holds an MBA. Before moving to wealth management, he built and managed enterprise-class software platforms. Steve was part of Twitter’s operational team, worked on special effects in movies with George Lucas and Industrial Light & Magic, and was the US government’s representative in Central Asia for electronic governance. In June 2015, Steve was announced as CTO at Trizic.
We talked to Steve about the situation when he applied for the position at Trizic, and the steps he took to solve the problems.
The situation in the industry
Steve was invited to join Trizic by Drew Sievers, the company’s CEO. At that time the entire industry was facing problems related to automation. When taking on new clients, financial advisors had to look at every trade and every stock in every portfolio, on every account, every day.
“That person could never look at all those things, because there’s simply not enough time in the day.”
Talking to financial advisors, Steve realized that their work required enormous amounts of paperwork and manual processing.
“I thought, this is a place where automation could really create a better experience for everybody. The advisor could actually create advice. The advisor could do what they should be doing, which is building trust and confidence and calling their clients and building portfolios that spot mega-trends in the industry, versus the minutiae of a given portfolio. And let that advisor do that thing, versus being bogged down in the groundwork.”
Steve conducted some research into Trizic and found out that the company was on the leading edge of the new generation of technology. He decided to he wanted to help Trizic solve its automation and scale problems. Steve’s experience at Twitter allowed him to identify a formula for success.
Step by step, Steve embarked upon the task of enhancing Trizic’s cloud-based digital investment advisory platform.
Step 1. Useful features and rapid iterations
First, Steve decided that the top priority was to create a platform full of the features required in the marketplace. The company had to learn the space, identify its needs, and satisfy those needs as soon as possible.
“[We had to] build the features that the space needs. Iterate rapidly. Build an MVP [minimum viable product] type of technology versus fully polished, super great, scalable technology that nobody wants.”
To deliver valuable features, Steve had to build a new pipeline of continuous integration and continuous deployment. He was excited about having the opportunity to re-architect how the work was done, how the technology would work, using things that he had learned not only at Twitter, but also at Edmunds and other big companies, and to grow and lead a team to solve this big problem using an iterative approach to software development.
“Let’s deliver value every day, every week, versus waiting for a release; [we felt that] that would be well accepted by this space even though it was a new concept.”
Step 2. From monolith to microservices
When Steve came to the company, the platform had a monolithic architecture installed on one single Amazon instance. “It was nominally cloud, but it wasn’t [a] cloud the way I perceive [it].”
Steve knew that the following had to be achieved:
- from the ground up, the platform had to scale;
- to be secure, all system tests had to be automated.
Automation was conceived as the first priority. Unexpectedly, Steve met with resistance from the team of engineers.
“One of the engineers said, ‘We didn’t go to computer science school to write tests.’ Think about that given modern software architecture. A lot of that team doesn’t work here anymore. That kind of thinking is unacceptable.”
When all tests had been automated, the platform was still monolithic. The development team ran some of the services on a URI-based HashiCorp console.
“We have strangely become the poster child for HashiCorp.”
For databases, Steve offered to provide a temporary caching solution. Thus, the company moved to queues.
“Now we had a couple of automated, cloud-deployed, tested, queue-driven of monoliths.”
Then, piece by piece, they took away parts of the monolith and created a microservices architecture.
When the company started working with John Hancock and other corporate clients, these clients were not expected to use Trizic’s UI, but to apply their own. Steve wondered, “What would [happen] if we got 5,000 API transactions in a second? We’d better scale that out.” They therefore built a consumer API set that communicated via the queue using the near-cache concept implemented in Redis. “We started doing a tiered data architecture with Spring, Java, Redis, RDS, and Amazon.”
Microservices and multi-tiered architecture allowed Trizic to quickly create new services once the need appeared. For new microservices, the most suitable languages could be chosen.
The team then dissected the ontology and sharded data along their ontology lines to benefit from good distribution.
Step 3. Config flags and continuous deployment
Steve continued to augment the platform. He explained to the company tech leaders that processes should be established so that nobody experience daily collisions with one another on various points.
“Never the two should meet. If we break out every business process into its own set of services, we can actually turn this into [the] Henry Ford of software. We could parallelize assembly lines of software. Isn’t this anything other than ISO 9001 for software?”
Steve decided to use the most valuable people in the company to provide junior developers with guidance, direction, code review, and training. This helped inexperienced employees be more effective, “because they’re in a framework of caring guidance that produces quality code.” Creating a FinTech and WealthTech knowledge base allows new developers who lack domain expertise to obtain industry insights, get to know approved practices, and receive clarification on core aspects and critical issues.
At that time many solutions were appearing that needed to be developed and deployed as soon as possible. Then, config flags were put in. Steve explains, “Config flags is the concept [in which] you continuously write little bits of software, little micro chunks of code and you deploy every hour, as fast as you can. You merge, test, and deploy. The config flag tells others, ‘Don’t do anything,’ and then you build up your prod like Legos, as opposed to large deploys.”
As a result, developers can deploy multiple times a day, once they have code that is ready or they need a patch. It takes about 40 minutes to go through deployment and test suites. The team is working on parallelizing even this process.
“We work to break up the different tests, and they’re not all one big stovepipe. We’ve even parallelized and automated our deployment, so it really only takes one command to do a complete deployment for us.”
Moreover, the team has improved its tests to provide test-driven infrastructure, architecture, and development.
With respect to the production environment, Steve says that they deploy to Amazon East and Amazon West.
“We have a hot–hot live environment. We can turn up a customer at either site. I don’t believe in hot–cold environments. Cold is that thing that won’t run when you really need it, so we constantly divvy up traffic between Amazon sites.”