Production Engineer: A Q&A with a PEnguin

Timothy Frison
Clio Labs
Published in
9 min readSep 22, 2023

--

Rafe, as a Senior Systems Engineer in the Production Engineering group is an invaluable asset at Clio. When he joined Clio, he brought 23 years of diverse experience. This included various support roles in software, hardware, and networking (as well as over a decade of wearing a developer hat). His goal when joining Clio? To become a PEnguin* and share his expertise with the team. In this Q&A, we’ll share Rafe’s perspective of life as a Clion, and highlight the critical nature of the role of our Production Engineers.

*We call the members of our Production Engineering (PEng) group PEnguins.

Rafe Slattery, Senior Systems Engineer — Production

Thanks for taking the time to do this interview Rafe! Could you tell us about yourself and your role here at Clio?

Although I started as a developer here, my goal was always for a more DevOps oriented role. At a previous company, my manager said to me, “given my mix of skills, I reckon DevOps would be a good thing”. That appeared to me as going back into a mix of development and application support, networking, and all that stuff I spent years learning (rather than just throwing it behind me and never touching it again).

Why not go into a role I can continue to grow into, and utilize all that stuff that’s in my head? I was enjoying what I was doing, but I thought long-term I’d like to incorporate something else. In 2021, when I became a PEnguin it meant every now and again I can stir something up from my past experience that I wouldn’t touch as a developer. As a developer, you’re up to your neck in Ruby, JavaScript — barring an occasional performance thing, I think that’s where you live.

As a PEnguin, what I love is that it’s never the same day. Never all-in on one language or technology. Let’s see, the stuff I’ve picked up on: Golang, Ruby, Terraform just to name a few. You’re interacting with an awful lot more technologies — and that’s just fun.

Can you describe what it means to be a PEnguin at Clio and how it contributes to the success of both the company and customers?

As PEnguins we develop the tools that the developers are able to utilize for doing their jobs better. That’s one of our primary roles. But then, we’re also heavily involved in providing the infrastructure for production and making sure that it’s running as it should. You know: security, reliability, and performance — and it’s running within expected parameters. Can machine A talk to machine B? All that stuff.

The folks writing the Ruby and writing the JavaScript — they have one set of needs. That our development tooling works, that our beta environments work, that authentication is working for all of these internal tools. These are our developer-experience customers, if their code can’t be deployed to production, then it can’t be used by Clio’s customers.

Then we have Clio’s customers. The Mr. Clio Grow, Ms. Manage, Mx. Lawyaws of the world — and they have a completely different set of requirements. They don’t care if our CI/CD system is down. They care about if they can create matters, or “can I send a bill today” — stuff like that.

If we’re not doing our job and doing it well, then production goes down. If we don’t do anything day-to-day, in X number of days the system will degrade and none of the people using Grow or Manage will be able to do anything.

It’s weird for me, because I’m based almost exclusively on helping the developers. When it comes to Clio’s customers, most of the time things are just ticking along, it’s if things fall over during the day that I’m stepping in to help.

Can you share an example of a significant project or initiative that the Production Engineering team has undertaken at Clio, highlighting the impact it had on the company and its customers?

The current project I’m working on is something we’re calling the “Transient Tracker”. Anyone who is technical will understand that transient, or flaky, tests happen in test infrastructure. And we’re coming up with a means to address that.

Like any large software project, we have a lot of tests. Any given CI run or build will be executing north of 100,000 tests. These tests are distributed across, say, a 100 machines, and some of them fail sometimes. It’s a fact we have to live with, but we’re trying to limit their impact on the delivery of our software.

So these transient test failures, we’re working on tracking them when they occur. Then we’re moving on to preventing them from failing future builds. When they happen, we’re going to notify the person who was most likely to have caused them so they can be more deeply investigated by the developers. We’ll also warn the developers — but we won’t block the code from being pushed to production.

We’re going to do this, while maintaining an exceptional quality bar for our software. That will increase the effective speed of developers, because they no longer will have to wait another 20 minutes for their builds to complete or before their changes can be shipped. Everyone’s happy. Developers aren’t sitting around scratching their heads and Clio customer’s get features faster.

What are some unique technical challenges that arise in the legal tech industry, and how does the Production Engineering team tackle them at Clio?

I spent 10 years in software for hardware asset management — and they’re just completely different fields. That was a “here’s your MSI [Windows installer], you’ve installed it on your machine. You go right at it.”. You have full admin rights and everything, there were no privacy concerns.

Not only do we need to have our own privacy, data and security practices, our Clio customers have their own legal requirements for their data. Solicitors are holding data for their clients, and the fields our customers are working in have knock-on effects on us. Be it data-escrow or data-residency, all that fun stuff we have to make sure we satisfy the requirements of.

From a security perspective, the impact of screwing up is a company-killer. It’s a huge one, perhaps not legal-tech specific, but we have a high security requirement. Solicitors might have data for a client who has a traffic ticket, or might have data for a client who is involved in cases against a state.

We have to have the highest possible security. We don’t know if we’ll be targeted by Jenny and Johnny the script kiddies, or more sophisticated hacking groups. There is a significant security team ensuring and monitoring what we have is inaccessible when required. It has to be there, and it has to be right.

What technical skills and expertise are crucial for success in the Production Engineering role at Clio? How does Clio support the technical growth and development of its PEnguins?

There are very few specific requirements, what there is-is a lot to learn — which is made easier with a breadth of previous experience across development, ops, and security. Nothing I do I couldn’t have picked up along the way given my background. There is no requirement to come into this job and know Terraform — because I didn’t. There is no requirement to know Go — because I didn’t know that either. If a new-to-me technology becomes a part of my day-to-day, I’m given the time to get to grips with it.

Yes, I spent years doing networking work. I know about routers, CIDRs, subnet masks, and all that stuff. But day-to-day you don’t need to know that. What you need is the ability to learn, and the ability to cope with not knowing enough at all.

I came in as a PEnguin having been a developer with our code bases for, let’s say 2-years. When I hit PEng that was useful, and I occasionally reference it, but there is very little need for it. What you really need is a desire to learn and a desire to be challenged.

For our team, what we do today is probably going to be different than what we do next week. It’s almost certainly going to be different from a month down the line. You will never be an expert in everything.

What you’ll need is a basic knowledge at a broad level, of a huge amount of technologies. Then, you need to be able to refresh yourself quickly when you get back into stuff. Right now, I’m up to my neck in Ruby. Previously, I wasn’t, it was Terraform. Before that, a huge amount of Docker and Kubernetes. Our gang particularly, we change technology stacks consistently. We’re silo’d for a little while, then we hop into the next one.

What was maybe helpful, was that I knew a little bit about everything that hangs it all together. Everything that we do I’ve probably covered in a small fashion previously, and I suppose that’s been helpful. I love that what we do is constantly changing. You’re never going to settle into a comfortable rut of “I know all of this”.

Beyond the technical skills, what other attributes or experiences do you think are valuable as a PEnguin at Clio?

Don’t be afraid to ask questions. No one here knows everything. Mr. DBA doesn’t know Kubernetes, Ms. Kubernetes doesn’t know how our dev tools roll together. You need a network of people you’re interacting with. That’s something that comes with time. It’s about always being willing to ask the loads of smart people around here questions. You don’t know everything. They don’t know everything. But you get enough heads on a problem and you can muddle it out. Don’t be afraid to talk, don’t be afraid to ask questions, and know that people will help.

I’m also the only PEnguin working in EMEA. I’m 5-hours ahead of the nearest Production Engineer. The whole Dublin team reaches out to me on Slack, so I need to be organized. I even have my own label for tickets to help with this, because sometimes you don’t know if something will be a few minutes or a few hours. This also helps me feel like I have an impact on the team here, because when I look at the tickets I’ve done recently, I think “There are 10 things that are here, and I remember when I was in their role and there wasn’t an accessible PEnguin until 5PM.”. Either you were blocked or had to spend 5–6 hours of your day trying to maybe get there because it’s not in your realm of experience. I do those things consistently, so I can be very helpful.

Also being outside of the primary time-zones for PEnguins, there is a lot of self-motivation to crack-on. I might hit an issue at 9am my time, I have 2 choices. I can either do nothing for 5 hours, or I can just crack-on-with-it for a couple of hours. Maybe I’ll find a solution, or maybe I’ll need some help and have to work on other stuff. So that’s a feature particular to me. So the ability to roll-with-stuff and be flexible, I usually have 2–3 things on the go at any given time if I get stuck.

How does the Production Engineering team collaborate and engage with the broader Clio organization?

The office here is very tightly knit, I also have the benefit of being the 3rd longest-serving person in the Dublin office. I’m lucky in that way, and am long tenured by Dublin’s office standard.

For other groups, like the marketing folks, I am often “the person” in their time-zone that can help. For me, I am PEng in EMEA. That’s a different flow to many other Production Engineers.

We have regular Dublin focused events, for example on Friday coming, we have a led yoga-session. Then there are the monthly business reviews, in which learn about what’s happening in marketing, sales, customer support.

Unless you go out of your way to avoid stuff, there is consistent information about how other teams are doing. You also have the weekly Town-Halls that are Friday for ye, but I have them at 1 PM on Mondays — it’s been a recurring appointment in my calendar since joining in 2018.

Also, we’re going to have a little West-Ireland Clio meetup. We’re going to hire out a meeting room in a hotel for a day, everyone on this side of the country can come and meet-up in person. And if you believe my slack messages today, I hosted a kayaking party in August.

We hope this Q&A session has provided you with valuable insight into the pivotal role of a Production Engineer at Clio. These exceptional PEnguins play a critical role in ensuring the security, reliability, and scalability of Clio’s infrastructure — all while evolving the developer experience to ensure the most effective tooling exists for our broader engineering organization. Join us in our pursuit of technical excellence while we transform the legal experience of all.

--

--