Migrating 3.7 Million Lines of Code to TypeScript with Tyler Krupicka

Joe chats with Tyler about how Stripe migrated 3.7 million lines of code—to TypeScript in a single pull request.

Show Notes

Tyler is part of the JavaScript infrastructure team at Stripe. He states that since 2016 Stripe had been heavily flow-typed JavaScript and React, but as time went on, they began to notice cracks in the functionality as the applications grew. This is when they started investigating TypeScript as an alternative moving forward.

Tyler said that, as of now, the Stripe Dashboard is the biggest migration project they have taken on, but that they are in the process of migrating all of the Stripe codebases to TypeScript. After deciding to take on a more “all in” approach, they began to test the waters by converting pieces of code to see what they would be able to do with the migration.

Due to the size of the codebase and the difficulties involved in migrating from one typed system to another, Tyler’s team began looking into methods of automated migration, building, adapting, and customizing different tools to help accomplish this.

On Sunday, March 6, 2022, they migrated Stripe’s largest JavaScript codebase (powering the Stripe Dashboard) from Flow to TypeScript. They converted more than 3.7 million lines of code in a single pull request. In this episode, we'll deep-dive into the details of large-scale migration.

Transcript

[00:00:00] Tyler Krupicka: So my name is Tyler Kka. I work on the JavaScript infrastructure team at Stripe which is a part of the larger developer productivity organization, which is like 10 plus teams dedicated to things like CI code. Builds developer environments and as well as like specific workflows for different languages.

[00:00:24] Tyler Krupicka: So basically anything that is about saving developers time at Stripe that's what developer productivity is working on. And I'm based out of San Diego, California and I've been on the team for a little under a year and a half at this point. .

[00:00:39] Joe Previte: Nice. And how long have you been at Stripe?

[00:00:42] Tyler Krupicka: I've been at Stripe for about that time I joined in early 2021.

[00:00:47] Joe Previte: Cool. And what's the, for those who aren't familiar, what what's the history behind like having a JS in 14?

[00:00:57] Tyler Krupicka: Yeah. Stripe, if you think [00:01:00] about like our products it might be pretty easy to say, Oh, Stripe is an API specific company.

[00:01:06] Tyler Krupicka: Like a lot of our core products are just APIs that people call. And and that's definitely the case with our products. But you still have to build a lot of front end for all of the things that we do. So some examples of that are, Our main large application is the strip dashboard, which is where if you're a Stripe customer, you're administering your account.

[00:01:28] Tyler Krupicka: There's also a lot of our like strip SDKs, which are in JavaScript and a lot of other hosted pages like strip checkout things like that. We also have the strip marketing pages and documentation, which the documentation site is really impressive if you haven't taken a look at it. There's some really complicated things happening in there.

[00:01:50] Tyler Krupicka: There's like a whole integration builder that's like an interactive editor that's got your tokens and everything in it. So it's much more than just like a static doc. As well. So [00:02:00] there's a lot of these different surfaces where we're building front end. And so over time there's been a need to focus on making that editing experience better fo figure out what language features we want, making sure builds are fast, making sure we have all of the syntax and everything shared es lent rules.

[00:02:20] Tyler Krupicka: And I think one thing that. Is really helpful as a company grows is it's really easy to get into a state where different pockets of the code base feel like completely. Isolated sections of code, they don't have shared practices across them. So if you're a developer and you have to work across all of these different folders, you might be in one folder in the morning that has completely different l rules and formatting and everything than one that you're working on in the afternoon.

[00:02:47] Tyler Krupicka: And that can start to be a little bit charring. And having a central team that can manage some of those dependencies and make sure the development experience is good, is really important. And we have quite a few developers who are writing [00:03:00] JavaScript. It's one of the top languages at Stripe.

[00:03:03] Tyler Krupicka: And so we wanna make sure that it's a really productive experience.

[00:03:07] Joe Previte: Nice. Yeah, that totally makes, then makes sense. Then, I guess when you're at a size like you know what Stripe is at that you wanna make sure that you are keeping your developers super productive. How much money does that save the organization?

[00:03:20] Tyler Krupicka: And like engineers are not going to enjoy their job if it's really frustrating. If they feel like they're spending two thirds of their time just dealing with processes versus writing code it's gonna be really difficult. So we wanna make sure that on top of everything, like. As much as we can keep engineers happy and make sure they enjoy working on the products they're working on as well.

[00:03:41] Joe Previte: Yeah Exactly. You wanna keep the devs happy, reduce any friction that you can, that's not related to writing code and solving problems for the business. So let's talk about I, I guess how this, like the main topic obviously for this season is migrating two type scripts.

[00:03:59] Joe Previte: And [00:04:00] I'm sure a lot of people probably in the audience saw an article come out from Stripe about this big migration. So maybe we could start at the beginning of like flow and JavaScript at.

[00:04:14] Tyler Krupicka: Let's start that. Yeah I think flow dates back to somewhere around 2016 at Stripe.

[00:04:20] Tyler Krupicka: And it really came along with a real increase in React usage as well. And so React has really good flow types with it. It's built around it. And I think that probably the same case for other organizations like. You're starting to lean in a lot more to using React throughout your code basis.

[00:04:38] Tyler Krupicka: Then you might also have used the type system that came along with it, and I think Stripe has a. Strong bias towards like strongly typed languages. We've recognized the benefit of them and Stripe also maintains, so Bay, which is the type system for Ruby and uses that for a lot of the Ruby codes.

[00:04:59] Tyler Krupicka: I think there's [00:05:00] generally a good understanding that like type safety is helpful and like you can build good editor integrations around it. And so with that we started to have a lot of flow adoption and going into 2021 when we were starting some of the migration we had.

[00:05:17] Tyler Krupicka: Probably about 5 million lines of flow typed JavaScript throughout all of our different applications and everything. So it was really widespread usage. Pretty much every app we had used it and had decent coverage with it. And so like over time it had just continued to grow and grow. But like we had started to see some cracks over time in.

[00:05:39] Tyler Krupicka: The functionality that it was giving us. And so that's where we started to start investigating type script. .

[00:05:46] Joe Previte: Okay. And I guess one thing that I think might be helpful to the audience to highlight or illustrate is so from the article that I read, this migration was for Stripe dashboard.

[00:05:59] Tyler Krupicka: Yeah, that [00:06:00] was our biggest migration that we did. But we are moving like all of the stripe code bases to type script. Okay.

[00:06:06] Joe Previte: Cool. . So Stripe has the Stripe dashboard and like all these other code bases, is this in a monorepo? Are these individual repos?

[00:06:14] Tyler Krupicka: Yeah, primarily Stripe is organized around a monorepo.

[00:06:18] Tyler Krupicka: There are some like separations depending on like different products, but by and large a lot of our Java is script code is in a single repo, and those might be organized into different folders or like applications throughout that. But everything is inside that one place. Okay, cool.

[00:06:37] Joe Previte: And Okay Stripe has a history of strongly typed languages. I didn't realize that they or you all maintain sobe, which is cool. Who, I guess when it comes to like decision making at an organization like Stripe who ultimately makes that decision of yes, we are going to migrate away from flow towards

[00:06:55] Tyler Krupicka: Typescripts.

[00:06:56] Tyler Krupicka: Yeah, that, that came down to basically [00:07:00] decisions on our team. There was a bunch of factors that kind of led into making the decision. One of the big ones was just around, Like we send re we send regular surveys to a lot of the engineers at Stripe and asked, How is your development experience?

[00:07:17] Tyler Krupicka: And a lot of people had been coming back and saying that flow was becoming something that was slowing them down or they might be more used to the type scripts syntax or they had been working in some open source projects and type scripts had been working really well from them and.

[00:07:31] Tyler Krupicka: Hoping they could use that. And so there was like a bit of that feedback that could come into it. But then beyond that, like there was also a big announcement in 2021 from the flow team that basically said we're going to be focusing a little less on some of the community aspects of open source development.

[00:07:51] Tyler Krupicka: And so that kind of signaled to us as well That it might be a good time to start pursuing this further. And then once there was [00:08:00] a push to start looking at it, it came down to taking a look and first determining did we think it was possible that we could do it? What were the timeframes for it and the effort involved and.

[00:08:12] Tyler Krupicka: Did that meet up with our expectations for all of the benefits that we would get if we actually completed it. All of that was weighed and discussed and at the end of the day it came down to some decisions from our team. But we came back and put together the proposal for doing it and there's a lot of people involved with it.

[00:08:31] Tyler Krupicka: The manager on the teams, some of the other. Engineers and leaders in the developer productivity org were involved in discussing it. And so it was really an effort between our team and our great developer productivity org.

[00:08:45] Joe Previte: Can we dive, can you can we elaborate on that? The proposal?

[00:08:49] Joe Previte: Like how do as a small team, when you're trying to evaluate how long this is gonna take, if it's feasible, can you walk us through, how you figure those things out?

[00:08:59] Tyler Krupicka: Yeah. [00:09:00] That was definitely difficult. One of the first things we did was to try to look around and see what other companies had done a migration like this before, and that was obviously like, Kind of the main route for it was going, Hey has anyone else even attempted something like this before?

[00:09:16] Tyler Krupicka: How long did it take them? Can we extrapolate on that? And I think what we discovered when looking around was like there are some companies who had done pretty substantial migrations up until that point. I think Air Table being one of the largest ones and And so that was encouraging to see and it was encouraging to see that they had some success with that.

[00:09:37] Tyler Krupicka: I don't think we really found anything that was quite at the scale that we knew we were going to have to do. But we knew like some of the pieces were in place from that would make it a little clearer. And we also knew that it would have to be a longer term effort. But one thing that also came up a lot in discussions.

[00:09:58] Tyler Krupicka: Is if this is something that we [00:10:00] think is really important long term for the, like engineering productivity of engineers, atri and it's always going to take a long time to do and it's. Only going to get harder as more JavaScript code is written. If it takes a while now, is the trade off of doing it now definitely worth it?

[00:10:18] Tyler Krupicka: Or is there anything that will change in the future that will make it easier that might cause us to hold off? And I think what we found through the research was. Again, that some other companies had attempted smaller migrations and had some success. So we knew there was like some things that could help us get started.

[00:10:35] Tyler Krupicka: But also that this was something that was fairly urgent and would take a while, but now it was the right time to do it. Those kind of all informed how we went into planning things.

[00:10:46] Joe Previte: Cool. Okay. So at this point you've. You've made the team has made the decision that we're going to do this, it's worth it.

[00:10:54] Joe Previte: So what happens next?

[00:10:57] Tyler Krupicka: Yeah. So from there [00:11:00] we spent some time kind of planning out what the different phases of it would be and where we wanted to get started. One early thing that we had looked at was actually whether or not, Making significant changes to our flow types could make a type script migration easier.

[00:11:17] Tyler Krupicka: There was discussion about for example, Flow has a parameter called Types First, and another one called well formed exports. That are all about removing some of the inference that flow is doing and like explicitly typing function returns and like module boundaries and things like that.

[00:11:37] Tyler Krupicka: And and so initially we had some thoughts that putting a lot of work into improving some of those types might actually have us. Allow us to iterate on our current code while still making the type script migration easier down the road. And so we did a bunch of initial evaluation around that to see what different features we could have.

[00:11:57] Tyler Krupicka: There's also like a strict [00:12:00] object mode in flow, so that behaves a lot more closely to how type script deals with objects. And we took some time to work on one of the larger apps and. Add types first mode to well formed exports, things like that. And see how it improved the code base. Did we improve the coverage a decent amount?

[00:12:21] Tyler Krupicka: If we were going into conversion, would that make less type script errors after conversion? And what we actually found after doing that was that. It took a lot of effort to do any of those features, so like it could take weeks for us to turn on types first in a single app. It could take weeks for us to get well formed exports working, and so even with some like automated scripts to help out with it.

[00:12:49] Tyler Krupicka: And so what we learned was like, even if it is a bit helpful in getting us closer to type scripts, inax, and maybe making the conversion a bit cleaner, the amount of effort that we had to put in [00:13:00] was almost as much as just like converting some of the code to type script. And That like changed how we were thinking about our longer term roadmap.

[00:13:08] Tyler Krupicka: And within a couple months, we had already started to shift our plan to, okay, how do we start working on the actual type script conversion and prioritize that work. So that was one of the big things that came up. Another thing was just starting to communicate it to engineers and plan around it.

[00:13:25] Tyler Krupicka: One thing that was really difficult was there was a lot of projects inside of Stripe or where different teams were starting new projects and there was a question of if we're going to type script, can we use. Type script now, or do we need to go on flow? What version should we be using? And if we get in this intermediate state for quite a while, are you going to run into issues?

[00:13:46] Tyler Krupicka: And so we have to have a lot of discussions with teams to figure out is the project you're working on. Going to have to be interoperable with a lot of flow projects and like, how does this match up with our plans for [00:14:00] when we're gonna convert each project? Is it going to limit you? And in a lot of cases the answer ended up being teams started on flow and we explicitly came into it saying, Here's some of the things that you can do.

[00:14:13] Tyler Krupicka: To make conversion down the road easier to type script, but if you need type safety right now, and we have nothing converted to type script like this is going to be best for you. But it also meant that we had some projects that were willing to start in type script and it added some Goals for our migration, or we knew, in order to help out X and Y team, we might want to prioritize conversion in a certain part of the code base.

[00:14:39] Tyler Krupicka: So that was a little bit of the planning that happened around then as well.

[00:14:44] Joe Previte: Yeah, no, that, that makes sense. That's interesting. Yeah, How can we go deeper into that? Like how from an outsider's perspective, it's Oh, I would assume it's a greenfield project. Yeah. Use type script cuz we're gonna land on that even.

[00:14:57] Joe Previte: Why? Yeah, why

[00:14:58] Tyler Krupicka: not? [00:15:00] Yeah, so like one example might be like, we have a shared design system and like one of our initial projects to convert was going to be the design system. But if you need it right now and you need type safety with it, like it's in flow right now. So you can stick with that. There's also say you're going to be.

[00:15:19] Tyler Krupicka: Building part of this experience as part of another application that's written in flow and it has all of its utility types and everything written in flow. If it was like a completely brand new application that didn't really have any ties to some other parts of the code base, then that might be fine.

[00:15:34] Tyler Krupicka: But a lot of the things tended to be a little bit integrated in some way. So we had to weigh. How it integrated it was going to be. And like how many of the utility packages and everything it was going to need.

[00:15:46] Joe Previte: Okay. Okay. That makes more sense. . Okay, so we've talked a little bit about some of these other decisions that you had to make coming up.

[00:15:54] Joe Previte: What about. Actually deciding which migration approach to go with. How can you [00:16:00] walk us through that?

[00:16:01] Tyler Krupicka: Yeah. I'd be interested to get your thoughts on like how to approach it, but there was obviously a big discussion about how to. Scope some of the migrations. So we have these different applications throughout the code base and they're naturally pretty segmented so we can do the conversions on their own, but some of them, like the Skype dashboard, are like 3.5 million lines of code.

[00:16:26] Tyler Krupicka: Can we try to do it incrementally? Because if we can do it directory by directory in there, that makes a lot of things much easier. Re removes a lot of risk probably. And so I think the initial discussions were all around how do we make that goal work? And as soon as we got into it, we started to run into a bunch of things that made it seem less appealing.

[00:16:52] Tyler Krupicka: One thing was, How does your like editor work? If you have half the code base in flow and half the [00:17:00] edit, the code base and type script, like either, there's a few options. We could try to generate like types at the boundaries. So we'd have to have a flow, two type script generator and a type script, two flow generator.

[00:17:13] Tyler Krupicka: And then anywhere there's like a module boundary or a export and a file, make sure to have a declaration for it. And then like your editor, like the S code is gonna have both the flow extension installed and the type script extension install. And like you're, you could get in a state where you're a developer and you have one file open and flow, and one file open and type script.

[00:17:35] Tyler Krupicka: And if you have to do that for a long period of time, that's gonna be like a worse experience for developers. And our thought was, Depending on how long the inconsistent state is it could be worse for engineers to be in an inconsistent state than it is to be like just stuck on either type system and those kind of things weighed in a bit where we were like, Eh, [00:18:00] I'm not sure we can do a migration like this and have a really good developer experience while we're doing it.

[00:18:07] Tyler Krupicka: The other thing that came up too was in a lot of these big applications, it wasn't easy to just segment off a certain part of the app and say these engineers are only working in this folder, so maybe we can do it. These bigger folders and do it in chunks of three or something. But unfortunately in the like dashboard and some of the other applications we were working in, it just wasn't possible to segment that so easily.

[00:18:32] Tyler Krupicka: And like in addition to all of that I don't know what your experience is, but when you're working in a super large code base, There's always like some sort of migration going on, just always. And it's really rare to hit a hundred percent completion. You switch like part of your testing runner, maybe you're going from like enzyme to react testing library, or you're [00:19:00] switching from one internationalization library to another and it's just impossible to get to a hundred percent unless there's just some people.

[00:19:09] Tyler Krupicka: Their core responsibility is like fixing all of that, and it becomes really difficult and I think we also had seen from a lot of experience that any incremental migration like that was almost guaranteed to never hit a hundred percent without a lot of extra work. And so that also made an incremental approach, just a little bit less appealing, where there was this.

[00:19:30] Tyler Krupicka: Understanding that, huh? These don't usually get to a hundred percent in order for a type system to be like really good. It really needs to get to a hundred percent. And so when we started weighing all of those options, it. It became clearer that really difficult. It was probably going to actually simplify a lot of things to just try to do everything in one shot if we could.

[00:19:55] Tyler Krupicka: I think one other thing that came up was just that in the case where you're have a [00:20:00] partial migration, in addition to having the tool to convert flow type script, you also need to have. Tool that converts type script to flow. And so we also weighed that as like maybe doubling the amount of pre-work required to make that tool really good.

[00:20:14] Tyler Krupicka: But yeah, so that's what was weighing for us.

[00:20:19] Joe Previte: Yeah, I hadn't really thought about the, what, if you're a developer and you have one flow file open and one type script, Yeah. You really don't wanna sacrifice the developer productivity. Yeah, cuz how long is that gonna go on for?

[00:20:29] Joe Previte: Especially with such a large code base. It's interesting because, so I think you are, I wanna say that like maybe fifth or sixth episode we've done so far on the podcast and your, the only one or as of right now, that we've talked to about doing a flow to type script migration. And so most of the other ones have been.

[00:20:50] Joe Previte: Just a Java scripted type script. And I think there was one person who we were talking to who did an Inc. They took an incremental approach and it was I think it took a [00:21:00] year and it was a large code base, and it was okay, any new files that we create or type script first.

[00:21:06] Joe Previte: . And then when we have time, like tackling tech debt, at the end of a sprint or whatever, they would convert like one file or one directory. And It took. Yeah. It took a really long time it was a smaller team, smaller work. . But it's interesting. It's, you wave outside.

[00:21:20] Tyler Krupicka: And I think like the, one of the big differences with that is like when you're going from JavaScript to type script, you're going from no type safety to type safety. And so you're already used to no type safety. So if you head back to a file that's missing types, you're like, Oh, I wish the said types, but I can't add it if I want.

[00:21:36] Tyler Krupicka: But like every type script. File is like a net improvement to your development experience, but like when you're going from one type system to another type system , if everything's like very inconsistent and you're like going from a decently working type system to one that like isn't working very well for a year while you're migrating, then it's definitely a bit trickier to do things incrementally.[00:22:00]

[00:22:00] Joe Previte: Yeah, exactly. And I'm just trying to picture myself as a developer working on that and It sounds like a little exhausting to have to juggle these two type systems, in your head, especially while you're trying to like, Just solve problems like fix bugs and implement features.

[00:22:13] Joe Previte: So

[00:22:14] Tyler Krupicka: yeah, he you're having to deal with the multiple syntax. Maybe your VS code is sluggish because you have now both the initializing TS js features and the flow is restarting flip server going at the same time. Yeah, there, there's some tricky aspects to that. Yeah.

[00:22:31] Joe Previte: Sounds like a nightmare that you know, you're avoiding for the team.

[00:22:34] Joe Previte: . Cool. Let's let's keep going. I want to talk about so you figured out, okay, incremental adoption, not the approach we want to go for. So what comes next in, in the process?

[00:22:44] Tyler Krupicka: Yeah, so the first, like really the next thing coming up was we had to start trying to convert some code and seeing what we could do.

[00:22:53] Tyler Krupicka: And so we did a little bit of evaluation of some of the open source tools that [00:23:00] existed for doing prototype script conversion. I think the two main ones I was looking at were the Khan Academy flow to ts tool and then the air Table code mod, which was fairly new at the time. And we knew one of the first projects we were going to have to tackle was our design system.

[00:23:18] Tyler Krupicka: Basically any project that you have, That is a like core dependency of every app you have is going to need to support both flow and type script for a period of time as you're migrating. Or go, it could go over to type script, but then you can't update it anywhere in other places. So there's a bit of discussion about that.

[00:23:39] Tyler Krupicka: But we knew we needed to start. Figuring out how to convert and evaluate some of the code mods. And we knew we needed to address the design system, which was like this core dependency. And so we just took some of the open source tools and we started running them against design system components, which are much smaller than like a full application, [00:24:00] but, exercise a lot of the different code paths that you could hit and just like a normal React application.

[00:24:06] Tyler Krupicka: And I think through doing that, it became pretty clear that actually. Both the Khan Academy tool and the air table tool actually worked pretty well, especially for just the most basic flow type conversions like you're converting from mixed to unknown or something like that. Both of them had them in some areas like maybe one was better than the other.

[00:24:29] Tyler Krupicka: But the one thing or the couple things that we saw with the Air Table Code mod that made it seem like a good starting point were they had built in this system where any place where type script explicitly needed a type, so like a function parameter. Type or something. And flow didn't have that specified.

[00:24:50] Tyler Krupicka: It would reach out to flow and get the type that flow is inferring and then place that in there for you. Using a command called flow type app [00:25:00] position, which is built into the flow like compiler. And That was a really neat feature that like, it was immediately unique and useful.

[00:25:09] Tyler Krupicka: We, we had some cases where we had functions that worked totally fine and flow, but they were using inferred types for the function parameters and it was able to go ask flow, what is this? And flow would a lot of the time come back with a string or a number or something useful that we could place in there.

[00:25:26] Tyler Krupicka: So that was good. And so I think that was like the main feature that we saw in there that was different, that we knew would be useful. The one other thing was the. Kind of code that they had written for actually going and finding files. And then also renaming them to dot ts or dot tsx was pretty useful because some of the other tools didn't have that renaming functionality built in as well.

[00:25:50] Tyler Krupicka: But we knew that like with some of those like core basic things working, we could. Make improvements on top of it. If there was type conversions that didn't work as well, we [00:26:00] could just add them. So that was our starting point. And once we had. Picked that out, we were able to make a fork of the code base and start doing revisions on it and just running it against design system components and making improvements as we found issues.

[00:26:17] Tyler Krupicka: Yeah, so that was really how we got started with all of that.

[00:26:22] Joe Previte: And then to as you said, that was your starting point and now you're iterating on it to, find edge cases and things like that. , what was your process like? Did you write tests or something and you were coding against the tests?

[00:26:34] Joe Previte: Or was it just a manual make some changes, run the code model, make some changes? What was your

[00:26:39] Tyler Krupicka: process? Yeah, tests were exactly what we did. One thing we noticed that was pretty interesting was like we hadn't actually seen. With some of the other code mods, like a really extensive test suite of conversions.

[00:26:53] Tyler Krupicka: And basically anytime we ran into a new issue, we'd write a test for that type of conversion and then just work [00:27:00] on the conversion until it, it came out how we wanted. And I think that was hugely beneficial. We very quickly went from having maybe a dozen tests to. 150, and I think at the time that we open sourced the code model, this last month, it was over 400 different conversion tests.

[00:27:20] Tyler Krupicka: And those have really helped and they've just gotten more useful over time. You're tweaking things with like how we're parsing function signatures, see something, and there's, it's very easy to break other conversions. And not only are they useful for like test driven development and getting to a working solution, but.

[00:27:38] Tyler Krupicka: They're useful for making sure we don't break anything in the future on it. So yeah, basically every time we'd find one of those issues, we'd jot it down, write a test for it, and then when we had time go in and patch in the fixes until it passed. Yeah, I think initially some of the biggest things that we were working with was there was just like a lot of react [00:28:00] specific types.

[00:28:01] Tyler Krupicka: Between all of the different code mods that exist. We might be able to see one of them has a conversion for it or there's like some documentation somewhere, but there's a lot of different types that come out of the React types package and it's really hard to have a conversion for all of them.

[00:28:17] Tyler Krupicka: So early on, I think a lot of the ones we were converting were related to that.

[00:28:22] Joe Previte: Gotcha. Okay. And this might get into the weeds a little bit, but I was peeking around the code mod repo. , and I noticed there's a patches directory. Yeah. With a recast patch. What was that for? Uhhuh.

[00:28:36] Tyler Krupicka: The way that Code Mod works is it basically has a script that goes and finds all the files in a directory that have the app flow annotation.

[00:28:46] Tyler Krupicka: Then it opens them up and parses them with a tool called recast. And for those that might be unfamiliar, recast allows you. Get the whole body of a source code file as the [00:29:00] babble ast, and then make modifications to the source code and then save that off. And so it's like, as opposed to having to jump around the file and like edit strings or something, you're actually able to work with the graph that babble builds describing the source code file.

[00:29:17] Tyler Krupicka: And one thing that we found that was really difficult, Babble and recast and a bunch of other tools is just comment placement. There's a bunch of different ways you could handle comments in a file. I think recast handles them slightly differently than how babble handles them, but in a lot of cases, either a comment might be floating and it might be like attached to the top level source code itself.

[00:29:43] Tyler Krupicka: Sometimes it might be attached to a line above or below it, and so like that line will have a comment attached to it. And when you're doing like the code mod and you're, you have a block of code that maybe is like a type instantiation, that type instantiation might [00:30:00] have a comment attached to it.

[00:30:01] Tyler Krupicka: We go and make modifications to it and suddenly the comment is on the wrong line now, or we shift it around the lines in the code and now, Comment is above or below. And there was a couple cases where we found places, I think just in recast where the way that it was recomputing, where comment. When we're a little wrong for what we were trying to do and the modifications we were making.

[00:30:25] Tyler Krupicka: Yeah we had to put in a patch file for that. It's pretty minor if I remember correctly, but it does help make sure that as we're modifying lines with types, that the comments get reattached to the right line.

[00:30:37] Joe Previte: Gotcha. Cool. Yeah, no, that's super interesting. I wasn't familiar with recast and I'm sure there's other listeners who aren't. So thanks for explaining that. . So you've made the code mod, you've got, let's say 150 tests or more . How do you know when you're ready to use it?

[00:30:53] Tyler Krupicka: Yeah, so once we had completed Getting our design system in a spot where it had both [00:31:00] flow types and type script types.

[00:31:02] Tyler Krupicka: We felt pretty comfortable about the code mode for doing basic conversions, but we knew that as soon as we went to convert an actual application, like a React single page app, that would have different issues. Suddenly you have React Router and Redux, and a bunch of other packages that have more complicated syntax involved with them.

[00:31:24] Tyler Krupicka: It's a different way of writing code versus a component, and we knew that we needed to like, do some tests on things that were actual applications, and so we basically went through apps that Stripe has internally especially ones like inside of my developer productivity organization.

[00:31:42] Tyler Krupicka: We could reach out to. For example, there's a team that helps with the developer environments and they have a really small single page app for helping you navigate your developer environment and. We could use that to convert type script. It's pretty small, maybe less than 50 [00:32:00] files total and incrementally take on more challenging code bases.

[00:32:05] Tyler Krupicka: And after we had finished the design system, we basically did that. We like went to a. Code base that was maybe 50 files. And went and converted that and then went to another code base that was maybe a few thousand files and went and converted that. And what we started to notice, like as we were doing that is.

[00:32:25] Tyler Krupicka: There was a process for actually doing the conversion. So you need to have babble support type script. You need to have all of these other tools. Webpack, just yes, link everything, support it. So we started building documentation for doing that. But we also started seeing there was like, A bit of a process to actually running the code mod and getting the code base into a state where you could get a green check mark on CI and merge it.

[00:32:52] Tyler Krupicka: And I think the biggest thing was just the type script errors. After you convert type scripts, not just going to pass immediately no [00:33:00] matter what we do, there's always going to be some errors. And as you just get larger and larger projects, it gets to a point where we can't have even a team of four or five people go and try to fix up errors and.

[00:33:12] Tyler Krupicka: Add suppression comments and things like that. So we needed to start to come up with more automated tools to scale up that process. And so yeah, we started building a. A whole workflow for taking an app from flow to type script. How do you upgrade all the tools? What scripts do we need in place in order to actually run the conversion?

[00:33:33] Tyler Krupicka: What scripts do we need in place to go and clean up things afterwards so that we can get to a passing build? Things like that.

[00:33:41] Joe Previte: That's super fascinating. Okay so we've got like the main tool, right? The code mod , what were some ways that you automated some of these other things to scale it up?

[00:33:50] Tyler Krupicka: Yeah. One of our teammates went and started experimenting a bit with a tool called TS Morph which I don't know if you're familiar with that, [00:34:00] but it's basically a tool for code modeling, but it hooks into the type script compiler and uses like type scripts, internal functions. And so what we realized pretty quickly was like we needed a way to automatically.

[00:34:15] Tyler Krupicka: Ts expect error, all of the new errors that were coming up because we can't manually take the time to do all of that. There's also a lot of things that we could do after migration. Once we have type information we can start to use that to fix some of the errors as well. So like type script has an auto import feature.

[00:34:36] Tyler Krupicka: If we see a type error that's. This type doesn't have an import. Can we ask type script to go find the import for us? Things like that, that can help clear up a lot of the errors that we were seeing. And like our team worked on developing some different scripts built around that. And we were able to put that in place.

[00:34:57] Tyler Krupicka: So there's what we call the second [00:35:00] pass code mod. Also in our code mod is the fixed command. And that's basically, A script that's running with type scripts, full information after conversion and adding suppressions where are needed. Fixing up imports where needed, things like that.

[00:35:17] Joe Previte: Okay, cool. Okay, so you do the first run.

[00:35:19] Joe Previte: You have some learnings, you add some more scripts. So now let's say you go to a new project within Stripe. Say it's a hundred. What is the what does the migration workflow look like?

[00:35:30] Tyler Krupicka: Yeah at that point things are actually fairly streamlined. The workflow basically. Looks like step one.

[00:35:37] Tyler Krupicka: We go and we install all of the dependency types we need. Go grab the app types, low dash, things like that so that they all resolve. And then step two is like going into each of the tools and upgrading them to use type script. So just support type script. Can we run the test suite with a type script file?

[00:35:56] Tyler Krupicka: Does babble support it? Things like that. When upgrading [00:36:00] those tools, we would usually try to keep it so that it supported both flow and type script. Just because it meant we could merge that all in and. Have it all in place before we actually run the conversion script. And then from there we're running the code mod and it's going and it probably takes on like a hundred file code base, it's probably taking 30 seconds or something to convert it.

[00:36:24] Tyler Krupicka: It's pretty quick. And from there we are running the, like remaining scripts to go and suppress the errors and pushing it up to ci, seeing what errors come. What we found around that stage was when you're doing these changes, it might seem like a really high risk migration because you're modifying so many lines of code.

[00:36:48] Tyler Krupicka: But if the code lot's working properly and it's only changing type annotations, the actual behavior of what the code does at runtime should be more or less identical. And the issues where that.[00:37:00] Start to become risky are where we're like modifying imports a bunch where something might import differently or when we're changing the file extension and that, that started to be the main thing we'd noticed, which is Some tool might reference dot js extension, like explicitly.

[00:37:17] Tyler Krupicka: Some import might use dot js explicitly. And so like changing the extension was going to start causing more of those errors. But from there we just clean up whatever of those errors came up and get to a green CI build, do whatever testing we can on it, and then merge it in in one pr.

[00:37:37] Joe Previte: Wow. Okay. And Okay, so you're doing all these things, right? You're continuing to learn and improve the process. How do you know you're, when you're ready for the big kahoona?

[00:37:46] Tyler Krupicka: Yeah. At a certain point, it became clear, like there was never going to be a point where.

[00:37:53] Tyler Krupicka: Felt a hundred percent ready for migrating something that big. I think the jump that we did was from migrating a [00:38:00] 50,000 line application to the 3.5 million line application. So it's like a 75 times increase in, in complexity. But at that point we had already converted some apps. We had a good team of engineers at this point.

[00:38:16] Tyler Krupicka: There was like four or five of us working on it. To actually get all these skills that we need to understand, get all the scripts in place to do it. And so once we realized we could do the 50,000 line app, it became clear that we should start taking a look at the biggest application.

[00:38:33] Tyler Krupicka: And we actually didn't do the biggest application last we. New. It is more complicated, but there's a couple things that come up with that. One is because it's the biggest application, the most engineers are working on it, and so if our goal is to convert to type script, the guiding star of that is like we're getting type scripts.

[00:38:55] Tyler Krupicka: To engineers. And so if we're converting all of the applications that [00:39:00] like are less frequently developed on, then yeah, we have more lines of code and type script. But people are still. Gonna be using flow . And we knew we needed to like, get on top of that and make sure that we were prioritizing where the people were working.

[00:39:15] Tyler Krupicka: But the other thing we knew too, was once we took on the largest code base that would just shake everything out of the woodwork, basically everything that could go wrong in conversion everything that could go wrong with tools would. Probably go wrong with the largest application. And so once we had that figured out, it would also make every migration after.

[00:39:38] Tyler Krupicka: Much more straightforward. We could be a lot more confident in the quality of conversion. And once we had decided on that, which was around January or like late December of this year, then we went into a three month or. Two and a half month work stream to actually get prepared and migrate that large [00:40:00] application.

[00:40:00] Tyler Krupicka: And so it followed a lot of the steps that I outlined before. Like one of the first things we did was go and find all the types we needed for dependencies. And we actually ended up writing a script that will go and find those for you because there's quite a few of them. And that's included in the code mod.

[00:40:17] Tyler Krupicka: But like doing some of that pre-work. One other thing that came up was in a large code base there is usually code generation happening. So say you're using like a rest API client or GraphQL or something it might be generating types for you that you can use. And so we were using flow types for a lot of those, so we needed to go in and make every generator support type script as well.

[00:40:41] Tyler Krupicka: And we could have it just basically output both. And then from there we actually had to get to a point where we could run the conversion on this large code base. It took quite a bit longer than we were used to. And I mentioned earlier that the code mode we were using it could go out and reach out to flow [00:41:00] and get inferred types out of it.

[00:41:02] Tyler Krupicka: That is really helpful, but it's also really slow in the grand scheme of what the code mod can do. So most of the operations, the code model. Are doing is just like parsing a file, finding different like function signatures or type declarations and then going within them and converting things, which is really quick.

[00:41:23] Tyler Krupicka: And then as soon as you like, start adding a delay for like type system requests constantly, like it can get a lot slower. And so we were seeing what the large code base, we're used to running the conversion in a couple minutes on all these smaller apps and now suddenly it. 45 minutes or an hour if your laptop is other stuff going on.

[00:41:43] Tyler Krupicka: And so we had to balance our work a little bit better as well to know that, oh, suddenly conversion is a bit more of an expensive operation. Maybe we can convert part of it. But once we got to a point where we could. Run the [00:42:00] conversion across all of the code base. We added a feature to the code mod to give us a Excel spreadsheet or like a CSV file of all of the type script errors.

[00:42:12] Tyler Krupicka: And so we could take that, sort them by error code and then start to just parse through and say is the conversion going cleanly or some of these things that we can fix? And I think the first time we ran that it was almost a hundred thousand type script errors and type script was pretty slow.

[00:42:32] Tyler Krupicka: It, it gets really unhappy when you have that many errors in the code base. And. We had to do a work stream to start organizing and prioritizing those issues and burning that down anyway, that we could. Some, a lot of them just ended up being errors where it was like, maybe we missed a package.

[00:42:52] Tyler Krupicka: That had types or maybe we're using the wrong version of its types. And now everywhere it's imported is an error or we have some core [00:43:00] utility that's types are a little wrong. And so we worked through fixing some of the errors and then went into a third phase of the project, which was all about making the conversion a repeatable thing that we can do.

[00:43:12] Tyler Krupicka: Quickly and making sure that we were like testing and could ensure the high quality of the conversion and that, this is something that we could release to production and not have a ton of things break. . And yeah, that, that was really like the last few weeks of the. Development cycle.

[00:43:31] Tyler Krupicka: We put together a script that kind of would try to do the conversion and to end, get it as close to a passing bill as possible in an automated fashion. And it would take a while, but but it's definitely much faster than doing everything by hand. Yeah.

[00:43:48] Joe Previte: Wow. And okay, so you get to this point, you, it's ready and you do it, and then you merge the PR and I think if I saw right, it was on like a Sunday.

[00:43:58] Tyler Krupicka: Is that right? Yeah, [00:44:00] so for context, like there's hundreds of developers working on the. The strip dashboard in any given week, and we have engineers in pretty much every time zone at this point. So really the only way to do a merge for that particular code base was to merge outside of hours and outside of hours is like not even all day Sunday.

[00:44:24] Tyler Krupicka: It's Before the evening on Sunday, because there are some engineers starting work in Singapore or somewhere else that's in a a time zone that's ahead of us. Kind of the plan that we came up with was on Saturday, go ahead and try to run through a lot of the conversion steps so that we can get a gauge of whether or not we're going to make it where we're at.

[00:44:48] Tyler Krupicka: Get that done and. Then on Sunday if there happened to be any changes, just like rebates, the small changes that happened. And then focus on like making sure the deployment has a [00:45:00] lot of time to go out, making sure we have time for testing and making sure everything is safe. So that was the structure that we followed going into it, but it was definitely a bit nerveracking on that Sunday hitting merge on a PR where, you see the diff and it's 3.5 million added, 3.7 removed or something like that.

[00:45:21] Tyler Krupicka: And just hoping that a lot of the systems. Handle it nicely. .

[00:45:26] Joe Previte: Yeah I can imagine. I, my, I would be, I would feel a little nervous about that too. Okay. And I know we're coming close on time, so I guess maybe we can end around here, but so you merge the PR. And what's the reaction internally?

[00:45:41] Joe Previte: What did people think and what happens

[00:45:43] Tyler Krupicka: after? Yeah, thankfully the reaction was pretty overwhelmingly positive. I think people came in on Monday morning and were just shocked to have their code base be so drastically modified. There was definitely a little bit of bumpiness like the. [00:46:00] Day, just making sure people's editors got updated and restarted and whatever, so that they were running type script.

[00:46:07] Tyler Krupicka: But the feedback we got from everyone was basically a lot of excitement about type script, a lot of excitement that it was a hundred percent done and over with in such a short amount of time. And also just a lot of enthusiasm as well around like getting started to. Clean up everything and just continue to improve the types.

[00:46:29] Tyler Krupicka: One thing that we had focused on towards the end of the migration was like understanding what the impact was going to be on type safety. And so one of the metrics we had been looking at was type coverage, which you can get out of both flow and type script to see which lines are just missing types in general.

[00:46:50] Tyler Krupicka: And we were happy to find that like the. Coverage was pretty similar before and after conversion. But then like engineers could step in, go [00:47:00] to their section of the code base and if they were really excited about type script, they're already in there, like improving the type and taking advantage of the different types system.

[00:47:07] Tyler Krupicka: So it was really encouraging to see everyone just jump into it and and be really enthusiastic about it.

[00:47:14] Joe Previte: That's awesome. Yeah. That, to have such an overwhelmingly positive response and have people like jumping in like that first week and Hey, like I know how to fix this, and all that, so that is awesome. Cool. I want to be respectful of your time. I know we're close to an hour. Is there anything else that you wanted to touch on?

[00:47:33] Tyler Krupicka: I think the one main thing that I'd just like to call out was, This was a huge team effort with a lot of different teams involved.

[00:47:41] Tyler Krupicka: Within my team, pretty much everyone had a part in helping out either with planning or with just specific aspects of the migration. But even outside of our team, there was people in developer productivity who were. Giving us some guidance on how to go and make a change this big. There was a bunch [00:48:00] of other teams who were just enthusiastic about type script, who were willing to test things out and there was teams who coming into the day before the migration were willing to pull up our branch and go and test their product in it and give a lot of feedback too.

[00:48:16] Tyler Krupicka: And so it was really with like kind of the coordination and help from everyone there that. We were able to pull this off. But at the same time there's, it was a pretty huge change to happen in a very short amount of time. And you'd expect a lot more people to be involved than were

[00:48:34] Joe Previte: Yeah, No, I mean that especially having such a small JS infer team but then being able to leverage all of the other, team members like you mentioned and developer productivity and I'm sure other teams, yeah, it's incredible. It's like 3.5 million lines. How many other engineering teams, have done that? Or at least talked about it publicly, .

[00:48:52] Tyler Krupicka: Yeah. No, it was really exciting and it was great. Stripe has a lot of Ruby development as well. There's a Ruby Infr team [00:49:00] and they've also done some migrations with big code basis. That's always really handy to have people like that around who can give some insights and who've been there before.

[00:49:10] Joe Previte: Yeah. Oh, definitely. Being able to leverage the, like prior art, prior experiences , even if it's in a different language. There's so much you can learn there.

[00:49:17] Tyler Krupicka: And I think too, it was good that there's the Khan Academy Code mod and the air table code mod and just all the open source work that had already been done.

[00:49:26] Tyler Krupicka: I was really happy that we were able to take all of the code model changes that we made and package that and open source it. It's up on the Stripe Archive, Get a repo if people wanna take a look at it. But that was all possible because of a lot of the work that had happened in open source already.

[00:49:43] Tyler Krupicka: And hopefully people can use our project to to also, convert their code base.

[00:49:49] Joe Previte: Yeah, totally. We'll make sure to add a link in the show notes for that. Yeah, it's amazing that, I don't remember who was the first one to do it. But somebody kind of starts this path of Opensourcing their own code mod.

[00:49:59] Joe Previte: , and [00:50:00] then someone forks that evolves. So , I'm sure within two years, maybe within five years, we'll see a bunch of other companies like fork yours, modify it, improve it, and until eventually everybody will be on type script,

[00:50:12] Tyler Krupicka: so yeah. . One day we'll see

[00:50:15] Tyler Krupicka: Yeah,

[00:50:15] Joe Previte: exactly. Exactly. Cool. Tyler, thank you so much for coming on to the show today. And just sharing all the insights about the migration. Where can people go to find you online?

[00:50:26] Tyler Krupicka: Yeah. So I am on GitHub, on LinkedIn and on Twitter it's usually just under my name, Tyler Kka.

[00:50:34] Tyler Krupicka: Happy to answer questions there. You can also find a lot of my teammates linked in the GitHub repository as well. They also have a lot of insights, so people have more questions. Feel free to reach out and yeah, we're happy to talk.

[00:50:47] Joe Previte: Cool. Awesome. We will link those in the show notes as well. Thank you so much again, Tyler.

[00:50:53] Tyler Krupicka: Great. Yeah. Thank you so much for having me. It was great to talk about it.