We kick off 2017 talking to Uber Engineer Gautam. In first part of this 2 part series, Gautam talks to us about the Uber Android app, the complexity in the architecture, the scaling challenges, the pain points Android developers faced working on such a massive app.
He then goes on to explain how his team (Android Developer Experience) at Uber have approached these challenges and come up with elegant solutions.
We then dive head deep into Buck – the build system for Android development, it’s advantages and the benefits that the folks at Uber have observed having migrated.
- Buck overview (Facebook/Uber)
- Bazel (Google)
- Pants (Twitter)
- Android test app comparing different build systems [github.com]
- @kageiit [twitter.com]
- @fragmentedcast [twitter.com]
- @donnfelker and +DonnFelker
- @kaushikgopal and +KaushikGopalIsMe
Donn Felker: How are you doing, Kaushik?
Kaushik Gopal: I am doing good. How has your day been?
DF: My day has been fantastic. Just another day in paradise, as they say.
KG: That sounds pretty good! My day has been interesting. I had a pleasant Uber ride, but it took me a quick minute to get here.
Speaking about Uber, I hear we have an excellent engineer from Uber with us today. Can you tell us more?
DF: I’m just going to hop right into it. Without further ado, I’d like to welcome Gautam to the show. Welcome to the Fragmented podcast!
Gautam Korlam: Thanks for having me, guys.
DF: Gautam, I think it would be good for our listeners to become familiar with you, and I think the best way you can do that is by giving us a little background information on yourself: how you got into Android development, where you work at now, and what you’ve been up to.
GK: I’ve been an Android developer for about three years now. On my first job out of college, I joined a company called Lookout Mobile Security. I’m sure you know some people who work there. There was a guy there named Israel Ferrer (he works at Twitter now).
DF: Hey, he was on our show before!
GK: Yeah, I used to work with him. Anyway, when I joined the company, I didn’t know anything about Android. I did a bunch of stuff with Ruby on Rails back in college, and I wasn’t really sure of what mobile did, but I was given the job. I started with Android UI. I think one of my first projects was designing a live wallpaper (those actually used to exist!).
Eventually, while working at that company, I began to develop an interest in Android tooling, especially in continuous integration (CI) and the build system. That was at a time when we didn’t have Gradle or any of the other tools that we have today. It was mostly just Eclipse.
DF: Oh man, that word brings back nightmares for me.
GK: Back in the day, we weren’t even on Maven.
KG: So you had to deal with all that good XML stuff, huh?
GK: Exactly. Those were interesting times.
But at that point in time, Gradle came out. I was pretty interested in the build system and all the new stuff that was coming out. I think they announced it during Google I/O. So I began diving into that stuff, and I migrated the company’s build system over to Gradle. So I grew into more of a tooling background.
When I was starting to look for another job, I landed at Uber. The first thing I said during my interview was that I wanted to make tools for developers—and today, I’m on the Android developer experience team at Uber. We focus a lot on things—the build system, the IDE, and any general productivity tools that developers use in day-to-day jobs. We try to help them do their job better. That’s the gist of my Android experience.
DF: So you help the Android team at Uber do their jobs more effectively? That’s cool. I had no idea there was even a position like that at Uber.
GK: It was a very organic thing. I was on the platform team before, which used to coordinate many of the core libraries, like networking, experimentation, and analytics. Since my main passion was in tooling, I carved this team out over time, so I have a couple of people on it right now.
I think this sort of team becomes very important as you scale a company. Especially with a lot of developers, you get a lot of unique challenges and problems. I think we currently have a couple hundred Android engineers at Uber.
KG: Oh wow! That explains a lot of stuff. That seems like paradise!
GK: Yeah, there are a lot of smart people—and a lot of people committing core changes. People tend to step on each other’s toes, so I want to make sure that they’re doing their job effectively, and (at the same time) that they’re able to commit fast and move quickly. That’s where we come in.
KG: Suddenly, the whole Android developer experience team is like, “Oh my God, we obviously need someone who’ll help with that.”
GK: It wouldn’t be much of an Android developer experience team if we had just two engineers.
DF: Holy cow, that’s amazing.
KG: One of the topics that both Donn and I have been really interested in (we’ve always wanted to talk about this) is the Buck system. I know that Uber now uses Buck, so it looks like you’ve transitioned from Gradle to that. Is that true? Did you ever use Gradle, or did you always use Buck?
GK: To give you a bit of a background about why we did this, when I joined the company, there were a handful of engineers. We had a very small codebase: the rider app, the driver app, and one common library, (which was conveniently called “Android Library”) that contained the shared code between these two apps. There was no versioning for that artifact, so people would just consume it and build off of it, basically. Those were dark times!
When I joined, I wanted to transition us onto a system where we could add more libraries and start taking things apart. We began looking at things like Artifactory, which is a Maven server for artifacts. We already had a very basic setup of Gradle, so we pushed all of our libraries up to the Artifactory server and all the apps consumed them. Organically, over time, we’ve grown to a situation where we have somewhere close to a hundred libraries with a lot of different functions. So things have gotten interesting.
DF: I have a question about that. I have a friend who worked, very early on, at Uber. He told me that everything is based on micro-services there. Does same thing happen in the Android ecosystem too? In other words, does everyone creates these micro-libraries for various different components or what?
GK: When we initially started out, we had a macro situation, where we’d try to dump all of our common utility code (along with anything that could be a secondary) into this library. But we later realized that if you start doing that, there will be a lot of bad coupling, and that can be a very big problem. So we started splitting out into micro-libraries, but not to the extent that there are only two or three classes in a library. That would just get ridiculous to maintain over time.
If you think about it at a high level, you’d probably have a bunch of different components in your app: stuff for networking, threading, UI, analytics, experimentation, and mapping and navigation, for instance. So we started splitting out our libraries based on the core feature specs of the app.
KG: How many submodules does a typical app the size of Uber have?
GK: It’s totally dependent on the particular app. Some apps, when they were being developed, didn’t have any modules in their repository. They had other modules outside the repository that were all built individually and uploaded to this artifact server, and would later consume it in their different ways.
DF: You guys consume a ton of modules and libraries, so (at some point) you have to have run into some kind of scaling issues with Gradle. Actually, I haven’t ever had that many submodules (and most other developers probably haven’t, either), so I guess the big question is: did you run into any scaling issues with Gradle when you reached that level?
KG: I mean, I had a project that I worked on in the early days of Gradle that had three submodules, and it was already a nightmare, so I’m really curious to find out how that’s worked out. I imagine there are more than three in Uber!
GK: Yeah, definitely more than three. To answer the question, when you initially start a new project in Android Studio, the wizard tells you, “Hey, you can make can make an app module”, right? I’m sure that if you started a hobby project or something on the side, you’ll just use one module. But the interesting thing about that module is, as you start adding more code to it, every single change you make needs the to recompile the entire module. As time goes on, you get to a state where small changes take so much time that you can go for a coffee break or do a lap around the office. We actually did that at Uber.
At that point, a lot of the engineers said, “Hey, this is too big. Let’s break it apart.” They started breaking things apart, but then they started seeing something interesting. Their Gradle sync times started getting really slow. Whenever they’d make a change, they’d think, “Oh my God, Gradle is telling me that the project is out of sync. I need to re-sync this,” but then it took a while. That can be a very grating proposition for engineers when they’re working, and they just want to keep working on their codebase.
KG: Like, “Oh my God, do not change your Gradle file—don’t even add a space!—because then it’s going to ask you to sync, and that’s going to take another five minutes.”
GK: Exactly. That’s the situation where we started putting stuff outside, because they had pre-built artifacts, which means that they didn’t need to read their tests or recompile it.
With scale, that came with other problems, though. If you have hundreds of libraries, it’s very hard to make big refactors in a safe, automatic manner. Whenever you want to make a change in the networking library, there could be hundreds of other libraries that depend on it. That could be very problematic, and you can break an API. And, since everything has to be compiled, you won’t even see the problem until it’s a runtime issue.
KG: Yikes. That doesn’t sound too pleasant, especially in terms of discovering the problem.
GK: For a while, I think we were still kind of in a system where we had a lot of different instances. So, to alleviate that, we had a system internally where, whenever some dependency changed, we recompiled and retested all the code for that particular version, to make sure that everything was compatible.
But that’s not really feasible as well, when you have a lot of different instances. At some point, our Gradle build times started getting ridiculous: around 20-40 minutes for a single line change.
DF: No more going to coffee! Go to get lunch.
GK: If you guys think that’s bad, look at Swift build times. I don’t know if you have any iOS friends…
KG: 20 minutes is dysfunctional. Wow! How does anyone get any work done?
GK: I we started seeing this issue back in May.
I don’t know if you guys have used the new Uber app. Have you tried it out yet?
KG: You mean the new redesign that came out? Yeah, it’s pretty slick.
DF: I used it yesterday.
GK: That app actually started out around that time. It was a brand new architecture, where a lot of the principles relied on the code being broken apart into a lot of modules. This was very worrying for me, as an Android developer, because I was thinking, “The platform team wants to bring out this cool new architecture, and they have a lot of modules coming up, but then the build times are going to suck.” I didn’t want to stop them from progressing, but…
DF: So you get to this level, break everything apart, and have all of these modules. Then you have this humongous build time. What did you do at that point to speed everything up?
GK: We didn’t even think about speeding it up at first. We just waited to see if we could make it work. We talked with the Android tools team quite a bit and tried to see if it could scale well at the longest time. But then we realized (at some point) that it didn’t scale, so we started looking at other companies that had scaled, like Google, Facebook, and Twitter. They have a gigantic amount of repos, with a lot of code, but they’re still functional. That got us interested in what they’re using to build their codebases.
At that point, we evaluated some of the automated build systems out there, and there were a couple of different contenders: Buck, from Facebook; Bazel, from Google (which they open-sourced); and Pants, from Twitter.
KG: Just so I understand, and so it’s clear to our listeners, these are all alternatives to Gradle? They’re all completely one-to-one build systems?
GK: They are all independent build systems, so you can build a lot of different Java or mobile codebases. But they’re all built by different companies.
KG: Does Bazel, the one that Google open-sourced, have anything do with Gradle? I know that Gradle is independent (i.e. it’s not part of Google), but that’s a question that could come up with the listeners: “There’s Gradle and there’s Bazel. Does Google internally use Gradle? Do they use Bazel? Do we know?”
GK: Gradle (the build system) and the Android Gradle plugin (a plugin that lets you build Android projects with Gradle) are two separate things. If you treat it separately like that, it starts making sense. Gradle is something that was open-sourced by a company called Gradleware. At that point in time (I believe), it was very easy for small projects get coding started very quickly. So the Android tools team, at that point in time, decided to create a way for extra developers to start using the system, because it was really low-friction.
At that stage, Bazel was not open-sourced yet. Google had something (which they still use internally) called Blaze. In fact, a lot of the systems I just talked about are all designed by engineers who left that team at Google and moved on to different companies. They built a very similar system, based on the ways that Google internally builds most of their codebase.
We wanted to focus more on something that would work really well for mobile, and Android specifically. At that point in time, Buck was the only one which had a lot of the features that we already get in the Android Gradle plugin, so that was something we began to take a closer look at. It had really good features. It promised so many different things, like really fast incremental builds, a lot of customizability, and being built to scale for the codebases the size of Facebook’s.
KG: You’re talking about “fast”. Do you have an approximate idea of how fast? Is it two seconds faster? A minute faster? 2x faster? Do we have any idea?
DF: Take Uber. How long is your build now, compared to how long it was before?
GK: I’ll give you a break down by different phases. When we started looking at Buck, we initially did everything by hand. We wrote all of the different files for Buck, and the build times were maybe 2x-3x faster than what Gradle gave us. That was very early on.
When you do incremental builds especially, there are a lot of operations Buck makes. There’s a thing called exopackage, which is very similar to multi-text. That was way before multi-text existed, which uploads changed class files to the device whenever you try to rebuild the project after you’ve changed some Java files. That cuts down the build time even further, and we started seeing 6x improvements.
KG: 6X?! Holy cow! Okay, just to backtrack a little, is this similar to what happens with Instant Run in that case?
GK: Instant Run is very similar, but it has a bunch of limitations. When it has anything to do with annotation processors, it basically does a full reboot, purely because the processors a lot of the Android community uses these days (like Dagger and ButterKnife) require the entire code. Anytime you change a file, it’s very hard for Instant Run to predict what’s actually supposed to be re-built, so it has to run the annotation processor again and rebuild the code again. If you notice, a lot of the samples you actually see from Instant Run don’t have the annotation processor turned on.
KG: Oh, interesting!
DF: Enlightenment just hit.
KG: So how does Buck handle that, then? I imagine it’s the same thing. Nothing has changed about the annotation processors, right?
GK: Buck actually takes a different approach to this. It knows that incremental builds are very hard to do in general. If you looked at the world of, say, Ant back in the day, you saw that incremental builds would give you correct results 98% of the time, but 2% of the time, you had to do a full build.
KG: That happens even today, and it’s super frustrating.
GK: Buck prefers really small, usable modules, so any individual module can be built or rebuilt really quickly. That means that you have a lot of flexibility with how long and how much you want to scale your build process. Let’s say that you have one giant application module in Gradle. There are some parts of that module that aren’t required to be recompiled every time, so you can start spreading things out. There are parts where you don’t even have to run annotation processors. So, as you start spreading things out, the build time becomes a function of how small you can get the modules and how many of them you actually have to build.
Which brings us into the next point about Buck: it’s very heavily based on inputs and outputs, compared to Gradle, which is task based (similar to how Ant worked). Let’s say that you want to build an Android library. Buck builds by looking at all the different inputs that are required—resources, sources, and other dependencies—and saying, “Hey, are all the dependencies currently ready for me to start building this or not?” If they’re not, it will go down one more level try to build the ones that are further down. It’s a very simple execution model that can catalyze really, really well. That’s what makes it very input/output based. There’s no more task model.
KG: Here’s a quick question then: you said that it encourages the use of smaller submodules, right? Does that mean that when I write a generic application, it would serve me better to force myself to use these submodules, or does that not necessarily matter? The alternative question is: if I don’t have submodules, am I going to see a benefit with Buck?
GK: Submodules definitely help with build times, but they’re not the only way that Buck is faster—for instance, the parallel execution model. The parallel execution model is very, very fine-grained. All it cares about is the input and the output. So, if you had multiple modules, it’s kind of like the Gradle switch that you can turn on called “parallel build”, except that Buck makes really good use of all the cores in your machine. It understands what the best way is to run all of these targets. Even when you have a vanilla Gradle module setup and you haven’t really spent time to speed it up, you typically see gains of 2x-3x for a clean build.
To actually prove that fact, we have a test project on GitHub that shows how the different build systems stack against each other. You can actually build a project with Gradle and build the exact same thing with Buck, and you can see for yourself how it makes a big difference with clean builds.
Since everything is declared beforehand in Buck (both input and output), there is no such thing as the Gradle configuration time. I don’t know if you ever saw this message: “Gradle configuring project x of y.” You wonder what it’s actually doing in the background. It’s actually doing a lot of stuff to configure the tasks, and figuring out what the inputs and outputs are, but all of that stuff is given to Buck, since the build language it uses makes it very easy for it to skip that configuration phase.
KG: That’s interesting. You said it’s highly customizable, and then you mentioned the build language. Gradle is written in Groovy, right? So Groovy is the language that’s used. Does Buck also use a language, or is it more declarative (like Android Maven)?
GK: The Buck build files are actually mostly just Python rules. It’s a macro, if you think of it that way. You can actually write Python in the Buck files.
DF: Do know what the Buck build system is written in? Is it purely Python, or what is it?
GK: The actual build system is written in Java.
KG: I know Java!
DF: No more Groovy! Yes!
GK: Even Gradle is mostly written in Java nowadays. Older versions of Gradle were written in Groovy, but almost 90% of Gradle is currently in Java.
KG: That’s interesting and surprising. I mean, I guess it’s more of a familiarity thing as well, but Groovy is supposed to be a better language. Well, actually, I shouldn’t say that…
DF: Who are you talking to?
KG: I’m probably going to get a lot of hate mail on that.
DF: From what I understand, Buck isn’t just used for building Android projects. It’s a build system, so you can use it to build both Android or other projects, like a Go project. Is that correct?
GK: Yeah, it actually supports quite a few languages, and it’s really easy to add new ones. I think it natively supports Java on Android, and Objective-C on iOS (Swift is in the works right now), along with C++, Go, Rust, Haskell, and what have you.
KG: I imagine Facebook is using this for their iOS applications, but have you seen Buck being used for other iOS applications?
GK: Actually, we have a counterpart at Uber called the iOS Developer Experience team, and they migrated to Buck before we did. So we basically have both iOS and Android builds orchestrated in Buck at Uber.
The other really nice thing about Buck, which I have to quickly mention, is that it’s really good at caching stuff. Since it’s based on the input/output rule, if the inputs haven’t changed, it reads the output as up to date. It’s similar to Gradle, but the checks are a lot faster, since they’re not based on timestamps but on the file content. Let’s say you make a build right now, change a file, and do a build. Then you later change your mind and want to go back to the previous state, and you build again. It basically says that everything is up-to-date, because it has already previously built that state of the codebase.
KG: That’s so cool! How does it do that? Since it doesn’t look at timestamps, does it use a hash or something?
GK: It maintains a hash, and it also does something else very clever. It runs a file watching service in the background (like a daemon). It’s called Watchman, and it watches certain files which are a part of this target chain. It knows before you even pick a build what has changed already, so it can make decisions really, really fast.
KG: Interesting. So this is like an additional thing that you have, beyond just the regular Buck system?
GK: You don’t really need it to build in Buck, but it actually makes it much faster, so it’s nice to have.
KG: Okay, that’s pretty cool. I know that one of the other advantages that has been for Buck is reproducible builds. How does that work? First, what does it mean? Why is reproducibility important? Why do you need it? And how does Buck provide it?
GK: Have you ever had a situation where you or another colleague adds some configuration that depends on things like the version to your build.gradle file? I know that a lot of people want to have the version of the codebase in the build.config, so they typically add it to the build.config fields in Gradle. That means that every time you build, the build.config of the Java changed, so it’ll end up being a full reboot.
Basically, we’re talking about state that’s outside of the build systems that you’re trying to build in. It’s very easy to do in Gradle. Sometimes, if you’re (say) developing annotation processors, and you include the Tool JAR, those JARs are actually different on every different machine and JDK. Your coworker could have a different JDK from you, which means that what you’re building might be slightly different in terms of hardware, which means that the actual APK that you’re building might have some differences from what you’re seeing. That can be very frustrating when you have small changes that are very hard to debug.
It’s really hard to do with Buck, because you have to be very clear to have that happen to you. The way that the build language is defined, you can’t depend on environment variables or have dynamic hardcore strings. You can’t really have dynamic dependencies that change for every invocation, for that matter, the way you can with Gradle. That’s part of the usability, which means that if I check out a particular version of the codebase and run it on any machine at any given point of time, I would always be guaranteed to get the same exact output.
KG: And this is not necessarily something you can guarantee with Gradle at the moment?
GK: You can actually guarantee it, but it’s very easy to mess up. We had an instance where we made this mistake ourselves. When we started migrating it to Buck, we realized, “Hey, why is this target being rebuilt every time?” Then we figured it out: “Oh, it’s because in Gradle, it’s not a problem. We were already used to it being rebuilt every time. But with Buck, it can get pretty annoying, especially when you’re used to really fast build times.” Anything that rebuilds that can be a problem.
DF: So I’m an Android developer. You’re an Android developer. Kaushik is an Android developer. Everyone who listens is usually an Android developer. We spend all of our day inside of Android Studio, so any time I hear about any of these new tools or things I can use, the number one question in my mind is, “Does this work with Android Studio?” Does Buck work with Android Studio?
GK: Android Studio is purely based off the excellent IntelliJ IDE by the JetBeans folks. If you think about it that way, Buck actually does work with Android Studio, but you may not see all the latest features that Studio might push out. For example, constraint layout is a new thing that Google pushed out. It hadn’t been upstreamed to IntellJ for a long time, but the latest version actually has constraint layout. Eventually, these do end up in IntelliJ, but Studio is definitely something people want to use with any build system. It does work with Android Studio in a limited capacity, but typically, we at Uber use IntelliJ instead of Android Studio, because we don’t miss the Gradle sync times. But that’s completely gone out of the picture.
KG: So you use it with IntelliJ, but I imagine you can potentially use it with Android Studio, right?
GK: Yes, you totally can use it with Android Studio.
KG: Okay, so we talked about some of the major pain points in the build system for a company as big as Uber, and we found out some of the cool ways in which Buck can help. In the next episode, we’ll actually dive into more specifics with okBuck, which makes it way easier to add a build system like Buck to your Android project. Stay tuned for part two!