The eBay engineering team recently outlined how they came up with a scalable release system. The release solution leverages distributed architecture to release more than 3,000 dependent libraries in about two hours. The team is using Jenkins to perform the release in combination with Groovy scripts.
As we learnt from Randy Shoup (VP of Engineering and Chief Architect at eBay) and Mark Weinberg (Vice President, Core Product Engineering) eBay had systemic challenges with releasing major dependencies, leading to the equivalent of distributed monoliths. Late last year, eBay began migrating their legacy libraries to a Mavenized source code. The engineering team needed to consider the complicated dependency relationships between the libraries before the release.
The prerequisite of one library release is that all the dependencies of it must have been released already, but considering the large number of candidate libraries and the complicated dependency relationships in each other, it will cause a considerable impact on release performance if the libraries release sequence cannot be orchestrated well.
Understanding the library release sequence becomes critical to ensure optimal release performance. Mapping of the dependency relationship for a large number of libraries yields a Directed Acyclic Graph (DAG) shown as an example below:
Source – eBay Engineering Tech Blog
In the above diagram, a library is represented by a node (circle) and the dependent libraries are connected using lines. When the library represented by number one is released, two, three and four can be released in parallel. The libraries shown in blue get released separately, as they are not dependent on any other libraries.
Using the central service, the distributed release system calculates the DAG and then queues all the same priority nodes in a parallel sequence for release. Furthermore, the node with more parent nodes is the first priority for the release. The central service then leverages the optimum number for Jenkins nodes to perform the release.
The eBay engineering team has architected the Jenkins “pull mode” for parallel releases.
Source – eBay Engineering Tech Blog
Each Jenkins job has a Groovy script. Jenkins nodes use the central service to pull the candidate library when the release is triggered. After releasing it and reporting the results, the next candidate library is pulled for release.
The eBay engineering team stated that the above distributed architecture is not limited to release-related tasks. Considering its generic nature, below are some other applicable cases for the distributed architecture
- Running distributed integration test cases and creating summarized results post-run
- Simultaneous data collection/analysis from different channels to generate reports
In the context of dependencies with microservices, at QCon Plus November 2021, we saw that a growing number of dependencies are being managed between the tightly coupled microservices. This need for simultaneous release arises because all the dependent microservices need to be tested together in a batch.
An interesting conversation on Hacker News reveals that multiple source code repositories in a distributed system are also a major source of frustration. HN user wreath mentions that “syncing and waiting for (repository) deployment/release (if it’s a library) just to add a small feature easily wastes a few hours of the day and most importantly drains cognitive ability by context switching.“
Additional details on the new eBay release architecture can be found in a recent article on the eBay Engineering Blog: A Lightweight Distributed Architecture to Handle Thousands of Library Releases at eBay