Anonymous
- Data Gathering: I asked my engineers to gather data on resource utilization across internal customers. We discovered that most customers were using less than 5% of their provisioned capacity.
- Engagement with Stakeholders: I began by reaching out to two of the largest internal users—XYZ and PQR—both of which I had close relationships with. I framed the problem as a multi-phased solution that would allow them to resize their resources gradually without impacting their services.
- Cross-team Alignment: I then engaged with the Front-End (FE) team and its Senior Manager to inform them of this opportunity to shape the traffic and reduce strain on the backend fleet. This led to further discussions about aligning future traffic patterns to optimize fleet usage.
- Initiating Pricing Discussions: With the success of the initial efforts, I initiated discussions around refining the internal pricing model, presenting data to show that many more customers could benefit from resizing, thereby reducing backend load and unnecessary resource allocation.
- Backend Fleet Reduction: We successfully reduced the backend fleet size by 7-9%, leading to a reduction in overall system load and minimizing metadata overhead.
- Cost Savings: Customers were able to save costs as well by reducing their reliance on over-provisioned resources and lowering the number of workers required for their applications.
- System Improvements: By pushing out scaling cliffs for the load aggregator service, we avoided potential bottlenecks, improving operational stability.
- Pricing Strategy Refinement: This work led to ongoing refinements in ABC's internal pricing model, aiming to align more closely with public pricing or move towards a more robust per-use product.