What is a Service Level Agreement (SLA)?
The Service Level Agreement (SLA) is the document that defines, objectively, what levels of performance, availability, and support a provider commits to delivering for a given service. In technology environments, the SLA is used to ensure that platforms, APIs, support services, and critical systems remain available for the time necessary to support the business, reducing risks of prolonged downtime and directly impacting the end-customer experience.
Instead of vague promises, the SLA works with measurable indicators, such as availability percentage (uptime), maximum response time for tickets, resolution deadlines for incidents, and scheduled maintenance windows. This allows the client and provider to speak the same language, establishing clear expectations and creating a solid basis for audits, performance reviews, and eventual application of penalties or service credits.
Why convert SLA percentage into downtime?
Seeing an SLA described as 99.5% or 99.9% may seem, at first glance, like just a small decimal difference. In practice, however, the amount of downtime minutes allowed in each scenario changes significantly. A 99.0% SLA in a 30-day month allows up to 7 hours and 18 minutes of downtime, while a 99.9% SLA reduces this number to about 43 minutes. This difference is critical when thinking about purchasing journeys, online payments, invoicing, or logistics operations.
Converting the percentage into time helps managers, product owners, and technical teams evaluate if the contracted service level aligns with business objectives. In many cases, a stricter SLA implies higher costs in infrastructure and redundancy. On the other hand, overly flexible SLAs can generate revenue loss, increased churn, and brand damage. The SLA Calculator brings these numbers closer to reality, translating percentages into real downtime minutes.
How to use the SLA Calculator in practice
To use the calculator, simply enter the Contracted SLA percentage, choose the Main Analysis Period (day, week, month, or year) and define the Operation Model. In 24x7 scenarios, all time in the period is considered, while the business hours mode takes into account only 8 hours of operation per business day, which usually makes more sense for support teams or services that do not operate on weekends.
After calculation, the tool presents the maximum allowed downtime in each period in an easy-to-read format, combining days, hours, minutes, and seconds. This makes it simple to answer questions like: โWith this contract, how long can we be offline per month?โ or โIs this SLA sufficient for the criticality of this service?โ. The information can also be copied for requirements documents, executive presentations, or vendor negotiations.
Common SLA examples and what they mean
In technology literature, it is common to hear expressions like โtwo ninesโ, โthree ninesโ, or โfour ninesโ. This refers to the number of digits 9 after the decimal point in the availability percentage. A 99% SLA is called โtwo ninesโ; 99.9% is โthree ninesโ; 99.99%, โfour ninesโ, and so on. Each increment requires more robust architectures, high redundancy, advanced monitoring, and mature disaster recovery processes.
For internal services of lower criticality, an SLA between 97% and 99% may be sufficient, especially when there is tolerance for small downtime windows. For payment gateways, tax issuance systems, ERPs, and e-commerce platforms, it is common to seek SLAs above 99.5%, often close to 99.9% or more. With the calculator, you quickly visualize the impact of going from 99.0% to 99.9% in the amount of allowed downtime minutes.
Best practices for negotiating and tracking SLAs
- Clearly define which services, APIs, or features are covered by the SLA and in which environments (production, staging, geographic regions).
- Specify how downtime will be measured, which monitoring tools will be used, and which events will be disregarded (scheduled maintenance, change windows, force majeure incidents).
- Review the SLA periodically, aligning expectations between business areas, technology, vendors, and partners, avoiding unattainable or overly loose goals.
- Use complementary indicators, such as Mean Time to Restore (MTTR), number of critical incidents, and response time during peak hours, to enrich your view of service quality.