top of page
Highway Night Light_edited (1).png

Performance Engineering

fast systems, faster business 

Fast systems make customers happy, they do more business and more often. Fast systems prevent failures, lower support tickets and makes teams productive. And fast systems operate at lower costs, much lower.

Low Latency.png
Low Latency.png
Low Latency.png
Low Latency.png
Low Latency

Realtime trading & messaging systems and APIs need low latency responses on every interaction. We take into consideration everything from language, libraries, algorithms, data structures, caching, disk and network i/o to ensure these systems are tuned for low latency irrespective of scale.

High Throughput.png
High Throughput

Data processing and async transactions need high throughput to scale, usually measured in rps - records per second. We achieve multi million rps via segregating processing steps by cpu and i/o, execute and scale them in parallel, batch all i/o, reduce/eliminate local and distributed locks to ensure scale.

High Throughput.png
High Throughput.png
High Throughput.png
User Experience.png
User Experience.png
User Experience.png
User Experience.png
User Experience

User experience on a web or mobile app has multiple layers that affect performance, both perceived and real. We tune both via optimized and background download of assets & media, tune API latency, data formats and compression, cache data online and offline, prioritize requests for above and below the fold.

Databases.png
Databases

Database is a critical component and often is the bottleneck as it may not be tuned or receiving load it should not. We tune the db from the data model, creating the necessary indices, optimize query plans, remove unnecessary constraints, avoid clever hacks like update on unique key error, batch updates, etc.

Databases.png
Databases.png
Databases.png
Events and Streams.png
Events and Streams.png
Events and Streams.png
Events and Streams.png
Events and Streams

High throughput event streaming needs expert tuning of message bus, publishers & consumers. Publishers and consumers can use batching, async processing of messages while the read loop is constant. We tuned kafka to consume at 250MBps from a single node for a high volume log mgmt product. Read it here. 

Separate Storage from Compute.png
Separate Storage from Compute

With increase in the variety and volume of data, cloud/block storage becomes an integral part of high throughput systems. Compute can be separated and provisioned as needed to process data from storage at scale. Products such as duckdb and data formats of parquet and arrow make this approach quite scalable.

Separate Storage from Compute.png
Separate Storage from Compute.png
Separate Storage from Compute.png
parca-dev.png
lighthouse.jpg
image.png
image.png
image.png
k9s.png
image.png
grafana-seeklogo.png
artillery-image.png

faster system

1M rps

on a JVM with
2 CPUs and 1GB RAM

10MBps

logs processed
per core

5X

online store throughput increased

10ms

avg api latency for a pci-dss store

perf tuning, the procecss

Define The Ask.png
1. Define The Ask

A clear definition of the requirements, the latency, throughput and cost sets the context for the subsequent phases. Sometimes, latency and throughput requirements get mixed up when one of them is clearly the only requirement over the other. For instance, a PnL calculation for 1M entries in 10 seconds does not imply a latency requirement of 1 record in 10 micro seconds.

2. Collect the metrics

Data is collected from production monitoring systems if available and sufficient. Or it is taken from a perf test environment putting the system on anticipated load. Additional non-intrusive monitoring components are installed to collect the required data to the granular level. Config and Code is reviewed to form the complete picture and correlate metrics to code. 

Collect the metrics.png
Tune with Roadrunnr.png
3. Tune with Roadrunnr

Roadrunnr is a performance toolkit from 91social, to analyze metrics, jvm gc logs, query plans etc and discover bottlenecks faster. The systems are tuned at 4 levels - hardware and software config, non-intrusive code changes, code refactoring and design refactoring to ensure systems meet latency and throughput requirements at optimal cost.

Contact us to performance tune your systems

Client Stories

E-commerce firm sails through the holiday season taking 3X load with zero dropped orders

A  San Francisco-based  e-commerce firm that makes high-quality fashion affordable, was looking to improve their application readiness and performance for the holiday season. Roadrunnr helped the company optimise Shopify platform spending, improved their application responsiveness by 10X, and scaled the platform to take up to 3X load without dropping any order at all.

bottom of page