Optimizing MongoDB for Real-Time Climate APIs

A 15-month study tracking nearly 230 pediatric asthma patients found that elevated PM2.5 concentration and pollen severity were directly associated with poorer asthma control. Alerts on peaks of pollution, freely available and timely, could be genuinely helpful in helping patients modify outdoor activity during dangerous windows.

When we think about API performance, it’s easy to reduce everything to numbers like latency and response times. But the moment your data starts informing decisions that affect someone’s health, it stops being just a UX preference.

In the context of climate and environmental data, slow responses can carry real consequences.

A delayed air quality alert might mean someone with asthma doesn’t get timely information about unsafe conditions. A pollen forecast that takes too long to load can affect how someone plans their day, or whether they step out at all.

That’s why optimizing API performance has become a priority across the entire team, instead of being pushed to the backend, since it has become a core part of the trust between developers and users.

Why climate data APIs are uniquely challenging

Climate and environmental APIs operate under a very different set of constraints compared to typical CRUD-based systems.

A single request may involve multiple environmental parameters such as air quality components, pollen levels, meteorological variables, and derived indices. Most queries are also location-based, meaning every request involves geospatial lookups tied to latitude and longitude.

On top of that, these APIs operate at a global scale. Requests can arrive simultaneously from different regions, and the backend must quickly fetch the most relevant environmental data based on coordinates, time windows, and specific parameters. This combination of high request concurrency, geospatial queries, and multi-parameter filtering makes database performance critical.

For instance, Kleenex® built a platform called Your Pollen Pal, powered by Ambee’s pollen and weather APIs, to alert users about local pollen levels and recommend relevant products via email and SMS. In its first week, the platform saw 100,000 visits. Now, it draws around 30,000 unique visitors almost regularly.

Improving MongoDB query performance

To improve response times, we started by analyzing the most frequent query patterns across our APIs. Many of these queries filtered data based on geolocation and timestamps, so optimizing how MongoDB handled those filters became the first priority.

One of the key improvements was introducing compound indexes aligned with our query patterns. Instead of indexing fields independently, we created indexes that reflected how the data was actually queried.

Example

Combining geographic identifiers with time-based filters.

When designing these indexes, we followed the ESR rule (Equality → Sort → Range), which is a common MongoDB indexing guideline. Fields used for equality filters were placed first in the index, followed by sorting fields, and finally range-based filters such as timestamps.

Structuring indexes this way allowed MongoDB to efficiently narrow down the candidate documents and avoid unnecessary scans.

Since location-based queries are central to climate data retrieval, we also ensured that geospatial indexes were properly configured.

We rely heavily on MongoDB’s 2dsphere indexes for coordinate-based lookups. One important limitation, though, is that geospatial indexes behave differently when used inside compound indexes.

When combining geospatial and compound indexes, index ordering becomes especially important so the query planner can use them efficiently. Because of this, we structured our indexes so that non-geospatial filters, such as equality filters or timestamps, appear before the geospatial field.

Additionally, we monitored query plans and execution statistics using explain() to identify slow operations. This helped us pinpoint queries that were triggering collection scans and adjust them to better align with the available indexes.

Performance improvements

After implementing these optimizations, we saw measurable improvements in both database performance and overall API latency.

Average query latency dropped from roughly 300–400ms to around 80–90ms for some of our most frequently used data. The p95 latency improved by over 40%, and the number of queries triggering collection scans decreased significantly after the indexing changes.

These improvements also helped stabilize performance during peak traffic periods, especially for endpoints relying heavily on geospatial lookups. In practical terms:

Requests that previously took noticeable time to return results now respond much faster
Latency is more consistent and predictable across queries

What this means for developers

For developers integrating with our climate APIs, these improvements translate directly into faster and more reliable data access. Applications that rely on environmental intelligence, whether for air quality alerts, pollen forecasts, or climate insights, can now fetch data with lower latency and improved consistency.

Ultimately, optimizing our MongoDB queries ensures that developers can focus on building impactful applications, while the infrastructure delivering the data remains efficient and scalable.