| August 15, 2023

Ookla for Good Adds Centroid Coordinates to Open Data

Faster and Easier Mapping for Fixed and Mobile Network Performance Map Tiles

When we first offered our open performance datasets to the public in 2020, Ookla®’s main priorities were to make sure that our fixed and mobile network performance data would be accessible, relevant, and useful to those researching or trying to improve the state of networks worldwide. As conversations around connectivity continue to evolve, so too has our open dataset. For instance, earlier this year we extended our open performance dataset to include additional responsiveness insights by showing latency under load for both saturated downlink and uplink measurements.

As we continue to advance our datasets to include more new, exciting, and relevant metrics, we remain committed to ensuring that key data views are accessible to as many interested parties as possible: more people accessing network performance and consumer quality of experience insights means more important conversations surrounding connectivity and ultimately better public policy to help improve broadband access globally. This is why we are excited to announce the addition of centroid coordinates to our open data.

What are centroids?

So what exactly are centroid coordinates, and how do they help provide more access to our data? Centroids are the point where the medians of a shape intersect. Or, to put it simply, a centroid is the center point of a shape. For more in-depth information on the math behind how centroids are computed, check out RPubs documentation on the subject.

Why centroids are important

The Ookla for Good open dataset makes use of spatial tiles. While the geometries of tiles provide precision in mapping, working with and plotting spatial data can often be a long and arduous process. This issue is further compounded by the global nature of our datasets, making it difficult for users to subset specific areas to look at. Machines with limited RAM may take multiple hours to do a single spatial join to a small area, and if more metrics are added, the joins take longer and longer to run.

The addition of centroid coordinates means that rather than needing a geospatial toolset, users can filter to an approximation of their area of interest using a numerical bounding box. This will dramatically cut down on the time to join our data to other data sources, improving accessibility for those looking to work with Ookla’s Fixed and Mobile Network Performance Map Tiles.

Mapping open data using centroid coordinates

For this example let’s use the country of Brazil to demonstrate one way of mapping our open data with centroid coordinates in R.

library(geobr)
library(ggrepel)
# remotes::install_github("teamookla/ooklaOpenDataR")
library(ooklaOpenDataR)
library(scales)
library(sf)
library(tidyverse)
library(usethis)

# Set global defaults
theme_set(theme_minimal())
theme_update(text = element_text(color = "#181512")) 
theme_update(plot.subtitle = element_text(hjust = 0.5))

# plot colors
purple <- "#8D5DB2"
light_gray <- "#EDEAE6"
mid_gray <- "#9B9893"

colPal <- colorRampPalette(c("#E4AECF", "#3E0E1C"))

I am using the package geobr to grab the geometries for the country. There are multiple functions within this package that allow you to grab geometries at different administrative levels if you want to follow this tutorial for a smaller portion of the country.

We will be using st_bbox() to create a bounding box to help subset the performance map tiles. An alternative to this is to get min/max coordinates from searching online (for ex: google “Brazil bounding box”). We highlighted the dimensions for Brazil’s bounding box on the following map.

#Set up map
brazil <- read_country(year=2020) %>%
  st_transform(4326)

br_bbox <- brazil %>% # use to subset open data
  st_bbox()

ggplot(brazil) +
  geom_sf(color = mid_gray, fill = light_gray, lwd = 0.08) +
  labs(title = "Brazil") +
  geom_text_repel(aes(label = name_state, geometry = geom),
    family = "sans",
    color = dark_gray,
    size = 2.2,
    stat = "sf_coordinates",
    min.segment.length = 2
  ) +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    axis.text = element_blank(),
    axis.title = element_blank()
  ) +
  annotate(geom = "rect", ymax = 5.27, ymin = -33.75, xmax = -28.85, xmin = -73.99, colour = purple, fill = colorspace::lighten("#8D5DB2", 0.5), alpha = 0.1)
Map Base of Brazil

We will then download the performance map tiles using the OoklaOpenDataR package, and then subset the global data to make it more manageable to work with when we do our join.

fixed_br_q2 <- get_performance_tiles(service = "fixed", quarter = 2, year = 2023) %>%
  filter(tile_y <= br_bbox['ymax'], tile_y >= br_bbox['ymin'], tile_x <= br_bbox['xmax'], tile_x >= br_bbox['xmin']) %>%
  st_as_sf(wkt = "tile", crs = 4326)

You can then check to see what the remaining data looks like. There will be points outside of the boundaries, but that is to be expected.

ggplot(brazil) + # Check
  geom_sf(color = mid_gray, fill = light_gray, lwd = 0.08) +
  geom_sf(data = fixed_br_q2, color = purple) +
  labs(
    title = "Brazil",
    subtitle = "Ookla® Open Data Fixed Tiles | Q2 2023"
  ) +
  geom_text_repel(aes(label = name_state, geometry = geom),
    family = "sans",
    color = dark_gray,
    size = 2.2,
    stat = "sf_coordinates",
    min.segment.length = 2
  ) +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    axis.text = element_blank(),
    axis.title = element_blank()
  )
Map of Brazil with Ookla® Open Data Fixed Tiles Q2 2023

Voilà! You now have a smaller dataset that can then be used to perform analysis as needed.

Aggregating Loaded Latency

As stated before, we announced the addition of additional loaded latency metrics to our data in January. We can use the previous code to show a simple example of how to aggregate it. For more examples and information on how to perform your own analysis on our open data check out this tutorial.

First, use the following code to perform a spatial join to get a more precise location for our tiles.

#Join 
br_tiles <- brazil %>%
  st_join(fixed_br_q2, left = FALSE)

Then aggregate the data up to the desired level of granularity. In this case, we aggregated the data by states in Brazil.

br_aggs <- br_tiles %>%
  st_set_geometry(NULL) %>%
  group_by(name_state) %>%
  reframe(
    tiles = n(),
    avg_lat_download_ms = round(weighted.mean(avg_lat_down_ms, tests, na.rm = TRUE), 2),
    avg_lat_upload_ms = round(weighted.mean(avg_lat_up_ms, tests, na.rm = TRUE), 2),
    total_tests = sum(tests)
  )

Re-join the geometries from Brazil.

br_aggs <- left_join(brazil, br_aggs)

And then map.

ggplot(br_aggs) +
  geom_sf(aes(fill = avg_lat_download_ms), color = "white", lwd = 0.08) +
  scale_fill_stepsn(colors = colPal(5), labels = label_number(suffix = " ms"), n.breaks = 4, guide = guide_colorsteps(title = "")) +
  theme(
    panel.grid.minor = element_blank(),
    panel.grid.major = element_blank(),
    axis.title.x = element_text(hjust = 1),
    legend.text = element_text(color = "black"),
    axis.text = element_blank()
  ) +
  labs(
    title = "Average Download Latency Across Brazil",
    subtitle = "Ookla® Open Data Fixed Tiles | Q2 2023"
  )
Map of Average Download Latency Across Brazil Q2 2023

This example offers one way to filter with the added centroid coordinates, but it is by no means the only approach. If you are interested in exploring more ways to interact with our data, additional tutorials are available on our GitHub page.

Ookla retains ownership of this article including all of the intellectual property rights, data, content graphs and analysis. This article may not be quoted, reproduced, distributed or published for any commercial purpose without prior consent. Members of the press and others using the findings in this article for non-commercial purposes are welcome to publicly share and link to report information with attribution to Ookla.

| August 17, 2022

The World Bank Uses Ookla Data to Examine Digital Infrastructure in Latin American Countries

Adherence to public health measures like non-pharmaceutical interventions (NPIs) were found to play a crucial role in protecting communities from the spread of COVID-19. A team of World Bank academics analyzed Ookla® Speedtest Intelligence® data alongside other publicly-available data to explore the role of digital infrastructure in NPIs compliance during the pandemic in their new report: “The adoption of non-pharmaceutical interventions and the role of digital infrastructure during the COVID-19 Pandemic in Colombia, Ecuador, and El Salvador.”

Highlights from the report:

  • Adherence to NPIs correlates with socioeconomic factors: It’s been established that disadvantaged communities struggled to implement movement restriction measures during the acute phases of the pandemic. These radically changed many people’s lives, but disadvantaged communities (both in low- and high-income countries) were disproportionately affected, most likely due to limitations in how people in those communities would be able to stay at home for jobs, school, etc.
  • Movement reduction: The research conducted in this report finds that NPIs implemented in Colombia, Ecuador, and El Salvador during 2020 caused a significant drop in movements during 2020. These drops from pre-pandemic baselines were 53% in Colombia and 64% in both Ecuador and El Salvador.
  • Connections between movement reduction and digital infrastructure: Furthermore, the research found that for every 10 Mbps increase in average fixed download speed, movement reduction increases by 13% in Colombia, 4% in Ecuador, and 19% in El Salvador.
  • Digital infrastructure, socioeconomic status, and movement reduction: These findings also may correlate with higher socioeconomic status supported by the fact that mobility reductions were more pronounced in larger, denser, and wealthier municipalities.
  • Further steps: This research shows that there is a significant association between the quality of digital infrastructure and adoption of NPIs, even after controlling for socioeconomic indicators. Due to the disproportionate gap between low-income and wealthier areas, action should be taken for policies and targeted investments aimed at “closing the digital gap, improving network reliability as well as equality across communities.”

We would like to thank Nicolò Gozzi, Niccolò Comini, and Nicola Perra for their work on this project.

Read the World Bank's Announcement Button

Ookla joined the World Bank’s Development Data Partnership platform over two years ago as part of our Ookla for GoodTM initiative. We are thankful to them, alongside the many other organizations we partner with to improve the state of internet performance around the world. Learn more about Ookla for Good.

Ookla retains ownership of this article including all of the intellectual property rights, data, content graphs and analysis. This article may not be quoted, reproduced, distributed or published for any commercial purpose without prior consent. Members of the press and others using the findings in this article for non-commercial purposes are welcome to publicly share and link to report information with attribution to Ookla.