Unit-average lift or cluster-average lift in cluster-randomized experiments

network experiment

In a cluster-randomized experiment, we can choose between computing a unit average or cluster average. The two averages certainly bare different units, but which one should we use if we are reporting on relative terms?

Author

Kenneth Hung

Published

March 14, 2025

In a cluster-randomized experiment, the effect is often expressed as a difference-in-mean between the two treatment groups. Certainly the two choices, unit average and cluster average, carry different units and have different interpretations as explained in Kahan et al. (2023). But what if we were to compute a lift, i.e. a ratio-of-means, does this distinction matter?

Adopting but modifying the notation in Karrer et al. (2021), suppose \(S_0\) and \(Y_0\) are the average unit count and average total metric (sum of the metric across all units in a cluster) across control clusters, and \(S_1\) and \(Y_1\) be the counterparts for the test clusters. For simpler exposition, we will also assume that there are equal number of control and test clusters. Then the unit-average lift estimate is given by \[\frac{Y_1 / S_1}{Y_0 / S_0} = \frac{Y_1}{Y_0} \cdot \frac{S_0}{S_1}\] and the cluster-average lift estimate is given by \[\frac{Y_1}{Y_0}.\] The two are only off by a factor of \(S_0 / S_1\), which asymptotically is one, unless the test treatment can somehow lead the clusters to change in size. Asymptotically both estimators are converging to the same estimand.

So the remaining question is: is one more efficient than the other? We can study the lift in log scale, which asymptotically by delta method would not change our conclusion. Then we have \[ \begin{align*} \log(\text{unit-average lift}) &= \log(Y_1 / Y_0) - (\log S_1 - \log S_0) \\ \log(\text{cluster-average lift}) &= \log(Y_1 / Y_0). \end{align*} \] More generally, for any \(\beta\), the estimator \[\log(Y_1 / Y_0) - \beta (\log S_1 - \log S_0)\] will converge to the same estimand that we can justifiably call \(\log(\text{true lift})\) now. When \(\beta = 1\) we are estimating the unit-average lift, when \(\beta = 0\) we are estimating the cluster-average lift.

This should probably remind you of regression adjustment from Lin (2013) or CUPED from Deng et al. (2013). Observing this gives some insights: there is no reason that a particular choice of \(\beta\) will always dominate the other. For example, if the metric is pretty homogenous across units, then \(Y\) scales with \(S\) and \(\beta = 1\) is probably a good choice. If the metric is on average inversely proportional to the number of units in the same cluster, then \(Y\) does not grow with \(S\) and \(\beta = 0\) is better.

Taking this idea further: if we are computing ratio-of-means, we should perform regression adjustment on \(S\) directly, which will yield an optimal choice of \(\beta\). If the treatment is not supposed to affect the number of units in a cluster, then it is a pre-experiment characteristic and should be regressed on, and post-adjustment both the unit-average lift and the cluster-average lift should be the same estimator asymptotically.

References

Deng, A., Xu, Y., Kohavi, R., and Walker, T. (2013), “Improving the sensitivity of online controlled experiments by utilizing pre-experiment data,” in Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, Rome, Italy: ACM Press, p. 123. https://doi.org/10.1145/2433396.2433413.

Kahan, B. C., Li, F., Copas, A. J., and Harhay, M. O. (2023), “Estimands in cluster-randomized trials: Choosing analyses that answer the right question,” International Journal of Epidemiology, 52, 107–118. https://doi.org/10.1093/ije/dyac131.

Karrer, B., Shi, L., Bhole, M., Goldman, M., Palmer, T., Gelman, C., Konutgan, M., and Sun, F. (2021), “Network Experimentation at Scale,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual Event Singapore: ACM, pp. 3106–3116. https://doi.org/10.1145/3447548.3467091.

Lin, W. (2013), “Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique,” The Annals of Applied Statistics, 7. https://doi.org/10.1214/12-AOAS583.