When we looked at DiD, we had data on units from 2 different cities and 2 different time periods (before & after intervention). What if we only have aggregated data on the city level?
With Synthetic Controls, we don’t need to find single units in treated and untreated groups. Instead, we can forge our own as a weighted average of multiple untreated units that best mirror the treated unit’s characteristics.
Uber has markets where credit cards aren’t widely utilized compared to cash. They charge drivers ~25% of their earnings as the service fee, which means that Uber drivers need to wire their cash earnings as well, creating hassles for drivers. Alternatively drivers may prefer cash over credit payments.

Donor Pool
Weight Each Unit
choose features X that can be used to predict outcome Y and find their importances (V)
build model to predict each unit’s pre-treatment outcome, optimize weights to minimize the difference in means between treated unit and donor pool
$$ |X_1 - X_0W|\\ = \sqrt{ \bigg(\sum^k_{h=1}v_h \bigg(X_{h1} - \sum^{J+1}{j=2} w_j X{hj} \bigg)^2 \bigg)} $$
After obtaining W, use weighted average of donor pool to project the post-treatment outcome in the treated unit if the treatment didn’t occur:
$$ \sum^{J+1}{j=2} w_j Y{jt} $$
This “Synthetic Control” is the best guess for the counterfactual reality that would have occurred in the treated city.
Hypothesis Test to check the effect of the treatment