Zoom Logo

Main Workshop Day 2 (SantAnna matching) - Shared screen with speaker view
Scott Cunningham
23:43
Now fingers crossed we can figure out zoom
Scott Cunningham
23:45
zoom chat
Scott Cunningham
25:01
Soon recordings will be posted
Scott Cunningham
25:15
This is a special treat -- Pedro is a great econometrician and teacher
Scott Cunningham
25:37
I'll do my best to answer questions and will repaste the question publicly so we can all read them fyi
Scott Cunningham
25:56
https://github.com/Mixtape-Sessions/Causal-Inference-1
Scott Cunningham
26:24
Remember you can access these slides here
Scott Cunningham
26:25
https://www.dropbox.com/sh/aaxwxii2oa2v2g2/AAANzdSwTAlkyO4OK8lAEWx_a?dl=0
Scott Cunningham
41:57
Your outcome (person i) is based on your treatment assignment (person i), not someone else.
Scott Cunningham
42:05
THis is what Rubin meant by interfrence
Scott Cunningham
50:30
I just reminded him to repeat questions for us
Scott Cunningham
50:38
They are asking about local aggregations
Scott Cunningham
52:14
Someone asked difference between ITT and ATE
Scott Cunningham
52:26
Someone is asking about quantile treatment. But Pedro is repeating
Scott Cunningham
54:11
I’ll be right back
Scott Cunningham
01:13:01
okay i'm back
Scott Cunningham
01:31:13
So what he is saying is that we are in a world where randomization DID NOT occur. Not that it CANNOT occur -- it DID NOT occur
Scott Cunningham
01:31:35
People chose treatment bc they thought it helped them. That means they chose action bc Y1>Y0, which violates indepdencen
Scott Cunningham
01:31:48
So now he's building out the selection on observables case
Scott Cunningham
01:32:05
Where ppl with covariates are then randomized (“as good as random”).
Scott Cunningham
01:45:13
m is I think referring to E[Y1|X] or E[Y0|X]
Scott Cunningham
01:45:41
I think this is Heckman's notation using m.
Scott Cunningham
01:48:10
okay i think that makes sense. i wonder if it's notation to use m bc it's basically just equal to E[Y1|X] but bc he's pulling it out of the conditions, maybe he moved to m0
Scott Cunningham
01:48:15
m1(x) rather
Scott Cunningham
01:50:01
oh, i think i was wrong
Scott Cunningham
01:50:12
m1(x) is E[Y|X,D=1]
Scott Cunningham
01:50:32
It's equal to Y1 under the switching equation but he's tchnically already moved to realized outcomes when he uses m1(x)
Scott Cunningham
02:02:40
Y0_||_D|X
Scott Cunningham
02:02:44
So what does that mean?
Scott Cunningham
02:02:57
That means my decision to get a PhD is independent of how much I made if I don't get a PhD
Scott Cunningham
02:03:02
It's unrelated to Y0
Scott Cunningham
02:03:11
But it can be connected to Y1
Scott Cunningham
02:03:23
So whichever potential outcome is independent, that's the one you're going to impute using matching
Scott Cunningham
02:03:35
That's why we get ATT here if we have independence with respect to Y0
Scott Cunningham
02:03:55
You're stuck with Y1, so you impute the missing y0 and since its independent you can use Pedro's lines of reasoning
Scott Cunningham
02:09:02
What's kind of cool that we are seeing today is that matching and weighting end up being connected, and what's then interesting is that all of this sort of leads naturally into diff-in-diff and synthetic control
Scott Cunningham
02:09:15
SO MUCH causal inference involves weighting to impute missing potential outcomes
Scott Cunningham
02:10:55
This is one of the things I have slowly learned over the last 10 years
Scott Cunningham
02:11:14
Pedro really advocates hard for this:
Scott Cunningham
02:11:22
1) here’s the target (ATT for instance)
Scott Cunningham
02:11:30
2. here's the assumptions that will let me estimate it
Scott Cunningham
02:11:34
3. chooses the weights
Scott Cunningham
02:19:34
https://www.dropbox.com/s/zgtmnnqr1h9oobg/Cunningham%20workshop%20document.docx?dl=0
Scott Cunningham
02:19:36
For later
Farah Mejri
02:21:15
Thanks
Scott Cunningham
02:23:21
so the weights are basically gender ratios if X is sex and k is specific sex group
Scott Cunningham
02:23:38
nk/n uses "total males dividied by total population”
Scott Cunningham
02:23:51
n1k/n1 is "total males in treatment group divided by total popuolation of treatment group”
Scott Cunningham
02:25:17
someone asked if you could have positive effects for males, but negative effects for females (different strata) and they cancel out in aggregation
Scott Cunningham
02:25:19
answer is yes
Scott Cunningham
02:25:22
in practice
Monika Avila Marquez
02:27:02
Why are the weights random?
Scott Cunningham
02:27:33
I didn't quite catch that, but if the units are random variables, then their shares will be too.
Scott Cunningham
02:27:37
Probably something like that
Monika Avila Marquez
02:27:54
Thank you. Would be possible to ask Pedro?
Scott Cunningham
02:27:57
Yes
Monika Avila Marquez
02:28:03
Thank you
Scott Cunningham
02:28:46
right - that's what i thought. it's sampling based
Monika Avila Marquez
02:28:52
Yes thanks
Scott Cunningham
02:45:10
yes, j is the matchd unit
Qixin Ye
02:45:12
A request: Is it possible for the speaker to use the mouse instead of the pointer? It's difficult for online audience to follow the pointer. Thanks!
Naoka Carey
02:45:43
Yes - can't see pointer at all
Scott Cunningham
02:45:47
He's pointing at middle line
Scott Cunningham
02:45:56
now bottomw line
Scott Cunningham
02:46:01
let me remind him
Scott Cunningham
02:47:47
w is the weight
Scott Cunningham
02:48:10
that's why w sums to 1
Scott Cunningham
02:48:22
oh wait
Scott Cunningham
02:48:23
hold on
Hongyu Fu
02:49:00
is w indicator whether the unit is matched?
Scott Cunningham
02:49:39
so w is basically marking the matched unit
Scott Cunningham
02:49:46
if there is nearest neighbor, you've chosen 1 match
Scott Cunningham
02:49:51
that's why it's w=1
Scott Cunningham
02:49:59
but if there were chosen several, then w sums 1
Scott Cunningham
02:50:10
but it is a weight which if there is one match is just equal to 1
Scott Cunningham
02:50:45
so this is all getting into distance for "nearest"
Scott Cunningham
02:50:57
when you have one covariate, distance is with respect to scale of covriate
Scott Cunningham
02:51:07
but when there's many variables, that's not the case
Monika Avila Marquez
02:54:57
Could you repeat the question. please?
Scott Cunningham
02:57:06
question was asked abt matching with replacement versus no replacement
Scott Cunningham
02:57:10
affects confidence intervals
Monika Avila Marquez
02:58:24
Thanks
Scott Cunningham
03:00:34
he asked abt "close neighbors" who are actually far away
Scott Cunningham
03:00:38
pedro brings up caliper
Monika Avila Marquez
03:00:48
Tnaks
Scott Cunningham
03:00:51
pedro emphasizes this a lot
Scott Cunningham
03:01:01
how choices that start trimming changes the parameter
Scott Cunningham
03:02:07
hope you guys could hear me
Monika Avila Marquez
03:02:26
I am sorry, could he repeat why there is need to do trimming?
Scott Cunningham
03:03:14
he was talking about trimming that happens when setting the caliper
Scott Cunningham
03:03:29
so you'll set a caliper (max distance to travel to find a neighbor)
Scott Cunningham
03:03:39
and if no one is tehre, you drop that treated unit
Scott Cunningham
03:03:51
you do that and each time it’s further from the ATT
Scott Cunningham
03:04:03
Recall Pedro stresses starting with the estimand
Scott Cunningham
03:04:08
what is the parameter? Is it the ATT?
Scott Cunningham
03:04:19
Then anytime you drop a unit for matching, you move away from the ATT
Monika Avila Marquez
03:05:26
Thank you
Hongyu Fu
03:06:33
Do we allow repeated sampling when picking up “close” matches?
Scott Cunningham
03:06:53
He said earlier to allow repeated sampling
Scott Cunningham
03:07:08
When you don't you can get weird results that are sensitive to ways of sorting randomly the data
Scott Cunningham
03:07:22
“Aren't we assume we can divide into treatment and control?”
Hongyu Fu
03:07:29
I missed that part. Thanks.
Laura Nicolae
03:10:42
Is there a paper we can read to learn more about the bias introduced by matching on covariates and the bias correction?
Monika Avila Marquez
03:11:12
If you a use nonparametric model, then that is better?
Scott Cunningham
03:11:16
“why propenisty score over oLS"
Scott Cunningham
03:15:47
okay we have a break
Scott Cunningham
03:15:54
until 1:15 CST
Scott Cunningham
03:20:54
hmmm
Scott Cunningham
03:46:37
Sorry - I don't have Dr. Rubin's talk
Scott Cunningham
03:46:49
He may post it later
Naoka Carey
03:47:27
I think there is at least a version of it in the Dropbox folder with his name
Monika Avila Marquez
03:49:15
Yes, it is in the dropbox
Monika Avila Marquez
03:51:33
The file
Monika Avila Marquez
03:51:34
https://www.dropbox.com/sh/aaxwxii2oa2v2g2/AAAlulSGoQqp_GKDPIrh-yFEa/2022%20Slides%20and%20Materials/Day%201.%20Rubin/2022-Rubin-Tues-lunch-Mu%3Bt-Imput-Simple-Example.for%20valid%20inferences.pdf?dl=0
Scott Cunningham
04:18:08
everyone I have to jump off; I'll be back in a half hour.
Laura Nicolae
04:22:56
Can you bootstrap pi* if you don’t want to assume a distribution for the missing values (e.g. not Bernoulli bc it’s not a dummy variable)?
Laura Nicolae
04:24:42
And what’s the intuition for taking p = the mean value of the variable over the entire dataset rather than conditioning on covariates?
Qixin Ye
04:36:36
So the goal here is to infer \pi only or fill each missing individual gender as accurate as possible?
Yitong Hu
05:31:10
If I use psmatch2 to do PSM, do I need to do regression with all covariates again after the automatic estimation? In case the matching is found to have significant differences among some covariates.
Scott Cunningham
05:32:58
I think you want to use teffects not psmatch2. It got updated and became a Stata internal command.
Scott Cunningham
05:33:16
but i'll ask
Scott Cunningham
05:34:24
https://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm
Yitong Hu
05:34:31
Thank you! Then if the teffects results are not perfectly matched, should I do regression again manually myself?
Scott Cunningham
05:34:35
For many years, the standard tool for propensity score matching in Stata has been the psmatch2 command, written by Edwin Leuven and Barbara Sianesi. However, Stata 13 introduced a new teffects command for estimating treatments effects in a variety of ways, including propensity score matching. The teffects psmatch command has one very important advantage over psmatch2: it takes into account the fact that propensity scores are estimated rather than known when calculating standard errors. This often turns out to make a significant difference, and sometimes in surprising ways. We thus strongly recommend switching from psmatch2 to teffects psmatch, and this article will help you make the transition.
Tommaso Tamburelli
05:37:05
doesn't it also work against you in a good case
Scott Cunningham
05:38:42
which part tommaso?
Scott Cunningham
05:39:00
the normalization of the weights?
Scott Cunningham
06:40:37
You're going to see this show up again tomorrow when he discusses Andrew Goodman-Bacon's decomposition of the twoway fixed effects diff in diff model
Monika Avila Marquez
06:46:29
If you interact the dummy of the treatment with the strata indicators, and then you perform a weighted average of the coefficients. Then you obtain the parameter of interest, right?
Yitong Hu
06:52:47
do you recommend us to match 1:1 instead of 1:n? Is 1:1 more balanced
Scott Cunningham
06:53:39
I do'nt know if there is a theoretical answer to that Yitong
Scott Cunningham
06:53:50
Most do use small number of matches though.
Scott Cunningham
06:53:57
M=2 or 3
Yitong Hu
06:54:32
do you think reduced sample size by matching 1:1 is an issue?
Scott Cunningham
06:54:42
sampling with replacement?
XIANGYU WANG
06:55:18
Do we have codes for this section-“Simulation Exercise”?
Scott Cunningham
06:55:22
if you sample with replacement, then you'll have a matched sample of same starting size, but it depends on if you place a caliper on distance.
Scott Cunningham
06:55:32
Not sure Xiangyu but I'll find out
XIANGYU WANG
06:55:52
👍
Yitong Hu
06:56:28
I didn't quite get "sampling with replacement" when he introduces it... what's the opposite of sampling with replacement?
Scott Cunningham
06:57:06
So sampling without replacement would be unit i gets matched to person j. Then we go to the second unit. Same the same nearest neighbor for the second unit is that same person j.
Scott Cunningham
06:57:26
With sampling with replacement, you’d put person j back in the donor pool and could use her as a match however many times she is a nearest neighbor
Scott Cunningham
06:57:42
But if you remove her (‘without replacement'), by definition you are matching to someone who is a worse match
Scott Cunningham
06:57:55
So it increases bias but reduces variance
Scott Cunningham
06:58:05
Typically, Pedro recommended sampling with replacement
Monika Avila Marquez
06:59:16
Then, what I propose is good right?
Monika Avila Marquez
06:59:21
I didn't hear well
Scott Cunningham
06:59:32
What you are proposing is called regression adjustment
Monika Avila Marquez
06:59:38
Yeah,
Monika Avila Marquez
07:00:06
Which would be fine if you perform a weighted average
Monika Avila Marquez
07:00:17
of the coefficients of each strata
Scott Cunningham
07:00:21
https://www.dropbox.com/s/t2k93k6zry09vj9/2022-SantAnna-Matching-2.pdf?dl=0
Monika Avila Marquez
07:00:22
I think
Monika Avila Marquez
07:00:27
Thanks
Scott Cunningham
07:02:15
Doubly robust is going to show up tomorrow again in his diff-in-diff talk fyi for the Callaway and Santanna estimator
Scott Cunningham
07:02:25
and the Santanna and Zhao estimator
Yitong Hu
07:02:37
oh I got it! Thank you
Laura Nicolae
07:03:40
This might’ve been covered (I can’t see earlier in chat) but is there a simple reweighting fix for OLS so that we do get ATE/ATT?
Scott Cunningham
07:08:05
The method Monika mentioned — regression adjustment -- was one he discussed in which you saturate the regression model interacting D with X, but I need to follow up one more time with him
Amit Mehra
07:08:29
For zoom participants - It will really help if he uses the cursor instead of the pointer
Laura Nicolae
07:09:35
Thank you! Is it true that that one only makes sense if you have discrete regressions with finite number of values?
Scott Cunningham
07:10:09
Amit - I told him, but I think he keeps naturally gravitating towards this style of presenting. I apologize for that.
Scott Cunningham
07:10:26
It's because he's standing in the front of the room.
Scott Cunningham
07:10:38
I'm going to see if I can somehow get into screens tomorrow
Amit Mehra
07:10:49
Thanks
Scott Cunningham
07:14:20
Rubin is asking whether there is a theorem proving this
Scott Cunningham
07:14:35
I would unmute beside e but it causes feedback
Scott Cunningham
07:14:45
He is challenging this claim that it's double robust
Scott Cunningham
07:15:28
He and Pedro are debating it
Scott Cunningham
07:17:02
What is the regression model is right, can we use parallellization methods
Scott Cunningham
07:20:04
Laura I haven't forrgotten your question
Scott Cunningham
07:23:37
I’ve got to close my laptop
Monika Avila Marquez
07:23:54
Regression Adjustment
Monika Avila Marquez
07:23:56
https://www.tandfonline.com/doi/full/10.1080/07474938.2020.1824732?casa_token=tc_-VuhVAtsAAAAA%3A5_NoKFcDi4-j9CEtBNoxm_DWKEwFNntaLBjo_Zoq3SanCH0Iycjqn9pqyD2ZfARcMDcMU_uqf1nbcw
Scott Cunningham
07:25:51
Laura can you ask your question again
Scott Cunningham
07:25:58
I had to close my laptop
Laura Nicolae
07:30:41
No worries and thanks Monika! My question was whether this adjustment relies on creating indicators Z for all the X values and therefore whether it only works when you have discrete regressors that take on a finite number of values
Laura Nicolae
07:30:47
Which is probably not true most of the time
Scott Cunningham
07:32:16
Ok - tomorrow I’m going to have a new system of relaying questions for the speaker
Scott Cunningham
07:33:30
As long as you recenter the covariates, and interact, then the weird variance weighting doesn’t occur and you can recover the ATT or ATE whichever
Laura Nicolae
07:37:00
Thank you :) I was misunderstanding and thought it’s to interact with indicators, which seemed weird
Scott Cunningham
07:37:03
Ok
Scott Cunningham
07:37:26
Only interact — original Angrist work saturated and interacted but it’s not necessary he said
Scott Cunningham
07:37:40
A lot of this is modeling the heterogeneity
Scott Cunningham
07:37:58
So you’re interacting continuous X after centering with D
Monika Avila Marquez
07:38:18
Interacting with indicators if the variable is categorical
Scott Cunningham
07:38:25
And Pedro does support having more choice over the weighting
Laura Nicolae
07:38:39
Thank you!
Boyu Shen
07:38:49
Thank you Scott!
Scott Cunningham
07:38:56
So one is his complaints with OLS is that the estimator is creating weights that may not be your preferred ones
Scott Cunningham
07:39:23
I can ask don question fyi
Scott Cunningham
07:47:59
Bernie is asking what he thinks abt methods trying to get exact balance
Scott Cunningham
07:48:47
Diagnostics that are transparent. Do it a few ways
Scott Cunningham
08:10:34
I’m going to talk to Bernie and change up how we get your questions answered more efficiently. And we are talking to speakers about using cursors too
DANDAN GU
08:11:25
Thanks Scott!
Scott Cunningham
08:14:42
When calculating scores what variables can we include and which ones not
Scott Cunningham
08:14:47
That’s the question
Scott Cunningham
08:14:54
Propensity scores
Scott Cunningham
08:22:44
Are there theorems in bias with propensity scores when uncondoundedness isn’t satisfied