10. 研究疑問
10
P a t i e n t
抗精神病薬を処⽅された急性⼼筋梗塞を
有する18歳以上の⼊院患者
Ex p o s u r e
経⼝ハロペリドールの使⽤
Comparison
経⼝リスペリドン/オランザピン/クエチ
アピンの使⽤
Ou t c o m e
処⽅開始7⽇以内の院内死亡
21. 傾向スコアの利⽤
Patients who received an atypical antipsychotic were
matched to patients who received haloperidol using
a 1:1 nearest neighbor matching algorithm with a
caliper of 0.2 of the standard deviation of the
propensity score on the logit scale.
21
Nearest neighbor
matching/1:1/0.2SD
34. 効果の種類×利⽤法により推定値は異なる
急性⾮代償性⼼不全へのCPAPの効果(使⽤ vs. ⾮使⽤)
34
利⽤法×効果の種類 n ⽣存率の差
(95%信頼区間)
マッチング-ATT 952 0.03 (−0.02, 0.08)
重み付け-ATT 4953 0.02 (0.01, 0.02)*
重み付け-ATE 4953 0.05 (−0.01, 0.12)
ATE=平均処置効果; ATT=処置群の平均処置効果
Pirracchio R et al: Stat Methods Med Res. 2016 Oct;25(5):1938-1954.
解析前に効果の種類を選択すべき
35. 効果の種類により
推定値が⼤きく異なる2つの要因
35
正値性の仮定からの逸脱
(violation or near violation of
positivity assumption)
傾向スコアの分布ごとに⾮均質な効果
(non-uniformity of treatment effect across
the PS strata)
Pirracchio R et al: Stat Methods Med Res. 2016 Oct;25(5):1938-1954.
36. 記載例
平均値処置効果(ATE)
In our study, the insured were men with health insurance whilst the
counterfactual group were those without insurance coverage. Under ideal
conditions, the effective strategy would be to obtain the average
effect of insurance coverage on prostate cancer screening, also known as
the average treatment effect (ATE).[1]
処置群の平均処置効果(ATT)
The question of interest was whether the treated group (the BIC
[Breakfast in the Classroom] schools) had different outcomes than they
would have if not provided with the BIC program or the average
treatment effect among the treated;[2]
36
[1] Kangmennaang J, Luginaah I: J Cancer Epidemiol. 2016;2016:7284303.
[2] Anzman-Frasca S et al: JAMA Pediatr. 2015 Jan;169(1):71-7.
44. 共変量選択の留意点
① 共変量は真に処置前変数であること[1]
② 共変量を増やすよう重視すること[2]
③ 信頼性のある共変量を選択すること[3]
④ 共変量をカテゴリ化する場合は,臨床的に
意味のある閾値を使うこと[4]
⑤ ⾮線形関係が期待される場合は,⼆次・三
次の項を加えること[4]
⑥ 共変量の測定期間を変えるなど感度分析を
⾏うこと[1]
44
[1] Jackson JW et al: Curr Epidemiol Rep. 2017 Dec;4(4):271-280.
[2] Ali MS et al: Front Pharmacol. 2019 Sep 18;10:973.
[3] Harris H, Horst SJ: Pract Assess Res Eval. (2016) 21:1–11
[4] Yang JY et al: Gastrointest Endosc. 2019 Sep;90(3):360-369.
45. 記載例①共変量選択理由
We identified potential confounders that were plausibly associated with
both the choice of antipsychotic and the risk of in-hospital death based
on clinical knowledge, using information from hospital admission to the
day before initiation of an antipsychotic. In addition to hospital
characteristics (teaching, urban), the covariates included patient
characteristics and conditions that were plausibly associated with the
choice of antipsychotic and the risk of in-hospital death (see
supplementary table S1 for the list of covariates).
45
Park Y et al: BMJ. 2018 Mar 28;360:k1218.
50. 統計モデルの⽐較研究①
50
Logistic regression vs. CART vs. Bagging vs.
Random Forest vs. Neural Network vs. naive
Bayes[1]
Random Forestがベスト
Logistic regressionとNeural Networkも良い
Logistic regression vs. boosted CART vs.
Covariate-balancing propensity score(CBPS)[2]
CBPSが良い
バランス評価をすればLogistic regressionも良い
[1] Cannas M, Arpino B: Biom J. 2019 Jul;61(4):1049-1072.
[2] Wyss R et al: Am J Epidemiol. 2014 Sep 15;180(6):645-55.
51. 統計モデルの⽐較研究②
51
Logistic regression vs. CART vs. pruned CART
vs. bagged CART vs. Random Forest vs.
boosted CART[1]
⾮線形性あるいは⾮加法性のいずれかの条件では,す
べての⼿法は許容可能な⽔準
⾮線形性かつ⾮加法性の条件では,boosted CARTと
random forestを推奨
Logistic regression vs. CART vs. pruned CART
vs. Neural Network[2]
Logistic regressionの頑健性は⾼い
[1] Lee BK et al: Stat Med. 2010 Feb 10;29(3):337-46.
[2] Setoguchi S et al: Pharmacoepidemiol Drug Saf. 2008 Jun;17(6):546-55.
52. 傾向スコア推定の変数役割
従属変数
処置変数 (Z) (処置群/対照群)
独⽴変数
共変量 (X)
傾向スコアの推定値 ( )
予測値:
特徴
得点可能範囲: 0〜1
サイズ: 標本サイズと同じ
52
共変量
(X)
処置変数
(Z)
i
ê
処置群
Austin PC et al: Analysis of observational health care data using SAS (pp51-84). SAS press. 2010.
53. 観察研究における傾向スコア
ID 処置変数 (Zi) 傾向スコア (ei)
1 対照 0.2
2 処置 0.2
3 対照 0.3
4 処置 0.3
・ ・ ・
N 対照 0.4
53
Ali MS et al: Front Pharmacol. 2019 Sep 18;10:973.
⼈により異なる
傾向スコアが近い⼈同⼠
は共変量パターンが類似
する可能性が⾼い
54. RCTにおける傾向スコア
ID 処置変数 (Zi) 傾向スコア (ei)
1 対照 0.5
2 処置 0.5
3 対照 0.5
4 処置 0.5
・ ・ ・
N 対照 0.5
54
Ali MS et al: Front Pharmacol. 2019 Sep 18;10:973.
研究法で規定される
58. 記載例①傾向スコア推定後の評価
The largely overlapping distributions of propensity scores (see
supplementary figure S2) suggest that haloperidol and atypical
antipsychotics were used interchangeably in many instances, judged by
the measured covariates.
58
ハロペリドール
⾮定型薬
Park Y et al: BMJ. 2018 Mar 28;360:k1218.
59. 記載例②ロジスティック回帰
59
The propensity score for fluoroquinolone exposure was estimated by a
logistic regression model, including 47 covariates as predictors,
covering demographic information, medical history, prescription drug use,
and healthcare use (web table 2).
Pasternak B et al: BMJ. 2018 Mar 8;360:k678.
60. 記載例③boosted CART
Because of the observational nature of the present study and to minimize
confounding, we modeled the probabilities of developing high ESs at 11
and 15 years old using the PS weighting (PSW) method, which used
generalized boosted modeling (GBM) to calculate PSs, in twang
package in R.[ref] GBM has been made popular in the machine learning
community as one of the latest prediction methods, allowing researchers
to powerfully estimate exposure probability (PS) on the basis of many
predicting covariates. It fits several models, both linear and nonlinear,
using a regression tree and then merging predictions computed by each
model.[ref] Regression trees do not require researchers to specify functional
forms of variables (ie, they handle continuous, nominal, ordinal, and
missing independent variables, as well as nonlinear and interaction
effects).[ref] Covariables used to compute PS at 11 years old and PS at 15
years old using GBM were chosen considering previous work on ESs and
CVD risk2 and can be found in the Figure. Number of interaction trees
was set on 5000, shrinkage in 0.01 and level of interactions in 2, which
were basically set to minimize prediction errors by means of subsampling
strategies.[ref]
60
Belem da Silva CT et al: J Am Heart Assoc. 2019 Jan 22;8(2):e011011.
91. 5つのサブグループ分析①
91
Wang SV et al: Am J Epidemiol. 2018 Aug 1;187(8):1799-1807.
要素 解説
⽅法 標本全体で傾向スコアを算出して,全体でのマッチ
ングを⾏う。同じ傾向スコアを使ってサブグループ
内で再度マッチングを⾏う(標本全体でマッチングされた
か否かは無視)
利点 主解析の標本サイズは最⼤化される
⽋点 サブグループは,主解析の集団と対応しているとは
限らない
92. 5つのサブグループ分析②
92
Wang SV et al: Am J Epidemiol. 2018 Aug 1;187(8):1799-1807.
要素 解説
⽅法 標本全体で傾向スコアを算出して,全体でのマッチ
ングを⾏う。同じ傾向スコアを使ってサブグループ
内で再度マッチングを⾏う(標本全体でマッチングされた
⼈に限定)
利点 サブグループは,主解析の集団と対応する
⽋点 ①と⽐較して,サブグループの標本サイズが⼩さく
なる
93. 5つのサブグループ分析③
93
Wang SV et al: Am J Epidemiol. 2018 Aug 1;187(8):1799-1807.
要素 解説
⽅法 サブグループ内で傾向スコアを算出してマッチング
を⾏う。主解析のためにサブグループを統合する。
利点 サブグループは,主解析の集団と対応する
⽋点 事後的にサブグループ分析を⾏う場合,主解析の集
団と対応しない。事後的なサブグループが主解析の
集団と対応するためには,主解析のために新たにサ
ブグループを統合する必要がある。傾向スコア推定
の際に,収束の問題が⽣じやすい。
94. 5つのサブグループ分析④
94
Wang SV et al: Am J Epidemiol. 2018 Aug 1;187(8):1799-1807.
要素 解説
⽅法 標本全体で傾向スコアを算出して,サブグループ内
でマッチングを⾏う。主解析のためにサブグループ
を統合する。
利点 サブグループは,主解析の集団と対応する
⽋点 事後的にサブグループ分析を⾏う場合,主解析の集
団と対応しない。事後的なサブグループが主解析の
集団と対応するためには,主解析のために新たにサ
ブグループを統合する必要がある。
95. 5つのサブグループ分析⑤
95
Wang SV et al: Am J Epidemiol. 2018 Aug 1;187(8):1799-1807.
要素 解説
⽅法 標本全体で傾向スコアを算出して,全体でのマッチ
ングを⾏う。追加の調整をせずに,サブグループ内
で効果を推定する。
利点 サブグループ内で,⼆度⽬のマッチングをする必要
がない。
⽋点 統計学的特性は,すべての⼿法の中で最も悪い
96. 記載例①Caliper matching/1:1/⾮復元抽出
We applied a 1:1 nearest-neighbor risk-set matching algorithm on the
propensity score without replacement, with a maximum caliper width
of 0.1 of the SD of the logit of the propensity score.
96
Henriquez DDCA et al: JAMA Netw Open. 2019 Nov 1;2(11):e1915628.
97. 記載例②Optimal matching/1:M/⾮復元抽出
After deriving a propensity score for each patient, variable optimal
matching for each hypothermia-treated patient was performed, with up
to 4 controls without replacement for each treated patient, using an
algorithm match with a caliper width no greater than 0.2 times the
standard deviation of the logit of the propensity score.
97
Chan PS et al: JAMA. 2016 Oct 4;316(13):1375-1382.
98. In the next step, patients were matched on estimated propensity scores
using a combination of exact and full matching.[ref] The matching was
exact concerning calendar quarter and in-hospital PCI. Full matching
means that a patient treated with fondaparinux could be matched to
several patients treated with LMWH and vice versa. The caliper (upper
limit to the allowed difference in propensity score between matched
patients treated with LMWH and fondaparinux) was 0.002 (except for
eGFR >15-30, for which the caliper was 0.005, and eGFR ≤15, for which the
caliper was 0.01). Unmatched patients were removed in the subsequent
analysis.
98
記載例③Full matching/M:M/⾮復元抽出
Szummer K et al: JAMA. 2015 Feb 17;313(7):707-16.
99. We tested for the presence of effect modification in several relevant
subgroups. First, we restricted the analysis to patients without evidence
of antibiotic use, disease-modifying antirheumatic drug use, or infections
in the baseline period (defined as 180 days before cohort entry). Second,
we stratified the analysis by sex to reflect the differences in incidence and
severity of UTIs[Urinary Tract Infections][ref]. ...(中略)...
Within each subgroup, the propensity score was reestimated and
patients were rematched on the newly estimated score using 1:1
nearest-neighbor matching within a caliper width of 0.01.
99
記載例④サブグループ分析
Dave CV et al: Ann Intern Med. 2019 Jul 30. doi: 10.7326/M18-3136.
120. 120
定義
ei: 傾向スコア
zi : 処置変数 (処置群=1; 対照群=0)
処置変数 傾向
スコア
(e)
1-傾向
スコア
(1-e)
ow
1 0.84 0.16 0.16
1 0.40 0.60 0.60
0 0.03 0.97 0.03
0 0.53 0.47 0.53
1⼈⽬の重み
3⼈⽬の重み
Li F et al: Am J Epidemiol. 2019 Jan 1;188(1):250-257.
4つの重み④オーバーラップした重み付け
121. 処置群の平均処置効果のための重み付け
Sandardised mortality ratio weights
121
定義
z: 処置変数 (処置群=1; 対照群=0)
e: 傾向スコア
[1] Austin PC, Stuart EA et al: Stat Med. 2015 Dec 10;34(28):3661-79.
[2] Desai RJ, Franklin JM: BMJ. 2019 Oct 23;367:l5657.
122. 記載例①逆確率重み付け
Inverse probability of treatment weighting on the propensity score
was used to balance comparison groups on recorded indicators of
baseline health, including known indications for baclofen use (including
off-label indications).[ref]
122
Muanda FT et al: JAMA. 2019 Nov 9. doi: 10.1001/jama.2019.17725.
123. 記載例②マッチングした重み付け
Comparison of OS[overall survival] between patients who underwent a
second PM[pulmonary metastasectomy] and patients who did not
required attention to factors associated with selection. To address this, we
used the matching weights method [ref]. This approach is a weighting
analogue to the 1:1 pair-matching method, although shown to be more
efficient, that provides better balance across covariates. Unlike 1:1 pair
matching, which excludes any unmatched patients, the matching weights
approach never discards any patients; instead, it only down-weights
some of the patients. The matching weights approach is a variant of the
inverse probability weights method; the matching weights can be
considered the probability of being selected to the matched data set. With
the application of the patient-level matching weights, each patient
contributes a fraction of itself to the overall cohort used in the analyses.
123
Chudgar NP et al: Ann Thorac Surg. 2017 Dec;104(6):1837-1845.
124. We used propensity score weighting because it could produce one
interpretable overall treatment effect and would not diminish our sample
size. To estimate the average effect of treatment on individuals using
SGLT-2[sodium-glucose cotransporter 2] inhibitors, the average
treatment effect of the treated (ATT) weighting was applied; that is,
we compared the hazards of outcomes among individuals using SGLT-2
inhibitors with the hypothesized situation had they taken DPP-4[dipeptidyl
peptidase 4] inhibitors, GLP-1[glucagon-like peptide 1] agonists, or older
agents instead of SGLT-2 inhibitors. This approach is specifically useful
when systematic differences likely occur between the study sample and
the overall population.[ref]
124
記載例③処置群の平均処置効果のための
重み付け
Chang HY et al: JAMA Intern Med. 2018 Sep 1;178(9):1190-1198.
131. 記載例①標準化差の⽅法と結果
⽅法
Covariate balance between the two groups was assessed after
matching, and we considered an absolute standardized difference less
than 0.1 as evidence of balance.[ref]
結果
We matched 99.5% of the haloperidol initiators to atypical antipsychotic
initiators (n=1659), and all covariates included in the propensity score
were well balanced after matching.
131
Park Y et al: BMJ. 2018 Mar 28;360:k1218.
133. 記載例③判断基準の変更理由
We calculated standardised differences to evaluate the balance of
variables in each predicted propensity score matched cohort. We first
regarded standardised differences less than 0.1 as having well
matched balance,[ref] but we could not achieve the value for the variable
of “defibrillation before matching” in the shockable cohort even with a
very narrow calliper width (0.001). When we attempted to achieve better
balancing of standardised differences (<0.1) by setting the calliper width
much narrower (<0.001), we lost a large number of patients. In the end,
we decided to avoid losing these patients by using a tight range of
target and chose a value of 0.25 rather than 0.1 of standardised
differences, as some statisticians have suggested,[ref] before doing our
final analyses.
133
Izawa J et al: BMJ. 2019 Feb 28;364:l430.
134. ⽅法
A standardised difference with an absolute value less than 0.10 and a
variance ratio between 4/5 and 5/4 was considered sufficient to
support the assumption of balance of the covariate between the
treatment groups [ref].
結果
The absolute standardised differences were as high as 0.89, with 79% of
the covariates having an absolute standardised difference >0.10. The most
extreme variance ratio was 10.1 and 68% of the covariates had a variance
ratio <4/5 or >5/4. After matching, the largest absolute standardised
difference was 0.12 and only 11% of the covariates had an absolute
standardised difference >0.10. The most extreme variance ratio was 1.34
and only 7% of the covariates had a variance ratio <4/5 or >5/4. Thus,
the propensity score matching largely removed the imbalances in the
covariates (Figs. 1 and 2).
134
記載例④分散⽐の⽅法と結果
Jakobsen CJ et al: Eur J Cardiothorac Surg. 2009 Nov;36(5):863-8.
139. マッチング法で対応を考慮すべきかは未決着
Although there is still some debate as to whether accounting for the
matched nature of the data is necessary, Austin and colleagues[ref]
advocate that accounting for the matched nature of the sample when
estimating the precision or significance of the treatment effect is
necessary, as matching was done after exposure.[1]
Whether or not to account for the matched nature of the data in
estimating the variance of the treatment effect, for example, using
paired t-test for continuous outcome or McNemar’s test for binary
outcome, is an ongoing discussion.[ref][2]
139
[1] Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
[2] Ali MS et al: Front Pharmacol. 2019 Sep 18;10:973.
157. 157
重み付け法の感度分析⑤
Carnegie NB et al: Journal of Research on Educational Effectiveness, 9:3, 395-420, 2016
測定された共変量のうち処置
変数とアウトカムへの影響が
最も⼤きい
統計的有意性がなくなる範囲
158. 記載例①マッチング法の⽅法と結果
⽅法
Relative risk (RR) was estimated as the ratio of the probability of the
outcome event in patients treated using the transcarotid approach
compared with patients treated using the transfemoral approach. The
95% CIs were constructed using methods that accounted for the
matched nature of the cohorts.[ref]
結果
158
Schermerhorn ML et al: JAMA. 2019 Dec 17;322(23):2313-2322.
159. 記載例②重み付け法の⽅法と結果
⽅法
For each outcome, 3 separate Cox proportional hazards regression
models with propensity score ATT weighting were constructed to
examine the association between the use of SGLT-2[sodium-glucose
cotransporter 2] inhibitors (relative to 3 reference groups) and the
outcome. We calculated robust estimates of SEs for all variables in the
models. [ref]
結果
159
Chang HY et al: JAMA Intern Med. 2018 Sep 1;178(9):1190-1198.
160. To address potential for selection bias, we used augmented inverse
probability weighting (AIPW) propensity score methods to estimate
the average effect of drug benefit user group and PUM[potentially unsafe
medication] exposure. We used the teffects aipw command in STATA,
version 13 (StataCorp). Augmented inverse probability weighting
combines 2 models: inverse probability weighting in the drug benefit user
group selection propensity model with regression adjustment in the
outcomes model (PUM exposure) [ref]. When these 2 approaches are
combined, AIPW is called “doubly robust estimation” because only 1 of the
2 models needs to be correctly specified to obtain an unbiased estimator.
Specifically, logistic regression was used in the propensity model to
estimate the probability of belonging to either user group, and
weighted logistic regression and weighted linear regression were
used to model PUM exposure and the number of days of PUM
exposure, respectively [ref]. All covariates described here were included in
both the drug benefit user group selection and PUM exposure models. To
account for the highly skewed nature of the days of PUM exposure
variables, we estimated SEs and 95% CIs using a bias-corrected
bootstrap approach [ref].
160
記載例③⼆重ロバスト推定法の⽅法
Thorpe JM et al: Ann Intern Med. 2017 Feb 7;166(3):157-163.
161. ⽅法
In addition, we performed a sensitivity analysis to evaluate the impact
of an unmeasured confounder as previously described.[ref] Sensitivity
analysis was implemented using the R package Rbounds available at
http://cran.r-project.org/web/packages/rbounds/.
結果
Furthermore, in a sensitivity analysis, the association between bivalirudin
use and vascular complications and GI bleed was robust enough to the
effect of an unmeasured confounder. The association with transfusion
was moderately robust, whereas the results corresponding to CABG
were only mildly robust (Table I in the online-only Data Supplement).
161
記載例④感度分析の⽅法と結果
Perdoncin E et al: Circ Cardiovasc Interv. 2013 Dec;6(6):688-93
162. 考察
However, our sensitivity analysis suggested that the antibleeding
efficacy estimates were fairly robust to the presence of an unmeasured
confounder and are likely extant.
付録
162
記載例⑤感度分析の考察と付録
Perdoncin E et al: Circ Cardiovasc Interv. 2013 Dec;6(6):688-93
172. ⽋測値処理の選択基準①
未測定の交絡=なし; 効果修飾=なし
タイプ MCAR MAR MNAR
完全ケース分析 ◎ ◎ ◎
⽋測指標 ✕ ✕ ✕
多重代⼊法注)
◎ ◎ ◎
多重代⼊法と⽋測
定指標の併⽤注) ◎ ◎ ◎
注) ⽋測値の代⼊⽣成モデルにアウトカムも含める
MCAR = missing completely at random
MAR = missing at random
MNAR = missing not at random
172
Choi J et al: Eur J Epidemiol. 2019 Jan;34(1):23-36.
173. ⽋測値処理の選択基準②
未測定の交絡=なし; 効果修飾=あり
173
タイプ MCAR MAR MNAR
完全ケース分析 ◎ ✕ ✕
⽋測指標 ✕ ✕ ✕
多重代⼊法注)
◎ ◎ ✕
多重代⼊法と⽋測
定指標の併⽤注) ◎ ◎ ✕
注) ⽋測値の代⼊⽣成モデルに,処置変数と共変量との交互作⽤項,共変量
とアウトカムの交互作⽤項,処置変数とアウトカムの交互作⽤項を含める
MCAR = missing completely at random
MAR = missing at random
MNAR = missing not at random
Choi J et al: Eur J Epidemiol. 2019 Jan;34(1):23-36.
174. ⽋測値処理の選択基準③
未測定の交絡=あり; 効果修飾=なし
174
タイプ MCAR MNAR
完全ケース分析 ✕ △
⽋測指標 ✕ ✕
多重代⼊法注)
✕ ✕
多重代⼊法と⽋測
定指標の併⽤注) ✕ △
注) ⽋測値の代⼊⽣成モデルにアウトカムも含める
MCAR = missing completely at random
MAR = missing at random
MNAR = missing not at random
Choi J et al: Eur J Epidemiol. 2019 Jan;34(1):23-36.
175. ⽋測メカニズム①MCAR*
*missing completely at random
175
X1
X2
X2
*
R
Z Y
⽋測指標
⽋測のある
共変量
⽋測発⽣は観測・⾮観測変数と独⽴
(例: X2の⽋測は,他の変数と無関係)
Choi J et al: Eur J Epidemiol. 2019 Jan;34(1):23-36.
177. ⽋測メカニズム③MNAR*
*missing not at random
177
X1
X2
X2
*
R
Z Y
⽋測指標
⽋測のある
共変量
⽋測発⽣は⾮観測変数に依存
(例: X2の値が⾼い場合にX2が⽋測)
Choi J et al: Eur J Epidemiol. 2019 Jan;34(1):23-36.
178. 慣例①標本サイズ
研究 第1四分位 中央値 第3四分位
[1] ⼿術 2016-2018 (303編) 503 1803 6658
[2] がん⼿術 2014-2015 (306編) 307 699 2783
[3]⼿術 2013-2014 (129編) 348 904 4133
178
[1] Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
[2] Yao XI et al: J Natl Cancer Inst. 2017 Aug 1;109(8).
[3] Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
179. 179
研究 第1四分位 中央値 第3四分位
[1] 集中治療 2006-2009 (47編) 9 15 22
[2] 全体 1983-2003 (177編) 10 17 28
[3] 全体 2001 (47編) 8 17 27
[4] ⼿術 2013-2014 (129編) 7 12 18
[1] Gayat E et al: Intensive Care Med. 2010 Dec;36(12):1993-2003.
[2] Stürmer T et al: J Clin Epidemiol. 2006 May;59(5):437-47.
[3] Weitzen S et al: Pharmacoepidemiol Drug Saf. 2004 Dec;13(12):841-53.
[4] Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
慣例②共変量の数
183. Groseの提案①統計モデル
183
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
傾向スコア推定の統計モデル (79%)
記載例
Propensity scores for each patient were obtained from a multivariable
logistic regression model based on patient characteristics, year of
surgery, comorbidities, and hospital volume and location.
184. Groseの提案②共変量
184
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
傾向スコア推定に含めた共変量 (91%)
記載例
The model included the following variables with pretreatment
characteristics: sex (male or female), age (≤17 years, 18-64 years, or ≥65
years), witness (witnessed or unwitnessed), bystander CPR (any CPR or no
CPR), first rhythm (ventricular fibrillation, ventricular tachycardia, pulseless
electrical activity, asystole, or others), and response time (<10 minutes or
≥10 minutes).
185. Groseの提案③共変量の選択根拠
185
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
傾向スコア推定に含めた共変量の選択根拠 (10%)
記載例
The 2 mesh groups were propensity-score matched using factors that
have been shown previously to be associated with increased risk of 30-
day wound events after ventral hernia repair.
Variables were selected from an initial univariate analysis comparing
the surgery and chemotherapy groups, and variables that differed
significantly between the 2 groups were chosen for propensity matching.
186. Groseの提案④マッチング後の標本サイズ
186
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
マッチング後の標本サイズ (94%)
記載例
After co-variable adjustment, 31 of the 37 patients in the hepatectomy+
RFA group were matched 1:3 with 93 of the 516 patients in the
hepatectomy-alone group.
187. Groseの提案⑤マッチング法の詳細
187
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
①アルゴリズム (74%)
②構成⽐ (97%)
③抽出法 (30%)
記載例
Patients were matched (1:1) using the nearest neighbor method
without replacement and a caliper width of 0.2 of the standard
deviation of the logit of the estimated propensity score.
After score calculation, we performed 1:1 matching using a greedy
nearest-neighbor algorithm without replacement of the remaining 88
GDP patients to 88 control patients using a caliper of 20% of the logit
of the score’s standard deviation.
188. Groseの提案⑥バランスの評価
188
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
共変量バランスの評価 (52%)
記載例
Standardized differences were estimated before and after matching to
evaluate the balance of covariates; small absolute values <0.10 SD
indicated balance between the cohorts.
Standardized differences greater than 0.2 were considered to indicate
large imbalance among covariates used in the propensity score for
matching.
189. Groseの提案⑦効果の推定法
189
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
対応を考慮した効果の推定法 (56%)
記載例
The analysis compared matched pairs using McNemar test for
categorical variables, paired t tests for symmetrically distributed
variables, and Wilcoxon signed-rank test for skewed continuous
variables.
190. Groseの提案⑧予測の精度
190
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
傾向スコア推定モデルの予測精度 (21%)
記載例
We tested discrimination of the propensity model with the c-statistic.
191. Groseの提案⑨代表性の評価
191
Grose E et al: J Am Coll Surg. 2020 Jan;230(1):101-112.e2.
解説(報告率)
マッチングできなかった症例の特性 (15%)
記載例
Details of outcome data for the overall and unmatched population are
presented in TABLE E2.
192. Lonjonの提案①共変量の選択根拠
解説
①すべてのベースライン特性を投⼊する
②領域固有の知識や統計量により変数選択する
記載例
Preoperative risk factors and demographic and operative variables were
entered in the propensity models irrespective of their significance (all
factors in Table 1).
The factors considered to be the most important confounders also
contributing to deep-infection risk were chosen for the propensity-score
algorithm. [. . .] These factors were chosen, based on consensus among
the investigators, as the factors most important for predicting later
infection but also as those most divergent between the immediate and
delayed-closure groups (Table II).
192
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
193. Lonjonの提案②ITT分析
解説
RCTと同様に,効果推定値を過⼤評価しないために,
ITT分析が望ましい
記載例
Every unplanned extension of an incision for any task other than retrieving
the specimen was considered to be a conversion. Laparoscopic operations
that had to be converted to open surgery were analyzed according to the
intention-to-treat principle.
193
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
194. Lonjonの提案③⽋測値処理
解説
①⽋測のある症例を除外
②⽋測をカテゴリ化
③多重代⼊法
記載例
Consistent with previously established methods specific to addressing
missing data in propensity score calculations, a separate category
for ’unknown’ was created for missing data for nominal variables.
Missing data were infrequent (5% on any variable). We performed
additional analyses using various missing data statistical approaches
including multiple imputation and weighted estimating equations.
194
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
195. Lonjonの提案④マッチング法の詳細
解説
①アルゴリズム
②構成⽐
③抽出法
記載例
A 1:1 match on the propensity score, without replacement, was
performed using the psmatch2 procedure, with a conservative caliper
width of 20% of the standard deviation of the log of propensity score.
195
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
196. Lonjonの提案⑤バランスの評価法
解説
標準化差が10%未満であることが望ましい
記載例
We estimated standardised differences for all covariates before and
after matching, with a standardised difference of 10% or more
considered indicative of imbalance.
196
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
197. Lonjonの提案⑥効果の推定法
解説
対応のある効果の推定法が望ましい
記載例
Continuous outcomes were compared in the PS-matched groups using
paired t-tests or Wilcoxon signed rank test as appropriate; differences
in proportions were compared using the McNemar’s test.
197
Lonjon G et al: Ann Surg. 2017 May;265(5):901-909.
203. Stuartの提案①
203
報告すべき点 記載箇所
介⼊群,対照群,アウトカムと標本の単位 (例: 患者や
病棟) の定義を含めることにより,関⼼のある因果的
疑問を明瞭に述べること。
I, M, D
関⼼のある効果の種類 (平均処置効果/処置群の平均処
置効果) と共に,アウトカムの対⽐ (リスク⽐/オッズ
⽐/リスク差など) を明瞭にすること。
I, M, R, D
解析対象集団に⾄るまでの適格基準と除外理由を明記
すること。なお,適格基準は,処置前変数により定義
しなければならない。
M
I = introduction; M = methods; R= results; D =discussion
Stuart EA: Propensity scores and matching methods. “The Reviewer’s Guide to
Quantitative Methods in the Social Sciences“ Routledge. 2018.