Scientific Papers

Constructing within and between hospital physician social networks for modeling physician research participation | BMC Medical Research Methodology

In a previous trial [11, 12], we partnered with the physician organization that facilitated the trial to recruit physicians across 40 acute-care hospitals to participate in a stepped-wedge cluster-randomized trial with five steps between July 2020 and May 2021. The physician organization under consideration plays a significant role in the healthcare sector, extending its services to over 200 hospitals across the United States, each with its distinct geographical and organizational attributes. This broad spectrum of affiliations enhances the relevance and applicability of our findings. The trial intervention sought to increase the rate of advance care planning conversations occurring between patients and clinicians. The details of this protocol and trial results were previously published [11, 12]. Hospitals were randomized across the 5 steps based on previous rates of advance care planning, region of the US, and practice size. We obtained permission to approach the hospitalist staff at each site from physician leaders, and invited eligible physicians to participate via email. Eligibility criteria included physicians who had worked with the physician organization for at least 6 months, worked at the trial hospital for at least 3 months, and who indicated that they bill for advance care planning conversations. Electronic consent was obtained for hospitalists who agreed to participate. Hospitals where the physician leader did not respond, were not included in the trial.

In this work, we compared physicians who agreed to participate in the research study to those who were invited but did not participate for reasons that may include declining to participate, ignoring the invitation, or being ineligible to participate based on study design. Billing data and physician characteristic data, including age, sex, supervisor, and other employment details were provided by the physician organization. Uniquely, our collaboration with the physician organization granted us access to both the data on participation and recruitment as well as the billing data. This rare combination of datasets provided us a unique opportunity to construct physician networks and meticulously examine if professional networks are associated with trial participation.

All analyses were conducted in R version 3.6.0 [13]. Billing data was used to construct a physician-physician network where physicians are nodes and edges represent that a patient visited both physicians during 2019. This network was subdivided into hospital-specific networks to facilitate the study of hospitalist relationships within a hospital as well as across hospitals. Beyond sharing patients across hospitals, physicians may work and bill at more than one hospital. This was increasingly true during the COVID-19 pandemic. Thus, a second network was constructed where hospitals are nodes and edges represent that a physician billed at both hospitals between March 1, 2020 and May 31, 2021. Within both networks we calculated the degree (the number of edges to other physicians) and betweenness centrality (the extent to which the node lies on geodesic (or shortest) paths between other physicians in the network) of all nodes, and the overall network density (the proportion of possible edges that are present in the network). Descriptions of these definitions and others used to summarize our networks are included in Table 1. For each physician in the physician network, we also calculated the proportion of other physicians at their primary billing hospital who agreed to participate out of all invited, as well as the proportion who agreed to participate out of all invited at their secondary, tertiary, …, billing hospitals such that peer-physician participation rates were only calculated using physicians at hospitals at the same trial step or earlier to guard against reverse causality. All network analyses were conducted using ‘igraph’ version 1.5.1 in R [14].

Table 1 Definition of network metrics in the context of our physician network

Additional physician characteristics were derived from the billing data, such as the number of years with the physician organization, the overall number of patient encounters billed, the primary billing hospital, and the number of hospitals a physician billed at between March 1, 2020 and May 31, 2021; our trial overlapped with the COVID-19 pandemic. As well, we used the Shannon diversity index to construct a measure of hospital diversity for each physician [15]. For physician \(i=1,\ldots ,N\) this is defined as:

$$\begin{aligned} H_{i}=-\sum \limits _{l=1}^R p_{il} \text { ln }p_{il} \end{aligned}$$

where, in this case, R is a physician’s overall number of billing hospitals and \(p_{il}\) is the proportion of physician i’s encounters at hospital l. A physician who bills at only one hospital would hence have \(H=0\), a physician equally split between R hospitals would have a score of \(H=ln(R)\), and physicians primarily at one hospital but with small amounts of care delivered at another hospital would have H values tending towards zero.

Both physician and network characteristics were univariately assessed for association with likelihood to participate in the study using a two sided student’s T-test for continuous variables or a chi-squared test for categorical variables [16, 17]. Logistic regression models were constructed using a logit-link function to calculate adjusted odds ratios using both physician-level and network-level characteristics as predictors [18]. Models were reduced manually by iteratively removing predictors until only significant predictors remained (\(p < 0.05\)). Additionally, we constructed mixed-effect logistic regression models to compare to our most parsimonious logistic-regression models wherein primary billing hospitals were assigned a random intercept to account for hospital-level cultural differences (and other unmeasured hospital-level factors) which may impact a physicians decision to participate.

Statistical models

The primary network used in our statistical model is the shared-patient network for 348 physicians who were approached to participate in the trial. Let \(z_{ik}\) denote the number of encounters that patient k had with physicians in hospital i and \(a_{ij} = \sum _{k=1,…,n} I(z_{ik}>0)I(z_{jk}>0)\), where \(I({\textrm{event}})\) denotes the indicator function equalling 1 if “event” is true and 0 otherwise, denote the number of patients seen by physicians i and j during the study time period. The matrix \(A=[a_{ij}]\) denotes the adjacency matrix for the Physician Trial Invitee (PTI) network. By construction, A is a weighted network with weights corresponding to the number of shared patients between the two physicians. However, by using a function other than the indicator or step function, different edge weights may be easily determined; for example, the geometric mean \((z_{ik},z_{jk})^{1/2}\) has also been used previously [19]. For some computations we will use the binarized network, \(B=[b_{ij}]\), formed by applying a threshold rule to A, such as \(b_{ij}=I(a_{ij}>a_{\textrm{low}})\) where \(a_{\textrm{low}}\) is a non-negative number (e.g., \(a_{\textrm{low}}=0\) for any patient-sharing, \(a_{\textrm{low}}=100\) for a 100 shared patient minimum threshold to constitute a network edge).

Important attributes of a physician include their “official” hospital affiliation, the amount of care they deliver at each of the 40 hospitals, and their personal characteristics including age, sex, and years with the physician organization. We let \(S_{i}\) denote the primary hospital affiliation of physician i, \(V_{i}=[v_{il}]\) be a vector with the volume of care delivered by physician i at each hospital l, and \(X_{i}\) denote the personal characteristics of physician i included as predictors in the model. Two derived attributes are the number of distinct hospitals that a physician has practiced at in the period of time leading up to the trial and the Shannon diversity defined above and denoted \(H_{i}\).

The set of predictors listed in Table 1 are features of each physician’s position in the PTI network. We also use two types of derived networks from the PTI network, the sub-networks containing a physician’s primary hospital (a distinct network for each of the 40 participating hospitals) and the residual of the PTI network after a physician’s own hospital network is excluded other than their own node. The decomposition of the PTI network allows separate peer-physician exposure measures to be computed from each physician’s perspective at their primary hospital and outside of that (i.e., across all other hospitals). Let \(B_{wi}\) denote a within-hospital network and \(B_{ac}\) the corresponding across hospital portion of the full network. (If the physicians are ordered by their primary hospital, \(B_{wi}\) is block diagonal and \(B_{ac}\) has block zero matrices on its diagonal.) The elements of each row of \(B_{wi}\) and \(B_{ac}\) are divided by their corresponding row sums yielding row-stochastic weight matrices, \(W_{wi}\) and \(W_{ac}\) (rows sum to 1), respectively.

The outcome variable, trial participation, for patient i is a binary random variable denoted \(Y_{i}\). The measures of exposure to participating hospital peers for physician i at the same and across different hospitals are computed as \(WY_{wi,i}=[W_{wi}Y]_{i}\) and \(WY_{ac,i}=[W_{ac}Y]_{i}\), respectively. When derived from a binary valued source network, \(WY_{wi}\) and \(WY_{ac}\) are vectors of proportions reflecting the fraction of a physician’s within (“wi”) hospital and across (“ac”) peers that had at the time of the current observation agreed to participate in the trial.

Because the dependent variable is binary, we use statistical models with the logistic regression form to estimate the association of the predictors with the likelihood that a physician with given characteristics agrees to participate in the trial. Our most general statistical model accounts for effect modification of across hospital peer participation by the number of non-primary hospitals a physician has practiced at, \(N_{i}\), and includes random effects for primary hospital to account for clustering. The model is specified mathematically as

$$\begin{aligned} Y_{i} \sim {\textrm{Bern}}(\pi _{i}), \textrm{where}\ \pi _{i} = {\textrm{Pr}}(Y_{i}=1 \mid S_{i}=l,\theta _{l})\ \textrm{and} \end{aligned}$$

$$\begin{aligned} {\textrm{logit}}(\pi _{il})=\beta _{0}+\varvec{\beta }_{1}^{T}\varvec{X}_{il}+\beta _{2}H_{i}+\beta _{3}WY_{wi,i}+(\beta _{4}+\beta _{5}N_{i})WY_{ac,i}+\theta _{l}, \end{aligned}$$


\(\theta _{l} \sim {\textrm{Normal}}(0,\tau ^{2})\) and \(\tau ^{2}\) quantifies the amount of unexplained between-hospital variation in participation. The key terms in the above model are the elements of \(\varvec{\beta }_{1}\) corresponding to the physician characteristics and the positional physician network summary measures in Table 1, the effect of a physician’s Shannon diversity (\(\beta _{2}\)), the effect of within-hospital peer participation exposure (\(\beta _{3}\)), and the main (\(\beta _{4}\)) and interaction (\(\beta _{5}\)) across-hospital peer physician participation exposure associations. Because the Shannon Diversity is not centered, \(\beta _{4}\) corresponds to a hypothetical physician who is practicing at a lone hospital. When all of the within hospital networks are fully connected (as noted in “Results” section, several of the trial hospital physician networks are fully connected), \(\beta _{3}\) largely reduces to the effect of the hospital-level participation rate on a physician’s likelihood of participation.

The model in Eq. (1) is only well-defined under estimation if both \(WY_{wi,i}\) and \(WY_{ac,i}\) involve outcomes from prior time-periods and so are not dependent variables in another observation with the outcome for physician i contributing to the peer-exposure predictors of that other observation. This is the case for \(WY_{ac,i}\) as there was a clear order by which hospitals (and their physicians) at different steps of the trial were asked to participate. However, all physicians at the same hospitals were invited to participate at the same time and so \(Y_{i}\) contributes to \(WY_{wi,j}\) just as \(Y_{j}\) is contributes to \(WY_{wi,i}\), leading to endogenous feedback (simultaneity) and inconsistent estimation [20,21,22]. Therefore, we excluded \(WY_{wi,i}\) from the theoretical model in (1) to obtain our final statistical model.

Source link