Package 'ImputeLongiCovs' reference manual

Title:	Longitudinal Imputation of Categorical Variables via a Joint Transition Model
Description:	Imputation of longitudinal categorical covariates. We use a methodological framework which ensures that the plausibility of transitions is preserved, overfitting and colinearity issues are resolved, and confounders can be utilized. See Mamouris (2023) <doi:10.1002/sim.9919> for an overview.
Authors:	Pavlos Mamouris [aut, cre], Vahid Nassiri [aut, ctb], Geert Molenberghs [ctb], Geert Verbeke [ctb]
Maintainer:	Pavlos Mamouris <[email protected]>
License:	GPL-2
Version:	0.1.0
Built:	2025-03-22 03:58:38 UTC
Source:	https://github.com/cran/ImputeLongiCovs

Analyses Data for imputing categorical covariates

Description

A dataset containing longitudinal data. The outcome of interest is the smoking status with three states (smoker, exsmoker, neversmoker), which are represented via transitions. The difference from the initial data is the prob_matrix column.

Usage

data(analyses_data)
data(analyses_data)

Format

A data frame with 2000 rows and 10 variables

Details

patient_id: Unique identifier for each patient
tran_Year: numeric, starting from 1 up to the number of transitions
transition_year: text explanation of the transition
state_from: the state at the beginning of a transition
state_to: the state at the end of a transition
prob_matrix: the probability matrix that was generated by the initial data
cardio_state_from: cardiovascular disease at the beginning of the transition, binary, if 1 == Yes, else No
cardio_state_to: cardiovascular disease at the end of the transition, binary, if 1 == Yes, else No
flu_vaccination_state_from: flu vaccination at the end of the transition, binary, if 1 == Yes, else No
flu_vaccination_state_to: flu vaccination disease at the end of the transition, binary, if 1 == Yes, else No

create_probMatrix

Description

create_probMatrix creates a variable that contains the transition probabilities ("initial", "forward", "backward", "intermittent", "observed")

Usage

create_probMatrix(input_data, patient_id)
create_probMatrix(input_data, patient_id)

Arguments

input_data

A dataset in a format similar to 'initial_data'. This dataset must contain the variables "state_from", which is the status at the beginning of the transition (say smoker in 2010), "state_to", which is the status at the end of the transition (say ex-smoker in 2011) and "tran_Year", which is an integer variable that is equal to the number of transitions. "tran_Year" == 1 means that the transition occurs from 2010 to 2011, "tran_Year" == 2, from 2011 to 2012, up to the total number of transitions

patient_id

A character variable that specifies the column name with the unique Id of the patient

Value

a data frame containing the column "prob_matrix"

Examples

create_probMatrix(initial_data, patient_id = "patient_id")
create_probMatrix(initial_data, patient_id = "patient_id")

impute_categorical_covariates

Description

impute_categorical_covariates imputes longitudinal categorical covariates through a joint model that accommodates initial, forward, backward, and intermittent transitions.

Usage

impute_categorical_covariates(
  input_data,
  patient_id,
  number_of_transitions,
  covariates_initial = NULL,
  covariates_transition = NULL,
  missing_variable_levels,
  startingyear = NULL,
  without_trans_prob,
  m = 1
)
impute_categorical_covariates(
  input_data,
  patient_id,
  number_of_transitions,
  covariates_initial = NULL,
  covariates_transition = NULL,
  missing_variable_levels,
  startingyear = NULL,
  without_trans_prob,
  m = 1
)

Arguments

`input_data`	A dataset in a format similar to 'analyses_data'. This dataset must contain the variables "state_from", which is the status at the beginning of the transition (say smoker in 2010), "state_to", which is the status at the end of the transition (say ex-smoker in 2011) and "tran_Year", which is an integer variable that is equal to the number of transitions. "tran_Year" == 1 means that the transition occurs from 2010 to 2011, "tran_Year" == 2, from 2011 to 2012, up to the total number of transitions Also, it must contain "prob_matrix" which captures all the transitions ("initial", "forward", "backward", "intermittent", "observed") that was calculated with the 'create_probMatrix' function
`patient_id`	A character variable that specifies the column name with the unique Id of the patient
`number_of_transitions`	The number of transitions needed. For example for years 2010, 2011 and 2012 there exist 2 transitions.
`covariates_initial`	The covariates to be used in the initial model
`covariates_transition`	The covariates to be used in the transition model
`missing_variable_levels`	The levels of the missing categorical outcome (e.g. "smoker", "ex-smoker", "never-smoker")
`startingyear`	If the starting year per patient has no missing values, specify it
`without_trans_prob`	This statement is useful when there are very high proportions of missing data and our initial and transition model cannot converge. It provides the user with two options. One, to "notImpute", namely to return NA and two, to "ImputeEqualProbabilities", i.e., the user can sample with equal probabilities.
`m`	Numeric, the number of imputed datasets

Details

It encloses three different functions. The 'initial_forward_function' imputes the longitudinal categorical covariate of interest based on whether in that transition the 'prob_matrix' of a patient was 'initial' or 'forward'. The 'imputeIntermittent' imputes the longitudinal categorical covariate for the intermittent transition and the 'backward_function' imputes the longitudinal categorical covariate for the backward transition.

Value

a list of m data frames with no missing values in the categorical outcome

References

()

Examples

impute_categorical_covariates(analyses_data,
patient_id = "patient_id",
number_of_transitions = 2,
covariates_initial = c("cardio_state_from", "flu_vaccination_state_from"),
covariates_transition = c("cardio_state_to", "flu_vaccination_state_to"),
missing_variable_levels = c("never-smoker", "smoker", "ex-smoker"),
startingyear = NULL,
without_trans_prob = "notImpute",
m = 2)
impute_categorical_covariates(analyses_data,
patient_id = "patient_id",
number_of_transitions = 2,
covariates_initial = c("cardio_state_from", "flu_vaccination_state_from"),
covariates_transition = c("cardio_state_to", "flu_vaccination_state_to"),
missing_variable_levels = c("never-smoker", "smoker", "ex-smoker"),
startingyear = NULL,
without_trans_prob = "notImpute",
m = 2)

Initial Data for imputing categorical covariates

Description

A dataset containing longitudinal data. The outcome of interest is the smoking status with three states (smoker, exsmoker, neversmoker), which are represented via transitions.

Usage

data(initial_data)
data(initial_data)

Format

A data frame with 2000 rows and 9 variables

Details

patient_id: Unique identifier for each patient
tran_Year: numeric, starting from 1 up to the number of transitions
transition_year: text explanation of the transition
state_from: the state at the beginning of a transition
state_to: the state at the end of a transition
cardio_state_from: cardiovascular disease at the beginning of the transition, binary, if 1 == Yes, else No
cardio_state_to: cardiovascular disease at the end of the transition, binary, if 1 == Yes, else No
flu_vaccination_state_from: flu vaccination at the end of the transition, binary, if 1 == Yes, else No
flu_vaccination_state_to: flu vaccination disease at the end of the transition, binary, if 1 == Yes, else No

Package 'ImputeLongiCovs'

Help Index

Analyses Data for imputing categorical covariates

Description

Usage

Format

Details

create_probMatrix

Description

Usage

Arguments

Value

Examples

impute_categorical_covariates

Description

Usage

Arguments

Details

Value

References

Examples

Initial Data for imputing categorical covariates

Description

Usage

Format

Details