Specifically, we consider the following semiparametric logistic model logistic mixed model representation and using the Gaus sian kernel function in estimating h. Under the mixed model representation, we estimated and h using penalized quasi likelihood, and estimated the smoothing parameter and the Gaussian kernel scale parameter simultaneously by treating them as variance components. The results are presented in Table 1. The test for the cell growth pathway effect on the prostate cancer status H0 h 0 vs H1 h �� 0 was conducted using the proposed score test as described in the Methods section. For the purpose of comparison, we also con ducted the global test proposed by Goeman et al. that assumed a linear pathway effect. Note that our test allows a nonlinear pathway effect and gene gene interactions.
Table 1 gives the p values for both tests. The p value of our test suggests that cell growth pathway has a highly signifi cant effect on the disease status, while the test from Goe man et al. indicates only marginal significance of the growth pathway effect. Simulation Study for the Parameter Estimates We conducted a simulation study to evaluate the perform ance of the parameter estimates of the proposed logistic kernel machine regression by using the logistic mixed model formulation. We considered the following model The simulation results are shown in Table 2. Due to the multi dimensional nature of the variables z, it is difficult to visualize the fitted curve h. We hence summarized the goodness of fit of h in the following way.
For each simulated data set, we regressed the true h on the fitted value h, both evaluated at the design points. We then empirically summarized the goodness of fit of h by where h is a nonparametric function of 5 genes within the cell growth pathway. The detail of the estimation pro cedure is provided in the Methods section. In summary, we fit this model using the kernel machine method via the calculating the average intercepts, slopes and R2s obtained from these regressions over the 300 simulations. If the kernel machine method fits the nonparametric func tion well, then we would expect the intercept to be close to 0, the slope close to 1, and R2also close to 1. Our results show that even when the sample size is as low as 100, estimation of the regression coefficient and non parametric function only has small bias.
When the Brefeldin_A kernel parameter is estimated, these biases tend to be small compared with those when is held fixed. With the increase of sample size the estimates of and h become closer to the true values, especially when is estimated, while there are still some bias when is fixed at values far ther away from the estimated one. Table 3 compares the estimated standard errors of with the empirical stand ard errors. Our results show that they agree to each other well when is estimated.