SRD: Sparse ramp discrimination for classification and variable selection on high-dimensional biological data

Xin Zhou; Zuoheng Wang; Xin Zhou; Zuoheng Wang

doi:10.48130/stati-0025-0001

A massive amount of biological data generates a high volume of information. However, high-dimensional and noisy data also present significant challenges for data analysis, particularly in pattern classification. Many large-margin classification methods follow a regularization framework with the 'loss + penalty' structure. By incorporating LASSO or elastic-net penalties, we may perform classification and variable selection simultaneously. Most large margin methods rely on convex loss functions, which are computationally advantageous but provide poor approximations of the 0-1 loss due to their unbounded nature. In contrast, non-convex loss functions offer better approximations and greater robustness to outliers. We propose the sparse ramp discrimination (SRD) method, which integrates the non-convex smoothed ramp loss function with the elastic-net penalty. We apply the difference of the convex (d.c.) algorithm to efficiently solve the non-convex optimization problem through a sequence of convex subproblems. The robustness of SRD makes it well-suited for noisy, high-dimensional biological data. The performance of the proposed SRD is illustrated through simulation studies and on a colon cancer involving high throughput biological datasets.

HTML

Introduction

The rapid advancement of high throughput technologies has enabled the collection of diverse omics data generated through whole genome tiling arrays, mass spectrometry, and more recently next-generation sequencing platforms. While these massive data provide us with vast information, they also pose significant challenges for data analysis. One major issue is the high dimensionality of the data—for instance, microarray gene expression data can measure tens of thousands of genes simultaneously—coupled with the typically small sample sizes. This 'curse of dimensionality' makes it difficult to accurately estimate model parameters. Additionally, the inherent noise in biological data and the uncertainty in the target response often lead to overfitting, further complicating analysis.

Classification plays a pivotal role in biomedical and bioinformatics research^[1,2]. It involves learning a function from training data to predict the class label of any valid input covariates. The support vector machine (SVM) is one of the commonly used classification methods^[3−5]. However, SVMs cannot perform variable selection simultaneously, which is essential for analyzing high-dimensional data. Zhu et al.^[6] proposed the ℓ₁-norm SVM, which incorporates the LASSO penalty^[7] for automatic variable selection. Wang et al.^[8] applied the elastic-net penalty^[9] to SVMs for variable selection on high dimensional data. The elastic-net penalty exhibits a 'grouping effect', which tends to select or remove highly correlated covariates together.

Many large margin classification algorithms, including SVMs, are in the regularization framework, typically following the 'loss + penalty' format. In addition to the LASSO and elastic-net penalties, other penalties, such as ℓ_1/2^[10] and SCAD^[11], have also been applied to pattern classification in high-throughput biological data analysis.

In this work, we focus on the loss function. For binary classification problems, the most straightforward loss function is the 0-1 loss. However, it is well known that minimizing the 0-1 loss is NP-hard. As a result, surrogate loss functions are commonly used instead. Generally, loss functions can be categorized into two types: convex and non-convex.

Convex loss functions, such as the hinge loss used in SVMs, are computationally advantageous due to the convexity. However, the unbounded property of the convex function often leads to poor approximations to the 0-1 loss function. To address this limitation, various non-convex loss functions have been proposed, including the Ψ loss^[12], the ramp loss^[13,14], and the sigmoid loss^[15,16]. Non-convex loss functions, such as the ramp loss, are particularly robust to outliers^[14]. In contrast, convex loss functions can be sensitive to outliers, making the corresponding classifiers prone to being dominated by outliers.

The robustness of non-convex loss functions makes them well-suited for handling noisy, high dimensional biological data. However, non-convex optimization poses significant challenges. In this work, we use a smoothed version of the ramp loss function, and propose applying the difference of convex (d.c.) algorithm to efficiently solve the non-convex optimization problem through a sequence of convex subproblems. We refer to the proposed method as Sparse Ramp Discrimination (SRD).

Simulations

In this section, we conducted simulation studies to evaluate the performance of the proposed SRD method. The training data consisted of 100 subjects (50 cases and 50 controls), where each input x was a d = 500-dimensional vector. Independent tuning and testing data were generated in the same way as the training data. The sample sizes for the tuning and testing data were 100 and 10,000, respectively. The tuning data were used to select the optimal tuning parameters via a grid search, and the testing data were used to evaluate the accuracy of various classification methods.

We considered two scenarios similar to those in Wang et al.^[25]. In the first scenario, all covariates are independent. The '+' class follows a normal distribution with a mean of:

$ \mu_{+} = \left( \underbrace{0.5{{,}} \cdots{{,}} 0.5}_{10}{{,}} \underbrace{0{{,}} \cdots{{,}} 0}_{490} \right)^T{{}} $

and a covariance matrix $ \mathbf{\Sigma }={\mathbf{I}}_{d\times d} $. The '−' class follows a similar distribution except that the mean is:

$ \mu_{-} = \left( \underbrace{-0.5{{,}} \cdots{{,}} -0.5}_{10}{{,}} \underbrace{0{{,}} \cdots{{,}} 0}_{490} \right)^T $

The Bayes optimal classification rule depends solely on the first 10 covariates, with a Bayes error of 5.69%. The second scenario represents an example where the relevant covariates are correlated. The '+' class follows a normal distribution with a mean of:

$ \mu_{+} = \left( \underbrace{1{{,}} \cdots{{,}} 1}_{10}{{,}} \underbrace{0{{,}} \cdots{{,}} 0}_{490} \right)^T{{}} $

and a covariance matrix:

$ \mathbf{\Sigma }=\left(\begin{array}{cc}{\mathbf{\Sigma }}_{10\times 10}^{\mathrm{*}}& {0}_{10\times 490}\\ {0}_{490\times 10}& {\mathbf{I}}_{490\times 490}\end{array}\right){{}} $

where, the diagonal elements of $ {\mathbf{\Sigma }}^{\mathrm{*}} $ are 1 and all off-diagonal elements are 0.8. The '−' class has a similar distribution except that the mean is:

$ \mu_{-} = \left( \underbrace{-1{{,}} \cdots{{,}} -1}_{10}{{,}} \underbrace{0{{,}} \cdots{{,}} 0}_{490} \right)^T $

The Bayes optimal classification rule depends only on the first 10 highly correlated covariates, and its Bayes error is 13.47%.

We compared the standard SVM, hybrid huberized SVM (HHSVM)^[25], and the proposed SRD method. HHSVM uses a smoothed hinge loss, called huberized hinge loss, combined with an elastic-net penalty. SVM was implemented using the e1071 R package, and the R package GCDnet^[22] was used for HHSVM.

We used the tuning dataset to select the tuning parameters for each method. For SVM, we searched over a pre-specified set of tuning parameter values for C, and selected the one that minimized the prediction error on the tuning data. For HHSVM and SRD, we chose a pre-specified set of λ₂ values, and adaptively determined a fine grid of λ₁ values using the procedure described in the methods, with τ = 0.01 and K = 98. The best pair (λ₁, λ₂) was selected based on the smallest prediction error on the tuning data. The performance of the selected models was then evaluated on independent testing data. Each experiment was repeated 1,000 times.

The mean prediction errors and corresponding standard deviations (in parentheses) are summarized in Table 1 (first two rows). As shown in Table 1, the prediction errors of SRD and HHSVM are similar, and outperform those of SVM in both scenarios. This improvement is attributed to the variable selection mechanism employed by SRD and HHSVM. Unlike SVM, which uses all covariates and is thereby affected by noise covariates, SRD and HHSVM are more robust due to their ability to select relevant features. The similar performance of SRD and HHSVM in scenario 2 highlights that SRD effectively identifies all correlated relevant covariates, a result of the 'grouping effect' provided by the elastic-net penalty, as in HHSVM.

Table 1. Mean (std) of prediction errors evaluated on independent test data for two simulation scenarios.

		SVM	HHSVM	SRD
perc = 0%	Scenario I	0.230_(0.014)	0.084_(0.015)	0.083_(0.015)
perc = 0%	Scenario II	0.171_(0.013)	0.146_(0.010)	0.144_(0.009)
perc = 10%	Scenario I	0.285_(0.021)	0.131_(0.032)	0.112_(0.023)
perc = 10%	Scenario II	0.204_(0.025)	0.154_(0.014)	0.149_(0.012)
perc = 20%	Scenario I	0.340_(0.025)	0.216_(0.056)	0.172_(0.040)
perc = 20%	Scenario II	0.253_(0.039)	0.174_(0.033)	0.153_(0.015)
The number of training samples is 100, the number of total covariates is 500, and the number of relevant covariates is 10. The reported results are the average test errors over 1,000 repetitions on a test set of 10,000 samples, with the values in parentheses indicating the corresponding standard deviations. In scenario I, the covariates are independent, while in scenario II, the relevant covariates are highly correlated. perc: describes the percent of training samples randomly selected to have their class labels flipped. Specifically, perc = 0% indicates no perturbation in the training data, perc = 10% corresponds to a 10% perturbation, and perc = 20% corresponds to a 20% perturbation.

The computation of SRD is higher than that of HHSVM due to the nature of non-convex optimization. At our computational facility, HHSVM took approximately 26.6 min for scenario I with 1,000 repeats. In contrast, with the same number of pre-specified (λ₁, λ₂) pairs for tuning, SRD took approximately 145.2 min for 1,000 repeats.

We perturbed the training data by randomly selecting perc(10% or 20%) of the samples and flipping their class labels to the opposite. We repeated the above testing procedure on the perturbed training data. The results are also presented in Table 1. The performance of all methods deteriorated due to the perturbation. However, SRD was the least affected, as expected. This supports our claim that the non-convex smoothed ramp loss enhances the robustness of the model compared to convex loss functions, leading to more accurate classifiers in the presence of outliers in the training data.

We also compared the covariates selected by HHSVM and SRD (SVM retains all covariates). We consider q_signal, the number of selected relevant covariates, and q_noise, the number of selected noise covariates. The results are presented in Table 2. When there is no perturbation in the training data, both HHSVM and SRD perform similarly. However, in the presence of perturbation, SRD tends to select more relevant covariates. Despite this, both methods also select a significant number of noise covariates. One way to further remove the noise covariates is to use tuning methods that favor more parsimonious models. For example, applying the 'one-standard error' rule^[26] during cross-validation for parameter tuning has proven effective in screening out noise.

Table 2. Comparison of variable selection.

		HHSVM		SRD
		q_signal	q_noise	q_signal	q_noise
perc = 0%	Scenario I	9.8_(0.6)	28.0_(38.0)	9.7_(0.6)	17.5_(30.9)
perc = 0%	Scenario II	6.0_(3.4)	50.8_(93.7)	6.5_(3.4)	53.7_(94.6)
perc = 10%	Scenario I	8.8_(1.3)	22.0_(29.1)	9.2_(1.0)	17.0_(21.0)
perc = 10%	Scenario II	6.9_(3.1)	41.9_(86.6)	7.5_(2.9)	45.5_(82.9)
perc = 20%	Scenario I	7.0_(2.0)	36.9_(56.3)	7.7_(1.8)	28.5_(36.7)
perc = 20%	Scenario II	6.3_(3.2)	41.2_(83.1)	7.5_(3.0)	31.4_(71.3)
The setups are the same as those described in Table 1. q_signal is the number of selected relevant covariates, and q_noise is the number of selected noise covariates. The results are averages over 1000 repetitions, and the numbers in parentheses are the standard deviations.

Data analysis

In this section, we investigated the performance of SVM, HHSVM, and our proposed SRD method using microarray gene expression data from a colon cancer study^[2]. This dataset includes 62 samples, consisting of 40 colon cancer tumors and 22 normal tissues. Each sample contains expression measurements for 2,000 genes obtained using an Affymetrix gene chip. The data are publicly available.

We pre-processed the data following the procedure described by Dudoit et al.^[27]. Microarray gene expression levels below the minimum threshold of 100 were set to 100, and the maximum threshold was capped at 16,000. We excluded genes with max/min ≤ 5 and (max − min) ≤ 500, where max and min denote the maximum and minimum expression levels of a gene across samples. The expression levels were then logarithmically transformed. After thresholding, filtering, and logarithmic transformation, the data were standardized to have a zero mean and unit standard deviation for each gene. The pre-processed data (p = 1,224) were used for all subsequent analysis. It has been reported that five samples (three tumors and two normal tissues) were contaminated^[28], making this dataset particularly suitable for assessing the robustness of the proposed SRD method.

We compared the performance of SVM, HHSVM, and SRD. All methods involve at least one tuning parameter. We used 10-fold cross-validation to select the optimal parameters. Since the sample size is small, the cross-validation error is known to have high variance. To mitigate this, we repeated the cross-validation procedure ten times and averaged the cross-validation errors to reduce the variance. We selected the parameter with the smallest average cross-validation error. This tuning strategy is referred to as the optimal rule in this article. For HHSVM and SRD, we aimed to choose a parsimonious model if its prediction performance was not compromised too much. A common approach, the 'one-standard error' rule^[26] is used in cross-validation to select the most parsimonious model whose prediction error is no more than one standard error above the optimal error. In this work, we used a slightly different way to compute the standard error to incorporate into the repeated cross-validation procedure. That is, the standard error was computed through the individual prediction error rates for each of the ten repetitions. Here, we applied both tuning strategies: the optimal rule and the 'one-standard error' rule.

To evaluate classification performance, we conducted a nested 10-fold cross-validation procedure^[29]. The data were randomly partitioned into ten roughly equal-sized folds. In each iteration, nine folds were used as training data, where an internal cross-validation procedure was performed to tune the model parameters. The classification model was then trained on the full training set using the selected parameters and applied to predict class labels for the left-out fold. This process was repeated nine more times to obtain predictions for all folds, yielding the overall prediction error. For a reliable evaluation of prediction accuracy, we repeated the outer cross-validation procedure 100 times with different fold partitions. The average and standard deviation of these 100 prediction errors are reported in Table 3. During the cross-validation procedure with 100 repetitions, we trained a total of 1,000 classification models for each method. For HHSVM and SRD, the average number of selected genes is also reported in Table 3.

Table 3. Results on the colon cancer dataset.

	Optimal rule		'One-standard error' rule
	Test error	No. of genes	Test error	No. of genes
SVM	12.3%_(1.7%)	1,224
HHSVM	13.6%_(2.1%)	86.2_(100.1)	13.7%_(2.0%)	60.5_(78.5)
SRD	11.9%_(1.9%)	219.0_(136.4)	12.3%_(2.2%)	140.4_(100.8)
Test errors are averages of 100 cross validation repetitions. The numbers of genes are averages of selected genes in 1,000 classification models. The numbers in parentheses are the corresponding standard deviations.

We can see that SRD seems to have a better classification accuracy than HHSVM. It confirms again the robustness of SRD in classification and variable selection when there exist outliers in the training data. Interestingly, SVM performed similarly with SRD although SVM used all covariates. The reason perhaps is that the gene expression levels are highly correlated for many genes. The correlation enhances the signal. In the simulation studies, both scenarios have ten relevant covariates, and the only difference is that in scenario II the relevant covariates are highly correlated. SVM performed much better in scenario II than in scenario I.

We used two tuning strategies. From Table 3, the 'one-standard error' rule is effective to select a parsimonious model without sacrificing too much on the prediction accuracy. We investigated the genes selected in the 1,000 SRD models determined by the 'one-standard error' rule during the cross-validation procedure. The top 20 most frequently selected genes are listed in Table 4.

Table 4. The 20 most frequently selected genes by SRD with the 'one-standard error' rule from the colon cancer dataset.

Accession	Gene name	Freq.
R87126	Myosin, heavy chain 9, non-muscle (MYH9)	990
M76378	Cysteine and glycine-rich protein 1 (CSRP1)	989
J02854	Myosin, light chain 9, regulatory (MYL9)	988
M63391	Desmin (DES)	983
X12671	Heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1)	983
Z50753	Guanylate cyclase activator 2B (GUCA2B)	981
R36977	General transcription factor IIIA	973
T47377	S100P Protein	972
J03040	Secreted protein, acidic, cysteine-rich (SPARC)	970
X14958	High mobility group AT-hook 1 (HMGA1)	967
T51571	P24480 Calgizzarin	967
H43887	Complement factor D precursor	966
L07648	MAX interactor 1, dimerization protein (MXI1)	957
M22382	Heat shock 60kDa protein 1 (HSPD1)	955
H64489	Tetraspanin 1 (TSPAN1)	945
H20709	Myosin, light chain 6, alkali (MYL6)	942
H40095	Macrophage migration inhibitory factor	935
R84411	Small nuclear ribonucleoprotein associated proteins B	929
T51023	Heat shock protein 90kDa alpha (cytosolic)	922
T71025	Metallothionein 1G (MT1G)	916

The molecular carcinogenesis of colorectal cancer is complex and poorly understood. Some comments about the top frequently selected genes are worthy of mention. MYH9 and its regulatory MYL9 are components of the cytoskeletal network involved in actomyosin-based contractility and implicated in the invasive behavior of tumor cells^[30]. hnRNP A1, a major member of the hnRNP family, binds to G-rich repetitive sequences and quadruplex structures in DNA. The overexpression of hnRNP A1 in colorectal cancer cells leads to evasion of cancer cell apoptosis^[31]. Uroguanylin, encoded by GUCA2B, is markedly down-regulated in adenocarcinomas of the colon, and the treatment with uroguanylin leads to induction of apoptosis in human colon carcinoma cells in vitro^[32]. S100P overexpressed in colorectal carcinomatous tissues, and the overexpression is significantly correlated with tumor metastasis, advanced clinical stage, and recurrence^[33].

[1]	Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, et al. 1999. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531−37 doi: 10.1126/science.286.5439.531 CrossRef Google Scholar
[2]	Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, et al. 1999. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 96:6745−50 doi: 10.1073/pnas.96.12.6745 CrossRef Google Scholar
[3]	Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, et al. 2000. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences of the United States of America 97:262−67 doi: 10.1073/pnas.97.1.262 CrossRef Google Scholar
[4]	Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, et al. 2000. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16:906−14 doi: 10.1093/bioinformatics/16.10.906 CrossRef Google Scholar
[5]	Guyon I, Weston J, Barnhill S, Vapnik V. 2002. Gene selection for cancer classification using support vector machines. Machine learning 46:389−422 doi: 10.1023/A:1012487302797 CrossRef Google Scholar
[6]	Zhu J, Rosset S, Hastie T, Tibshirani R, Zhu J, et al. 2003. 1-norm support vector machines. Proceedings of the 17 ^th International Conference on Neural Information Processing Systems, 9−11 December 2003, Whistler, British Columbia, Canada. Cambridge, MA, United States: MIT Press. pp. 49−56. https://dl.acm.org/doi/10.5555/2981345.2981352
[7]	Tibshirani R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58:267−88 doi: 10.1111/j.2517-6161.1996.tb02080.x CrossRef Google Scholar
[8]	Wang L, Zhu J, Zou H. 2006. The doubly regularized support vector machine. Statistica Sinica 16:589−615 Google Scholar
[9]	Zou H, Hastie T. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology 67:301−20 doi: 10.1111/j.1467-9868.2005.00503.x CrossRef Google Scholar
[10]	Liang Y, Liu C, Luan XZ, Leung KS, Chan TM, et al. 2013. Sparse logistic regression with a L_1/2 penalty for gene selection in cancer classification. BMC Bioinformatics 14:198 doi: 10.1186/1471-2105-14-198 CrossRef Google Scholar
[11]	Zhang HH, Ahn J, Lin X, Park C. 2006. Gene selection using support vector machines with non-convex penalty. Bioinformatics 22:88−95 doi: 10.1093/bioinformatics/bti736 CrossRef Google Scholar
[12]	Shen X, Tseng GC, Zhang X, Wong WH. 2003. On ψ-learning. Journal of the American Statistical Association 98:724−34 doi: 10.1198/016214503000000639 CrossRef Google Scholar
[13]	Collobert R, Sinz F, Weston J, Bottou L. 2006. Trading convexity for scalability. Proceedings of the 23 ^rd international conference on Machine learning, 25−29 June 2006, Pittsburgh, Pennsylvania, USA. New York, USA: ACM. pp. 201−8. doi: 10.1145/1143844.1143870
[14]	Wu Y, Liu Y. 2007. Robust truncated hinge loss support vector machines. Journal of the American Statistical Association 102:974−83 doi: 10.1198/016214507000000617 CrossRef Google Scholar
[15]	Mason L, Baxter J, Bartlett P, Frean M. 1999. Boosting algorithms as gradient descent. Advances in neural information processing systems. Cambridge, MA, United States: MIT Press. pp. 512−18. https://dl.acm.org/doi/10.5555/3009657.3009730
[16]	Bartlett PL, Jordan MI, McAuliffe JD. 2006. Convexity, classification, and risk bounds. Journal of the American Statistical Association 101:138−56 doi: 10.1198/016214505000000907 CrossRef Google Scholar
[17]	Zhou X, Tuck DP. 2007. MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data. Bioinformatics 23:1106−14 doi: 10.1093/bioinformatics/btm036 CrossRef Google Scholar
[18]	Fan J, Li R. 2001. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96:1348−60 doi: 10.1198/016214501753382273 CrossRef Google Scholar
[19]	Thi Hoai An L, Dinh Tao P. 1997. Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of global optimization 11:253−85 doi: 10.1023/A:1008288411710 CrossRef Google Scholar
[20]	Yuille AL, Rangarajan A. 2003. The concave-convex procedure. Neural computation 15:915−36 doi: 10.1162/08997660360581958 CrossRef Google Scholar
[21]	Friedman J, Hastie T, Tibshirani R. 2010. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33:1−22 Google Scholar
[22]	Yang Y, Zou H. 2013. An efficient algorithm for computing the HHSVM and its generalizations. Journal of Computational and Graphical Statistics 22:396−415 doi: 10.1080/10618600.2012.680324 CrossRef Google Scholar
[23]	Gong P, Zhang C, Lu Z, Huang J, Ye J. 2013. A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In international conference on machine learning 28:37−45 Google Scholar
[24]	Boyd S, Vandenberghe L. 2004. Convex optimization. Cambridge, UK: Cambridge University Press. doi: 10.1017/cbo9780511804441
[25]	Wang L, Zhu J, Zou H. 2008. Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24:412−19 doi: 10.1093/bioinformatics/btm579 CrossRef Google Scholar
[26]	Hastie T, Tibshirani R, Friedman J. 2009. The elements of statistical learning: data mining, inference, and prediction. New York, NY: Springer. doi: 10.1007/978-0-387-84858-7 Google Scholar
[27]	Dudoit S, Fridlyand J, Speed TP. 2002. Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97:77−87 doi: 10.1198/016214502753479248 CrossRef Google Scholar
[28]	Li L, Weinberg CR, Darden TA, Pedersen LG. 2001. Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17:1131−42 doi: 10.1093/bioinformatics/17.12.1131 CrossRef Google Scholar
[29]	Ambroise C, McLachlan GJ. 2002. Selection bias in gene extraction on the basis of microarray gene-expression data. Proceedings of the National Academy of Sciences of the United States of America 99:6562−66 doi: 10.1073/pnas.102102699 CrossRef Google Scholar
[30]	Betapudi V, Licate LS, Egelhoff TT. 2006. Distinct roles of nonmuscle myosin II isoforms in the regulation of MDA-MB-231 breast cancer cell spreading and migration. Cancer Research 66:4725−33 doi: 10.1158/0008-5472.CAN-05-4236 CrossRef Google Scholar
[31]	Fujiya M, Konishi H, Mohamed Kamel MK, Ueno N, Inaba Y, et al. 2014. microRNA-18a induces apoptosis in colon cancer cells via the autophagolysosomal degradation of oncogenic heterogeneous nuclear ribonucleoprotein A1. Oncogene 33:4847−56 doi: 10.1038/onc.2013.429 CrossRef Google Scholar
[32]	Shailubhai K, Yu HH, Karunanandaa K, Wang JY, Eber SL, et al. 2000. Uroguanylin treatment suppresses polyp formation in the Apc(Min/+) mouse and induces apoptosis in human colon adenocarcinoma cells via cyclic GMP. Cancer Research 60:5151−57 Google Scholar
[33]	Dong L, Wang F, Yin X, Chen L, Li G, et al. 2014. Overexpression of S100P promotes colorectal cancer metastasis and decreases chemosensitivity to 5-FU in vitro. Molecular and cellular biochemistry 389:257−64 doi: 10.1007/s11010-013-1947-5 CrossRef Google Scholar

{{lists.name}}

SRD: Sparse ramp discrimination for classification and variable selection on high-dimensional biological data

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors