跳至主要内容

Identification of Significant Genes and Pathways Related to Lung Cancer via Statistical Methods

In the 21st century, cancer research, integrated with biology, genetics, cytology and statistics, continues to be a hot spot. Since last century, many researchers have been working in this field on clinical observations and theoretical deduction. Among many researches, generic aspects of such, a relatively new method for learning causes and preventions for cancer, have begun to show its potential. Over the past decades, large-scale research projects have been launched but faced certain challenges.

Researchers often have to deal with tens of thousands of genes with a relatively small sample size of patient cases—a dilemma referred to as the “Curse of Dimensionality”—and it makes it hard to learn the data well because of relatively sparse data in high dimensional space. To deal with the dilemma, this study used p-values of individual genes for pathway enrichment to find statistically significant pathways. The aim of this study was to find significant genes and biological pathways that were related to lung cancer by statistical method and pathway enrichment analysis.

The dataset was retrieved from Gene Set Enrichment Analysis (GSEA), which collected data in collaboration with National Cancer Institute, National Institutes of Health, National Institute of General Medical Sciences etc. Normalized RNA sequencing data of 868 lung cancer patients was studied statistically with patients’ clinical observations. Two specific clinical data, recurrence free survival (RFS) and whether the patient has a new tumor event after initial treatment (cancer recurrence), were chosen to run regression on. The two major statistical methods, linear regression and logistic regression, were used in this paper. And gene set pathway enrichment analysis was also used.

The results showed that several significant genes, such as WNT2B, VAV2, and significant pathways, such as Metabolism of xenobiotics by cytochrome P450-Homo sapiens (human) and Fatty acid degradation-Homo sapiens (human), were found to be both statistically significant and biological studies supported. Significant genes, includingTESK2, C5orf43, and ZSCAN21, and significant pathways such as Pentose and glucoronate interconversions-Homo sapiens (human), were found to be new cancer-related genes and pathways.

Overall, this study is one of the many steps that human take to understand and finally defeat cancer. As more advanced developments occur in statistics, biology and genetics, a world without cancer approaches steadily.


Article by Yuhang Wu, from Cranbrook Educational Community, Bloomfield Hills, MI, USA.

Full access: http://suo.im/4SwHc5

评论

此博客中的热门博文

A Comparison of Methods Used to Determine the Oleic/Linoleic Acid Ratio in Cultivated Peanut (Arachis hypogaea L.)

Cultivated peanut ( Arachis hypogaea L.) is an important oil and food crop. It is also a cheap source of protein, a good source of essential vitamins and minerals, and a component of many food products. The fatty acid composition of peanuts has become increasingly important with the realization that oleic acid content significantly affects the development of rancidity. And oil content of peanuts significantly affects flavor and shelf-life. Early generation screening of breeding lines for high oleic acid content greatly increases the efficiency of developing new peanut varieties. The objective of this study was to compare the accuracy of methods used to classify individual peanut seed as high oleic or not high oleic. Three hundred and seventy-four (374) seeds, spanning twenty-three (23) genotypes varying in oil composition (i.e. high oleic (H) or normal/not high oleic (NH) inclusive of all four peanut market-types (runner, Spanish, Valencia and Virginia), were individually tested ...

Location Optimization of a Coal Power Plant to Balance Costs against Plant’s Emission Exposure

Fuel and its delivery cost comprise the biggest expense in coal power plant operations. Delivery of electricity from generation to consumers requires investment in power lines and transmission grids. Placing a coal power plant or multiple power plants near dense population centers can lower transmission costs. If a coalmine is nearby, transportation costs can also be reduced. However, emissions from coal plants play a key role in worsening health crises in many countries. And coal upon combustion produces CO 2 , SO 2 , NO x , CO, Metallic and Particle Matter (PM10 & PM2.5). The presence of these chemical compounds in the atmosphere in close vicinity to humans, livestock, and agriculture carries detrimental health consequences. The goal of the research was to develop a methodology to minimize the public’s exposure to harmful emissions from coal power plants while maintaining minimal operational costs related to electric distribution losses and coal logistics. The objective was...

Evaluation of the Safety and Efficacy of Continuous Use of a Home-Use High-Frequency Facial Treatment Appliance

At present, many home-use beauty devices are available in the market. In particular, many products developed for facial treatment use light, e.g., a flash lamp or a light-emitting diode (LED). In this study, the safety of 4 weeks’ continuous use of NEWA TM , a high-frequency facial treatment appliance, every alternate day at home was verified, and its efficacy was evaluated in Japanese individuals with healthy skin aged 30 years or older who complained of sagging of the facial skin.  Transepidermal water loss (TEWL), melanin levels, erythema levels, sebum secretion levels, skin color changes and wrinkle improvement in the facial skin were measured before the appliance began to be used (study baseline), at 2 and 4 weeks after it had begun to be used, and at 2 weeks after completion of the 4-week treatment period (6 weeks from the study baseline). In addition, data obtained by subjective evaluation by the subjects themselves on a visual analog scale (VAS) were also analyzed. Fur...