PhD candidate / guest
When studying in Henan Agricultural University as an undergraduate student, I not only majored in Biology Science, but also had interest in Computer Science. Meanwhile, I studied extensively on some advanced mathematical theories and useful algorithms. As a consequence, bioinformatics was chosen as my major for my postgraduate study at China Agricultural University.
During my master study, my project was self-interacting protein prediction. In order to address this issue, I studied intensively in the field of the machine learning. All of these algorithms are very powerful to deal with a large amount of data generated by high throughput biological technology, such as the next generation sequencing. Apart from that, I also learned the molecular simulation to study the function of the biological macromolecules.
At present, I mainly focus on the connections between drugs, targets and cancer diseases based on application of machine learning method to analyze the medicinal big data. My aims are designing some computational models to explore some candidate GPCRs as targets and discriminate and generate the effective compounds as drugs to bind these GPCR targets. In order to explore the connections between drugs, disease and targets, The following directions will be investigated in the future:
1.Recently, deep learning methods such as convolutional or recurrent neural network can improve the performance by extracting features automatically from image or sequence data, instead of based on the pre-defined features. Therefore, these deep learning is a very promising method to improve the performance of QSAR model.
2. In order to deal with the relationship between these mutations on GPCRs and cancer, we need a very precise descriptor to embody the effect of these mutations. Deep learning is also powerful tool to generate this feature vectors and it will be exploited to deal with structure and sequence data respectively and generate optimal feature coding scheme.
3. Borrowing these ideas from image and text generation with Generative adversarial networks, we can use it to enlarge the effective drug space and find optimal drug candidate without side effects. In addition, I also try to apply reinforcement learning into drug discovery. For example, in order to to construct the generative model, targets and the bioactivity can be regards the environment and the reward, respectively. After training, we expect it can be generate a group of candidate drugs, when a specific target is given.
Liu, X., Yang, S., Li, C., Zhang, Z., and Song, J. (2016). SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information. Amino acids 48:1655-1665 Yang, Z., Wang, C., Wang, T., Bai, J., Zhao, Y., Liu, X., Ma, Q., Wu, X., Guo, Y., Zhao, Y., et al. (2015). Analysis of the reptile CD1 genes: evolutionary implications. Immunogenetics 67: 337-346