Ahmet S. Rifaioglu

Computer Engineering, Middle East Technical University, A-409, Ankara/Türkiye
Tel: +90 (312) 210-5577
arifaioglu [at] ceng.metu.edu.tr


I am a research/teaching assistant of Computer Engineering at the Middle East Technical University (METU), Ankara, Turkey. My main research interests are bioinformatics, machine learning. I am actively working on applications of machine learning on protein function prediction and drug-target interaction prediction.

Experience

Research/Teaching Assistant

Middle East Technical University

I am working as a research/teaching assistant at Computer Engineering Department of Middle East Technical University. I am also doing my PhD at the same department.

October 2011 - Present

Research Trainee

European Bioinformatics Institute (EMBL-EBI)

I worked in "Comprehensive Resource of Biomedical Relations with Deep Learning and Network Representations" project which is a joint project between METU and EBI.

November 2017 – March 2018

Research Trainee

European Bioinformatics Institute (EMBL-EBI)

I worked on development of UniGOPred protein function prediction method.

October 2014 – January 2015

.NET Developer/Junior SAP Consultant

ISIS Information Technologies

I participated development of e-Invoice Project and started my training for SAP consultancy.

July 2010 - December 2011

Education

Middle East Technical University

PhD
Computer Engineering

GPA: 3.67

February 2015 -

Middle East Technical University

Master's Degree
Computer Engineering

GPA: 3.86

September 2012 - February 2015

İstanbul Doğuş University

Computer Engineering

GPA: 3.16

September 2006 - August 2010

Publications and Talks

Journal Publications

  • Large-scale automated multi-functional annotation of protein sequences and an experimental case study validation on PTEN transcript variants
    (Proteins. 2017;00:1–17. https://doi.org/10.1002/prot.25416 PROTEINS: Structure, Function, and Bioinformatics)
    Authors : Rifaioglu, A.S., Doğan, T., Saraç, Ö.S., Ersahin, T., Saidi, R., Atalay, M.V., Martin, M.J., & Cetin-Atalay, R.

    Abstract : Recent advances in computing power and machine learning empower functional annotation of protein sequences and their transcript variations. Here, we present an automated prediction system UniGOPred, for GO annotations and a database of GO term predictions for proteomes of several organisms in UniProt Knowledgebase (UniProtKB). UniGOPred provides function predictions for 514 molecular function (MF), 2909 biological process (BP), and 438 cellular component (CC) GO terms for each protein sequence. UniGOPred covers nearly the whole functionality spectrum in Gene Ontology system and it can predict both generic and specific GO terms. UniGOPred was run on CAFA2 challenge target protein sequences and it is categorized within the top 10 best performing methods for the molecular function category. In addition, the performance of UniGOPred is higher compared to the baseline BLAST classifier in all categories of GO. UniGOPred predictions are compared with UniProtKB/TrEMBL database annotations as well. Furthermore, the proposed tool's ability to predict negatively associated GO terms that defines the functions that a protein does not possess, is discussed. UniGOPred annotations were also validated by case studies on PTEN protein variants experimentally and on CHD8 protein variants with literature. UniGOPred protein functional annotation system is available as an open access tool at http://cansyl.metu.edu.tr/UniGOPred.html.

  • Multi-task Deep Neural Networks in Protein Function Prediction
    ArXiv, 2017 Pages 1-19 arXiv:1705.04802
    Authors : Rifaioglu, A.S.,  Martin, M.J., Cetin-Atalay, R., Martin, M.J.,  &  Doğan, T.

    Abstract : In recent years, deep learning algorithms have outperformed the state-of-the art methods in several areas thanks to the efficient methods for training and for preventing overfitting, advancement in computer hardware, the availability of vast amount data. The high performance of multi-task deep neural networks in drug discovery has attracted the attention to deep learning algorithms in bioinformatics area. Here, we proposed a hierarchical multi-task deep neural network architecture based on Gene Ontology (GO) terms as a solution to protein function prediction problem and investigated various aspects of the proposed architecture by performing several experiments. First, we showed that there is a positive correlation between performance of the system and the size of training datasets. Second, we investigated whether the level of GO terms on GO hierarchy related to their performance. We showed that there is no relation between the depth of GO terms on GO hierarchy and their performance. In addition, we included all annotations to the training of a set of GO terms to investigate whether including noisy data to the training datasets change the performance of the system. The results showed that including less reliable annotations in training of deep neural networks increased the performance of the low performed GO terms, significantly. We evaluated the performance of the system using hierarchical evaluation method. Mathews correlation coefficient was calculated as 0.75, 0.49 and 0.63 for molecular function, biological process and cellular component categories, respectively. We showed that deep learning algorithms have a great potential in protein function prediction area. We plan to further improve the DEEPred by including other types of annotations from various biological data sources. We plan to construct DEEPred as an open access online tool.



Talks and Posters in Peer-Reviewed Conferences

  • Investigation of Multi-task Deep Neural Networks for Protein Function Prediction (Talk)
    ISMB/ECCB 2017: 25th Annual International Conference on Intelligent Systems for Molecular Biology, Function- COSI Oral Presentation , July 21 - July 25, 2017, Prague, Czech Republic doi: 10.7490/f1000research.1114653.1
    Authors : Rifaioglu, A.S.,  Martin, M.J., Cetin-Atalay, R., Atalay, M.V.  &  Doğan, T.
  • UniGOPred: A Large Scale Automated GO Term Annotation System for UniProtKB. (Poster)
    Great Lakes Bioinformatics Conference , May 15-17, 2017, Chicago, USA.
    Authors : Rifaioglu, A.S., Doğan, T., Saraç, Ö.S., Atalay, M.V., Martin, M.J. & Cetin-Atalay, R.
  • Unsupervised Identification of Redundant Domain Entries in InterPro Database Using Clustering Techniques 
    (BCB '15 Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics Pages 505-506. https://doi.org/10.1145/2808719.2811430
    Authors : Rifaioglu, A.S., Doğan, T., Can, T.
  • UniGOPred and ECPred : Automated Function Prediction Tools Based on A Combination of Different Classifiers (Talk)
    ISMB/ECCB 2015: 23th Annual International Conference on Intelligent Systems for Molecular Biology, AFP-CAFA SIG , July 10-14, 2015, Dublin, Republic of Ireland.
    Authors : Rifaioglu, A.S., Doğan, T., Saraç, Ö.S., Atalay, M.V., Atalay, M.V., Martin, M.J. & Cetin-Atalay, R.
  • Computational drug target prediction and validation in PI3K/AKT pathway (Poster)
    ISMB/ECCB 2015: 23th Annual International Conference on Intelligent Systems for Molecular Biology, AFP-CAFA SIG , July 10-14, 2015, Dublin, Republic of Ireland.
    Authors : Doğan, T., Ersahin, T., Rifaioglu, A.S., Poggioli, D., Nightingale, A., Martin, M.J. & Cetin-Atalay, R.

Projects

Comprehensive Resource of Biomedical Relations with Deep Learning and Network Representations

October 2017 - October 2019

Budget : ~1,500,000 TL

Motivation :The main objectives of the proposed project can be summarized as: i) developing a novel large-scale computational system with multiple components to serve the purposes of the translational life-sciences research by annotating relations between drugs, target biomolecules, systems and diseases; ii) presenting the results of the study to the research community in a publicly available web-service; and iii) discussing selected results of the computational system in the framework of health and disease, to make a contribution to the understanding of the mechanisms active in liver cancers and in the drug-induced liver toxicity.
To our knowledge, this will be the first project aiming to generate a fully integrated biomedical system in such a scale. The proposed system will bridge the biological data resources which provide highly related biomedical information, but are fairly disconnected from each other in the current state. It is expected that the new system will display a continuous data flow from drugs/compounds to diseases (with easy to comprehend network representations) and will be utilized to aid experimental and computational work in biomedical research, especially in the fields of precision medicine and drug discovery & repositioning.
This new computational system will contain 3 modules: (1) a novel computational method for the comprehensive prediction of unknown compound/drug - target protein interactions (as well as non-interactions) to obtain valuable information both regarding on-target and off- target effects of chemical substances on biomolecules, using high-dimensional feature spaces and deep learning architectures; (2) multi-partite biological entity networks where different types of nodes will represent compounds/drugs, genes/proteins, pathways and diseases, and the edges will represent the known and predicted pairwise relations in-between (different relation types are: "biological interaction", "cause and effect" and "belongs to"); and (3) an open access database of results and a web-service where it will be possible to browse with an entity of interest to observe the related network with its components. Furthermore, selected results of the bio-interaction prediction component will be experimentally verified with target inhibition assays, to test the biological relevance of the results of the computational system.

Large Scale Automated Function Prediction By Subsequence Analysis

October 2016 - October 2017

Budget : ~6,000 TL

Motivation :Recent advances in computing power and machine learning empower functional annotation of protein sequences and their transcript variations. Identification of protein functions is a crucial research area for various purposes such as understanding molecular mechanism of living-beings, identification of disease-causing functional changes and discovering new drugs. Traditionally, protein functions can be identified by labor intensive and expensive wet-laboratory experiments which are insufficient to annotate vast amount of protein sequence data. Therefore, we need automated protein function prediction methods to help annotating proteins. In this project, our aim is to predict gene ontology terms and enzyme comissin numbers with a high accuracy.

Awards & Certifications

  • 2017 - UniGOPred: A Large Scale Automated GO Term Annotation System for UniProtKB GLBIO 2017 (Best Poster Award)
  • 2010 – Doğuş University Honour Student Award •
  • 2006 - 1st Place - Doğuş University Highest Ranked Student Award