Research
                 
                Research Areas
                
                  
                    - Autonomus and Multi-Agent
                        Systems
 
                    - Reinforcement Learning,
                        Markov Decision
                        Processes, Partially Observable Markov Decision
                        Processes
 
                    
                      - Induction and Control of
                          Gene Regulatory
                          Networks in Bioinformatics
 
                    
                    - Multiagent Path Finding
                       
                    - Behaviour Modeling in Virtual
                        Simulations and
                        Computer Games
 
                  
                 
                Research Projects
                
                
                  
                    - Incremental Multi-Agent Path Finding
                      
                     
                    
                      - Abstract
                          In Multi-Agent Path Finding (MAPF), the aim is
                          to find
                          conflict-free paths for more than one agent:
                          given a graph, the aim is to
                          determine a path from an initial vertex to a
                          target vertex for each agent such
                          that no two agents can be at the same vertex
                          at the same time and the sum of
                          the costs of agent paths is minimal. Existing
                          MAPF algorithms are inadequate
                          for meeting requirements of many real-life
                          MAPF problems, that’s why MAPF
                          problem description needs to be enhanced with
                          realistic domain specific
                          requirements: i) changes in the environment,
                          ii) agents with more than one
                          destination, and iii) entry of new jobs
                          anytime after the initial
                          job-assignment. CBS is not a suitable approach
                          for dealing with dynamic
                          environments, because it uses an offline
                          single agent search algorithm, namely
                          A*. This causes CBS to re-compute all of the
                          agent paths and regenerate a
                          constraint tree (CT) from the scratch to
                          provide a new optimal plan for the
                          agents. In this project, first a new
                          incremental single agent path finding
                          algorithm will be developed and then it will
                          be coupled into the high-level
                          planner to swiftly generate new plans after
                          environmental changes. In the
                          second stage, we will develop an algorithm
                          based on our incremental solver to
                          MAPF problem instance where each agent can
                          have more than one delivery vertices
                          in the graph. In the last stage, we will work
                          on a method to assign new jobs to
                          the agents considering their existing jobs and
                          possible paths. In order to
                          achieve this, we will develop several job
                          assignment heuristics and then
                          determine a heuristic with best performance.
                          In order to test algorithms
                          developed in there stages, we will use
                          randomly generated scenarios and some
                          benchmark MAPF maps in the literature
                          
                        
                        
                       
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 120E504.
 
                      - Principle Investigator: Faruk Polat,
                        Scholar:  Fatih Semiz, Evren Cilden. 
                       
                      - Started in March 2021. To Be Delivered in May
                        2023. 
                       
                      - Budget: 401183TL 
 
                    
                    - Management of BS Programs and Capacity Planning
                      for  Council of Higher Education.
                     
                    
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 115G086.
 
                      - Principle Investigator: Faruk Polat,
                        Researchers: Dr.Cem Iyigun,
                        Scholar:  Alper Demir, Huseyin Aydin, Tuna
                        Berk Kaya, Yagmur Caner. 
                       
                      - Started in May 2018. To Be Delivered in May
                        2020. 
                       
                      - Budget: 228860TL 
 
                    
                    - Subgoal Identification in Sequential
                        Decision Making under Partial Observability
                      
                     
                    
                      - Abstract: 
                        Sequential decision making under partial
                        observability is a hard problem mainly due to
                        perceptual aliasing and
                        dimensionality issues. Learning algorithms try
                        to handle the sequential
                        decision making problem through an adaptive
                        agent perspective, trying
                        to cope with the problem using some
                        approximation methods.Reinforcement
                        learning (RL) is a strong on-line learning
                        method widely known for its
                        fitness to autonomous agent model, relatively
                        simple implementation and
                        ease of adaptation to real-world phenomena.
                        Although RL methods are
                        theoretically based on Markov decision process
                        (MDP) model, partially
                        observable MDP (POMDP) variants exist, together
                        with some assumptions
                        and limitations. Significant effort has been
                        spent to divide MDP
                        problems into smaller problems, so that every
                        sub-problem can be solved
                        with less effort, and solutions of all
                        sub-problems can be combined
                        later on for the grand solution. One of the
                        popular ways to do this is
                        the identification of sub-goals which naturally
                        clusters the problem
                        into pieces. Although there are sound methods
                        for MDP- RL case, the
                        sub-goal identification literature for partial
                        observable case is still
                        immature. The aim of this project is to attack a
                        definitely unexplored
                        area in terms of sub-goal identification for
                        POMDP-RL, which is the
                        memory based RL algorithms for problems with
                        hidden state. This study
                        focuses on adaptation or re-design of existing
                        on-line sub-goal
                        identification methods already available for
                        MDP-RL to POMDP-RL
                        algorithms, so that learning performance can be
                        improved without an
                        off-line intervention. In order to do this we
                        will rely on the state
                        estimation (or discrimination) scheme generated
                        by the memory based
                        POMDP-RL algorithm, to generate approximate but
                        useful sub-goals. We
                        will first extensively analyze and study the
                        existing sub-goal
                        identification approaches for both MDP-RL and
                        POMDP-RL, with emphasis
                        on methods making use of learning outcomes. Then
                        we will focus on one
                        of the mature family of POMDP-RL algorithms,
                        namely memory based
                        algorithms. Since the nature of the selected
                        POMDP-RL algorithm(s) will
                        determine the solution method, we will try to
                        devise a sub-goal
                        identification method that makes use of the
                        used/generated memory.
                        Finally, in order to verify effectiveness of new
                        method(s), extensive
                        comperative test runs will be executed and
                        reported using various
                        different benchmark problems from the
                        literature.
                       
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 215E250.
 
                      - Principle Investigator: Faruk Polat,
                        Researcher: Dr.Erkin Cilden,
                        Scholar:  Alper Demir, Huseyin Aydin. 
                       
                      - Started in May 2016. Delivered in May 2018. 
                       
                      - Budget: 246729TL 
 
                    
                    - Direct Abstraction for Partially Observable
                        Reinforcement
                        Learning 
 
                    
                      - Abstract: 
                        This project focuses on adaptation or re-design
                        of existing on-line
                        direct temporal abstraction methods already
                        available for MDP-RL to
                        POMDP-RL algorithms, so that learning
                        performance can be improved
                        without an off-line intervention. First we 
                        develop a software
                        platform for POMDP-RL to implement various
                        algorithms. We will focus on
                        two leading categories that represent POMDP-RL
                        family, namely belief
                        state based algorithms and memory based
                        algorithms. The direct
                        abstraction methods to be developed in this
                        project will make use of
                        the approach in Extended Sequence Tree (EST)
                        method, which was
                        developed by our research group for MDP-RL, as
                        it is the most recent
                        and comprehensive study of its category. Then,
                        new direct abstraction
                        methods will be developed for selected POMDP-RL
                        methods and they will
                        be implemented on the software platform.
                        Finally, in order to verify
                        effectiveness of new abstraction methods,
                        extensive comperative test
                        runs will be executed and reported using various
                        different benchmark
                        problems from the literature.
                       
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 113E239.
 
                      - Principle Investigator: Faruk Polat,
                        Researcher: Dr.Erkin Cilden,
                        Scholar:  Coskun Sahin, Utku Sirin, Fatih
                        Semiz. 
                       
                      - Started in Sept 2013. Delivered in Sept 2015.
                        
                       
                      - Budget: 138000TL 
                       
                    
                    - Effective Control of Partially Observable
                        Gene Regulatory
                        Networks
 
                    
                      - Abstract:  
                        The gene
                        regulatory networks (GRN) control problem has
                        been studied
                        mostly with the aid of probabilistic boolean
                        networks. 
                        Partial observability of the gene regulation
                        dynamics is
                        mostly ignored in existing studies on GRN
                        control problem. On the other
                        hand, current works addressing partially
                        observability focus on formulating algorithms
                        for the finite horizon
                        GRN control problem. This motivated us to take
                        the challenge and tackle
                        the control problem from a real partially
                        observable perspective. So,
                        in this work we explore the feasibility of
                        realizing the problem in a
                        partially observable setting, mainly with
                        Partially Observable Markov
                        Decision Processes (POMDP). The method proposed
                        in this work is a POMDP
                        formulation for the infinite horizon version of
                        the problem. We first
                        decompose the problem by isolating different
                        unrelated parts of the problem automatically,
                        and then make use of
                        existing POMDP
                        solvers to solve the obtained subproblems; the
                        final outcome is a
                        control mechanism for the main problem. The
                        proposed approach is
                        tested by using both synthetic and real GRNs to
                        demonstrate its
                        applicability, effectiveness and efficiency. 
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 110E179.
 
                      - Principle Investigator: Faruk Polat, 
                        Scholars: Utku
                        Erdogdu, Utku Sirin, Omer Ekmekci. 
                       
                      - Started in March 2011. Delivered in July 2013.
                        
                       
                      - Budget: 124000TL 
 
                    
                  
                  
                    - Learning Temporal Abstractions and
                        Hierarchical Structures
                        in
                        Single/Multiagent Environments
 
                    
                      - Abstract:  
                        In
                        this project two approaches were developed to
                        speed a reinforcement
                        learning task making use of abstactions as much
                        as possible. In the
                        first approach, we extend McGovern's stochastic
                        terminating sequence
                        method  to  build up a special tree,
                        called extended sequence
                        treee (EST), to maintain the discovered useful
                        abstractions and then
                        utilize it for action selection. In the second
                        approach, we develop a
                        novel method to identify states with similar
                        sub-policies, and show how
                        they can be integrated into  RL framework
                        to improve the learning
                        performance. The method uses an efficient data
                        structure to find common
                        action sequences started from observed states
                        and defines a similarity
                        function between states based on the number of
                        such sequences. Using
                        this similarity function, updates on the
                        state-action value function of
                        a state are reflected to all similar states. 
                      - Supported by the Scientific
                          and Technological Research Council of Turkey 
                        (TUBITAK)
                        under
                        Grant No. 105E181.
 
                      - Principle Investigator: Faruk Polat, 
                        Scholar: Sertan
                        Girgin 
                       
                      - Started in Nov 2005. Delivered in Nov 2006. 
                       
                    
                  
                 
                 
                Research and Development Projects
                
                  
                    - MGKMOS: Agent-Based
                          Simulation of Joint
                          Force
                          Operations
 
                    
                      - Supported by Turkish
                          General Staff of Armed
                          Forces
 
                      - Project Manager: Faruk
                          Polat
                         
                      - Project Staff: 5 research
                          assistants, 4
                          professors
                         
                      - Started Sept 2006.
                          Delivered in March 2010
 
                      - Budget: 1.100.000USD 
                          (University's
                          share)
                         
                      - Joint work with HAVELSAN
                          A.S, Turkey.
                         
                    
                    - SAVMOS: Multi-Agent
                          Simulation of Small
                          Size
                          Contingency Operations
 
                    
                      - Supported by Turkish
                          General Staff of Armed
                          Forces
 
                      - Project Manager: Faruk
                          Polat
                         
                      - Project Staff: 4 research
                          assistants, 1
                          full-time researcher
                         
                      - Started Jan 2002. Delivered
                          April 2004
 
                      - Budget: 416.000USD
 
                    
                    - SENSIM: Optimizing
                          Placement  of
                          Static/Mobile Sensor Platforms on 3D Terrain
 
                    
                      - Supported by Turkish
                          General Staff of Armed
                          Forces
 
                      - Project Manager: Faruk
                          Polat
                         
                      - Project Staff: 4 research
                          assistants
                         
                      - Started Sep 1999. Delivered
                          Dec 2000
 
                      - Budget: 279.000USD