About Me

Autonomus and Multi-Agent Systems
Reinforcement Learning, Markov Decision Processes, Partially Observable Markov Decision Processes

Induction and Control of Gene Regulatory Networks in Bioinformatics

Multiagent Path Finding
Behaviour Modeling in Virtual Simulations and Computer Games

Incremental Multi-Agent Path Finding

Abstract
In Multi-Agent Path Finding (MAPF), the aim is to find conflict-free paths for more than one agent: given a graph, the aim is to determine a path from an initial vertex to a target vertex for each agent such that no two agents can be at the same vertex at the same time and the sum of the costs of agent paths is minimal. Existing MAPF algorithms are inadequate for meeting requirements of many real-life MAPF problems, that’s why MAPF problem description needs to be enhanced with realistic domain specific requirements: i) changes in the environment, ii) agents with more than one destination, and iii) entry of new jobs anytime after the initial job-assignment. CBS is not a suitable approach for dealing with dynamic environments, because it uses an offline single agent search algorithm, namely A*. This causes CBS to re-compute all of the agent paths and regenerate a constraint tree (CT) from the scratch to provide a new optimal plan for the agents. In this project, first a new incremental single agent path finding algorithm will be developed and then it will be coupled into the high-level planner to swiftly generate new plans after environmental changes. In the second stage, we will develop an algorithm based on our incremental solver to MAPF problem instance where each agent can have more than one delivery vertices in the graph. In the last stage, we will work on a method to assign new jobs to the agents considering their existing jobs and possible paths. In order to achieve this, we will develop several job assignment heuristics and then determine a heuristic with best performance. In order to test algorithms developed in there stages, we will use randomly generated scenarios and some benchmark MAPF maps in the literature
Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 120E504.
Principle Investigator: Faruk Polat, Scholar: Fatih Semiz, Evren Cilden.
Started in March 2021. To Be Delivered in May 2023.
Budget: 401183TL

Management of BS Programs and Capacity Planning for Council of Higher Education.

Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 115G086.
Principle Investigator: Faruk Polat, Researchers: Dr.Cem Iyigun, Scholar: Alper Demir, Huseyin Aydin, Tuna Berk Kaya, Yagmur Caner.
Started in May 2018. To Be Delivered in May 2020.
Budget: 228860TL

Subgoal Identification in Sequential Decision Making under Partial Observability

Abstract:
Sequential decision making under partial observability is a hard problem mainly due to perceptual aliasing and dimensionality issues. Learning algorithms try to handle the sequential decision making problem through an adaptive agent perspective, trying to cope with the problem using some approximation methods.Reinforcement learning (RL) is a strong on-line learning method widely known for its fitness to autonomous agent model, relatively simple implementation and ease of adaptation to real-world phenomena. Although RL methods are theoretically based on Markov decision process (MDP) model, partially observable MDP (POMDP) variants exist, together with some assumptions and limitations. Significant effort has been spent to divide MDP problems into smaller problems, so that every sub-problem can be solved with less effort, and solutions of all sub-problems can be combined later on for the grand solution. One of the popular ways to do this is the identification of sub-goals which naturally clusters the problem into pieces. Although there are sound methods for MDP- RL case, the sub-goal identification literature for partial observable case is still immature. The aim of this project is to attack a definitely unexplored area in terms of sub-goal identification for POMDP-RL, which is the memory based RL algorithms for problems with hidden state. This study focuses on adaptation or re-design of existing on-line sub-goal identification methods already available for MDP-RL to POMDP-RL algorithms, so that learning performance can be improved without an off-line intervention. In order to do this we will rely on the state estimation (or discrimination) scheme generated by the memory based POMDP-RL algorithm, to generate approximate but useful sub-goals. We will first extensively analyze and study the existing sub-goal identification approaches for both MDP-RL and POMDP-RL, with emphasis on methods making use of learning outcomes. Then we will focus on one of the mature family of POMDP-RL algorithms, namely memory based algorithms. Since the nature of the selected POMDP-RL algorithm(s) will determine the solution method, we will try to devise a sub-goal identification method that makes use of the used/generated memory. Finally, in order to verify effectiveness of new method(s), extensive comperative test runs will be executed and reported using various different benchmark problems from the literature.
Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 215E250.
Principle Investigator: Faruk Polat, Researcher: Dr.Erkin Cilden, Scholar: Alper Demir, Huseyin Aydin.
Started in May 2016. Delivered in May 2018.
Budget: 246729TL

Direct Abstraction for Partially Observable Reinforcement Learning

Abstract:
This project focuses on adaptation or re-design of existing on-line direct temporal abstraction methods already available for MDP-RL to POMDP-RL algorithms, so that learning performance can be improved without an off-line intervention. First we develop a software platform for POMDP-RL to implement various algorithms. We will focus on two leading categories that represent POMDP-RL family, namely belief state based algorithms and memory based algorithms. The direct abstraction methods to be developed in this project will make use of the approach in Extended Sequence Tree (EST) method, which was developed by our research group for MDP-RL, as it is the most recent and comprehensive study of its category. Then, new direct abstraction methods will be developed for selected POMDP-RL methods and they will be implemented on the software platform. Finally, in order to verify effectiveness of new abstraction methods, extensive comperative test runs will be executed and reported using various different benchmark problems from the literature.
Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 113E239.
Principle Investigator: Faruk Polat, Researcher: Dr.Erkin Cilden, Scholar: Coskun Sahin, Utku Sirin, Fatih Semiz.
Started in Sept 2013. Delivered in Sept 2015.
Budget: 138000TL

Effective Control of Partially Observable Gene Regulatory Networks

Abstract:
The gene regulatory networks (GRN) control problem has been studied mostly with the aid of probabilistic boolean networks. Partial observability of the gene regulation dynamics is mostly ignored in existing studies on GRN control problem. On the other hand, current works addressing partially observability focus on formulating algorithms for the finite horizon GRN control problem. This motivated us to take the challenge and tackle the control problem from a real partially observable perspective. So, in this work we explore the feasibility of realizing the problem in a partially observable setting, mainly with Partially Observable Markov Decision Processes (POMDP). The method proposed in this work is a POMDP formulation for the infinite horizon version of the problem. We first decompose the problem by isolating different unrelated parts of the problem automatically, and then make use of existing POMDP solvers to solve the obtained subproblems; the final outcome is a control mechanism for the main problem. The proposed approach is tested by using both synthetic and real GRNs to demonstrate its applicability, effectiveness and efficiency.
Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 110E179.
Principle Investigator: Faruk Polat, Scholars: Utku Erdogdu, Utku Sirin, Omer Ekmekci.
Started in March 2011. Delivered in July 2013.
Budget: 124000TL

Learning Temporal Abstractions and Hierarchical Structures in Single/Multiagent Environments

Abstract:
In this project two approaches were developed to speed a reinforcement learning task making use of abstactions as much as possible. In the first approach, we extend McGovern's stochastic terminating sequence method to build up a special tree, called extended sequence treee (EST), to maintain the discovered useful abstractions and then utilize it for action selection. In the second approach, we develop a novel method to identify states with similar sub-policies, and show how they can be integrated into RL framework to improve the learning performance. The method uses an efficient data structure to find common action sequences started from observed states and defines a similarity function between states based on the number of such sequences. Using this similarity function, updates on the state-action value function of a state are reflected to all similar states.
Supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Grant No. 105E181.
Principle Investigator: Faruk Polat, Scholar: Sertan Girgin
Started in Nov 2005. Delivered in Nov 2006.

MGKMOS: Agent-Based Simulation of Joint Force Operations

Supported by Turkish General Staff of Armed Forces
Project Manager: Faruk Polat
Project Staff: 5 research assistants, 4 professors
Started Sept 2006. Delivered in March 2010
Budget: 1.100.000USD (University's share)
Joint work with HAVELSAN A.S, Turkey.

SAVMOS: Multi-Agent Simulation of Small Size Contingency Operations

Supported by Turkish General Staff of Armed Forces
Project Manager: Faruk Polat
Project Staff: 4 research assistants, 1 full-time researcher
Started Jan 2002. Delivered April 2004
Budget: 416.000USD

SENSIM: Optimizing Placement of Static/Mobile Sensor Platforms on 3D Terrain

Supported by Turkish General Staff of Armed Forces
Project Manager: Faruk Polat
Project Staff: 4 research assistants
Started Sep 1999. Delivered Dec 2000
Budget: 279.000USD

Research