Policy Gradient with Parameter-Based Exploration: methods, extensions and applications

January 18, 2020

Policy Gradient with Parameter-Based Exploration: methods, extensions and applications

Speaker: Giorgio Manganini
United Technologies Research Centre
Cork, Ireland

When: 24th January, 11.00 am

Where: GSSI Library

Abstract: Policy Search is a Reinforcement Learning approach that focuses on the search for the optimal policy of a Markov Decision Process in a limited policy space. It has gained popularity as an approach for complex, real applications since it can deal with high–dimensional state and action spaces, while keeping the search limited to a task–appropriate predefined parametrized policy class. During this talk, we introduce the concepts and the formulations of Policy Gradient methods, focusing on an exploration strategy in the policy parameter space (PGPE).

In the first part, we endow the PGPE with a novel policy parameterization using particles to describe entire areas of the state space associated to the same action, and hence scaling favorably with state space size. In the second part, the gradient direction of the PGPE is extended to second–order Newton methods: we provide the formulation of the Hessian of the expected return, a technique for variance reduction in the sample–based estimation and a finite sample analysis in the case of Normal distribution.
Beside discussing the theoretical properties, we empirically evaluate the proposed methods on either instructional and real case studies.

Davide Di Ruscio

Leave a Reply Cancel reply

Study of Architectures, Networks, and Medium Access Protocols for Vehicular Communications in the Context of CCAM (Cooperative, Connected, and Automated Mobility)
Referring Professor: Prof. F. Santucci

The project aims to improve energy efficiency and reliability of communication systems for cooperative and autonomous mobility. Specifically, the research focuses on studying architectures, networks, and medium access protocols characterized by high energy efficiency and reliability for services related to vehicular communications and Cooperative, Connected, and Automated Mobility (CCAM). The project involves the following phases:

– Study of architectures, networks, and medium access protocols aimed at ensuring high energy efficiency and reliability, with a particular focus on CCAM-supported services
– Extension of the study domain to the development of innovative techniques for managing communication devices with limited computational and energy resources, such as sensors for monitoring and localization in smart road contexts
– Design and development of testbeds for experimental validation of the proposed solutions, leveraging the hardware and software assets provided by the hosting laboratories

Agile and Motivational Software Systems for Sustainable Mobility: Microservices, Gamification, and Real-time Infomobility Supporting Citizens and Businesses
Referring Professors: Prof. M. Autili, Prof. A. Bucchiarone

The project includes the following phases:

– Design of a digital platform for sustainable mobility, based on microservices, gamification, and real-time infomobility, to encourage sustainable commuting behaviors
– Needs analysis, co-design, development, and field validation in a real operational context
– Development and testing of the designed platform. Development sprints will be conducted with subsequent user testing and evaluation of usability and motivational effectiveness. GSA will provide data, user access, and experimental support.

Design and Development of Smart Environments for the Enhancement of Cultural Heritage through Gamification and Augmented Realityy
Referring Professor: Prof. H. Muccini

Serious games are increasingly used to enhance user engagement with cultural heritage, promoting interactive learning and immersive exploration. The project aims to create a development environment that enables the integration of AI and Augmented Reality (AR) within gamified applications supporting the enjoyment of cultural assets. The project includes the following phases:

– State-of-the-art analysis
– Design of an architecture integrating serious games, AI, and AR modules
– Development of a platform enabling the creation of gamified applications for cultural heritage exploitation, supported by AI and AR
– Application of the platform to real-world use cases
– Validation of the results

System for Collecting and Analyzing Data from Urban Video Surveillance Networks to Enhance Territorial Safety and Resilience, with Application to the 2009 Earthquake Crater Area.
Referring Professor: Prof. A. Di Marco

The project focuses on the intelligent management of urban video surveillance data to enhance territorial safety and resilience. Using data from the video surveillance system of the municipalities affected by the 2009 earthquake, the project includes the following phases:

– Definition, design, and implementation of a system that collects and preprocesses data from the video surveillance network in compliance with legal, ethical, and privacy requirements
– Definition, design, and implementation of video analysis for the data collected by the system defined in the previous objective
– Development of a Smart Urban Living system to improve citizen safety in general and particularly in disaster scenarios

Details on the grant funded by the PRIN 2022 MEDITATE project
Referring Prof. V. Cortellessa

Title: Model-based generation and optimization of advanced driver assistance systems (ADAS) testing scenarios in co-simulation

Abstract: Advanced Driver Assistance Systems (ADAS) play a crucial role in improving road safety, reducing traffic congestion, and decreasing fuel consumption, thus paving the way to safer and more environmentally friendly mobility. Novel ADAS mostly relies on AI techniques to “understand” the current situation around the vehicle (e.g., a pedestrian in front of the car), plan, and execute appropriate behavior (e.g., emergency braking). The validation of AI-based ADAS is a pain point for automotive companies, due to the sheer number of scenarios that need to be considered. In this PhD program, we aim at improving cosimulation-based validation of ADAS with an integrated framework, supporting both the automated generation of a suite of testing scenarios and their optimization.

Doctoral Program in ICT

Policy Gradient with Parameter-Based Exploration: methods, extensions and applications

Leave a Reply Cancel reply

Latest news