TargetFinder: Developing an External Access System for Widget Information in Graphical User Interfaces

Par Julien Gori, 15 novembre, 2024

Modalities
+ 5-6 months internship at ISIR laboratory (Sorbonne Université, Paris)
+ monthly stipend (about 600¿/month)
+ supervised by Julien Gori, Géry Casiez and Matthieu Nancel
+Contact: gori@isir.upmc.fr. Please send a CV, M1/M2 transcripts and a
link to your Github account if available if applying.

Objective
In most graphical applications, users interact with various elements — like buttons, menus, and title bars — called widgets. Information about these widgets (such as their size, location, and labels) is typically accessible only within the application itself. The objective of this internship is to develop methods to access this information from outside the application,
enabling a broader range of uses, such as cross-application automation, user interface analysis, and accessibility enhancements.

Project Description : This project is part of an ongoing collaboration between Sorbonne Université and Université de Lille. During this internship, the student will build a system that can gather widget information from graphical user interfaces (GUIs) externally. The student will build on an existing demonstrator that can identify a single target’s location and size. The demonstrator builds on a YOLOv8 pre-trained object detection network. The first goal of this project is to improve the demonstrator to identify multiple target locations and sizes at once, as well as determine their labels. To do so, the intern will ap-
ply existing image processing techniques to pre-label an existing dataset, which will, after verification, be used to retrain the existing network. The second goal of this project is to implement an existing interaction technique that leverages
target information, for example semantic pointing, to showcase the possibilities offered by this project.

Potential Impact: This project could offer significant benefits for improving user interfaces. By making widget information available externally, it could open up new possibilities for interacting with software. For example, there
exist many pointing techniques that outperform the regular cursor, but which rely on specific target information to be implemented, making a system-wide implementation currently impossible. Another benefit of this project is the ability to more easily log user behavior, in particular the objects with which the user interacts with, without having to modify existing applications. The
tools developed in this work are thus expected to be re-used across a variety of applications.

Requirements : We are looking for a person with a background in computer science, with skills in machine learning / computer vision / image processing. Knowledge of Python is required, knowledge of C++ is not needed but is appreciated.

Lieu
ISIR, Sorbonne Université
Encadrant
Julien Gori
Référent universitaire
n/a
Tags
Attribué
Non
Année
2025