Publications

Filters

Technology
  • Counterfactual Analysis
  • Genetic Programming
  • Human-Guided Learning
Sector
  • General
  • Energy
  • Healthcare
  • Retail
  • Other
Interpretable Forecasting of Energy Demand in the Residential Sector
Sakkas N, Yfanti S, Daskalakis C, Barbu E, Domnich M. Interpretable Forecasting of Energy Demand in the Residential Sector. Energies. 2021; 14(20):6568. https://doi.org/10.3390/en14206568

The paper provides for a first approach towards identifying the potential of counterfactual analysis in mid term building stock energy consumption forecasting as well as comparing GP and NN approaches.

The paper results informed the approach taken for the detailed elaboration on the TRUST AI specification for the building energy sector and the expectations set for experimentation and validation, as conducted, concluded and reported mostly in key paper [4] below.

A cooperative coevolutionary hyper-heuristic approach to solve lot-sizing and job shop scheduling problems using genetic programming
Zeiträg, Y., Rui Figueira, J., & Figueira, G. (2024). A cooperative coevolutionary hyper-heuristic approach to solve lot-sizing and job shop scheduling problems using genetic programming. International Journal of Production Research, 1-28.

In this paper we explore a lot-sizing and scheduling problem in a job shop environment with Genetic Programming (GP), the fundamental method used in TRUST-AI. In real-world situations, quick solution generation is crucial for frequent replanning due to dynamic events and fast decision-making. Solution time is also critical when interacting with decision support systems to iteratively generate different solutions (e.g. when considering multiple objectives). Therefore, GP is used to generate simple rules that find good solutions for this NP-hard problem in less than a second, for the first time. The average gap of the obtained solutions against a state-of-the-art exact solver is less than 2%, thus showing another successful application of GP.

Learning efficient in-store picking strategies to reduce customer encounters in omnichannel retail
Neves-Moreira, F., & Amorim, P. (2024). Learning efficient in-store picking strategies to reduce customer encounters in omnichannel retail. International Journal of Production Economics, 267, 109074.

This paper focuses on a sequential decision making problem (as in Use Case 2 – Online Retail) where a picker agent needs to pick online orders in a retail store without disturbing physical customers. The retail store environment is modelled as a Markov Decision Process (MDP) and a Q-learning algorithm is used to learn the best shopping paths to follow inside the store. The simulation environment developed for this omnichannel retail use case can easily be adapted to allow for the extraction of symbolic policies that can be used to tackle in-store picking with an explainable-by-design approach.

Quantifying reinforcement-learning agent’s autonomy, reliance on memory and internalisation of the environment
Ingel, A., Makkeh, A., Corcoll, O., & Vicente, R. (2022). Quantifying reinforcement-learning agent’s autonomy, reliance on memory and internalisation of the environment. Entropy, 24(3), 401.

Drivers of and counterfactuals for the final energy and electricity consumption in EU industry
Sakkas, N., Athanasiou, N. (2021). Drivers of and counterfactuals for the fnal energy and electricity consumption in EU industry. Academia Letters, Article 3451. https://doi.org/10.20935/AL3451

The paper provides for a first attempt towards identifying the potential of explainability and counterfactual analysis in mid term industrial energy consumption. It was an adaptation of above paper [1] for the case of industry, based on data that were sourced from Eurostat and Odyssey Mure. The paper highlighted the potential impact of explainable approaches of industrial energy, was of an explorative nature and was not uptake in more detailed investigations that focused exclusively on short term, building level forecasting (and not industry).

Open data or open access? The case of building data
Sakkas, N., Yfanti, S. (2021). Open data or open access? The case of building data. Academia Letters, Article 3629. https://doi.org/10.20935/AL3629

The paper details the middleware data sharing service (https://ds.leiminte.com) developed in the course of the early phases of TRUST AI and its potential for providing for a basis for an open source solution for the energy sector.

This work largely informed the open source specification of the TRUST AI Energy instance.

Explainable Approaches for Forecasting Building Electricity Consumption
Sakkas, N.; Yfanti, S.; Shah, P.; Sakkas, N.; Chaniotakis, C.; Daskalakis, C.; Barbu, E.; Domnich, M. Explainable Approaches for Forecasting Building Electricity Consumption. Energies 2023, 16, 7210. https://doi.org/10.3390/en16207210

This is the key paper of the Energy Use Case investigation and provides for the extensive background scientific investigation carried out in TRUST AI and for the Energy Use Case. GP approaches have been thoroughly benchmarked against LSTM and STS approaches typically used in the literature and have been evaluated both as to their performance as well as their explainability (global and local) potential. Counterfactual and feature importance analysis has been found particularly important for uptake in real settings and real life scenarios. However, the good accuracy performance of GP as well as its ability to train in short time frames have also been among the interesting results reached.

This work has informed from a scientific point of view the commercial and open source customisation of TRUST AI with regard to the envisaged Energy instance of TRUST AI.

Technology Readiness Levels (TRLs) in the Era of Co-Creation
Yfanti, S.; Sakkas, N. Technology Readiness Levels (TRLs) in the Era of Co-Creation. Appl. Syst. Innov. 2024, 7, 32. https://doi.org/10.3390/asi7020032

The paper presents an approach for enhancing the TRL of the energy forecasting module developed within TRUST AI. Co-creation was found to be an interesting option for developments that are inherently limited as to their TRL potential (such as the forecasting algorithm developed in TRUST AI). Thus, the paper generalises the experiences and interactions that generated this approach for the energy use case, and addresses TRL at large, proposing some key amendments that are due in the era of co-creation. It is reminded that the TRL concept emerges in the post-war period and reflected the producer innovation model that was almost exclusively dominant in that era.

This work has informed, from a commercialisation and stakeholder interaction point of view, the commercial and open source customisation of TRUST AI with regard to the envisaged Energy instance of TRUST AI.

Building data models and data sharing. Purpose, approaches and a case study on explainable demand response
Sakkas, N., Chaniotaki, C., & Sakkas, N. (2022, December). Building data models and data sharing. Purpose, approaches, and a case study on explainable demand response. In IOP Conference Series: Earth and Environmental Science (Vol. 1122, No. 1, p. 012066). IOP Publishing.

The paper was based on the workings of a workshop organised within the framework of the SBEFin 2022 conference in Helsinki where new business models were discussed with regard to building energy applications and possible open source approaches of broad value for the energy community. As the workshop was conducted at a moment when the GP analyses had not yet reached conclusive results, the emphasis was mostly on the functionalities that should be supported and the data management approach pursued and implemented in TRUST AI via the https://ds.leiminte.com middleware that was developed.

The paper informed the open source strategy pursued in TRUST AI in view of an open TRUST AI Energy instance.

Deep neural networks using a single neuron: folded-in-time architecture using feedback-modulated delay loops
Stelzer, F., Röhm, A., Vicente, R., Fischer, I., & Yanchuk, S. (2021). Deep neural networks using a single neuron: folded-in-time architecture using feedback-modulated delay loops. Nature communications, 12(1), 5164.

Personalized choice model for forecasting demand under pricing scenarios with observational data - The case of attended home delivery
Ali, Ö. G., & Amorim, P. (2024). Personalized choice model for forecasting demand under pricing scenarios with observational data—The case of attended home delivery. International Journal of Forecasting, 40(2), 706-720.

The delivery time slot management in the Online Retail case that we are exploring in TRUST-AI presents an important predictive task: customer choice of time slots. This paper introduces a method to personalize choice models. The model provides interpretable customer- and context-specific preferences, and price sensitivity. The results indicate that while the popular non-personalized multinomial logit model does very well at the aggregate (day–slot) level, personalization provides significantly and substantially more accurate predictions at the individual–context level.

Scheduling wagons to unload in bulk cargo ports with uncertain processing times
Ferreira, C., Figueira, G., Amorim, P., & Pigatti, A. (2023). Scheduling wagons to unload in bulk cargo ports with uncertain processing times. Computers & Operations Research, 160, 106364.

Genetic Programming (GP) can be applied to classification/regression tasks, but also to devise rules for prescriptive problems. That is the case of scheduling problems, one of the applications we are exploring in TRUST-AI, in addition to the three main use cases. In this paper, we explore a real-world setting of bulk cargo ports, where the scheduling of unloading wagons to the stockyard needs to be optimized. This dynamic problem is addressed by dispatching rules, which are learned with GP. The obtained rules allow operations managers to avoid performance deterioration under schedule disruptions, while incurring in fewer deviations from the original schedules. Finally, GP is also used to evolve rules for a greedy randomized heuristic that optimizes a static approximation of the problem. The rules evolved by GP are compact and elegant, demonstrating its ability to learn effective, explainable-by-design AI models for this problem.

Multi-modal multi-objective model-based genetic programming to find multiple diverse high-quality models
Sijben, E. M. C., Alderliesten, T., & Bosman, P. A. (2022, July). Multi-modal multi-objective model-based genetic programming to find multiple diverse high-quality models. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 440-448).

For the Genetic Programming (GP) work package, we aim to enhance the capabilities of the genetic programming gene-pool optimal mixing evolutionary algorithm (GP-GOMEA). Genetic programming (GP) is often cited as being uniquely well-suited to contribute to explainable AI because of its capacity to learn (small) symbolic models that have the potential to be interpreted. Nevertheless, GP typically results in a single best model, like many ML algorithms. However, in practice, the best model in terms of training error may well not be the most suitable one as judged by a domain expert for various reasons. Hence, to increase the chances that domain experts deem a resulting model plausible, it becomes important to be able to explicitly search for multiple, diverse, high-quality models that trade-off different meanings of accuracy. In this paper, we achieve precisely this with a novel multi-modal, multi-tree, multi-objective GP approach that extends a modern model-based GP algorithm known as GP-GOMEA that is already effective at searching for small expressions. As a result, we aim to improve the learning loop of the TRUST-AI concept by enabling the user to inspect multiple models in one iteration of the learning loop. This deliverable is part of project deliverable 4.1. This version of the GP-GOMEA algorithm will be utilized in at least one of the TRUST-AI project use cases.

Memetic semantic genetic programming for symbolic regression
Leite, A., & Schoenauer, M. (2023, March). Memetic semantic genetic programming for symbolic regression. In European Conference on Genetic Programming (Part of EvoStar) (pp. 198-212). Cham: Springer Nature Switzerland.

This paper summarizes the work done in Task 4.2 of TRUST-AI: Fundamentally advancing Memetic Semantic GP. Before the TRUST-AI project, MSGP had been limited to Boolean or multi-valued contexts. The version developed during TRUST-AI now handles continuous variables and functions. Furthermore, it can be biased toward short, and hence more likely to be explainable, expressions. Last but not least, it performs on par with GP-GOMEA (depending on the benchmark problem) and outperforms other GP-based approaches.

Explanatory World Models via Look Ahead Attention for Credit Assignment
Corcoll, O., & Vicente, R. (2022, July). Explanatory World Models via Look Ahead Attention for Credit Assignment. In UAI 2022 Workshop on Causal Representation Learning.

A Guide for Practical Use of ADMG Causal Data Augmentation
Poinsot, A., & Leite, A. (2023, March). A Guide for Practical Use of ADMG Causal Data Augmentation. In ICLR 2023 Workshop on Pitfalls of limited data and computation for Trustworthy ML.

Compact symbolic models, like those learned under the TRUST-AI paradigm, tend to be less data hungry and more robust. Still, some particular settings with very limited data represent a challenge. This paper explores data augmentation in tabular data problems. In particular, it analyzes a causal data augmentation strategy considering different settings.

Evolvability degeneration in multi-objective genetic programming for symbolic regression
Liu, D., Virgolin, M., Alderliesten, T., & Bosman, P. A. (2022, July). Evolvability degeneration in multi-objective genetic programming for symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 973-981).

Since Genetic programming (GP) has a capacity to learn (small) symbolic models that have the potential to be interpreted, it is leveraged as the backbone optimization method in the TRUST-AI concept. One could even optimize for both the model's size and the model's performance by performing a search in a multi-objective manner. In this way, we enable the user to select a model from a list of models with varying complexity (and thus possibly varying explainability). However, problems arise using this approach with multi-objective optimization with NSGA-II since it cannot efficiently optimize objectives with varying difficulties. Optimizing for size is way easier than optimizing for model performance since the population will naturally contain models of a smaller size. We overcome this problem by limiting how many models of each complexity level can survive the generation. We compare this new version of NSGA-II, evoNSGA-II, with seven existing multi-objective GP approaches on ten widely-used data sets. We find that evoNSGA-II is equal or superior to using these approaches in almost all comparisons.

Coefficient mutation in the gene-pool optimal mixing evolutionary algorithm for symbolic regression
Virgolin, M., & Bosman, P. A. (2022, July). Coefficient mutation in the gene-pool optimal mixing evolutionary algorithm for symbolic regression. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 2289-2297).

For the Genetic Programming (GP) work package, we aim to enhance the capabilities of the genetic programming gene-pool optimal mixing evolutionary algorithm (GP-GOMEA). Although GP-GOMEA is among the top-performing algorithms for symbolic regression (SR), it lacks a mechanism to optimize coefficients. This paper studies how fairly simple approaches for optimizing coefficients can be integrated into GP-GOMEA. In particular, we considered two variants of Gaussian coefficient mutation. We applied GP-GOMEA with the best-performing coefficient mutation approach to the data sets of SR-Bench, a large SR benchmark, for which a ground-truth underlying equation is known. We find that coefficient mutation can help rediscover the underlying equation by a substantial amount, but only when no noise is added to the target variable. In the presence of noise, GP-GOMEA with coefficient mutation discovers alternative but similarly accurate equations. This work is part of the project deliverable 4.1.

Deep learning-based auto-segmentation of paraganglioma for growth monitoring
Sijben, E. M. C., Jansen, J. C., de Ridder, M., Bosman, P. A., & Alderliesten, T. (2024, March). Deep learning-based auto-segmentation of paraganglioma for growth monitoring. In Medical Imaging 2024: Image Perception, Observer Performance, and Technology Assessment (Vol. 12929, pp. 247-256). SPIE.

For the healthcare use case, we aim to predict the future tumor growth of paragangliomas in the head and neck area. By doing so, we could possibly improve treatment strategies for those patients. By giving paraganglioma patients treatment, severe symptoms can be prevented. However, treating patients who do not actually need it comes at the cost of unnecessary possible side effects and complications. Improved measurement techniques could enable growth model studies with a large amount of tumor volume data, possibly giving valuable insights into how these tumors develop over time. This paper presents an auto-segmentation model that segments the tumor volumes in 3D MRI scans. Using this model, we can thereby automatically measure the tumor volume. We thoroughly evaluated the model as a tool for automatic volumetric tumor measurement. Using this model, we created a data set of 311 tumor volume measurements over time. This data set has a median of 5 volume measurements per tumor (range: 3-15). Using this data set, we will train an explainable model to predict tumor growth (using a specific variant of a genetic programming algorithm developed in the TRUST-AI project). This work is part of the project deliverable 5.4.

Function Class Learning with Genetic Programming: Towards Explainable Meta Learning for Tumor Growth Functionals
Sijben, E. M. C., Jansen, J. C., Bosman, P. A. N., & Alderliesten, T. (2024). Function Class Learning with Genetic Programming: Towards Explainable Meta Learning for Tumor Growth Functionals. arXiv preprint arXiv:2402.12510.

For the healthcare use case, we aim to learn the general underlying growth pattern of paragangliomas from multiple tumor growth data sets, in which each data set contains a tumor's volume over time. To do so, we propose a novel approach based on genetic programming to learn a function class, i.e., a parameterized function that can be fit anew for each tumor. We do so in a unique, multi-modal, multi-objective fashion to find multiple potentially interesting function classes in a single run. This approach nicely ties in with the explainable by design approach we are taking within the TRUST-AI concept leveraging Symbolic Regression for explainability purposes. We evaluate our approach on the tumor growth data set that was previously crafted in the TRUST-AI project. This work is part of the project deliverable 5.3, in which the genetic programming algorithm was customized for the healthcare use case (but with possible wider applications). The clinical impact of the models found using this approach will be evaluated in the validation study of the project and will be reported in project deliverable 5.5.

Enhancing Counterfactual Explanation Search with Diffusion Distance and Directional Coherence
Domnich, M., & Vicente, R. (2024). Enhancing Counterfactual Explanation Search with Diffusion Distance and Directional Coherence. arXiv preprint arXiv:2404.12810.

This paper presents an approach to enhance counterfactual explanation searches by incorporating diffusion distance and directional coherence. The proposed method improves the quality and interpretability of counterfactual explanations, making them more useful in real-world applications. By leveraging diffusion distance, the approach ensures that the generated counterfactuals are closer to the data manifold, while directional coherence maintains logical consistency in the transformations. This work contributes to the ongoing efforts in TRUST-AI to develop more explainable and robust AI models.

COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images
Shvetsov, D., Ariva, J., Domnich, M., Vicente, R., & Fishman, D. (2024). COIN: Counterfactual inpainting for weakly supervised semantic segmentation for medical images. arXiv preprint arXiv:2404.12832.

The paper presents a novel approach for automating segmentation label extraction for tumors in the medical domain, with the assistance of an Explainable AI approach counterfactual explanations. This work is relevant for the healthcare use case, as it deals with tumor volume assessment from medical images. Additionally, the paper furthers the formalization and understanding of counterfactual explanations, which are a core functionality of the TRUST-AI platform.

Exploring Commonalities in Explanation Frameworks: A Multi-Domain Survey Analysis
Barbu, E., Domnich, M., Vicente, R., Sakkas, N., & Morim, A. (2024). Exploring Commonalities in Explanation Frameworks: A Multi-Domain Survey Analysis. arXiv preprint arXiv:2405.11958.

The paper presents the results of user studies, which include questionnaires and expert interviews, conducted across three domains in the TRUST-AI project. These study results were crucial to define the scope and target users of the TRUST-AI framework.

Multi-objective genetic programming for explainable reinforcement learning
Videau, M., Leite, A., Teytaud, O., & Schoenauer, M. (2022, April). Multi-objective genetic programming for explainable reinforcement learning. In European Conference on Genetic Programming (Part of EvoStar) (pp. 278-293). Cham: Springer International Publishing.

This paper demonstrates the power of Genetic Programming to come up with short, easily understandable commands for control problems, also known as reinforcement problems in the ML community. This work is part of WP4, though it is not directly related to MSGP because there are no fitness cases here, hence no semantics to be used (DEAP library was used). However, these results, such as the explainable command on the lunar lander problem, clearly show that Genetic Programming can find explainable laws for reinforcement problems, unlike recent deep RL approaches.