The alignment gap

Introduction

Welcome to the new and shiny blogpost section of this website! I decided to use this space as an informal personal diary, collecting resources for my future projects. For the sake of my sanity I’ll keep the blog simple, with minimal structuring and treating it as a flow of conscience space. I’ll also disable AI while writing - so any mistake is on me! 😉

This first post discusses what I call the “AI alignment gap”. As of today we are starting to see the early signs of AI use diverging from its original goal. Instead of helping humanity to advance, we are running into the risk of creating systems that could behave against our interests, both willingly and unwillingly. We are constantly seeing new embodied autonomous agents being produced and platforms where they can collaborate with each other without supervision. The question is then natural: how can we ensure that we don’t end up like Sarah Connor in Terminator? But more importantly, how do we take concrete and effective steps that go beyond just saying “AI is bad, we should limit its development”?

The starting point - what we have

When I first started to reflect on this question I quickly realised I knew very little about AI safety as a whole. While I had recently co-authored a chapter on a Video and Motion Unlearning as a potential countermeasure (see TBA), my knowledge of the topic is still limited. As such, I opted to create this section to recap what I know, dig deeper into suggestions from colleagues (thanks Matteo M!) and keep track of resources I collect during this research.

The big voices — This paragraph discusses the work that was put out by people much smarter than I am in regards of creating actual AI safety guidelines. Of the top of my head, I can already mention is the AI act from the European Union.

The research community —

A non research oriented approach —

The bigger outlook —