Digital data, whether corporate or public, personal or content-related, is the fuel of artificial intelligence. So, to improve the production quality of an AI, you need to offer it the best quality data. Audric Lhoas, Head of Product Management at Proximus NXT Luxembourg, explains.

Let's start at the beginning: what is data?

Audric Lhoas: Data is a collection of information, personal or otherwise, that companies collect, hold and in some cases exploit, depending on the service they offer. A bank, for example, holds a certain amount and type of information about its customers. Some of this information is required by law, but it also enables the bank to adapt its commercial offering.

In what way is this data fundamental in the digital age?

AL: Take, for example, data relating to websites and their content, but also to the users who visit these sites. This data has value and can be monetised. A company like Google, or the GAFAMs more broadly, have placed the monetisation of data at the heart of their business models.

Today, the subject of monetising data is being raised by all companies and organisations. They are all wondering how they can monetise, or at the very least make the most of, the data they have at their disposal, over and above their own activities.

And this raises a number of security issues?

AL: There are indeed many issues at stake, and they all lead us to think about the risks involved in using data. For a user, in the event of data leakage, loss or fraudulent use by a third party, the seriousness will vary depending on whether the data is freely entrusted to the user - holiday photos on a social network, for example - or whether it is confidential financial information that the user's bank was supposed to protect. For the company in question, which stores the data and is supposed to keep it secure, its reputation is at stake.

Why is data so crucial to the deployment of Artificial Intelligence, particularly generative AI?

AL: You have to think of data as food, as what constitutes AI, what enables it to function and grow. So first you have to train the AI to recognise what it should or can ingest, i.e. what is of interest to the company and what is not. It is by retaining and interconnecting a specific type of data that AI gains skills and offers added value to the company that 'employs' it, just like an employee.

Cleaning up data means enabling AI to process it quickly and efficiently, limiting costs and maximising gains.
Audric Lhoas

Audric LhoasHead of Product ManagementProximus NXT

The quality of the data must also be optimal...

AL: Absolutely, and here we're touching on one of the major issues: that of adding value to data. In fact, over and above the flow of data and its quantity, we need to provide the system in question with 'cleaned' data. One of our main activities today is therefore data cleansing and enhancement. In other words, making the data 'edible', understandable and exploitable by an AI, but also limiting the data to what is of interest to the company. This last point is crucial, because the IT infrastructure that enables data to be processed quickly, i.e. the chips and IT components, have become expensive and less available on the market. Efficiently processing data and extracting the best from it therefore means limiting costs and maximising profits.

How does Proximus NXT support its customers in this area?

AL: We support our customers from A to Z in this process. It starts with cleaning up and structuring the data, and determining what is and isn't usable within it. It then goes on to securing the data, anonymising it if necessary, and finally modelling it so that it can be ingested and processed efficiently by the AI. We also deal with the hardware side, i.e. the supply of GPU chips, and the production of solutions using AI. By concentrating these different areas of expertise, we are a one-stop shop for our customers.

What do you need to bear in mind when using these solutions?

AL: It's important to bear in mind that the AI solutions we're talking about are not yet capable of producing 100% correct results, despite the fact that the data has been enhanced. The data may still be contradictory, incomplete or false. The results are amazing and are still improving, but there is still a margin for error. That's why we always recommend that our customers test AI internally, for a given period, to see what it produces without exposing themselves directly to the market. In this way, a fine-tuning phase can be implemented and the system optimised.