Our Ethical Commitments

"AI will have a profound impact on society and needs to be built in a way that earns trust, protects against bias, and respects privacy as a fundamental human right."
– Mitra Azizirad, Corporate VP, Microsoft AI and Innovation

Human-Centric AI

Humans are at the core of all our products. While AI is an automated entity that operates independently once trained, it is essential to involve humans in the AI’s learning loop to achieve optimal performance. To do so, Dathena has developed its Classification Review module, a process where clients can review the results of our AI models and reassign labels as they see fit.

This Classification Review module is integrated on all the dashboards. Indeed, you can:

     • Investigate the results of Data and User Risk Assessment to fine-tune the AI models

     • Update and/or create new groups of documents to be protected through Augmented Data Protection to allow for tailored rules

     • Review user accesses to sensitive documents through External File Sharing Management and mark them as exceptions

This module enables us to identify sensitive data faster, with fewer false alarms, and lower miss rates. All of this simply, and with minimal labelling effort.

Transparency

At Dathena, we believe that for our clients to trust our AI system, it must be open and transparent. This is why transparency is at the core of our vision of ethical AI, allowing the people who use our products to do so with complete knowledge and control.

This implies strong commitments:

     • We commit to full openness of our systems and technologies. Rather than hiding our expertise behind a technological barrier, we commit to sharing the information that helps the user understand our system and its strengths to make the best use of it.

     • To build trust, we are committed to share with you as much knowledge and information as needed on our products. This includes in-depth tutorials on how to use them, as well as more theoretical information on the technological and scientific concepts necessary to understand them.

     • To give you a clear vision our product’s integration capability, we are committed to working closely with the users involved in our software’s deployment, and to allow them to fully engage in the process. It is through this communication and teamwork that we are able to build trust that our systems will integrate flawlessly within your environment.

     • Finally, transparency implies a clear visibility of the results and performance of our products. We are committed to providing metrics to evaluate the performance of such models and the tools necessary to test them.

Accountability

Dathena’s AI researchers are dedicated to understanding in detail the behaviour of the models we build, and to measuring the quality of these models to provide optimal performance out of the box. By monitoring their performance and including human feedback in the training loop, models are continuously retrained to better address your organization’s needs.

Anyone using our tool must be clear about who trains their AI systems, what data was used in that training and, most importantly, what went into their algorithm’s recommendations. As such, our experts are accountable for the AI systems they develop and are committed to complying with:

     • Our internal code of good conduct

     • The Data Science Association’s “Code of Professional Conduct”

     • The international guidelines and regulations when handling sensitive data

Fairness

Fairness is a principle aiming to prevent unreasonable favouring, or discrimination. AI models are not (un-)fair per se, they are simply pattern recognition engines without any personality or inherent biases. However, if they have been trained with biased data coming from human preferences and judgement, they will inherit these biases and become unfair themselves.

At Dathena we identify and eliminate these biases from our data at an early stage, so that our models remain objective, and our clients receive results that are reliable, accurate and fair. This is allowed thanks to the Explainability methods we put in place to study and understand the input data used to train our models.

Explainability

Explainability is defined as “the condition that allows a user to understand decisions made by the model and its subparts through processes before, during, and after the construction of the AI model.” It offers a robust assurance that the AI models we study behave as expected as we can dissect every constituting element and interrogate their choices. This implies that our AI models are:

     • Interpretable: we can clearly understand the relationship between the input and output

     • Auditable: allowing us to answer questions such as; Are unintentional biases included in the model? Does this model have security risks? Is this AI model compliant with a specific data protection regulation?

Through our #AINoMagic initiative, we are committed to provide end-to-end explainable solutions and turn what are usually black-box algorithms into transparent white AI models.

AI Security & Privacy

At Dathena, we take your privacy and security very seriously, and are committed to take all the necessary to ensure your data is safe with us. We have put 2 key measures in place:

1. Data Encryption



Data Encryption is the process of encoding information in a way that only authorised parties can see the original decoded information. Although thinkable in theory, it is practically impossible to reverse data encryption without the original key.

Dathena encrypts all personal data by using the latest AES (Advanced Encryption Standard) before storing it inside the database to provide full anonymity. This way, the value of the PIIs (Personally Identifiable Information) are hidden, but their type (name, credit card number, …) are not encrypted to allow displaying them on our dashboards and understanding the sensitivity of documents.

After finishing the data processing and analysis, the full content of the database along with anything that has been processed is deleted from our system. Nothing is stored permanently, apart from the aggregated information needed to display results on the dashboards.

2. Data Vectorisation



Vectorisation (or embedding) is the process to transform text into numerical vectors, which is essential to process for Machine Learning and Deep Learning. While manipulating these vectors, it is impossible to reverse them back to text and read the information.

Indeed, those vectors only approximate the semantics of the text using a representation that averages a group of words with similar meaning. This way, we can tell if 2 vectors represent the same information, but not if they contain the same word such as a client’s name or the contents of a confidential contract. This process is key to document processing, and Dathena uses it to process and analyse documents across all 3 products.