About me


Hello, welcome to my personal website! My name is Nelly Barret and I'm a third year PhD student at École Polytechnique and Inria Saclay, under the supervision of Ioana Manolescu. I'm a member of the CEDAR team, a joint team between Inria and the LIX (Computer Science lab of École Polytechnique) focusing on rich data analytics at cloud scale. My thesis subject is about user-oriented exploration of semi-structured data sources. More precisely, my thesis is two-fold: (a) we produce compact and expressive descriptions out of any data source of any semi-structured data model, i.e., JSON and XML documents, RDF and Property Graphs; (b) we enumerate interesting paths connecting Named Entities, e.g., people, companies and places, in and across heterogeneous datasets. I developed Abstra, a software building such descriptions, and PathWays for entity-focused path enumeration. Abstra relies on the ConnectionLens project, a software which aims at building a single graph from different data sources (structured, semi-structured or unstructured). My thesis work is also a collaboration with WeDoData, a SME specialized in data visualization. They give us data to work with and many use-cases as well as their feedback as ConnectionLens users. Before my thesis, I graduated from Université Lyon 1 where I obtained my Master in Artificial Intelligence. Finally, I wish to beome an associate professor in fields I appreciate: heterogeneous data management, database systems, data integration, GIS (geographic information system) and coding. Dynamic and hard worker, I'm always ready to take on new challenges!

My curriculum (last update: Oct. 2023)

photo
ORCID

Education


PhD student in computer science

2020 - 2023
INRIA Saclay and Ecole Polytechnique, Palaiseau
Supervised by Ioana Manolescu
  • PhD title: "User-oriented exploration of semi-structured datasets"
  • Main achievements: Entity-Relationship-like summaries out of semi-structured datasets; entity path enumeration between entities of interest in heterogeneous data
  • Attending conferences: BDA 2021, 2022, 2023; CIKM 2021; EDBT/ICDT 2023, 2024; ESWC 2023, CoopIS 2023, EGC 2024, among others
  • Giving talks about my thesis and the research to different audiences; talks I would like to highlight are:
    • "Semi-structured data user exploration" - 10/2023, LaHDAK team @ LISN, Université Paris Saclay
    • "Artificial Intelligence: a tool for investigative journalism" - 7/2023, CFI (French media development agency)
    • "From data to journalism" - 6/2023, Ecole Polytechnique students
    • "Computing Generic Abstractions from Application Datasets" - 6/2023, CEDAR team @ Inria Saclay
    • "Doing research in data integration" - 2/2022, RJMI (young female mathematicians and computer scientists meeting)
    • "Understand data intelligently" - 2/2022, middle school pupils
  • Teaching:
    • "Object-Oriented Programming in Java" labs for 1st year engineer students at Ecole Polytechnique
    • "Basic algorithms in Java" labs for 2nd year engineer students at Ecole Polytechnique
    • "Machine Learning in Python" labs for 2nd year Bachelor students at Ecole Polytechnique
  • Scientific training: HiParis summer school 2021, MDD summer school 2022, HiParis summer school 2023
  • Transversal training: "Teaching at university", "Public speaking: captivate, convince and unite your audience" and english courses

Computer Science Master, Artificial Intelligence track

2018 - 2020
Université Lyon 1, Villeurbanne
  • Artificial intelligence: machine learning, multi-agents systems, neural networks
  • Web programming (Web and mobile applications): HTML, CSS, Javascript
  • Databases: MySQL
  • Programming languages: Python, Java, C/C++
  • Project management and software engineering: AGILE methods
  • Others: introduction to networks and cryptography

Computer Science Bachelor

2015 - 2018
Université Lyon 1, Villeurbanne
  • Web programming: HTML, CSS, Javascript, Bootstrap
  • Databases: MySQL
  • Programming languages: C/C++, Java, Unix, Scheme, Prolog
  • Mathematics: algebra, statistics, optimisation
  • Others: introduction to networks, formal logic, physics and chemistry

Scientific Baccalaureate - with honours

2012 - 2015
Lycée Blaise Pascal, Charbonnières-les-bains
  • Earth and Life Sciences track, with Computer Science option

Publications and research projects


Check my publications on my ORCID record and my DBLP entry!
Research tools

Work experience


Short-term contract: comprehension of complex objects in ConnectionLens graphs

October 2020 - December 2020
INRIA/École Polytechnique, Palaiseau
Supervised by Ioana Manolescu

ConnectionLens is a data integration tool which is generic for a large number of data formats and efficient.
  • Study the actual software and suggest some refactoring operations to get a more distributable code
  • Study the actual classification process and suggest new ideas to have a more efficient process
Java graphs knowledge extraction

Internship student: Predicting the environment of a neighbourhood with Predihood

February 2020 - July 2020
LIRIS, Villeurbanne
Supervised by Fabien Duchateau and Franck Favetta

Internship in collaboration with the HiL (Home in Love) startup.
  • Write a state-of-the-art about prediction techniques
  • Propose solutions to scientifc challenges: an algorithm for selecting a set of the top-k relevant indicators about neighborhoods, taking in account the dsitribution of the indicators to help the prediction
    • Select relevant indicators by using several selection techniques
    • Produce a set of relevant indicators to be used by the prediction process
    • Compare automatically the distribution of indicators to get better results
  • Predict our environment variables (e.g. the landscape, the activity or the wealth of a neighborhood)
    • Tune several prediction algorithms, such as KNN, Random Forest, AdaBoost
    • Predict the six environment variables of any neighborhood in France
  • Develop a cartographic visualization interface for neighborhoods
  • Developv an interface for generic tuning of prediction algorithms
  • Popularize the work to present it to different auditors (project members, students)
Internship report (in French)
variable selection prediction algorithms experimental evaluation Python

Factory worker

August 2019
Métaldyne (AAM), Vénissieux
  • Automobile pulley control
summer job

Student project: Matching and merging geographic entities with GeoAlign

Januray 2019 - June 2019
LIRIS, Villeurbanne
Supervised by Fabien Duchateau and Franck Favetta

  • Integrate heterogeneous catographic data from different data providers: Geonames, Bing, Here and Open Street Maps
  • Create a unique schema from the individual schema of each data provider
  • Propose a tunable formula for detecting correspondances, in terms of attributes (name, address, phone number, ...) and measures (Levenshtein or Jaro-Winkler for strings, Haversine or self-defined measures for geographical attributes
  • Estimate the matching quality in an automatic way
  • Merge correspondances according to different strategies (e.g. random, data provider first, quality first)
  • Develop an interface for matching and merging corespondances between points of interest (geographic entities) with an estimation of the quality
Javascript PHP mathcing and merging algorithms similairty measures

Internship student: Real estate recommandations oriented to the user wishes with VizLiris

May 2019 - July 2019
LIRIS, Villeurbanne
Supervised by Fabien Duchateau and Franck Favetta

Internship in collaboration with the HiL (Home in Love) startup.
  • Write a state-of-the-art about recommandation techniques
  • Integrate data from heterogeneous sources (Excel, JSON, GeoJSON...)
  • Use prediction algorithms to recommend neighborhoods
  • Use clustering algorithms to classify neighborhoods
  • Develop an interface for facilitating comparison and recommandation of neighbourhoods in France
  • Write the sections Comparaison de quartiers and Scénarios d’utilisation in a scientific article
Internship report (in French)
Javascript Python recommendation algorithms

Contractual for integration week

Summers 2017, 2018 & 2019
Université Lyon 1, Villeurbanne
  • Animate back-to-school amphiteathers and workshops to help new students to discover the university
  • Manage students registrations
public speaking workshop management

Shop assistant

Summers 2016 & 2017
Bershka, Lyon
  • Maintain the store (store, cabins, stocks), inform customers, manage Internet orders and deliveries
multi-task customer help

Gymnastics coach

2010 - 2017
Le Cran, Tassin
  • Lead a gymnastics class for 22 young gymnasts (between 6 and 11 years old) at a rate of 2 hours / week
  • Participate in annual events and competitions
  • Participate in community life (trainings, meetings, etc.)
  • Train in animation and gymnastic skills (AF2 formation in 2015, UFF formation in 2014, AF1 formation in 2012)
volunteering sports

Skills


Programming languages

Python, Java, C/C++, PHP, Prolog, (Scheme)

Frameworks

Scikit-learn, Leafet, Bootstrap

Web

HTML, CSS, Javascript

Databases

PostgreSQL, MySQL, (MongoDB)

Tools

Git, JMerise

Others

LaTeX, Microsoft Office

Languages

French (native), English (professional competency)

Certifications

Google Digital Active certification, C2I

Leisures