Resume
Data Scientist & ML Engineer · Brasília, Brazil
Experience
Machine Learning Engineer
Centro de Gestão e Estudos Estratégicos · Brasília, Distrito Federal
• Develop and deploy end-to-end data analytic applications used daily by analysts, with Python/Flask and FastAPI backends, JavaScript/HTML frontends, and Docker-based deployment pipelines with GitLab documentation for the IT department. • Lead the company's first fine-tuning project, training a specialist embedding model on neuroscience literature using QLoRA, with a classifier on top to identify domain-specific articles from OpenAlex for statistical analysis. • Built a RAG pipeline and a speaker diarization API for YouTube videos, integrating LLMs into internal backend services. • Develop state-of-the-art methodologies to identify technology emergence from patent data, combining NLP techniques such as GloVe embeddings, TF-IDF, and recurrent neural networks with network and multi-layer network analysis. • Investigate transformer internals and token embedding interactions in trained models, with ongoing research on sparse embeddings and fine-tuning performance. • Manage Linux/Ubuntu server infrastructure, including Docker image creation, multi-user container management, and Nginx configuration. • Analyze deep learning model performance on small datasets, applying data augmentation techniques in NLP contexts. • Regularly present deep mathematical analyses of model architectures to the team — including end-to-end derivations of deep learning models — bridging theoretical understanding and practical research directions.
Oct 2021 — Present
Data Science Intern
Centro de Gestão e Estudos Estratégicos · Brasília, Distrito Federal
• Analyze the performance of machine learning and deep learning models applied on small sized data sets. Sometimes in this case I use techniques such as data augmentation in the context of Natural Language Processing. • Research state of the art methods in the fields of technology emergence, machine learning, deep learning, and natural language processing. • Create and maintain data analytic applications to be used inside the company. The technologies used are JavaScript and HTML for front-end, and Flask (Python) for back-end. • Convert some web applications to desktop applications.
Feb 2021 — Sept 2021
Education
Doctor of Philosophy - PhD
University of Brasília · Aug 2023 — Present
• Research focuses on applying maximum entropy graph ensemble models and random walk dynamics to characterize the Brazilian innovation system at the individual researcher level, bridging statistical mechanics and network science. • First paper (under review): constructed an assist network connecting science, technology, and business activities from patent, publication, and company data. Applied a Maximum Entropy null model (BiCM) to filter statistically significant interactions, revealing a power-law backbone and domain-specific 3-node motif structures that characterize the Brazilian researcher profile. • Current direction: comparing random walk diffusion on real networks against maximum entropy ensemble baselines to quantify what genuine network structure adds beyond degree effects — applied to patent classification networks from the same dataset.
Master of Physics
University of Brasília · Aug 2019 — Dec 2021
• Developed reservoir computing models from scratch to forecast COVID-19 time series for Distrito Federal, building recurrent neural networks with deep mathematical grounding in dynamical systems theory and Fourier analysis. • Derived and implemented the full mathematical framework of Echo State Networks, from spectral radius conditions to memory capacity analysis.
Bachelor of Physics
University of Brasília · Jul 2014 — Aug 2019
• Analysis of a physical dynamical system. • Create algorithms to create Poincaré sections, bifurcation diagrams, and recurrence plots. • Deep understanding of the theory of dynamical systems. • Technologies used: Matlab.
Skills
Languages
Data Science
ML Engineering
AI Engineering
Data Engineering
MLOps