ML Engineer • Data Scientist • PhD Researcher ❄

Stefan Seman

Building production-ready AI systems that ship, scale, and deliver impact

Building AI solutions with LLMs, RAG, and Generative AI

0 Publications
0 Citations
0 Years in Data/ML
Solutions 404 logo
AI + Biomedicine Research-driven engineering
ML engineering icon Research icon 3D robot icon Project visuals icon
Stefan Seman profile picture
PhD Researcher

Hello.

I'm an ML Engineer, Data Scientist, and PhD Researcher with a strong background in higher education and published research. I specialize in machine learning, AI, generative AI, and software development, with unique expertise bridging data science and biomedicine. I'm an analytical thinker with a pragmatic problem-solving approach and excellent communication skills.

LLMs RAG Systems Software Development Document Intelligence

Featured Projects.

Showcasing my work in ML/AI, Software Engineering, and Cloud

Generative AI

AI-Powered Document Summarizer

As the ML/AI engineer, I developed a document summarizer leveraging RAG, Large Language Models (LLMs), and NLP techniques. Delivered a scalable, high-accuracy solution for extracting insights from unstructured text data.

Python AWS RAG Docker
Generative AI

AI-Code Copilot

Built from scratch for enterprise infrastructure integration. Designed architecture, implemented RAG techniques, and fine-tuned LLMs. Handled data collection, extraction, transformation, and analysis.

Python Redis RAG AI Agents
ML/AI

Document Chatbot

Enterprise chatbot application with optimized data ingestion pipelines and document retrieval. Evaluated and fine-tuned LLMs for improved performance and accuracy.

Python FastAPI Azure Search Azure Functions
Cloud / AWS

Large-Scale Document ETL Pipelines

Built large-scale document ingestion and transformation pipelines using AWS services. Implemented document load, parse, transform, and S3 load workflows for resilient processing.

Python AWS Lambda Step Functions DynamoDB
Data Engineering

Auto Market Data Platform

Ingested auto market data from databases and web scraping, processed it in Databricks, and loaded curated datasets for analytics.

Python Azure Docker Data Lake
Research

SILICOFCM - EU Horizon 2020

Development of computational platform for clinical trials. Conducted ECG and CPET analysis, statistical analysis, and published research in peer-reviewed journals.

Python PyTorch Data Analysis Biostatistics

Career Journey.

Core competencies and growth across roles

Oct 2024 - Present

Senior Machine Learning Engineer & Software Engineer

404 Solutions

  • Development: Designed and operated multi-tenant AI systems for enterprise clients
  • Document Intelligence: Delivered extraction, classification, and summarization solutions
  • Cloud Delivery: Built AWS workflows (S3, Step Functions, Bedrock, APIs) for multi-org deployments
Python AWS Snowflake FastAPI RAG
Oct 2023 - Sept 2024

Senior Data Scientist & Software Engineer in AI, R&D

Publicis Sapient

  • Lead Developer: Led development from POC to fully production-grade systems
  • Engineering: Built enterprise RAG systems with retrieval optimization
  • Stakeholder Delivery: Delivered client-facing solutions and technical propositions
Python Azure Snowflake AI Agents Transformers
May 2022 - Oct 2023

Data Scientist & Developer → Senior Specialist

Valcon Netherlands

  • Team Leadership: Led and mentored a team of 6 engineers
  • Architecture: Owned design of document processing microservices
  • Innovation: Developed PoC systems for conversational AI and NLP integration
Python Azure Snowflake NLP ML Team Leadership
Oct 2021 - Apr 2022

Data Scientist

Optimal Systems GmbH

  • Experimentation: Systematic model evaluation and hyperparameter tuning
  • Analysis: Data-driven insights to support development decisions
  • Quality: Testing and validation of ML classification systems
Python Docker ML Azure
Mar 2020 - Mar 2022

Researcher

BioIRC - Bioengineering R&D Center

  • Platform Development: Built components for EU Horizon 2020 computational platform
  • Signal Analysis: ECG, CPET, and biomedical data processing
  • Publication: Peer-reviewed papers and international conference presentations
Python Biostatistics ECG CPET
Oct 2018 - Mar 2022

PhD Researcher

University of Belgrade

  • Research Design: Developed methodology and statistical frameworks
  • ML Application: Applied machine learning to biomedical datasets
  • Academic Output: Published papers and presented internationally
PyTorch Machine Learning Biostatistics
Oct 2014 - July 2018

Sports Scientist

Sport and Recreation Center Obliqus

  • Development: Built algorithms and services for signal processing and data collection
  • Diagnostics: Functional testing and performance analysis for athletes
  • Reporting: Program evaluation and evidence-based recommendations
Python LabVIEW SPSS

Publications & Research.

Academic contributions and research work

A Machine Learning Approach to Gene Expression in Hypertrophic Cardiomyopathy

Pavic, J., Zivanovic, M.N., Seman, S., et al.

Pharmaceuticals (MDPI), 2024

Analysis of apoptosis-regulating gene expression in HCM patients using machine learning for feature importance and clustering analysis, identifying potential biomarkers for disease progression.

Ventilatory Efficiency Parameters Outperform Peak VO2 in Monitoring Therapy Effects in HCM

Seman, S., Tesic, M., Popovic, D., et al. (SILICOFCM Investigators)

Progress in Cardiovascular Diseases, Vol. 87, 2024

Study identifying CPET parameters that most accurately reflect therapeutic efficacy in patients with hypertrophic cardiomyopathy over 16-week treatment and 36-month follow-up.

Defining the Importance of Stress Reduction in Managing Cardiovascular Disease - The Role of Exercise

Popovic, D., Bjelobrk, M., Tesic, M., Seman, S., et al. (HL-PIVOT Collaborative)

Progress in Cardiovascular Diseases, Vol. 70, pp. 84-93, 2022

Examines psychological and social factors influencing cardiovascular health, demonstrating that structured exercise programs enhance resilience to stress and reduce CVD risk. Cited 60+ times.

Physical activity and exercise as an essential medical strategy for the COVID-19 pandemic and beyond

Seman, S., Srzentic Drazilov, S., et al. (HL-PIVOT Network)

Experimental Biology and Medicine, Vol. 246, Issue 21, 2021

Highlights how physical activity supports psychological, social, and physical health during the COVID-19 pandemic, and outlines exercise-driven strategies to strengthen immune response and preventive care.

0 Publications
0 Citations

Technical Skills.

Technologies and tools I work with

Programming Languages

Python SQL LabVIEW Bash

AI & ML Frameworks

FastAPI Flask PyTorch PySpark LangChain LlamaIndex Transformers

Cloud & DevOps

AWS Azure Docker KubeFlow Git Grafana

Data & Storage

DynamoDB Redis PostgreSQL Snowflake PGVector Pinecone Milvus

AI Specializations

Generative AI RAG LLMs NLP AI Agents Fine-Tuning

Research Domains

Biomedicine CPET Sport Science ECG Cardiovascular Physiology

Education.

Ph.D. Candidate

Sport Science

University of Belgrade

2018 - Present

Research focus on machine learning applications in exercise physiology and biomedical signal processing.

Master's Degree

Sports Medicine

University of Novi Sad

Completed

Bachelor's Degree

Sport Science

University of Belgrade

Completed

Certifications

Microsoft Azure AI Associate
Microsoft Azure Data Fundamentals
Databricks Lakehouse Fundamentals
Databricks Generative AI Fundamentals
Coursera COVID-19 Contact Tracing (Johns Hopkins)

Get in Touch.

Let's collaborate on something great

I'm always interested in discussing research collaborations, data science opportunities, and innovative projects at the intersection of AI and healthcare.