CV | Sangeet Sagar

General Information

Full Name	Sangeet Sagar
Languages	Hindi (Native), English (C1), German (A2)

Education

April 2023
M.Sc.

Universität des Saarlandes, Saarbrücken, Germany
- GPA- 1.8 (ECTS, lower is better)
- Thesis- Noise Robust Speech Recognition for Search and Rescue Domain
- Major- Language Science and Technology
June 2019
B.Tech.

The LNM Institute of Information Technology, Jaipur, India
- GPA- 7.13/10.0
- Thesis- Analysis of Emotion Recognition using Speech Features
- Major- Electronics and Communication Engineering

Experience

Sept 2023 - July 2025
Research Engineer for Automatic Speech Recognition

EML Speech Technology GmbH, Munich, Germany
- Leading the development of end-to-end models to facilitate real-time Automatic Speech Recognition (ASR) during live conferences within a commercial setting.
- Developed a C++ runtime for a streaming faster Conformer-Transducer (NeMo) and integrated its CPU-based decoder with our in-house end-to-end ASR decoder.
- Initiated and guided the integration of a target speaker extraction system into the core ASR pipeline, improving WER by 18% on overlapping speech and enabling deployment in challenging multi-speaker scenarios.
May 2023 - Sept 2023
Speech-to-Text Intern

Airbus Defence and Space GmbH, Munich, Germany
- Utilized SOTA models such as the Wav2Vec2 and Whisper ASR models to enhance communication between pilots and air traffic control (ATC) by developing state-of-the-art speech-to-text systems for aerospace domain data.
June 2021 - Feb 2023
Research Assistant | HiWi

German Research Center for Artificial Intelligence (DFKI) GmbH, Saarbrücken, Germany
- Designed and developed a noise-robust automatic speech recognition system (STT) (German language) as a component of MS thesis, enabling functionality under hostile noisy conditions such as search and rescue operations.
- Trained open-source attention-based BiRNN punctuation restoration system+TruCasing for the German language. The system outperformed the baseline model- Vosk model by over 14% in recall metric.
Oct. 2019 - Dec 2020
University research assistant

Institute of Formal and Applied Linguistics, Charles University, Prague, Czechia
- Served as the principal tester and evaluator for a live speech-language translation (SLT) system (ELITR project), identifying key failure points and providing critical feedback.
- Training and testing (with in-domain/out-of-domain data) Czech punctuator system for live ASR, thereby improving the usability of live ASR transcripts.
Feb. 2019 - Sept. 2019
University research assistant

Faculty of Information Technology, Brno University of Technology, Brno, Czechia
- Developed and implemented a novel system for cross-lingual topic identification in low-resource languages (Kinyarwanda, Zulu, Hindi), achieving a weighted average precision of 0.52 on Kinyarwanda by building upon a linear transformation technique to English embedding space.
- Managed core tasks including text feature extraction, classifier training, and embedding generation for cross-lingual analysis.

Open Source Projects

2022
English and Chinese poetry generation
- English poetry generation and comparison (using PPL as evaluation metric) using LSTM, encoder-decoder based models, and transformer model (GPT). We conclude that fine-tuning GPT-2 model generated the highest quality poems i.e. with the least PPL score. Also trained a topic-prediction model to study how well a machine-generated poem is interpreted by a system trained on human-written poems. Report | Code
2022
Span extraction based slot-filling using attention and RNNs
- Performed slot-filling (and intent recognition) using RNN on a multi-head attention mechanism. The main idea was to model the slot-filling task as a span extraction and to utilize available information about slot type for which value is to be provided. F1 score of 0.83 was achieved compared to 0.96 of the baseline model. Report | Code
2022
Evaluating and defending against stealthy backdoor attacks
- Presented defense strategies that counter backdoor attacks. These defenses significantly decrease the attack success rate (by 77%) on specific samples designed by the attacker. We do this by transforming each input such that the trigger words get replaced in the input, and the attack is not triggered. These defenses can be used without any significant runtime costs. Report
2021
Out-of-vocabulary (OOV) word estimation using subword representation
- Achieved a better OOV rate and perplexity score than the baseline for three levels of granularity (char level, small, large vocab) with appropriate hyperparameter tuning. This was done by training RNN based language model to artificially generate corpus and compute OOV rate on varying sizes of the generated corpora. Report | Code
2021
LSTM Based Parts-of-Speech Tagger
- Utilized LSTM-based models with word embeddings that use sub-word information to perform POS tagging. It included LSTM models and Bi-LSTM models and their comparison with a bigram HMM-based parts-of-speech tagger that leverages Viterbi algorithm. Report | Code
2019
Low Resource Languages for Emergent Incidents (LORELEI)
- Performed topic identification based on multinomial sub-space models (for the DARPA funded Lorelei program) on low resource languages. Achieved a remarkable weighted average precision score of 0.5212 (on 10 different topics) on Kinyarwanda language. PPT | Code
2018
(Bachelors Thesis) Analysis of Emotion Recognition using Speech Features
- 7% improvisation over baseline on the classification of speech signals based on human emotions like angry, disgust, fear, happy, etc. Implemented on SAVEE and Emo-DB datasets using classifiers like GMM, CNN, MLPNN, we propose the use of feature extraction algorithms like S-Transform and image spectrogram of the speech signal to perform emotion recognition. Report | Code

Programming Skills

Programming Languages: Python, C++, Bash, MATLAB
Libraries/Frameworks: PyTorch, K2/Icefall, SpeechBrain, Huggingface
Tools & Platforms: Docker, Git, AWS, HPC (SLURM), Adv. Linux user

General Information

Education

M.Sc.

Universität des Saarlandes, Saarbrücken, Germany

B.Tech.

The LNM Institute of Information Technology, Jaipur, India

Experience

Research Engineer for Automatic Speech Recognition

EML Speech Technology GmbH, Munich, Germany

Speech-to-Text Intern

Airbus Defence and Space GmbH, Munich, Germany

Research Assistant | HiWi

German Research Center for Artificial Intelligence (DFKI) GmbH, Saarbrücken, Germany

University research assistant

Institute of Formal and Applied Linguistics, Charles University, Prague, Czechia

University research assistant

Faculty of Information Technology, Brno University of Technology, Brno, Czechia

Open Source Projects

English and Chinese poetry generation

Span extraction based slot-filling using attention and RNNs

Evaluating and defending against stealthy backdoor attacks

Out-of-vocabulary (OOV) word estimation using subword representation

LSTM Based Parts-of-Speech Tagger

Low Resource Languages for Emergent Incidents (LORELEI)

(Bachelors Thesis) Analysis of Emotion Recognition using Speech Features

Programming Skills