CV
General Information
Full Name | Sangeet Sagar |
Date of Birth | March 1997 |
Languages | English, Hindi, German (A1.1) |
Education
-
April 2023
M.Sc.
Universität des Saarlandes, Saarbrücken, Germany
- Thesis- Noise Robust Speech Recognition for Search and Rescue Domain
- Major- Language Science and Technology
-
June 2019
B.Tech.
The LNM Institute of Information Technology, Jaipur, India
- Thesis- Analysis of Emotion Recognition using Speech Features
- Major- Electronics and Communication Engineering
Experience
-
2021-2023
Research Assistant
Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Saarbrücken, Germany
- As a part of MS thesis, built a noise robust automatic speech recognition system (STT) (German language) capable to work under hostile conditions like search and rescue missions. Also, train RNN based large language models for speech recognition.
- Trained open-source attention-based BiRNN punctuation restoration system+TruCasing for the German language. The system outperformed the baseline model- Vosk model by over 14% in recall metric.
-
2019-2020
University research assistant
Institute of Formal and Applied Linguistics, Charles University, Prague, Czechia
- Principle tester of the live speech-language translation (SLT) system at live test sessions for the ELITR project
- Training and testing (with in-domain/out-of-domain data) Czech punctuator system for live ASR.
- Perform domain adaptation for live ASR by fine-tuning LM by using speaker-specific data.
-
2019
University research assistant
Faculty of Information Technology, Brno University of Technology, Brno, Czechia
- Developed a system for cross-lingual topic identification in low resource scenarios for Kinyarwanda, Zulu, Hindi and Reuters corpus.
- Achieved a remarkable weighted average precision score of 0.52 (on 10 different topics) on Kinyarwanda language by learning a simple linear transformation from this language to English language space.
Open Source Projects
-
2022
English and Chinese poetry generation
- English poetry generation and comparison (using PPL as evaluation metric) using LSTM, encoder-decoder based models, and transformer model (GPT). We conclude that fine-tuning GPT-2 model generated the highest quality poems i.e. with the least PPL score. Also trained a topic-prediction model to study how well a machine-generated poem is interpreted by a system trained on human-written poems. Report | Code
-
2022
Span extraction based slot-filling using attention and RNNs
- Performed slot-filling (and intent recognition) using RNN on a multi-head attention mechanism. The main idea was to model the slot-filling task as a span extraction and to utilize available information about slot type for which value is to be provided. F1 score of 0.83 was achieved compared to 0.96 of the baseline model. Report | Code
-
2022
Evaluating and defending against stealthy backdoor attacks
- Presented defense strategies that counter backdoor attacks. These defenses significantly decrease the attack success rate (by 77%) on specific samples designed by the attacker. We do this by transforming each input such that the trigger words get replaced in the input, and the attack is not triggered. These defenses can be used without any significant runtime costs. Report
-
2021
Out-of-vocabulary (OOV) word estimation using subword representation
- Achieved a better OOV rate and perplexity score than the baseline for three levels of granularity (char level, small, large vocab) with appropriate hyperparameter tuning. This was done by training RNN based language model to artificially generate corpus and compute OOV rate on varying sizes of the generated corpora. Report | Code
-
2019
Low Resource Languages for Emergent Incidents (LORELEI)
-
2018
(Bachelors Thesis) Analysis of Emotion Recognition using Speech Features
- 7% improvisation over baseline on the classification of speech signals based on human emotions like angry, disgust, fear, happy, etc. Implemented on SAVEE and Emo-DB datasets using classifiers like GMM, CNN, MLPNN, we propose the use of feature extraction algorithms like S-Transform and image spectrogram of the speech signal to perform emotion recognition. Report | Code
Programming Skills
- Languages: Python, Unix, MATLAB, R
- Toolkits/Libraries: PyTorch, SpeechBrain , GPT, Scikit-learn, Numpy, Huggingface, Matplotlib
- Misc.: GIT, Advanced Linux user, Extensive experience with cluster computing like SLURM