CV

General Information

Full Name Sangeet Sagar
Date of Birth March 1997
Languages English, Hindi, German (A1.1)

Education

Experience

  • 2021-2023
    Research Assistant
    Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Saarbrücken, Germany
    • As a part of MS thesis, built a noise robust automatic speech recognition system (STT) (German language) capable to work under hostile conditions like search and rescue missions. Also, train RNN based large language models for speech recognition.
    • Trained open-source attention-based BiRNN punctuation restoration system+TruCasing for the German language. The system outperformed the baseline model- Vosk model by over 14% in recall metric.
  • 2019-2020
    University research assistant
    Institute of Formal and Applied Linguistics, Charles University, Prague, Czechia
    • Principle tester of the live speech-language translation (SLT) system at live test sessions for the ELITR project
    • Training and testing (with in-domain/out-of-domain data) Czech punctuator system for live ASR.
    • Perform domain adaptation for live ASR by fine-tuning LM by using speaker-specific data.
  • 2019
    University research assistant
    Faculty of Information Technology, Brno University of Technology, Brno, Czechia
    • Developed a system for cross-lingual topic identification in low resource scenarios for Kinyarwanda, Zulu, Hindi and Reuters corpus.
    • Achieved a remarkable weighted average precision score of 0.52 (on 10 different topics) on Kinyarwanda language by learning a simple linear transformation from this language to English language space.

Open Source Projects

  • 2022
    English and Chinese poetry generation
    • English poetry generation and comparison (using PPL as evaluation metric) using LSTM, encoder-decoder based models, and transformer model (GPT). We conclude that fine-tuning GPT-2 model generated the highest quality poems i.e. with the least PPL score. Also trained a topic-prediction model to study how well a machine-generated poem is interpreted by a system trained on human-written poems. Report | Code
  • 2022
    Span extraction based slot-filling using attention and RNNs
    • Performed slot-filling (and intent recognition) using RNN on a multi-head attention mechanism. The main idea was to model the slot-filling task as a span extraction and to utilize available information about slot type for which value is to be provided. F1 score of 0.83 was achieved compared to 0.96 of the baseline model. Report | Code
  • 2022
    Evaluating and defending against stealthy backdoor attacks
    • Presented defense strategies that counter backdoor attacks. These defenses significantly decrease the attack success rate (by 77%) on specific samples designed by the attacker. We do this by transforming each input such that the trigger words get replaced in the input, and the attack is not triggered. These defenses can be used without any significant runtime costs. Report
  • 2021
    Out-of-vocabulary (OOV) word estimation using subword representation
    • Achieved a better OOV rate and perplexity score than the baseline for three levels of granularity (char level, small, large vocab) with appropriate hyperparameter tuning. This was done by training RNN based language model to artificially generate corpus and compute OOV rate on varying sizes of the generated corpora. Report | Code
  • 2021
    LSTM Based Parts-of-Speech Tagger
    • Utilized LSTM-based models with word embeddings that use sub-word information to perform POS tagging. It included LSTM models and Bi-LSTM models and their comparison with a bigram HMM-based parts-of-speech tagger that leverages Viterbi algorithm. Report | Code
  • 2019
    Low Resource Languages for Emergent Incidents (LORELEI)
    • Performed topic identification based on multinomial sub-space models (for the DARPA funded Lorelei program) on low resource languages. Achieved a remarkable weighted average precision score of 0.5212 (on 10 different topics) on Kinyarwanda language. PPT | Code
  • 2018
    (Bachelors Thesis) Analysis of Emotion Recognition using Speech Features
    • 7% improvisation over baseline on the classification of speech signals based on human emotions like angry, disgust, fear, happy, etc. Implemented on SAVEE and Emo-DB datasets using classifiers like GMM, CNN, MLPNN, we propose the use of feature extraction algorithms like S-Transform and image spectrogram of the speech signal to perform emotion recognition. Report | Code

Programming Skills

  • Languages: Python, Unix, MATLAB, R
  • Toolkits/Libraries: PyTorch, SpeechBrain , GPT, Scikit-learn, Numpy, Huggingface, Matplotlib
  • Misc.: GIT, Advanced Linux user, Extensive experience with cluster computing like SLURM