Umid Suleymanov

My Resume

Experience

2024 - Present

Graduate Research Assistant, Virginia Tech, USA

Researching and implementing meta learning and few shot learning for preparing defenses against zero-day attacks. Built a prototypical neural network for few shot learning.
Published a paper on Deep Unlearning of Breast Cancer Histopathological Images for Enhanced Responsibility. Researching Machine Unlearning techniques for deep learning models which allow certain data points to be forgotten after the model has been trained for privacy concerns.
Working on Membership Inference Attacks (MIA) to measure the privacy leakage of various machine learning models.
Contributing to the development of AI systems that are ethical, reliable, and secure.

2022 - 2023

Senior Data Scientist @ E-Gov Development Center

Developed sentiment analysis model for measuring user satisfaction, which reduced 12 hours per week of manual labeling and analysis time.
Built a machine learning model to predict customer waiting time and notify users in the queue leading to a reduction of 16% in queue-related complaints.

2019 - 2022

Leading Data Scientist @ E-Gov Development Center

Generated pretrained word embeddings and applied them to sentiment analysis. Worked with a text corpus of 164 million tokens. Improved the accuracy of previous model by 4%.
Strategized full machine learning lifecycle: predictive models, development, new ideas inducing, proof of concepts, implementation in the production environment, monitoring.
Improved the process speed by 32% by training a machine learning model to detect and anonymize certain special nouns. Modeled the problem as a named entity recognition task.

2018 - 2019

Data Engineer @ E-Gov Development Center

Identified ways to improve data reliability, efficiency, and quality. Mainly queue and NLP datasets.

Education

2024 - 2028

Virginia Tech University

PhD in Computer Science and Applications

Graduate Research Assistant

2019 - 2021

Khazar University

Master of Science in Computer Science

First Class Honors Degree (top 1%) / GPA: 4/4.

2014 - 2018

ADA University

Bachelor of Science in Computer Science

Activities and societies: ACM ICPC North-Eastern Regional Contest finals.
Senior Design Project: Text Classification for Azerbaijani News Articles Using Machine Learning Approaches.

2013 - 2014

ADA University

English for Academic Purposes

Graduated from English for Academic Purposes at ADA University.

Skills

NLP, Tabular Data

Python

Tensorflow, Keras

Pandas, scikit-learn, Numpy

Docker, FastAPI, Flask

SQL, BigQuery, Spark

Languages

Azerbaijani - Native

English - Fluent

Turkish - Fluent

My Publications

Training and Evaluation of Word Embedding Models for Azerbaijani Language

Recently, natural language representation models have attracted an increasing amount of attention from researchers. Various approaches have been proposed for learning these continuous vector representations. In this work, we will analyze the effectiveness of various word embeddings learning approaches for Azerbaijani language. Mainly, we will concentrate on two methodologies: (1) word2vec and (2) GloVe. We have trained both models on the text corpus of cleaned, Azerbaijani news articles and parsed books. Moreover, we have created intrinsic analogy tasks as introduced by Mikolov et al. for Azerbaijani. For the evaluation of word vector models in Azerbaijani, the intrinsic analogy tasks, as well as, two separate extrinsic evaluation tasks are performed. This work is one of the initial reports on the evaluation of word embeddings on intrinsic as well as extrinsic evaluation tasks for Azerbaijani, which is a low resource, agglutinative language.

Text Classification for Azerbaijani Language Using Machine Learning

Text classification systems will help to solve the text clustering problem in the Azerbaijani language. There are some text-classification applications for foreign languages, but we tried to build a newly developed system to solve this problem for the Azerbaijani language. Firstly, we tried to find out potential practice areas. The system will be useful in a lot of areas. It will be mostly used in news feed categorization. News websites can automatically categorize news into classes such as sports, business, education, science, etc. The system is also used in sentiment analysis for product reviews. For example, the company shares a photo of a new product on Facebook and the company receives a thousand comments for new products. The systems classify comments like positive or negative. The system can also be applied in recommended systems, spam filtering, etc. Various machine learning techniques such as Naive Bayes, SVM, Multi-layer Perceptron have been devised to solve the text classification problem in Azerbaijani language.

Sentiment Polarity Detection in Azerbaijani Social News Articles

Text classification field of natural language processing has been experiencing remarkable growth in recent years. Especially, sentiment analysis has received a considerable attention from both industry and research community. However, only a few research examples exist for Azerbaijani language. The main objective of this research is to apply various machine learning algorithms for determining the sentiment of news articles in Azerbaijani language. Approximately, 30.000 social news articles have been collected from online news sites and labeled manually as negative or positive according to their sentiment categories. Initially, text preprocessing was implemented to data in order to eliminate the noise. Secondly, to convert text to a more machine-readable form, BOW (bag of words) model has been applied. More specifically, two methodologies of BOW model, which are tf-idf and frequency based model have been used as vectorization methods. Additionally, SVM, Random Forest, and Naive Bayes algorithms have been applied as the classification algorithms, and their combinations with two vectorization approaches have been tested and analyzed. Experimental results indicate that SVM outperforms other classification algorithms.

Automated News Categorization using Machine Learning methods

Being one of the most linguistically rich languages, Azerbaijani has been researched less in the context of natural language processing area. The text corpus created from Azerbaijani news articles is designed to apply supervised machine learning approaches for the case of automatic news labeling. Chi-squared test and LASSO methods have been implemented for feature selection and pre-processing. The application of supervised machine learning approaches to the text corpus allowed us to compare the performance results of well-established supervised machine learning approaches in the domain of Azerbaijani language.

Empirical Study of Online News Classification Using Machine Learning Approaches

The developed text classification system is designed to automate news classification process. Text corpus for training and testing the system is formed from Azerbaijani news articles. The system will be useful for online news categorization and automatic news labeling for news agencies. Naive Bayes, Support Vector Machines, and Artificial Neural Networks have been implemented to solve the news classification problem. Moreover, a number of approaches such as stemming, stop word removal, feature reduction have been implemented for both performance and accuracy improvements.

Instance Segmentation of Handwritten Text On Historical Document Images Using Deep Learning Approaches

Handwritten text segmentation is one of the essential initial stepsfor higher-level document processing tasks such as text recognition.The topic of handwriting segmentation and recognition of archivedocuments, especially with the presence of printed characters, hasbeen encountered rarely in academic research. In this paper, we triedto address the task of segmentation and at later stage recognitionof handwritten archive documents to purify and extend informationdatabases from the past years. In our case, we defined the problemas an instance-based image segmentation task, for which we proposetwo different methods; one-stage and two-stage architectures.

Latest Blogs

MLOps-with-MLflow

By: Umid

In this project, I aim at developing an end-to-end Machine Learning project, from problem formulation to model deployment and monitoring in production environment. MLflow will be utilized for experiment tracking and model registry managment, for containerization of the model, Docker will be utilizied. A machine learning model goes throgh different phases in its lifecycle from requirements elicitation to monitoring in production. There are varius industry standarts describing the exact phases and their outcomes, such as CRIISP-DM, Microsft Team Data Science Process and etc. Nonetheless, a typical machine learning process involves business understanding, requiremnts eliciation, explorotary data analysis, data transformation, feature engineering, modeling, experiment tracking, evaluation, productionaizing, and at last monitoring and continuos improvements.

Song Recommendation with Spark

By: Umid

A Spark project built for recommending songs to users based on their song listening history log files. The project has been realized using Spark and Spark MLlib.
Song Recommendation Recommending new songs to users based on user log history. The system has been developed using implicit feedback based on user behaviour.

Az-Wikipedia Parser and Analyzer

By: Umid

Az-Wikipedia Parser and Analyzer The project aims at cleaning, parsing and analysing Az Wikipedia. The distribution of properties inside article templates, categories and most used external references outside Wikipedia have been analyzed. Templates in Az Wikipedia are mainly very obscure inside the body and not readly avaliable for analyisis. Manual heuristic based parsing code have been given for parsing templates.

Who am I ?

Graduate Research Assistant / Machine Learning Engineer

Personal Info

Phone : 5406050963

Address : Virginia, USA

Email : umidsulleymanov@gmail.com

GitHub : https://github.com/usuleymanov

Certificates & Achivements

TensorFlow Developer Certificate

8 Publications on ML

2x finalist @ International Data Analysis Olympiads

AzInTelecom Hackathon Winner

My Resume

Experience

2024 - Present

2022 - 2023

2019 - 2022

2018 - 2019

Education

2024 - 2028

2019 - 2021

2014 - 2018

2013 - 2014

Skills

NLP, Tabular Data

Python

Tensorflow, Keras

Pandas, scikit-learn, Numpy

Docker, FastAPI, Flask

SQL, BigQuery, Spark

Languages

Azerbaijani - Native

English - Fluent

Turkish - Fluent

My Publications

Training and Evaluation of Word Embedding Models for Azerbaijani Language

Text Classification for Azerbaijani Language Using Machine Learning

Sentiment Polarity Detection in Azerbaijani Social News Articles

Automated News Categorization using Machine Learning methods

Empirical Study of Online News Classification Using Machine Learning Approaches

Instance Segmentation of Handwritten Text On Historical Document Images Using Deep Learning Approaches

Latest Blogs

MLOps-with-MLflow

Song Recommendation with Spark

Az-Wikipedia Parser and Analyzer

Phone :
5406050963

Address :
Virginia, USA

Email :
umidsulleymanov@gmail.com

GitHub :
https://github.com/usuleymanov