I am Shivasai

Machine Learning Engineer / Data Scientist

Name: Shivasai B

Profile: Machine Learning Engineer / Data Scientist

Email: shivasaib05@gmail.com

About Me

I’m a Passionate Data Scientist / Machine Learning Engineer and a recent graduate with Master of Science in Computer Science (Data Science Specilization) from the University of Central Missouri with a 4.0 GPA.

I have 3.5+ years of professional experience as Machine Learning Engineer.

My expertise includes Data Analysis, Statistical Modelling, Predictive Analytics and Machine Learning by incorporating data from disparate sources to draw actionable insights from a given data.

To learn more or connect on an opportunity, don’t hesitate to reach out.


Skillset
Languages
Python 85%
SQL 80%
Machine Learning 80%
Statistics 80%
Skills

  • Python - NumPy, Pandas, Scikit-learn, NLTK, OpenCV, PySpark
  • Python (Visualization) - Matplotlib, Seaborn, Plotly
  • Machine Learning - Linear and Non-Linear Regression, Logistic Regression, KNN, SVM, PCA, Decision Tree, Random Forest, AdaBoost, Gradient Boost, XgBoost, K-Means, Naive Bayes
  • Statistics - Hypothesis Test, PCA,A/B Testing, Chi-Square
  • SQL - Oracle 11g, PostgreSQL, SQLite3, MS SQL
  • IDE - PyCharm, Anaconda, Visual Studio, Eclipse, IntelliJ
  • Big Data - Hadoop (HDFS, MapReduce), Spark, Hive
  • Tableau
  • Cloud Technologies - AWS (EC2, S3, SageMaker), GCP, Heroku
  • Tools - Putty,Winscp, Git, GitHub,Soap UI, HP-QC ALM, Oracle Siebel CRM

3.5

YEARS OF EXPERIENCE

12

PROJECTS
Experience

NTT DATA SERVICES
Machine Learning Engineer Dec 2020 - Present
Dallas,TX,USA

    Responsibilities:
  • Predicted credit card churn customers using Random Forest Classifier and other ensemble ML algorithms.
  • Hands-on experience with Hadoop eco-system components like HDFS, Spark (Mlib) and Hive.
  • Detected anomalies/outliers in credit card transactions using Isolation Forest (unsupervised algorithm).
  • Financial domain knowledge with practical application of various ML models.
  • Collaborated with different team members for building new ML projects and knowledge transfer.

Tata Consultancy Services (TCS)
Machine Learning Engineer Feb 2016 - Jan 2019
Hyderabad, India

    Saudi Telecom Company Project

    Description:
    The main goals of the project is to build machine learning models to predict customer churn to improve the customer retention rate and customer segmentation to enchance targeted marketing.

    Responsibilities:
  • Examined the datasets, addressed data quality, tidiness issues, and then trimmed & cleaned the dataset for analysis.
  • Merged the datasets by extracting only the useful metrics for the analysis and performed EExploratory Data Analysis over Data.
  • Built ML Models for predicting customer churn using xgboost algorithm and increased the overall customer retention rate by 10%.
  • Handled the uncertainty in data like outliers and missing values using statistical analysis methods.
  • Developed ML models using clustering (unsupervised) methods that improved the customer segmentation and targeted marketing.
  • Implemented machine learning concepts of supervised and unsupervised algorithms to generate valuable business insights.
  • Responsible for the improvement of model performance by hyper-parameters tuning.
  • Worked on External tables and wrote complex queries for data extraction using SQL on Oracle 11g relational database management system.
  • Built pipelines by sequentially applying a list of transformers (data modelling) and then a final estimator (ML model).
  • Worked on both structured and unstructured data by implementing machine learning techniques to generated business insights.

Education

University of Central Missouri
Master of Science in Computer Science (Data Science Specialization) Jan 2019 - May 2020
GPA 4.0/4.0

  • Coursework: Machine Learning, Artificial Intelligence, Data Mining, Big Data: Storage, Analytics, and Visualization, Advanced Algorithms, Database Theory and Applications, Advanced Database Systems, Advanced Applications Programming in C# and .NET, Compiler Design and Construction, Advanced Operating Systems

Jawaharlal Nehru Technological University Hyderabad
Bachelor of Technology in Electronics & Communication Engineering Aug 2012 - May 2015

Projects

Heart Disease Prediction

  • Developed a model using Deep Forest (Cascade Random Forests), Naive Bayes and SVM to predict presence of heart disease for a given patient record.
  • Handled the missing values and outliers in the data using statistical methods.
  • Measured the model performance among the algorithms and then optimized the model to enhance performance.
  • This model helped in predicting the risk of heart disease by up to 80%.

Bank Customer Churn Prediction

  • Built a model to predict bank customer churn using Random Forest algorithm in python.
  • Converted the categorical data type attributes to numeric using One-Hot Encoding and performed feature engineering.
  • The model is able to classify which customer customer will churn with an accuracy of 86.25%.

Used Car Price Prediction

  • Predicted the selling price of an used car by given various input features like car purchased year, gas type etc,.
  • Performed Exploratory Data Analysis using Pandas and removed irrelavent features.
  • Developed and Deployed this ML model into the Heroku Cloud Platform With the help of flask framework.

NLP : Twitter Sentiment Analysis

  • Implemented concepts of Tokenization, Stemming, Lemmatization and Bag of Words on tweets dataset.
  • Performed sentiment analysis on the tweets using NLTK library.
  • Categorized the tweets into positive & negative to analyze the performance of brands on social media and provides powerful opportunities for its improvement.

SQL Database Optimization: Shopify App

  • Optimized a large database with 300k records by applying different indexing techniques.
  • Choosed the best indexing technique based on query processing time.
  • Improved the overall SQL query performance.

Geocoding Locator

  • Created an app in python to display latitude, longitude, and formatted address using Geo-coding API given an address.
  • Decoded the address search which is in UTF-8 and converted it to JSON and extracted information from the data in JSON.