I have a two-part data analysis and modeling project that requires expertise in machine learning and statistical methods. Below are the details:
1. Scope: This dataset contains 500 observations with two features (x1 and x2) and a target variable (y). The task involves building a polynomial regression model and determining the optimal polynomial degree based on performance metrics.
Tasks:
a. Split the dataset into 80% training and 20% testing sets.
b. Evaluate polynomial degrees up to 4 using Leave-One-Out Cross-Validation (LOOCV). Select the degree with the lowest mean squared error (MSE) and report LOOCV errors for all degrees.
c. Refit the model using the selected polynomial degree on the training set and compute the test MSE.
d. Report the coefficients of the fitted model.
e. Test different polynomial degrees again using 5-fold cross-validation and report the MSE for each degree.
f. Refit the model with the selected degree based on 5-fold CV and compute the test MSE and R² score.
g. Compare the polynomial degrees selected using LOOCV and 5-fold CV. Are they the same?
h. Report the coefficients of the final model.
2.
Scope: Using a dataset that predicts default status, compare Logistic Regression, K-Nearest Neighbors (KNN), and Naïve Bayes models to determine the best-performing model.
Tasks:
a. Split the dataset into 80% training and 20% testing sets.
b. Train Logistic Regression, Naïve Bayes, and KNN (with different numbers of neighbors). Evaluate them using 10-fold cross-validation and select the model with the highest mean ROC-AUC score. Report the ROC-AUC for all models and settings.
c. Refit the selected model on the entire training set and compute test set metrics: ROC-AUC, accuracy, precision, and recall.
d. Create a confusion matrix.
e. Experiment with different thresholds to improve performance on "default yes" customers without compromising "default no" performance.
f. Recommend an appropriate threshold for a credit card company, considering their perspective.
Requirements for Freelancer:
Proficient in Python (preferably using libraries such as Scikit-learn, Pandas, NumPy, and Matplotlib/Seaborn) or R.
Strong understanding of machine learning models, cross-validation techniques (LOOCV, k-fold), and evaluation metrics (MSE, R², ROC-AUC, precision, recall).
Experience with polynomial regression, logistic regression, KNN, and Naïve Bayes.
Ability to clearly document code and explain results in a report format.
Deliverables include:
Source code/script file.
A detailed report with metrics, coefficients, confusion matrix, and interpretations.
Insights and recommendations based on model evaluation.
Submission Expectations:
Clean and well-commented code.
A concise and clear report summarizing methodology, results, and conclusions.
Delivery timeline and cost estimate.
If interested, please provide your qualifications, tools you’ll use, and an approximate timeline for completion.
As a professional Data Engineer with extensive experience in machine learning and statistical analysis, I am excited about tackling this project. In addition to the specified models, we can also explore advanced algorithms such as Random Forests and XGBoost to enhance the predictive performance. These models are particularly effective in handling non-linear relationships and complex interactions in data, providing an opportunity to achieve better results. I will carefully evaluate all models using robust cross-validation techniques and relevant metrics, ensuring that the final recommendation is not only accurate but also practical for your use case. Let’s collaborate to deliver a comprehensive solution tailored to your needs.
I am expert data scientist and can implement Linear Regression and KNN. I can implement ML algorithm and currently working as researcher . So It is form me a normal task but time taking.
Hello Sir/MAM
I am a skilled full stack developer. Having rich experience
in Java , C++ , C , C# , Python , Eclipse , Sql , Mysql , .Net ,Oracle , Object Oriented Programming ,
Data Structure , Algorithms .
I have a perfect grip on “Artificial Intelligence” “Automation” , and work in “Machine Learning” Deep Learning ”.
My track record as demonstrated in my 100% job completion and 5-star review rating showcases
My ability to deliver exceptional results on time and with utmost quality
I believe that my skill set makes me the ideal candidate for this project
Please come on chat we will discuss more about this
I will be waiting for your reply .
Thanks and Best Regards
esalamo alaikom
I would be delighted to offer my expertise to support your data analysis project.
Please don't hesitate to contact me for more information