Answered You can buy a ready-made answer or pick a professional tutor to order an original one.

QUESTION

Logistic Regression (R studio)

Use the Kaggle Credit Card Data set for this exercise.  Use 100K and the entire data set representing fraudulent and non-fraudulent data.   Use the same approach to generate test and training data sets.

1.  Perform ridge and lasso to reduce the input feature set.  Use the reduced feature set to rerun the logistic regression.  Identify the reduced input feature set.

2.  Compare with the raw logistic regression.    The total accuracy for the comparison is not a good measure.  Explain why.  Use other measures to compare the two models.

As explained in class, this credit card data set is unbalanced. Read https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf for a discussion of how to handle unbalanced data sets.

3.  Make a powerpoint presentation of the technique used with unbalanced data in the paper https://journal.r-project.org/archive/2014-1/menardi-lunardon-torelli.pdf.

4.  Use the ROSE package discussed adjust for the imbalance in the credit fraud data.  Run logistic regression with the new data set.  Also check https://cran.r-project.org/web/packages/ROSE/ROSE.pdf (Links to an external site.) for a more concise explanation.

https://www.kaggle.com/dalpozz/creditcardfraud .   >>     data set

Show more
Mehathi
Mehathi
  • @
  • 82 orders completed
ANSWER

Tutor has posted answer for $150.00. See answer's preview

$150.00

**** *** ******* Kindly follow *** the ***** * **** ******* *** and ***** ** no ***** ** the **** ****** ******* your ****** ***** all data *** *** ****

Click here to download attached files: Logistic Regression.doc
Click here to download attached files: CreditCardFraudCode.zip
or Buy custom answer
LEARN MORE EFFECTIVELY AND GET BETTER GRADES!
Ask a Question