Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
M
ML team project
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
donghyun
ML team project
Commits
9d436222
Commit
9d436222
authored
May 30, 2021
by
윤성혁
Browse files
Options
Downloads
Patches
Plain Diff
add
parent
35fdd65c
Branches
Branches containing commit
No related tags found
No related merge requests found
Changes
2
Expand all
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
.ipynb_checkpoints/Term Project_2(final)-checkpoint.ipynb
+465
-16
465 additions, 16 deletions
.ipynb_checkpoints/Term Project_2(final)-checkpoint.ipynb
Term Project_2(final).ipynb
+465
-16
465 additions, 16 deletions
Term Project_2(final).ipynb
with
930 additions
and
32 deletions
.ipynb_checkpoints/Term Project_2(final)-checkpoint.ipynb
+
465
−
16
View file @
9d436222
...
@@ -900,20 +900,463 @@
...
@@ -900,20 +900,463 @@
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count":
38
,
"execution_count":
59
,
"metadata": {},
"metadata": {},
"outputs": [
"outputs": [
{
{
"data": {
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEKCAYAAAAfGVI8AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAWp0lEQVR4nO3de9QkdX3n8feHGTkqAhNh9OgAShQXZ4/K6ohmict4iQsaxSRkBTUqMRnJwoq5HNHdJJposibGrDeUMwqLt0B0BR3NKGYVMNGwzmAUGBCdgygjHC7Cgkg2MvLdP6oebR6eSz+X6p5+6v06p89TVV1d/e3+dffnqduvUlVIkvprr3EXIEkaL4NAknrOIJCknjMIJKnnDAJJ6jmDQJJ6rrMgSHJ2kpuTXDnL/UnyriQ7k1ye5Mld1SJJml2XawTnAMfMcf+xwGHtbRPwvg5rkSTNorMgqKovAbfNMctxwIeqcSmwJskjuqpHkjSz1WN87nXA9QPju9ppN06fMckmmrUG9tlnn6ccfvjhP73vzv93D9/9wd3dVioAnrBu//uMX3bZZbdW1drFLm+udgW44vt3LHbRWoDlbleYu21t19EZbNu52nWcQZAZps3Y30VVbQY2A2zYsKG2b9/eZV0aUpLvLuXxtuueaantCrbtnmiudh3nUUO7gIMHxg8CbhhTLZLUW+MMgi3Ay9ujh54O3FFV99ssJEnqVmebhpKcC2wEDkyyC3gj8ACAqjoT2Ao8D9gJ3A2c1FUtkqTZdRYEVXXiPPcXcEpXzy9JGo5nFktSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQSS1HMGgST1nEEgST1nEEhSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQSS1HMGgST1nEEgST1nEEhSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQSS1HMGgST1XKdBkOSYJNck2Znk9TPcv3+STyf5RpIdSU7qsh5J0v11FgRJVgFnAMcC64ETk6yfNtspwFVV9SRgI/D2JHt3VZMk6f66XCM4EthZVddW1Y+B84Djps1TwL5JAjwEuA3Y3WFNkqRpugyCdcD1A+O72mmD3gM8HrgBuAI4rarunb6gJJuSbE+y/ZZbbumqXo2Y7bpy2baTpcsgyAzTatr4fwS+DjwSOAJ4T5L97vegqs1VtaGqNqxdu3a569SY2K4rl207WboMgl3AwQPjB9H85z/oJOD8auwEvgMc3mFNkqRpugyCbcBhSQ5tdwCfAGyZNs/3gGcDJHk48G+AazusSZI0zequFlxVu5OcClwIrALOrqodSU5u7z8TeDNwTpIraDYlnV5Vt3ZVkyTp/joLAoCq2gpsnTbtzIHhG4DndlmDJGlunlksST1nEEhSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPWcQSBJPWcQSFLPGQSS1HMGgST1nEEgST03bxAkeXiSs5J8th1fn+RV3ZcmSRqFYdYIzqG5psAj2/FvAa/tqB5J0ogNEwQHVtXHgHuhueAM8JNOq5IkjcwwQfCjJAfQXng+ydOBOzqtSpI0MsNcoez3aK41/JgkXwbWAsd3WpUkaWTmDYKq+lqSo2kuLB/gmqq6p/PKJEkjMcxRQ6cAD6mqHVV1JfCQJP+5+9IkSaMwzD6C366q/zs1UlW3A7/dWUWSpJEaJgj2SpKpkSSrgL27K0mSNErD7Cy+EPhYkjNpjhw6Gfhcp1VJkkZmmCA4HXg18Ds0O4s/D3ygy6IkSaMzzFFD9wLva2+SpBVm3iBIchTwJuBR7fwBqqp+vtvSJEmjMMymobOA3wUuw64lJGnFGSYI7qiqz3ZeiSRpLIYJgouSvA04H/jXqYlV9bXOqpIkjcwwQfC09u+GgWkFPGv5y5EkjdowRw09cxSFSJLGwyuUSVLPeYUySeo5r1AmST3X6RXKkhyT5JokO5O8fpZ5Nib5epIdSS4ZunJJ0rLo7AplbS+lZwC/BOwCtiXZUlVXDcyzBngvcExVfS/Jwxb+EiRJS9HlFcqOBHZW1bUASc4DjgOuGpjnJcD5VfW99rluXmD9kqQlmjUIkvzqLHc9LglVdf48y14HXD8wvoufnZPw02UBD0hyMbAv8M6q+tAMtWwCNgEccsgh8zytJoXtunLZtpNlrjWCF7R/Hwb8e+CL7fgzgYtpzjSeS2aYVjM8/1OAZwMPAv4pyaVV9a37PKhqM7AZYMOGDdOXoQllu65ctu1kmTUIquokgCSfAdZX1Y3t+CNotv3PZxdw8MD4QcANM8xza1X9iGan9JeAJ9EcoipJGoFhjhp69FQItG6i2aQzn23AYUkOTbI3cALNTudBnwKekWR1kgfTbDq6eohlS5KWyTBHDV2c5ELgXJpNOycAF833oKraneRUmpPRVgFnV9WOJCe3959ZVVcn+RxwOc15Ch+oqisX+VokSYswzFFDp7Y7jp/RTtpcVRcMs/Cq2gpsnTbtzGnjbwPeNly5kqTlNswawdQRQvPtHJYkTaBhOp371STfTnJHkjuT/DDJnaMoTpLUvWHWCP4SeEFVuRNXklagYY4auskQkKSVa5g1gu1J/hb4JPe9VKX7DCRpBRgmCPYD7gaeOzCtcOexJK0Iwxw+etIoCpEkjccwRw09LskXklzZjj8xyR92X5okaRSG2Vn8fuANwD0AVXU5zdnFkqQVYJggeHBVfXXatN1dFCNJGr1hguDWJI/hZ5eqPB64ce6HSJImxTBHDZ1C06/44Um+D3wHeGmnVUmSRmaYo4auBZ6TZB9gr6r6YfdlSZJGZZijhg5I8i7gH2i6pH5nkgO6L02SNArD7CM4D7gF+DXg+Hb4b7ssSpI0OsPsI3hoVb15YPwtSV7UUT2SpBEbZo3goiQnJNmrvf0n4O+6LkySNBrDBMGrgb8BfkzT6dx5wO95XQJJWhmGOWpo31EUIkkaj2GOGkqSlyX5o3b84CRHdl+aJGkUhtk09F7gF4CXtON3AWd0VpEkaaSGOWroaVX15CT/DFBVtyfZu+O6JEkjMswawT1JVvGzvobWAvd2WpUkaWSGCYJ3ARcAD0vyZ8A/An/eaVWSpJEZ5qihjya5DHg2EOBFXsxeklaOWYMgyUMHRm8Gzh28r6pu67IwSdJozLVGcBnNfoEAhwC3t8NrgO8Bh3ZdnCSpe7PuI6iqQ6vq54ELgRdU1YFVdQDwy8D5oypQktStYXYWP7Wqtk6NVNVngaO7K0mSNErDnEdwa5I/BD5Cs6noZcAPOq1KkjQyw6wRnAispTmE9IJ2+MQui5Ikjc4wh4/eBpw2glokSWMwzBqBJGkFMwgkqecMAknqubnOLH43bUdzM6mq18y38CTHAO8EVgEfqKq3zjLfU4FLgRdX1f+ab7mSpOUz187i7UtZcNtj6RnALwG7gG1JtlTVVTPM9xc0J65JkkZs1iCoqg8ucdlHAjur6lqAJOcBxwFXTZvvvwCfAJ66xOeTJC3CvIePttcfOB1YDzxwanpVPWueh64Drh8Y3wU8bdqy1wG/AjyLOYIgySZgE8AhhxwyX8maELbrymXbTpZhdhZ/FLiappO5PwGuA7YN8bjMMG36Pod3AKdX1U/mWlBVba6qDVW1Ye3atUM8tSaB7bpy2baTZZguJg6oqrOSnFZVlwCXJLlkiMftAg4eGD8IuGHaPBuA85IAHAg8L8nuqvrkEMuXJC2DYYLgnvbvjUmeT/NjftAQj9sGHJbkUOD7wAnASwZnqKqfdmWd5BzgM4aAJI3WMEHwliT7A78PvBvYD/jd+R5UVbuTnEpzNNAq4Oyq2pHk5Pb+MxdftiRpuQzT19Bn2sE7gGcuZOFt99Vbp02bMQCq6pULWbYkaXnMdULZ66rqL2c7sWyYE8okSXu+udYIpi5Qv6QTyyRJe7a5Tij7dDt4d1V9fPC+JL/eaVWSpJEZ5jyCNww5TZI0gebaR3As8DxgXZJ3Ddy1H7C768IkSaMx1z6CG2j2D7wQuGxg+g8Z4vBRSdJkmGsfwTeSXAk8dxk6oJMk7aHm3EfQ9gF0QJK9R1SPJGnEhjmz+LvAl5NsAX40NbGq/rqzqiRJIzNMENzQ3vYC9u22HEnSqA3TxcSfjKIQSdJ4DHthmtcB/5aFXZhGkjQBhr0wzTdZ+IVpJEkTYJggOKCqzgLuqapLquo3gad3XJckaUS6vDCNJGkCdHZhGknSZJirr6EHAicDjwXWAWdV1YIuTCNJ2vPNtY/ggzQXl78COBZ4+0gqkiSN1FybhtZX1RMAkpwFfHU0JUmSRmmuNYKpncRUld1OS9IKNdcawZOS3NkOB3hQOx6gqmq/zquTJHVurm6oV42yEEnSeAxzQpkkaQUzCCSp5wwCSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6rtMgSHJMkmuS7Ezy+hnuf2mSy9vbV5I8qct6JEn311kQJFkFnEFzdbP1wIlJ1k+b7TvA0VX1RODNwOau6pEkzazLNYIjgZ1VdW1V/Rg4DzhucIaq+kpV3d6OXgoc1GE9kqQZdBkE64DrB8Z3tdNm8yrgszPdkWRTku1Jtt9yyy3LWKLGyXZduWzbydJlEGSGaTXjjMkzaYLg9Jnur6rNVbWhqjasXbt2GUvUONmuK5dtO1nmulTlUu0CDh4YPwi4YfpMSZ4IfAA4tqp+0GE9kqQZdLlGsA04LMmhSfYGTgC2DM6Q5BDgfOA3qupbHdYiSZpFZ2sEVbU7yanAhcAq4Oyq2pHk5Pb+M4E/Bg4A3psEYHdVbeiqJknS/XW5aYiq2gpsnTbtzIHh3wJ+q8saJElz88xiSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6ziCQpJ4zCCSp5wwCSeo5g0CSes4gkKSeMwgkqecMAknqOYNAknrOIJCknjMIJKnnDAJJ6jmDQJJ6ziCQpJ4zCCSp5wwCSeo5g0CSes4gkKSeMwhmsWbNGtasWbPsy129ejWrV69e9uVCdzV3tdy5bNy4kY0bNy7oMQt9bxfTFuN4L1aSLj//c0lCkpE/72I+x+NgEEhSzxkEktRzBoEk9ZxBIEk9ZxBIUs8ZBJLUcwaBJPVcp0GQ5Jgk1yTZmeT1M9yfJO9q7788yZO7rEeSdH+dBUGSVcAZwLHAeuDEJOunzXYscFh72wS8r6t6JEkz63KN4EhgZ1VdW1U/Bs4Djps2z3HAh6pxKbAmySM6rEmSNE2X53qvA64fGN8FPG2IedYBNw7OlGQTzRoDwF1JrlneUmd1YJJbu1hwh6e7j7LmRy1xefO166Jey0Lf28W0xYi6KzgQ6KQt57GkdoX523Yc3T3Q4XdjPuN6vdz38zNru3YZBDO98lrEPFTVZmDzchS1EEm2V9WGUT/vUkxSzfO16yS9li5M8usf13d2LpP8fi7GQl5vl5uGdgEHD4wfBNywiHkkSR3qMgi2AYclOTTJ3sAJwJZp82wBXt4ePfR04I6qunH6giRJ3els01BV7U5yKnAhsAo4u6p2JDm5vf9MYCvwPGAncDdwUlf1LNIetWo7pEmseTYr6bUsRt9f/3Lr2/s59OtN1f02yUuSesQziyWp5wwCSeo5g2AGSQ5OclGSq5PsSHLauGsaRpJVSf45yWfGXctCJTk7yc1JrhyY9qYk30/y9fb2vHHW2JXZPm9JHprk75N8u/37c+OuddJM6nd5qRb6W2AQzGw38PtV9Xjg6cApM3SPsSc6Dbh63EUs0jnAMTNM/x9VdUR72zrimkZlts/b64EvVNVhwBfacS3MpH6Xl2pBvwUGwQyq6saq+lo7/EOaN3TdeKuaW5KDgOcDHxh3LYtRVV8Cbht3HeMwx+ftOOCD7WwfBF40lgIn2CR+l5dqMb8FBsE8kjwa+HfA/xlzKfN5B/A64N4x17HcTm17pj27D5tGpn3eHj51Xk3792FjLG3iTdB3eanewQJ/CwyCOSR5CPAJ4LVVdee465lNkl8Gbq6qy8ZdyzJ7H/AY4Aia/qfePtZqOjYpn7dJ1Jf3drG/BQbBLJI8gOaD89GqOn/c9czjKOCFSa6j6eX1WUk+Mt6Slq6qbqqqn1TVvcD7aXq0XZFm+bzdNNUbb/v35nHVN8km7Lu8VIv6LTAIZpCmq8CzgKur6q/HXc98quoNVXVQVT2apiuPL1bVy8Zc1pJN65L8V4ArZ5t3ks3xedsCvKIdfgXwqVHXNukm7bu8VIv9Leiy99FJdhTwG8AVSb7eTvuvK/iolbFLci6wkaar4F3AG4GNSY6g6ZH2OuDV46qvYzN+3oC3Ah9L8irge8Cvj6e8ieZ3eQh2MSFJPeemIUnqOYNAknrOIJCknjMIJKnnDAJJ6jmDYA+X5IiV2uum1JUkr03y4EU87q557r/P9zHJC5NMfGeABsGe7wiay3lKGt5rgQUHwRCOYOD7WFVbquqtHTzPSBkEyyDJHyX5Zttn/LlJ/qD9z+HStsO0C6Y6TJtj+sVJNrTDBya5LsnewJ8CL27743/x+F6lAJJ8Msllbd/2m9ppr0ryrbYN35/kPe30tUk+kWRbeztqvNWvTEn2SfJ3Sb6R5MokbwQeCVyU5KJ2nrsG5j8+yTnt8KFJ/qltnzcPzPPhJMcNjH80yQuZ9n1M8sqB9j4nyfva6x9cm+TotrPEq6eer53vue1zfi3Jx9t+kMarqrwt4QZsAL4OPAjYF/g28AfA5cDR7Tx/CryjHZ5t+sXAhnb4QOC6dviVwHvG/Tq9/bS9H9r+fRBNlxfraM56fijwAOAfptoL+BvgF9vhQ2i6ORj7a1hpN+DXgPcPjO/ftsmBA9PuGhg+HjinHd4CvLwdPmVqPuBo4JMDy/sOTU8M9/k+Do7TXFPjPCA0XYjfCTyB5h/uy2jWJg4EvgTs0z7mdOCPx/0eukawdL8IfKqq/qWa/s4/DewDrKmqS9p5Pgj8hyT7zzR95BVrKV6T5BvApcDBNN0XXFJVt1XVPcDHB+Z9DvCetmuDLcB+SfYddcE9cAXwnCR/keQZVXXHAh57FHBuO/zhqYntd/SxSR4GnAh8oqp2D7G8T1fzC38FcFNVXVFNp4k7gEfTXBxnPfDl9nPxCuBRC6i3E/Y1tHRZpuXs5meb6h64TMvUMkqykebH/Req6u4kFwPXAI+f5SF7tfP+y0gK7Kmq+laSp9Bsu//vST4/02wDw9O/X7P1s/Nh4KU0nbf95pDl/Gv7996B4anx1cBPgL+vqhOHXN5IuEawdP8IvCDJA9ttfc8HfgTcnuQZ7TxT/zXeMdP0dvg64Cnt8PEDy/8hzSYnjd/+wO1tCBxO89/dg4Gjk/xcktU0mymmfB44dWqk7UBPyyzJI4G7q+ojwF8BT+b+35ubkjw+yV40PdlO+TLNDz00P/qDzqHZ6UxV7WinLfX7eClwVJLHtrU/OMnjlrC8ZWEQLFFVbaNZ7f8GcD6wHbiDZpXvbUkup9k2+KftQ2ab/lfA7yT5Cs12xCkXAevdWbxH+Bywum27N9N8qb8P/DnNVa/+N3AVTfsDvAbY0B4YcBVw8uhL7oUnAF9tN7X8N+AtwGbgs1M7i2mu9/wZ4Is0FzmachrNdYy30QT9T1XVTTSXtvyfA5OX9H2sqlto9iuc236OLgUOX+hylpu9jy6DJA+pqrva45a/BGyq9jqpWvkG2n81cAFwdlVdMO66tDTt9/kK4MkL3O8wcVwjWB6b2/9GvkazU8kQ6Jc3te1/Jc3RJZ8cazVasiTPAb4JvHulhwC4RiBJvecagST1nEEgST1nEEhSzxkEktRzBoEk9dz/B1Peg5qyMu+RAAAAAElFTkSuQmCC\n",
"text/html": [
"\n",
" <style>\n",
" table.eli5-weights tr:hover {\n",
" filter: brightness(85%);\n",
" }\n",
"</style>\n",
"\n",
"\n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
" <table class=\"eli5-weights eli5-feature-importances\" style=\"border-collapse: collapse; border: none; margin-top: 0em; table-layout: auto;\">\n",
" <thead>\n",
" <tr style=\"border: none;\">\n",
" <th style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">Weight</th>\n",
" <th style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">Feature</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 80.00%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0084\n",
" \n",
" ± 0.0047\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" age\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 84.73%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0057\n",
" \n",
" ± 0.0105\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" goout\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 88.04%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0040\n",
" \n",
" ± 0.0040\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" sex_M\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 90.22%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0030\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Mjob_other\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 90.22%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0030\n",
" \n",
" ± 0.0013\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Fjob_health\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 90.22%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0030\n",
" \n",
" ± 0.0058\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" romantic_no\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 90.22%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0030\n",
" \n",
" ± 0.0033\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" guardian_mother\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 91.80%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0023\n",
" \n",
" ± 0.0040\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Grade\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 92.63%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0020\n",
" \n",
" ± 0.0033\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" activities_yes\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 92.63%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0020\n",
" \n",
" ± 0.0058\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" famsup_no\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 93.52%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0017\n",
" \n",
" ± 0.0064\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" studytime\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 93.52%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0017\n",
" \n",
" ± 0.0060\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Fedu\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 94.45%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0013\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" nursery_yes\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 94.45%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0013\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Mjob_teacher\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 94.45%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0013\n",
" \n",
" ± 0.0039\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" famsup_yes\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 96.59%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0007\n",
" \n",
" ± 0.0016\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" failures\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 96.59%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0007\n",
" \n",
" ± 0.0040\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" reason_reputation\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 96.59%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0007\n",
" \n",
" ± 0.0045\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" address_U\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0003\n",
" \n",
" ± 0.0013\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Pstatus_T\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0003\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Mjob_at_home\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0003\n",
" \n",
" ± 0.0033\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Mjob_health\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0003\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Mjob_services\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0003\n",
" \n",
" ± 0.0039\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" guardian_other\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(120, 100.00%, 100.00%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" 0.0000\n",
" \n",
" ± 0.0064\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" health\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0000\n",
" \n",
" ± 0.0042\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Medu\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0003\n",
" \n",
" ± 0.0054\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" freetime\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0003\n",
" \n",
" ± 0.0025\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" activities_no\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0003\n",
" \n",
" ± 0.0013\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" higher_yes\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0003\n",
" \n",
" ± 0.0013\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" Fjob_teacher\n",
" </td>\n",
" </tr>\n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
" -0.0003\n",
" \n",
" ± 0.0039\n",
" \n",
" </td>\n",
" <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
" famrel\n",
" </td>\n",
" </tr>\n",
" \n",
" \n",
" \n",
" <tr style=\"background-color: hsl(0, 100.00%, 97.90%); border: none;\">\n",
" <td colspan=\"2\" style=\"padding: 0 0.5em 0 0.5em; text-align: center; border: none; white-space: nowrap;\">\n",
" <i>… 25 more …</i>\n",
" </td>\n",
" </tr>\n",
" \n",
" \n",
" </tbody>\n",
"</table>\n",
" \n",
"\n",
" \n",
"\n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"\n",
"\n",
"\n"
],
"text/plain": [
"text/plain": [
"<
Figure size 432x288 with 4 Axes
>"
"<
IPython.core.display.HTML object
>"
]
]
},
},
"metadata": {
"execution_count": 59,
"needs_background": "light"
"metadata": {},
},
"output_type": "execute_result"
"output_type": "display_data"
}
}
],
],
"source": [
"source": [
...
@@ -926,18 +1369,24 @@
...
@@ -926,18 +1369,24 @@
" max_features='sqrt',min_samples_leaf=1)\n",
" max_features='sqrt',min_samples_leaf=1)\n",
"model.fit(X_train,y_train)\n",
"model.fit(X_train,y_train)\n",
"# result = permutation_importance(model, scoring = \"accuracy\", random_state=1).fit(X_test,y_test)\n",
"# result = permutation_importance(model, scoring = \"accuracy\", random_state=1).fit(X_test,y_test)\n",
"
##
result = PermutationImportance(model, scoring = \"accuracy\", random_state=1).fit(X_test,y_test)\n",
"result = PermutationImportance(model, scoring = \"accuracy\", random_state=1).fit(X_test,y_test)\n",
"# result.importances_mean\n",
"# result.importances_mean\n",
"## eli5.show_weights(result, top = 30, feature_names = X_train.columns.tolist())\n",
"eli5.show_weights(result, top = 30, feature_names = X_test.columns.tolist())\n",
"# disp = plot_partial_dependence(model, X_train, [1, 2])\n",
"# disp = plot_partial_dependence(model, X_train, [1, 2])"
"\n",
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\"\"\"\n",
"disp = plot_partial_dependence(model,\n",
"disp = plot_partial_dependence(model,\n",
" X_train,\n",
" X_train,\n",
" features = ['goout','age','studytime'],\n",
" features = [('goout','age'),'studytime'],\n",
" target=5,\n",
" target=2)\n",
" kind='individual')\n",
"\"\"\""
"\n",
"# print(disp)"
]
]
},
},
{
{
...
...
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
import
numpy
as
np
import
numpy
as
np
import
pandas
as
pd
import
pandas
as
pd
import
matplotlib.pyplot
as
plt
import
matplotlib.pyplot
as
plt
import
seaborn
as
sns
import
seaborn
as
sns
from
sklearn.model_selection
import
train_test_split
from
sklearn.model_selection
import
train_test_split
from
sklearn.metrics
import
accuracy_score
from
sklearn.metrics
import
accuracy_score
from
imblearn.over_sampling
import
RandomOverSampler
from
imblearn.over_sampling
import
RandomOverSampler
#데이터 로드
#데이터 로드
por
=
pd
.
read_csv
(
"
./student-por.csv
"
)
por
=
pd
.
read_csv
(
"
./student-por.csv
"
)
math
=
pd
.
read_csv
(
"
./student-mat.csv
"
)
math
=
pd
.
read_csv
(
"
./student-mat.csv
"
)
data
=
pd
.
concat
([
por
,
math
],
ignore_index
=
True
)
data
=
pd
.
concat
([
por
,
math
],
ignore_index
=
True
)
ros
=
RandomOverSampler
(
random_state
=
0
,
sampling_strategy
=
'
auto
'
)
ros
=
RandomOverSampler
(
random_state
=
0
,
sampling_strategy
=
'
auto
'
)
X
,
Y
=
ros
.
fit_resample
(
data
.
drop
([
'
Dalc
'
,
'
Walc
'
],
axis
=
1
),
data
[
'
Walc
'
])
X
,
Y
=
ros
.
fit_resample
(
data
.
drop
([
'
Dalc
'
,
'
Walc
'
],
axis
=
1
),
data
[
'
Walc
'
])
data
=
pd
.
concat
([
X
,
Y
],
axis
=
1
)
data
=
pd
.
concat
([
X
,
Y
],
axis
=
1
)
```
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#Null값이 없는 것을 확인, G1 G2 G3는 Grade로 통합
#Null값이 없는 것을 확인, G1 G2 G3는 Grade로 통합
data
[
"
Grade
"
]
=
data
[
'
G1
'
]
+
data
[
'
G2
'
]
+
data
[
'
G3
'
]
data
[
"
Grade
"
]
=
data
[
'
G1
'
]
+
data
[
'
G2
'
]
+
data
[
'
G3
'
]
data
=
data
.
drop
(
columns
=
[
'
G1
'
,
'
G2
'
,
'
G3
'
])
data
=
data
.
drop
(
columns
=
[
'
G1
'
,
'
G2
'
,
'
G3
'
])
print
(
data
.
info
())
print
(
data
.
info
())
data
.
shape
data
.
shape
```
```
%% Output
%% Output
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1990 entries, 0 to 1989
RangeIndex: 1990 entries, 0 to 1989
Data columns (total 30 columns):
Data columns (total 30 columns):
# Column Non-Null Count Dtype
# Column Non-Null Count Dtype
--- ------ -------------- -----
--- ------ -------------- -----
0 school 1990 non-null object
0 school 1990 non-null object
1 sex 1990 non-null object
1 sex 1990 non-null object
2 age 1990 non-null int64
2 age 1990 non-null int64
3 address 1990 non-null object
3 address 1990 non-null object
4 famsize 1990 non-null object
4 famsize 1990 non-null object
5 Pstatus 1990 non-null object
5 Pstatus 1990 non-null object
6 Medu 1990 non-null int64
6 Medu 1990 non-null int64
7 Fedu 1990 non-null int64
7 Fedu 1990 non-null int64
8 Mjob 1990 non-null object
8 Mjob 1990 non-null object
9 Fjob 1990 non-null object
9 Fjob 1990 non-null object
10 reason 1990 non-null object
10 reason 1990 non-null object
11 guardian 1990 non-null object
11 guardian 1990 non-null object
12 traveltime 1990 non-null int64
12 traveltime 1990 non-null int64
13 studytime 1990 non-null int64
13 studytime 1990 non-null int64
14 failures 1990 non-null int64
14 failures 1990 non-null int64
15 schoolsup 1990 non-null object
15 schoolsup 1990 non-null object
16 famsup 1990 non-null object
16 famsup 1990 non-null object
17 paid 1990 non-null object
17 paid 1990 non-null object
18 activities 1990 non-null object
18 activities 1990 non-null object
19 nursery 1990 non-null object
19 nursery 1990 non-null object
20 higher 1990 non-null object
20 higher 1990 non-null object
21 internet 1990 non-null object
21 internet 1990 non-null object
22 romantic 1990 non-null object
22 romantic 1990 non-null object
23 famrel 1990 non-null int64
23 famrel 1990 non-null int64
24 freetime 1990 non-null int64
24 freetime 1990 non-null int64
25 goout 1990 non-null int64
25 goout 1990 non-null int64
26 health 1990 non-null int64
26 health 1990 non-null int64
27 absences 1990 non-null int64
27 absences 1990 non-null int64
28 Walc 1990 non-null int64
28 Walc 1990 non-null int64
29 Grade 1990 non-null int64
29 Grade 1990 non-null int64
dtypes: int64(13), object(17)
dtypes: int64(13), object(17)
memory usage: 466.5+ KB
memory usage: 466.5+ KB
None
None
(1990, 30)
(1990, 30)
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#int형 data의 EDA를 보기 위해 numbers에 저장, Outlier가 없는 것을 확인
#int형 data의 EDA를 보기 위해 numbers에 저장, Outlier가 없는 것을 확인
numbers
=
data
.
select_dtypes
(
'
int64
'
).
columns
numbers
=
data
.
select_dtypes
(
'
int64
'
).
columns
numbers
=
data
[
numbers
]
numbers
=
data
[
numbers
]
numbers
.
hist
(
figsize
=
(
18
,
18
),
edgecolor
=
'
white
'
)
numbers
.
hist
(
figsize
=
(
18
,
18
),
edgecolor
=
'
white
'
)
plt
.
show
()
plt
.
show
()
display
(
numbers
.
describe
())
display
(
numbers
.
describe
())
```
```
%% Output
%% Output
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#각 변수간의 상관관계 분석 (숫자형만)
#각 변수간의 상관관계 분석 (숫자형만)
fig
,
ax
=
plt
.
subplots
(
figsize
=
(
12
,
12
))
fig
,
ax
=
plt
.
subplots
(
figsize
=
(
12
,
12
))
#corr()로 상관관계 계산, vmin-vmax로 최대 최소값 지정, cmap으로 색상 결정,
#corr()로 상관관계 계산, vmin-vmax로 최대 최소값 지정, cmap으로 색상 결정,
#annot로 숫자 표시 여부 결정
#annot로 숫자 표시 여부 결정
sns
.
heatmap
(
numbers
.
corr
(),
vmin
=-
1
,
vmax
=
1
,
sns
.
heatmap
(
numbers
.
corr
(),
vmin
=-
1
,
vmax
=
1
,
cmap
=
'
RdYlBu_r
'
,
annot
=
True
)
cmap
=
'
RdYlBu_r
'
,
annot
=
True
)
plt
.
show
()
plt
.
show
()
```
```
%% Output
%% Output
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#명목형 변수를 One-Hot encoding으로 정수형으로 바꿔줌
#명목형 변수를 One-Hot encoding으로 정수형으로 바꿔줌
data_dummies
=
pd
.
get_dummies
(
data
)
data_dummies
=
pd
.
get_dummies
(
data
)
data_dummies
.
head
(
5
)
data_dummies
.
head
(
5
)
```
```
%% Output
%% Output
age Medu Fedu traveltime studytime failures famrel freetime goout \
age Medu Fedu traveltime studytime failures famrel freetime goout \
0 18 4 4 2 2 0 4 3 4
0 18 4 4 2 2 0 4 3 4
1 17 1 1 1 2 0 5 3 3
1 17 1 1 1 2 0 5 3 3
2 15 1 1 1 2 0 4 3 2
2 15 1 1 1 2 0 4 3 2
3 15 4 2 1 3 0 3 2 2
3 15 4 2 1 3 0 3 2 2
4 16 3 3 1 2 0 4 3 2
4 16 3 3 1 2 0 4 3 2
health ... activities_no activities_yes nursery_no nursery_yes \
health ... activities_no activities_yes nursery_no nursery_yes \
0 3 ... 1 0 0 1
0 3 ... 1 0 0 1
1 3 ... 1 0 1 0
1 3 ... 1 0 1 0
2 3 ... 1 0 0 1
2 3 ... 1 0 0 1
3 5 ... 0 1 0 1
3 5 ... 0 1 0 1
4 5 ... 1 0 0 1
4 5 ... 1 0 0 1
higher_no higher_yes internet_no internet_yes romantic_no romantic_yes
higher_no higher_yes internet_no internet_yes romantic_no romantic_yes
0 0 1 1 0 1 0
0 0 1 1 0 1 0
1 0 1 0 1 1 0
1 0 1 0 1 1 0
2 0 1 0 1 1 0
2 0 1 0 1 1 0
3 0 1 0 1 0 1
3 0 1 0 1 0 1
4 0 1 1 0 1 0
4 0 1 1 0 1 0
[5 rows x 56 columns]
[5 rows x 56 columns]
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#feature, label 분리, y
#feature, label 분리, y
X
=
data_dummies
.
drop
([
'
Walc
'
],
axis
=
1
)
X
=
data_dummies
.
drop
([
'
Walc
'
],
axis
=
1
)
y_w
=
data_dummies
[
'
Walc
'
]
y_w
=
data_dummies
[
'
Walc
'
]
# y_d = data_dummies['Dalc']
# y_d = data_dummies['Dalc']
X
.
head
(
5
)
X
.
head
(
5
)
```
```
%% Output
%% Output
age Medu Fedu traveltime studytime failures famrel freetime goout \
age Medu Fedu traveltime studytime failures famrel freetime goout \
0 18 4 4 2 2 0 4 3 4
0 18 4 4 2 2 0 4 3 4
1 17 1 1 1 2 0 5 3 3
1 17 1 1 1 2 0 5 3 3
2 15 1 1 1 2 0 4 3 2
2 15 1 1 1 2 0 4 3 2
3 15 4 2 1 3 0 3 2 2
3 15 4 2 1 3 0 3 2 2
4 16 3 3 1 2 0 4 3 2
4 16 3 3 1 2 0 4 3 2
health ... activities_no activities_yes nursery_no nursery_yes \
health ... activities_no activities_yes nursery_no nursery_yes \
0 3 ... 1 0 0 1
0 3 ... 1 0 0 1
1 3 ... 1 0 1 0
1 3 ... 1 0 1 0
2 3 ... 1 0 0 1
2 3 ... 1 0 0 1
3 5 ... 0 1 0 1
3 5 ... 0 1 0 1
4 5 ... 1 0 0 1
4 5 ... 1 0 0 1
higher_no higher_yes internet_no internet_yes romantic_no romantic_yes
higher_no higher_yes internet_no internet_yes romantic_no romantic_yes
0 0 1 1 0 1 0
0 0 1 1 0 1 0
1 0 1 0 1 1 0
1 0 1 0 1 1 0
2 0 1 0 1 1 0
2 0 1 0 1 1 0
3 0 1 0 1 0 1
3 0 1 0 1 0 1
4 0 1 1 0 1 0
4 0 1 1 0 1 0
[5 rows x 55 columns]
[5 rows x 55 columns]
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
print
(
y_w
.
head
(
5
))
print
(
y_w
.
head
(
5
))
# print(y_d.head(5))
# print(y_d.head(5))
```
```
%% Output
%% Output
0 1
0 1
1 1
1 1
2 3
2 3
3 1
3 1
4 2
4 2
Name: Walc, dtype: int64
Name: Walc, dtype: int64
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
#Weekend 예측
#Weekend 예측
X_train
,
X_test
,
y_train
,
y_test
=
train_test_split
(
X
,
y_w
,
test_size
=
0.3
)
X_train
,
X_test
,
y_train
,
y_test
=
train_test_split
(
X
,
y_w
,
test_size
=
0.3
)
print
(
"
X_train
'
s shape :
"
,
X_train
.
shape
)
print
(
"
X_train
'
s shape :
"
,
X_train
.
shape
)
print
(
"
X_test
'
s shape :
"
,
X_test
.
shape
)
print
(
"
X_test
'
s shape :
"
,
X_test
.
shape
)
print
(
"
y_train
'
s shape :
"
,
y_train
.
shape
)
print
(
"
y_train
'
s shape :
"
,
y_train
.
shape
)
print
(
"
y_test
'
s shape :
"
,
y_test
.
shape
)
print
(
"
y_test
'
s shape :
"
,
y_test
.
shape
)
```
```
%% Output
%% Output
X_train's shape : (1393, 55)
X_train's shape : (1393, 55)
X_test's shape : (597, 55)
X_test's shape : (597, 55)
y_train's shape : (1393,)
y_train's shape : (1393,)
y_test's shape : (597,)
y_test's shape : (597,)
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
from
lightgbm
import
LGBMClassifier
from
lightgbm
import
LGBMClassifier
from
xgboost
import
XGBClassifier
from
xgboost
import
XGBClassifier
from
sklearn
import
svm
from
sklearn
import
svm
from
sklearn
import
tree
from
sklearn
import
tree
from
sklearn.ensemble
import
RandomForestClassifier
from
sklearn.ensemble
import
RandomForestClassifier
from
sklearn.ensemble
import
AdaBoostClassifier
from
sklearn.ensemble
import
AdaBoostClassifier
from
sklearn.ensemble
import
GradientBoostingClassifier
from
sklearn.ensemble
import
GradientBoostingClassifier
import
torch
import
torch
import
torch.nn
as
nn
import
torch.nn
as
nn
from
sklearn.model_selection
import
GridSearchCV
from
sklearn.model_selection
import
GridSearchCV
para_grid
=
{
para_grid
=
{
'
n_estimators
'
:
[
150
,
200
],
'
n_estimators
'
:
[
150
,
200
],
'
max_depth
'
:
[
10
,
15
,
20
],
'
max_depth
'
:
[
10
,
15
,
20
],
'
gamma
'
:
[
0.1
,
0.5
,
1
]
'
gamma
'
:
[
0.1
,
0.5
,
1
]
}
}
para_agrid
=
{
para_agrid
=
{
'
n_estimators
'
:
[
150
,
200
]
'
n_estimators
'
:
[
150
,
200
]
}
}
para_lgrid
=
{
para_lgrid
=
{
'
max_depth
'
:
[
20
,
200
],
'
max_depth
'
:
[
20
,
200
],
'
min_child_weight
'
:
[
3
,
5
,
10
,
60
],
'
min_child_weight
'
:
[
3
,
5
,
10
,
60
],
'
gamma
'
:
[
0
,
8
,
1
]
'
gamma
'
:
[
0
,
8
,
1
]
}
}
para_cgrid
=
{
para_cgrid
=
{
'
n_estimators
'
:
[
150
,
200
],
'
n_estimators
'
:
[
150
,
200
],
'
max_depth
'
:
[
20
,
200
],
'
max_depth
'
:
[
20
,
200
],
'
min_samples_split
'
:
[
3
,
5
,
10
],
'
min_samples_split
'
:
[
3
,
5
,
10
],
'
learning_rate
'
:
[
0.1
,
0.3
,
0.5
,
0.7
,
0.9
]
'
learning_rate
'
:
[
0.1
,
0.3
,
0.5
,
0.7
,
0.9
]
}
}
svc
=
svm
.
SVC
()
svc
=
svm
.
SVC
()
clf
=
tree
.
DecisionTreeClassifier
()
clf
=
tree
.
DecisionTreeClassifier
()
rf
=
RandomForestClassifier
(
max_depth
=
2
,
random_state
=
0
)
rf
=
RandomForestClassifier
(
max_depth
=
2
,
random_state
=
0
)
agb
=
AdaBoostClassifier
(
random_state
=
0
)
agb
=
AdaBoostClassifier
(
random_state
=
0
)
xgb
=
XGBClassifier
(
eval_metric
=
'
mlogloss
'
,
learning_rate
=
0.4
,
subsample
=
0.7
,
colsample_bytree
=
0.5
)
xgb
=
XGBClassifier
(
eval_metric
=
'
mlogloss
'
,
learning_rate
=
0.4
,
subsample
=
0.7
,
colsample_bytree
=
0.5
)
lgb
=
LGBMClassifier
(
learning_rate
=
0.1
,
subsample
=
1
)
lgb
=
LGBMClassifier
(
learning_rate
=
0.1
,
subsample
=
1
)
gb
=
GradientBoostingClassifier
(
max_features
=
'
sqrt
'
,
min_samples_leaf
=
1
)
gb
=
GradientBoostingClassifier
(
max_features
=
'
sqrt
'
,
min_samples_leaf
=
1
)
grid_s
=
GridSearchCV
(
estimator
=
agb
,
param_grid
=
para_agrid
,
n_jobs
=-
1
,
verbose
=
2
)
grid_s
=
GridSearchCV
(
estimator
=
agb
,
param_grid
=
para_agrid
,
n_jobs
=-
1
,
verbose
=
2
)
grid_s
.
fit
(
X_train
,
y_train
)
grid_s
.
fit
(
X_train
,
y_train
)
print
(
'
final params
'
,
grid_s
.
best_params_
)
print
(
'
final params
'
,
grid_s
.
best_params_
)
print
(
'
best score
'
,
grid_s
.
best_score_
)
print
(
'
best score
'
,
grid_s
.
best_score_
)
# gb.fit(X_train,y_train)
# gb.fit(X_train,y_train)
model
=
grid_s
.
best_estimator_
model
=
grid_s
.
best_estimator_
pred
=
model
.
predict
(
X_test
)
pred
=
model
.
predict
(
X_test
)
accuracy
=
accuracy_score
(
y_test
,
pred
)
accuracy
=
accuracy_score
(
y_test
,
pred
)
print
(
accuracy
)
print
(
accuracy
)
```
```
%% Output
%% Output
Fitting 5 folds for each of 2 candidates, totalling 10 fits
Fitting 5 folds for each of 2 candidates, totalling 10 fits
final params {'n_estimators': 200}
final params {'n_estimators': 200}
best score 0.4005595523581135
best score 0.4005595523581135
0.38190954773869346
0.38190954773869346
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
import
eli5
import
eli5
from
eli5.sklearn
import
PermutationImportance
from
eli5.sklearn
import
PermutationImportance
from
sklearn.inspection
import
plot_partial_dependence
,
permutation_importance
from
sklearn.inspection
import
plot_partial_dependence
,
permutation_importance
from
sklearn.ensemble
import
GradientBoostingClassifier
from
sklearn.ensemble
import
GradientBoostingClassifier
model
=
GradientBoostingClassifier
(
n_estimators
=
200
,
max_depth
=
20
,
min_samples_split
=
3
,
learning_rate
=
0.9
,
model
=
GradientBoostingClassifier
(
n_estimators
=
200
,
max_depth
=
20
,
min_samples_split
=
3
,
learning_rate
=
0.9
,
max_features
=
'
sqrt
'
,
min_samples_leaf
=
1
)
max_features
=
'
sqrt
'
,
min_samples_leaf
=
1
)
model
.
fit
(
X_train
,
y_train
)
model
.
fit
(
X_train
,
y_train
)
# result = permutation_importance(model, scoring = "accuracy", random_state=1).fit(X_test,y_test)
# result = permutation_importance(model, scoring = "accuracy", random_state=1).fit(X_test,y_test)
##
result = PermutationImportance(model, scoring = "accuracy", random_state=1).fit(X_test,y_test)
result
=
PermutationImportance
(
model
,
scoring
=
"
accuracy
"
,
random_state
=
1
).
fit
(
X_test
,
y_test
)
# result.importances_mean
# result.importances_mean
##
eli5.show_weights(result, top = 30, feature_names = X_t
rain
.columns.tolist())
eli5
.
show_weights
(
result
,
top
=
30
,
feature_names
=
X_t
est
.
columns
.
tolist
())
# disp = plot_partial_dependence(model, X_train, [1, 2])
# disp = plot_partial_dependence(model, X_train, [1, 2])
disp
=
plot_partial_dependence
(
model
,
X_train
,
features
=
[
'
goout
'
,
'
age
'
,
'
studytime
'
],
target
=
5
,
kind
=
'
individual
'
)
# print(disp)
```
```
%% Output
%% Output
<IPython.core.display.HTML object>
%% Cell type:code id: tags:
```
python
"""
disp = plot_partial_dependence(model,
X_train,
features = [(
'
goout
'
,
'
age
'
),
'
studytime
'
],
target=2)
"""
```
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
"""
"""
from sklearn.model_selection import validation_curve
from sklearn.model_selection import validation_curve
train_scores, valid_scores = validation_curve(
train_scores, valid_scores = validation_curve(
GradientBoostingClassifier(max_features=
'
sqrt
'
,min_samples_split=5,min_samples_leaf=1,learning_rate=0.2),
GradientBoostingClassifier(max_features=
'
sqrt
'
,min_samples_split=5,min_samples_leaf=1,learning_rate=0.2),
X_train, y_train,
"
max_depth
"
, np.arange(10,20,2), cv=5)
X_train, y_train,
"
max_depth
"
, np.arange(10,20,2), cv=5)
"""
"""
```
```
%% Output
%% Output
'\nfrom sklearn.model_selection import validation_curve\ntrain_scores, valid_scores = validation_curve(\n GradientBoostingClassifier(max_features=\'sqrt\',min_samples_split=5,min_samples_leaf=1,learning_rate=0.2),\n X_train, y_train, "max_depth", np.arange(10,20,2), cv=5)\n'
'\nfrom sklearn.model_selection import validation_curve\ntrain_scores, valid_scores = validation_curve(\n GradientBoostingClassifier(max_features=\'sqrt\',min_samples_split=5,min_samples_leaf=1,learning_rate=0.2),\n X_train, y_train, "max_depth", np.arange(10,20,2), cv=5)\n'
%% Cell type:code id: tags:
%% Cell type:code id: tags:
```
python
```
python
"""
"""
from sklearn.inspection import plot_partial_dependence
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(clf, X, features)
plot_partial_dependence(clf, X, features)
"""
"""
```
```
%% Output
%% Output
'\nfrom sklearn.inspection import plot_partial_dependence\nplot_partial_dependence(clf, X, features) \n'
'\nfrom sklearn.inspection import plot_partial_dependence\nplot_partial_dependence(clf, X, features) \n'
...
...
This diff is collapsed.
Click to expand it.
Term Project_2(final).ipynb
+
465
−
16
View file @
9d436222
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment