DS 4100 Weekly Review

This week I began my journey into data science. It’s the first time I’ve taken a class like this, and so far I’m pretty excited. The class began with an introduction to our syllabus and other things we’re going to be doing in and out of class, including these blog reflections. In these reflections I’m going to be thinking about what I learned in class, and connecting it to what I’ve experienced in the world.

Well towards the end of last class I looked into different data sets and analyzing them via basic queries. Outside of class I was looking into other data that I’ll be able to analyze in the future. What I really want to do is generate a data set based off of my gameplay. I don’t really game much, but when I do, my favorite thing to play is Civ 6. The newest game in the series, Civ 6 is a very welcome addition that I have been obsessed with. I think it would be really cool to write a script/ mod that sends and stores my in-game data in a database. Later, I want to go through and analyze my gameplay and see how I can do better. Maybe I release the script/ mod to the public so I can make my own AI based off of the best players data. I’m really interested in making an AI by training it with data sets.

My teacher mentioned in class that we’re going to be learning how to do basic predictive analysis, which I’m also eager to learn about. I’m curios to see how I can implement predictive analysis on a backend. I wonder what the function would look like and how I would use it. In the coming week, I’ll be attending PennApps with my friend. While there, I plan to code an idea I’ve been sitting on for a couple of months. I don’t want to go into too much detail, I don’t want to spoil it :P. Once it’s created however, predictive analysis would be a great feature to add in. Almost every major social platform uses similar algorithms to provide better content for their users. I see it all the time on facebook, netflix, hulu, etc.

Finally, I want to talk about R. It is a TERRIBLE AND WEIRD language that I am not enjoying. I talked with some of my PL friends this weekend about their opinions on the language, and they agreed that it’s a real mystery why it’s the industry standard for data science. Other languages like julia, and python are just as equipped (if not better) to handle large data sets. Maybe because of the unsafety of R is it a good language to handle huge amounts of data, but I’m still not sure. I’m excited to learn more about the language througout class, maybe my opinion will change eventually :P but as of right now I’m not convinced at all.