Big Data-Climate Change
There are many ways to use Big Data to observe and monitor climate change. One of these ways is to use machine learning to determine predictions. These are the steps and my own process I did within my group. First, I found a dataset for the world population on WorldBank and coded a MapReduce program to clean/profile its data. I had a mapper, reducer, and driver class. The code focused on getting all the countries’ populations after 1990 until the very recent years. However, since the focus changed to just the USA, I did not need this code and found a new dataset. The new dataset I found was precipitation for the US ranging from the year 1990 to the year 2016 from USA Facts. This particular dataset actually did not need any cleaning so no MapReduce program was created for it. It was then merged with the other two datasets my fellow groupmates found. After I did this, I went on to do more research about climate change, and how other people used Big Data tools to monitor climate change. I found interesting articles and papers surrounding this topic. Since most of the papers I found used linear regression for their machine learning portion, we decided it was the best option. I helped learn how to code a simple linear regression model using Spark Scala. Using these predictions made with our model, I analyzed the results and made the visualizations so we can share them with everyone. Then, I wrote up the readme file for our project code directories and the project slides. After that, I discussed with my group how they did their parts and what they wanted me to know so I can write the majority of the paper and have them proofread, edit it, and add some parts like their motivation. Then, one last time, I discussed the paper with my group, then submitted it!