Wednesday, July 29, 2015

Project Notes -Parse JSON log file using MapReduce on AWS AMI

This blog entry discusses how we parse a JSON log file using MapReduce.
We will be using the AWS AMI we setup earlier to perform mapreduce task.
1. We use a simple JSON log file generator to generate the following JSON log file, save as demo.txt

{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID964", "location": {"y": 156, "x": 292}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID442", "location": {"y": 135, "x": 323}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID314", "location": {"y": 153, "x": 316}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID711", "location": {"y": 131, "x": 310}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID397", "location": {"y": 170, "x": 347}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID120", "location": {"y": 122, "x": 355}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID591", "location": {"y": 117, "x": 213}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID164", "location": {"y": 125, "x": 341}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID187", "location": {"y": 135, "x": 382}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID623", "location": {"y": 137, "x": 359}}

2.  We use Putty to telnet into the AWS instance we created earlier
1
3. download the GameAnalysisDemo zip from s3, then unzip
  • sudo wget https://s3-us-west-1.amazonaws.com/gameanalysisdemo/GameAnalysisDemo-master.zip
  • sudo unzip GameAnalysisDemo-master.zip -d GameAnalysisDemo
3. Login as Hadoop user
  • sudo su hduser
4.verify hadoop is running okay by this command and output
  • jps
  • jps output
5. run maven
  • mvn clean
  • mvn package
6. create input/output directory on HDFS and put input file on HDFS
  • hadoop fs -mkdir /output-file/GameAnalysisDemo/1
  • hadoop fs -put demo.txt /input-file/GameAnalysisDemo
7. Run Map Reduce
  • hadoop jar target/src.mapreduce.demo-0.0.1-SNAPSHOT.jar mapreduce.demo /input-file/
8. See the output

No comments:

Post a Comment