This blog entry discusses how we parse a JSON log file using MapReduce.
We will be using the AWS AMI we setup earlier to perform mapreduce task.
1. We use a simple JSON log file generator to generate the following JSON log file, save as demo.txt
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID442", "location": {"y": 135, "x": 323}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID314", "location": {"y": 153, "x": 316}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID711", "location": {"y": 131, "x": 310}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID397", "location": {"y": 170, "x": 347}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID120", "location": {"y": 122, "x": 355}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID591", "location": {"y": 117, "x": 213}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID164", "location": {"y": 125, "x": 341}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID187", "location": {"y": 135, "x": 382}}
{"time_stamp": "2015-07-01 19:19:13", "user_id": "UID623", "location": {"y": 137, "x": 359}}
2. We use Putty to telnet into the AWS instance we created earlier

3. download the GameAnalysisDemo zip from s3, then unzip
- sudo wget https://s3-us-west-1.amazonaws.com/gameanalysisdemo/GameAnalysisDemo-master.zip
- sudo unzip GameAnalysisDemo-master.zip -d GameAnalysisDemo
3. Login as Hadoop user
- sudo su hduser
4.verify hadoop is running okay by this command and output
- jps

5. run maven
- mvn clean
- mvn package
6. create input/output directory on HDFS and put input file on HDFS
- hadoop fs -mkdir /output-file/GameAnalysisDemo/1
- hadoop fs -put demo.txt /input-file/GameAnalysisDemo
7. Run Map Reduce
- hadoop jar target/src.mapreduce.demo-0.0.1-SNAPSHOT.jar mapreduce.demo /input-file/
8. See the output
No comments:
Post a Comment