Modifying the Number of Mappers or Reducers on a Running EMR Cluster
Amazon emr unfortunately doesn’t give you an easy way to change the number of mappers and reducers on a running cluster. To do so before booting the cluster, add
as appropriate to the elastic-mapreduce.rb command.
For a running emr cluster, you can use the following scripts. Navigate to the conf directory; it will be in a path similar to /home/hadoop/.versions/1.0.3/conf
Edit mapred-site.xml and replace either or both of
1
mapred.tasktracker.map.tasks.maximum
or
1
mapred.tasktracker.reduce.tasks.maximum
Then copy and paste these commands:
12345678
$ # distribute the file to all nodeshadoop job -list-active-trackers | sed "s/^.*_//" | sed "s/:.*//" | xargs -t -I{} -P10 scp -o StrictHostKeyChecking=no mapred-site.xml hadoop@{}:.versions/1.0.3/conf/
$$ # bounce the tasktrackers on each nodehadoop job -list-active-trackers | sed "s/^.*_//" | sed "s/:.*//" | xargs -t -I{} -P10 ssh -o StrictHostKeyChecking=no hadoop@{} sudo /etc/init.d/hadoop-tasktracker stop
$$ # restart the jobtracker on the headnodesudo /etc/init.d/hadoop-jobtracker stop
One way to verify this worked is on the jobtracker web page.