Tuesday, January 6, 2015

Data locality Understanding

In1990 , a normal disk drive used be a capacity of 1.3GB.The reading speed from drive is 4.4 MB/Sec. So it will take near about 5 mins.

Now a normal disk drive is 1TB. It has multiplied to 10^6 times.
The reading speed is close 100MB/sec . So it will take around

1TB/(100MB/Sec)= 10^4 Sec= It is approximately  2hours 30 minutes.

It makes sense to use distributed system . But in traditional distributed system where data moves from storage to computing node . So evenif the processpr speed has improved a lot , it has to still wait for the data to reach . So there comes Hadoop  Distributed  System  where the data does not move to the node  where computation  happens instead code  . This is actually  called  as data  locality.

No comments:

Post a Comment