The Google File System – Google’s core storage platform

Google File System – large distributed log structured file system in which they throw in a lot of data. Reliable scalable storage is a core need of any application. GFS is Google’s core storage platform. Google File System (GFS) is a proprietary distributed file system developed by Google for its own use. Its point is both to assure reliablity by using redundant copies and to allow individual most used data to selectively receive more resources (more dedicated hardware or/and redundant copies). GFS is optimized for Google’s core data storage needs, web searching, which can generate enormous amounts of data that needs to be retained; Google File System grew out of an earlier Google effort, “BigFiles”, developed by Larry Page and Sergey Brin in the early days of Google, while it was still located in Stanford. The data is stored persistently, in very large, multiple gigabyte-sized files (around 100GB) which are only extremely rarely deleted, overwritten, or shrunk; files are usually appended to or read. It is also designed and optimized to run on Google’s computing clusters, the nodes of which consist of cheap, “commodity” computers, which means precautions must be taken against the high failure rate of individual nodes and the subsequent data loss. Other design decisions select for high data throughputs, even when it comes at the cost of latency.

read more | digg story

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s