The Embedded Theologian
Politics, Embedded Linux, Theology, and Other Eclectica
Observations on Web Site Data
How compressible is web data? For a recent project of mine...
135 GB of web pages downloaded.
15 GB of actual data extracted.
0.6 GB compressed data archives.
So, about a 25:1 compression ratio for text data.
Derived from the above:
15/135 ~= 10%
So, approximately 90% of the web is crud.
Sturgeon's Law
- ask for it by name!
1 comment:
Greg Prosmushkin
said...
This comment has been removed by a blog administrator.
May 27, 2020 at 2:44 PM
Post a Comment
Newer Post
Older Post
Home
1 comment:
Post a Comment