Back to Results
First PageMeta Content
Archive formats / Rzip / Lossless data compression / Lempel–Ziv–Markov chain algorithm / 7z / Gzip / 7-Zip / Bzip2 / Zip / Software / Data compression / System software


On Compressing the Textual Web ∗ Giovanni Manzini Paolo Ferragina
Add to Reading List

Document Date: 2009-12-30 02:02:18


Open Document

File Size: 577,97 KB

Share Result on Facebook

City

Piemonte Orientale / Pisa / Cache / New York City / /

Company

Google / Limited-Bandwidth Networks / Yahoo! / Microsoft / /

Country

Italy / United States / United Kingdom / /

Currency

pence / USD / /

/

Facility

Polytechnic University / /

IndustryTerm

real Web applications / lossless data compression algorithm / Web applications / Web structure / web-based kernel function / differencing algorithms / Web miners / query processing / Web IR / Web collections / web links / pattern mining approach / algorithmic technology / open-source tools / pairs similarity search / classic solution / Web-application / software developers / unknown Web-collection / data compression tools / Web collection / compressed-storage solution / web indexes / post-processing tasks / Web-scale scenario / bwt-based algorithms / open-source search engines / disk-based bwt-construction algorithm / distributed k-means algorithm / differencing-algorithm / indexed Web / Web-clustering tools / Web-storage system / web-transfer / energy consumption / large Web collections / Web-links / Web-compression setting / individual Web pages / data processing / Web-page collections / search engines / Web search engines / Web-page storage / energy/maintenance costs / distributed web crawler / Web-data / large Web-collections / web search / mining / Web-page storage systems / compact web graph representation / social networks / bmi+gzip algorithm / Web Graphs / raw Web pages / compression algorithms / Web collection14 / classic compression tools / engineered solutions / recent algorithmic technology / designed compressed-storage solutions / state-of-theart solutions / Web-link / Web data / Web-scale / search engine / compressed-storage solutions / classic tools / Web-page / /

OperatingSystem

Linux / /

Organization

Italy Univ. / Polytechnic University / Giovanni Manzini Paolo Ferragina Univ. / IDF / Apache Foundation / American Society for Information Science and Technology / /

Person

Giovanni Manzini Paolo Ferragina Univ / Morgan Kaufmann / Addison Wesley / Web Graphs / /

/

Position

Dean / and U. H¨ olzle / /

Product

Move / Hadoop system / Hadoop / /

ProgrammingLanguage

XML / /

ProvinceOrState

New York / Manitoba / /

PublishedMedium

Journal of the ACM / /

Technology

three Algorithms / RAM / bwt-based algorithms / distributed k-means algorithm / Linux / differencing algorithms / disk-based bwt-construction algorithm / compression algorithms / ESA / search engine / random access / html / Terms Algorithms / bmi+gzip algorithm / recent algorithmic technology / Bentley-McIlroy algorithm / technology of compressed self-indexes / lossless data compression algorithm / HTTP / dictionary-based algorithms / dictionary-based algorithm / caching / algorithmic technology / 52.58 56.11 56.53 51.94 38.36 23.41 23.30 19.05 19.31 Algorithms / /

URL

http /

SocialTag